public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
       [not found]                             ` <6e9bb35e-6f4f-cf78-e515-549da487b5ef@cornell.edu>
@ 2021-08-30  7:57                               ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30  7:57 UTC (permalink / raw)
  To: cygwin-developers

[Moving discussion to cygwin-developers where it belongs]

On Aug 29 18:42, Ken Brown via Cygwin wrote:
> On 8/29/2021 4:41 AM, Takashi Yano wrote:
> > Hi Ken,
> > 
> > On Sat, 28 Aug 2021 16:55:52 -0400
> > Ken Brown wrote:
> > > On 8/28/2021 11:43 AM, Takashi Yano via Cygwin wrote:
> > > > On Sat, 28 Aug 2021 13:58:08 +0200
> > > > Corinna Vinschen wrote:
> > > > > On Aug 28 18:41, Takashi Yano via Cygwin wrote:
> > > > > > On Sat, 28 Aug 2021 10:43:27 +0200
> > > > > > Corinna Vinschen wrote:
> > > > > > > [...]
> > > > > > If 'non-blocking' means overlapped I/O, only the problem will be:
> > > > > > https://cygwin.com/pipermail/cygwin/2021-March/247987.html
> > > > > 
> > > > > Sorry if that wasn't clear, but I was not talking about overlapped I/O,
> > > > > which we should get rid off, but of real non-blocking mode, which
> > > > > Windows pipes are fortunately capable of.
> > > > 
> > > > Do you mean, PIPE_NOWAIT flag? If this flags is specified in
> > > > the read pipe, non-cygwin apps cannot read the pipe correctly.
> > > 
> > > While waiting for Corinna's response to this, I have one more question.  Do you
> > > understand why nt_create() failed and you had to revert to create()?  Was it an
> > > access problem because nt_create requested FILE_WRITE_ATTRIBUTES?  Or did I make
> > > some careless mistake in writing nt_create?
> > 
> > I am sorry but no. I don't understand why piping C# program via
> > the pipe created by nt_create() has the issue. I tried to change
> > setup parameters in nt_create(), however, I did not succeed it to
> > work. I also couldn't find any mistake in nt_create() so far.
> > 
> > Win32 programs which use ReadFile() and WriteFile() work even
> > with the pipe created by nt_create() as well as overlapped I/O.
> > 
> > What does C# program differ from legacy win32 program at all?
> 
> I don't know.
> 
> By the way, when I introduced nt_create(), my preference would have been to
> simply change create() to use the NT API, but I was afraid to do that
> because I didn't want to take a chance on breaking something.  That's still
> my preference, if we can find a way to work around this problem with C#
> programs.

Maybe Procmon from sysinternals helps to find the difference in the
working vs. the non-working calls...?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
       [not found]                       ` <789f056a-f164-d71d-1dc9-230f5a41846d@cornell.edu>
@ 2021-08-30  8:27                         ` Corinna Vinschen
  2021-08-30 13:00                           ` Corinna Vinschen
       [not found]                         ` <20210830043756.8aa0ada77db0bfbbe3889f62@nifty.ne.jp>
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30  8:27 UTC (permalink / raw)
  To: cygwin-developers

[Moving discussion to cygwin-developers]

On Aug 29 11:57, Ken Brown via Cygwin wrote:
> On 8/29/2021 5:07 AM, Takashi Yano via Cygwin wrote:
> > On Sat, 28 Aug 2021 18:41:02 +0900
> > Takashi Yano wrote:
> > > On Sat, 28 Aug 2021 10:43:27 +0200
> > > Corinna Vinschen wrote:
> > > > On Aug 28 02:21, Takashi Yano via Cygwin wrote:
> > > > > On Fri, 27 Aug 2021 12:00:50 -0400
> > > > > Ken Brown wrote:
> > > > > > Two years ago I thought I needed nt_create to avoid problems when calling
> > > > > > set_pipe_non_blocking.  Are you saying that's not an issue?  Is
> > > > > > set_pipe_non_blocking unnecessary?  Is that the point of your modification to
> > > > > > raw_read?
> > > > > 
> > > > > Yes. Instead of making windows read function itself non-blocking,
> > > > > it is possible to check if the pipe can be read before read using
> > > > > PeekNamedPipe(). If the pipe cannot be read right now, EAGAIN is
> > > > > returned.
> > > > 
> > > > The problem is this:
> > > > 
> > > >    if (PeekNamedPipe())
> > > >      ReadFile(blocking);
> > > > 
> > > > is not atomic.  I. e., if PeekNamedPipe succeeds, nothing keeps another
> > > > thread from draining the pipe between the PeekNamedPipe and the ReadFile
> > > > call.  And as soon as ReadFile runs, it hangs indefinitely and we can't
> > > > stop it via a signal.
> > > 
> > > Hmm, you are right. Mutex guard seems to be necessary like pty code
> > > if we go this way.
> > 
> > I have found that set_pipe_non_blocking() succeeds for both read and
> > write pipes if the write pipe is created by CreateNamedPipe() and the
> > read pipe is created by CreateFile() contrary to the current create()
> > code. Therefore, not only nt_create() but also PeekNamedPipe() become
> > unnecessary.
> > 
> > Please see the revised patch attached.
> 
> That's a great idea.
> 
> I've applied your two patches to the topic/pipe branch.  I also rebased it
> and did a forced push in order to bring in Corinna's loader script fix.  So
> you'll have to do 'git fetch' and 'git rebase --hard origin/topic/pipe'.
> 
> Does this now fix all known problems with pipes?
> 
> Corinna, do you still see any benefit to switching to PIPE_NOWAIT?  AFAICT,
> it wouldn't decrease the code size at this point, so the only question is
> whether it might improve performance.

Pipes are already using PIPE_NOWAIT aka FILE_PIPE_COMPLETE_OPERATION
mode, see set_pipe_non_blocking.  The problem is that it's not used for
blocking pipes.  Rather, blocking pipes use overlapped IO.  Overlapped
IO is conceptually upside-down from the POSIX concept of non-blocking.
Also, the information returned in FilePipeLocalInformation is historically
borderline.  For kicks, see
https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html

So my suggestion is to try switching to non-blocking Windows pipes
entirely, even for blocking pipes on the user level.  It works nicely
for sockets.

> If you think it's worth trying, I'd be glad to code it up on a new branch,
> and we could compare the two.

We can do this in two version steps.  There's no pressure.

> Aside from that, I'm wondering how and when to merge the new pipe
> implementation to master.  It obviously needs much more widespread testing
> than it's gotten so far.  I'm a little nervous about it because I haven't
> thought about the details for two years, and no one other than me has tested
> it until a few days ago.

I could push out version 3.3.0, and afterwards you can just merge into
master.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
       [not found]                           ` <47e5dd74-b940-f305-fd5a-c6c9d8f41305@cornell.edu>
@ 2021-08-30  8:48                             ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30  8:48 UTC (permalink / raw)
  To: cygwin-developers

[Moved to cygwin-developers]

On Aug 29 17:09, Ken Brown via Cygwin wrote:
> On 8/29/2021 3:37 PM, Takashi Yano wrote:
> > Only the small thing remaining is pipe mode. If the pipe mode
> > is changed to byte mode, the following issue will be solved.
> > https://cygwin.com/pipermail/cygwin/2021-March/247987.html
> > 
> > How about the simple patch attached?
> > 
> > The comment in pipe code says:
> >       Note that the write side of the pipe is opened as PIPE_TYPE_MESSAGE.
> >       This *seems* to more closely mimic Linux pipe behavior and is
> >       definitely required for pty handling since fhandler_pty_master
> >       writes to the pipe in chunks, terminated by newline when CANON mode
> >       is specified.
> > 
> > This mentions about pty behaiviour in canonical mode, however, the
> > pty pipe is created as message mode even with this patch. Are there
> > any other reasons that message mode is preferred for pipe?
> 
> No idea.  All I remember is that there was a lot of discussion around the
> time that it was decided to use PIPE_TYPE_MESSAGE by default.  Corinna
> probably remembers the reasons.

No, sorry, I don't remember the exact discussion.  But it seemed to fix
quite a few issues at the time.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
       [not found]                               ` <20210830170204.fa91eaf110f310f13b67abc3@nifty.ne.jp>
@ 2021-08-30 10:20                                 ` Corinna Vinschen
  2021-08-30 10:38                                   ` Corinna Vinschen
  2021-08-30 12:04                                   ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 10:20 UTC (permalink / raw)
  To: cygwin-developers

[Move discussion to cygwin-developers]

On Aug 30 17:02, Takashi Yano via Cygwin wrote:
> On Sun, 29 Aug 2021 22:15:29 -0400
> Ken Brown wrote:
> > On 8/29/2021 8:22 PM, Takashi Yano via Cygwin wrote:
> > > We have two easy options:
> > > 1) Configure the pipe with PIPE_ACCESS_DUPLEX.
> > > 2) Use nt_create() again and forget C# program issue.
> > 
> > I vote for 2), but let's see what Corinna thinks.
> 
> BTW. what's wrong if just:
> 
> static int
> nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
>                 DWORD psize, int64_t *unique_id)
> {
>   if (r && w)
>     {
>       static volatile ULONG pipe_unique_id;
>       LONG id = InterlockedIncrement ((LONG *) &pipe_unique_id);
>       if (unique_id)
>         *unique_id = ((int64_t) id << 32 | GetCurrentProcessId ());
>       if (!CreatePipe (r, w, sa_ptr, psize))
>         {
>           *r = *w = NULL;
>           return GetLastError ();
>         }
>     }
>   return 0;
> }
> 
> ?
> 
> In my environment, I cannot find any defects.
> - No performance degradation.
> - set_pipe_non_blocking() works for both read and write pipes.
> - NtQueryInformationFile() in select() works for both r/w pipes.
> - Piping C# program works.
> 
> Is naming the pipe really necessary?

It's not, but CreatePipe is doing this anyway.

"Anonymous pipes are implemented using a named pipe with a unique name."
https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe

The reason CreateNamedPipe was used in the first place was that
FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
of the pipe, however, it creates full duplex pipe:

https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html

Given the fact that CreatePipe is implemented in terms of
NtCreateNamedPipeFile anyway, why should the pipe created with
NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?

The only reason can be some missing flag, I think.  Checking
fhandler_pipe.cc::nt_create and comparing that with the default flags
for files and other devices, it occurs to me that the SYNCHRONIZE stuff
is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
this in nt_create:

  status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
				  FILE_SHARE_READ | FILE_SHARE_WRITE,
				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
				  0, 1, psize, psize, &timeout);

Does that fix the above problems, too?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 10:20                                 ` Corinna Vinschen
@ 2021-08-30 10:38                                   ` Corinna Vinschen
  2021-08-30 12:04                                   ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 10:38 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 12:20, Corinna Vinschen wrote:
> [Move discussion to cygwin-developers]
> 
> On Aug 30 17:02, Takashi Yano via Cygwin wrote:
> > On Sun, 29 Aug 2021 22:15:29 -0400
> > Ken Brown wrote:
> > > On 8/29/2021 8:22 PM, Takashi Yano via Cygwin wrote:
> > > > We have two easy options:
> > > > 1) Configure the pipe with PIPE_ACCESS_DUPLEX.
> > > > 2) Use nt_create() again and forget C# program issue.
> > > 
> > > I vote for 2), but let's see what Corinna thinks.
> > 
> > BTW. what's wrong if just:
> > [...]
> >       if (!CreatePipe (r, w, sa_ptr, psize))
> > [...]
> > In my environment, I cannot find any defects.
> > - No performance degradation.
> > - set_pipe_non_blocking() works for both read and write pipes.
> > - NtQueryInformationFile() in select() works for both r/w pipes.
> > - Piping C# program works.
> > 
> > Is naming the pipe really necessary?
> 
> It's not, but CreatePipe is doing this anyway.
> [...]
> Given the fact that CreatePipe is implemented in terms of
> NtCreateNamedPipeFile anyway, why should the pipe created with
> NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
> 
> The only reason can be some missing flag, I think.  Checking
> fhandler_pipe.cc::nt_create and comparing that with the default flags
> for files and other devices, it occurs to me that the SYNCHRONIZE stuff
> is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
> this in nt_create:
> 
>   status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
> 				  FILE_SHARE_READ | FILE_SHARE_WRITE,
> 				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
> 				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
> 				  0, 1, psize, psize, &timeout);
> 
> Does that fix the above problems, too?

Btw., checking the calls with Procmon from sysinternals may give a clue,
too.  It was pretty helpful in the past.

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 10:20                                 ` Corinna Vinschen
  2021-08-30 10:38                                   ` Corinna Vinschen
@ 2021-08-30 12:04                                   ` Takashi Yano
  2021-08-30 12:55                                     ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-30 12:04 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 30 Aug 2021 12:20:30 +0200
Corinna Vinschen wrote:
> [Move discussion to cygwin-developers]
> 
> On Aug 30 17:02, Takashi Yano via Cygwin wrote:
> > On Sun, 29 Aug 2021 22:15:29 -0400
> > Ken Brown wrote:
> > > On 8/29/2021 8:22 PM, Takashi Yano via Cygwin wrote:
> > > > We have two easy options:
> > > > 1) Configure the pipe with PIPE_ACCESS_DUPLEX.
> > > > 2) Use nt_create() again and forget C# program issue.
> > > 
> > > I vote for 2), but let's see what Corinna thinks.
> > 
> > BTW. what's wrong if just:
> > 
> > static int
> > nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
> >                 DWORD psize, int64_t *unique_id)
> > {
> >   if (r && w)
> >     {
> >       static volatile ULONG pipe_unique_id;
> >       LONG id = InterlockedIncrement ((LONG *) &pipe_unique_id);
> >       if (unique_id)
> >         *unique_id = ((int64_t) id << 32 | GetCurrentProcessId ());
> >       if (!CreatePipe (r, w, sa_ptr, psize))
> >         {
> >           *r = *w = NULL;
> >           return GetLastError ();
> >         }
> >     }
> >   return 0;
> > }
> > 
> > ?
> > 
> > In my environment, I cannot find any defects.
> > - No performance degradation.
> > - set_pipe_non_blocking() works for both read and write pipes.
> > - NtQueryInformationFile() in select() works for both r/w pipes.
> > - Piping C# program works.
> > 
> > Is naming the pipe really necessary?
> 
> It's not, but CreatePipe is doing this anyway.
> 
> "Anonymous pipes are implemented using a named pipe with a unique name."
> https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe
> 
> The reason CreateNamedPipe was used in the first place was that
> FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
> of the pipe, however, it creates full duplex pipe:
> 
> https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html
> 
> Given the fact that CreatePipe is implemented in terms of
> NtCreateNamedPipeFile anyway, why should the pipe created with
> NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
> 
> The only reason can be some missing flag, I think.  Checking
> fhandler_pipe.cc::nt_create and comparing that with the default flags
> for files and other devices, it occurs to me that the SYNCHRONIZE stuff
> is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
> this in nt_create:
> 
>   status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
> 				  FILE_SHARE_READ | FILE_SHARE_WRITE,
> 				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
> 				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
> 				  0, 1, psize, psize, &timeout);
> 
> Does that fix the above problems, too?

Yes it does! Now, if CYGWIN=pipe_byte is also set, the piping issue
of C# program is gone!

In fact, I've already tested adding the SYNCHRONIZE access flag,
but it didn't solve the problem. It seems that the cause was
that FILE_SYNCHRONOUS_IO_NONALERT was missing.

Thank you for figuring out the solution!

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 12:04                                   ` Takashi Yano
@ 2021-08-30 12:55                                     ` Corinna Vinschen
  2021-08-30 13:31                                       ` Corinna Vinschen
  2021-08-30 13:51                                       ` Ken Brown
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 12:55 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 21:04, Takashi Yano wrote:
> On Mon, 30 Aug 2021 12:20:30 +0200
> Corinna Vinschen wrote:
> > [Move discussion to cygwin-developers]
> > 
> > On Aug 30 17:02, Takashi Yano via Cygwin wrote:
> > > [...]
> > > Is naming the pipe really necessary?
> > 
> > It's not, but CreatePipe is doing this anyway.
> > 
> > "Anonymous pipes are implemented using a named pipe with a unique name."
> > https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe
> > 
> > The reason CreateNamedPipe was used in the first place was that
> > FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
> > of the pipe, however, it creates full duplex pipe:
> > 
> > https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html
> > 
> > Given the fact that CreatePipe is implemented in terms of
> > NtCreateNamedPipeFile anyway, why should the pipe created with
> > NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
> > 
> > The only reason can be some missing flag, I think.  Checking
> > fhandler_pipe.cc::nt_create and comparing that with the default flags
> > for files and other devices, it occurs to me that the SYNCHRONIZE stuff
> > is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
> > this in nt_create:
> > 
> >   status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
> > 				  FILE_SHARE_READ | FILE_SHARE_WRITE,
> > 				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
> > 				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
> > 				  0, 1, psize, psize, &timeout);
> > 
> > Does that fix the above problems, too?
> 
> Yes it does! Now, if CYGWIN=pipe_byte is also set, the piping issue
> of C# program is gone!
> 
> In fact, I've already tested adding the SYNCHRONIZE access flag,
> but it didn't solve the problem. It seems that the cause was
> that FILE_SYNCHRONOUS_IO_NONALERT was missing.
> 
> Thank you for figuring out the solution!

No worries.  The same should apply to the NtCreateFile side of the
pipe, btw.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30  8:27                         ` Corinna Vinschen
@ 2021-08-30 13:00                           ` Corinna Vinschen
  2021-08-30 13:20                             ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 13:00 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 10:27, Corinna Vinschen wrote:
> [Moving discussion to cygwin-developers]
> 
> On Aug 29 11:57, Ken Brown via Cygwin wrote:
> > Corinna, do you still see any benefit to switching to PIPE_NOWAIT?  AFAICT,
> > it wouldn't decrease the code size at this point, so the only question is
> > whether it might improve performance.
> 
> Pipes are already using PIPE_NOWAIT aka FILE_PIPE_COMPLETE_OPERATION
> mode, see set_pipe_non_blocking.  The problem is that it's not used for
> blocking pipes.  Rather, blocking pipes use overlapped IO.  Overlapped
> IO is conceptually upside-down from the POSIX concept of non-blocking.
> Also, the information returned in FilePipeLocalInformation is historically
> borderline.  For kicks, see
> https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html
> 
> So my suggestion is to try switching to non-blocking Windows pipes
> entirely, even for blocking pipes on the user level.  It works nicely
> for sockets.

On second thought, I'm not so sure how to block on non-blocking pipes
on writing.  Assuming a write fails because the buffer is full, we
don't have a waitable object to wait on.  Unless the pipe handle is
signalled if writing is allowed, but that would be a first in Windows.
So in theory this would still require overlapped IO.  Does that still
work as desired if the pipe mode is non-blocking?  I don't think I ever
tried that...


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 13:00                           ` Corinna Vinschen
@ 2021-08-30 13:20                             ` Corinna Vinschen
  2021-08-30 13:41                               ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 13:20 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 15:00, Corinna Vinschen wrote:
> On Aug 30 10:27, Corinna Vinschen wrote:
> > [Moving discussion to cygwin-developers]
> > 
> > On Aug 29 11:57, Ken Brown via Cygwin wrote:
> > > Corinna, do you still see any benefit to switching to PIPE_NOWAIT?  AFAICT,
> > > it wouldn't decrease the code size at this point, so the only question is
> > > whether it might improve performance.
> > 
> > Pipes are already using PIPE_NOWAIT aka FILE_PIPE_COMPLETE_OPERATION
> > mode, see set_pipe_non_blocking.  The problem is that it's not used for
> > blocking pipes.  Rather, blocking pipes use overlapped IO.  Overlapped
> > IO is conceptually upside-down from the POSIX concept of non-blocking.
> > Also, the information returned in FilePipeLocalInformation is historically
> > borderline.  For kicks, see
> > https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html
> > 
> > So my suggestion is to try switching to non-blocking Windows pipes
> > entirely, even for blocking pipes on the user level.  It works nicely
> > for sockets.
> 
> On second thought, I'm not so sure how to block on non-blocking pipes
> on writing.  Assuming a write fails because the buffer is full, we
> don't have a waitable object to wait on.  Unless the pipe handle is
> signalled if writing is allowed, but that would be a first in Windows.
> So in theory this would still require overlapped IO.  Does that still
> work as desired if the pipe mode is non-blocking?  I don't think I ever
> tried that...

That probably doesn't make sense.  If WriteFile returns without writing
something, what should overlapped io be waiting on?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 12:55                                     ` Corinna Vinschen
@ 2021-08-30 13:31                                       ` Corinna Vinschen
  2021-08-31  8:50                                         ` Takashi Yano
  2021-08-30 13:51                                       ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 13:31 UTC (permalink / raw)
  To: cygwin-developers

Hi Takashi,

On Aug 30 14:55, Corinna Vinschen wrote:
> On Aug 30 21:04, Takashi Yano wrote:
> > On Mon, 30 Aug 2021 12:20:30 +0200
> > Corinna Vinschen wrote:
> > > [Move discussion to cygwin-developers]
> > > 
> > > On Aug 30 17:02, Takashi Yano via Cygwin wrote:
> > > > [...]
> > > > Is naming the pipe really necessary?
> > > 
> > > It's not, but CreatePipe is doing this anyway.
> > > 
> > > "Anonymous pipes are implemented using a named pipe with a unique name."
> > > https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe
> > > 
> > > The reason CreateNamedPipe was used in the first place was that
> > > FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
> > > of the pipe, however, it creates full duplex pipe:
> > > 
> > > https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html
> > > 
> > > Given the fact that CreatePipe is implemented in terms of
> > > NtCreateNamedPipeFile anyway, why should the pipe created with
> > > NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
> > > 
> > > The only reason can be some missing flag, I think.  Checking
> > > fhandler_pipe.cc::nt_create and comparing that with the default flags
> > > for files and other devices, it occurs to me that the SYNCHRONIZE stuff
> > > is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
> > > this in nt_create:
> > > 
> > >   status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
> > > 				  FILE_SHARE_READ | FILE_SHARE_WRITE,
> > > 				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
> > > 				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
> > > 				  0, 1, psize, psize, &timeout);
> > > 
> > > Does that fix the above problems, too?
> > 
> > Yes it does! Now, if CYGWIN=pipe_byte is also set, the piping issue
> > of C# program is gone!

I don't quite understand this one.  Is that C# example using the write
side of the pipe?  If it reads from the pipe, this behaiour would be
pretty puzzeling, given the read mode is always BYTE.

Either way, assuming we switch the write side to BYTE mode only, is
the pty code robust enough to work with that?  The comment

  Note that the write side of the pipe is opened as PIPE_TYPE_MESSAGE.
  This *seems* to more closely mimic Linux pipe behavior and is
  definitely required for pty handling since fhandler_pty_master
  writes to the pipe in chunks, terminated by newline when CANON mode
  is specified. 

is old, so the problems the message mode was trying to solve for
CANON mode may not apply anymore...


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
       [not found]                             ` <d217ef03-7858-5e22-0aa6-f0507eedd9da@cornell.edu>
       [not found]                               ` <20210830170204.fa91eaf110f310f13b67abc3@nifty.ne.jp>
@ 2021-08-30 13:36                               ` Ken Brown
  2021-08-30 14:05                                 ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-30 13:36 UTC (permalink / raw)
  To: cygwin-devel

On 8/29/2021 10:15 PM, Ken Brown via Cygwin wrote:
> On 8/29/2021 8:22 PM, Takashi Yano via Cygwin wrote:
>> On Mon, 30 Aug 2021 09:13:14 +0900
>> Takashi Yano wrote:
>>> On Sun, 29 Aug 2021 17:04:56 -0400
>>> Ken Brown wrote:
>>>> On 8/29/2021 5:07 AM, Takashi Yano via Cygwin wrote:
>>>>> On Sat, 28 Aug 2021 18:41:02 +0900
>>>>> Takashi Yano wrote:
>>>>>> On Sat, 28 Aug 2021 10:43:27 +0200
>>>>>> Corinna Vinschen wrote:
>>>>>>> On Aug 28 02:21, Takashi Yano via Cygwin wrote:
>>>>>>>> On Fri, 27 Aug 2021 12:00:50 -0400
>>>>>>>> Ken Brown wrote:
>>>>>>>>> Two years ago I thought I needed nt_create to avoid problems when calling
>>>>>>>>> set_pipe_non_blocking.  Are you saying that's not an issue?  Is
>>>>>>>>> set_pipe_non_blocking unnecessary?  Is that the point of your 
>>>>>>>>> modification to
>>>>>>>>> raw_read?
>>>>>>>>
>>>>>>>> Yes. Instead of making windows read function itself non-blocking,
>>>>>>>> it is possible to check if the pipe can be read before read using
>>>>>>>> PeekNamedPipe(). If the pipe cannot be read right now, EAGAIN is
>>>>>>>> returned.
>>>>>>>
>>>>>>> The problem is this:
>>>>>>>
>>>>>>>     if (PeekNamedPipe())
>>>>>>>       ReadFile(blocking);
>>>>>>>
>>>>>>> is not atomic.  I. e., if PeekNamedPipe succeeds, nothing keeps another
>>>>>>> thread from draining the pipe between the PeekNamedPipe and the ReadFile
>>>>>>> call.  And as soon as ReadFile runs, it hangs indefinitely and we can't
>>>>>>> stop it via a signal.
>>>>>>
>>>>>> Hmm, you are right. Mutex guard seems to be necessary like pty code
>>>>>> if we go this way.
>>>>>
>>>>> I have found that set_pipe_non_blocking() succeeds for both read and
>>>>> write pipes if the write pipe is created by CreateNamedPipe() and the
>>>>> read pipe is created by CreateFile() contrary to the current create()
>>>>> code. Therefore, not only nt_create() but also PeekNamedPipe() become
>>>>> unnecessary.
>>>>>
>>>>> Please see the revised patch attached.
>>>>
>>>> I haven't had a chance to test this myself yet, but occurs to me that we might
>>>> have a different problem after this patch: Does the write handle that we get
>>>> from CreateNamedPipe() have FILE_READ_ATTRIBUTES access?
>>>
>>> I have just checked this, and the answer is "No". Due to this problem,
>>> NtQueryInformationFile() call in select() fails on the write pipe.
>>>
>>> It seems that we need more consideration...
>>
>> We have two easy options:
>> 1) Configure the pipe with PIPE_ACCESS_DUPLEX.
>> 2) Use nt_create() again and forget C# program issue.
> 
> I vote for 2), but let's see what Corinna thinks.
> 
>> Even without this problem, select() for writing pipe has a bug
>> and does not wrok as expected. The following patch seems to be
>> needed.
>>
>> diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
>> index 83e1c00e0..ac2fd227e 100644
>> --- a/winsup/cygwin/select.cc
>> +++ b/winsup/cygwin/select.cc
>> @@ -612,7 +612,6 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, 
>> bool writing)
>>             that.  This means that a pipe could still block since you could
>>             be trying to write more to the pipe than is available in the
>>             buffer but that is the hazard of select().  */
>> -      fpli.WriteQuotaAvailable = fpli.OutboundQuota - fpli.ReadDataAvailable;
>>         if (fpli.WriteQuotaAvailable > 0)
>>          {
>>            paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
>>
> 
> I agree.

Now I'm starting to wonder.  The use of fpli.OutboundQuota - 
fpli.ReadDataAvailable instead of fpli.WriteQuotaAvailable was introduced in 
commit a010e6abe, with no comment in the commit message or the code to explain 
why.  Corinna, does this make sense to you?  Is it related to the issues raised 
in the message

  https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html

that you cited elsewhere in this thread?

BTW, when I was working on the pipe approach to AF_UNIX sockets (topic/af_unix 
branch), I had occasion to step through select.cc:pipe_data_available in gdb, 
and the use of fpli.OutboundQuota - fpli.ReadDataAvailable definitely seemed 
wrong to me.  So when I wrote peek_socket_unix on that branch, I used 
fpli.WriteQuotaAvailable, as Takashi is suggesting now.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 13:20                             ` Corinna Vinschen
@ 2021-08-30 13:41                               ` Ken Brown
  2021-08-30 14:12                                 ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-30 13:41 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 9:20 AM, Corinna Vinschen wrote:
> On Aug 30 15:00, Corinna Vinschen wrote:
>> On Aug 30 10:27, Corinna Vinschen wrote:
>>> [Moving discussion to cygwin-developers]
>>>
>>> On Aug 29 11:57, Ken Brown via Cygwin wrote:
>>>> Corinna, do you still see any benefit to switching to PIPE_NOWAIT?  AFAICT,
>>>> it wouldn't decrease the code size at this point, so the only question is
>>>> whether it might improve performance.
>>>
>>> Pipes are already using PIPE_NOWAIT aka FILE_PIPE_COMPLETE_OPERATION
>>> mode, see set_pipe_non_blocking.  The problem is that it's not used for
>>> blocking pipes.  Rather, blocking pipes use overlapped IO.  Overlapped
>>> IO is conceptually upside-down from the POSIX concept of non-blocking.
>>> Also, the information returned in FilePipeLocalInformation is historically
>>> borderline.  For kicks, see
>>> https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html
>>>
>>> So my suggestion is to try switching to non-blocking Windows pipes
>>> entirely, even for blocking pipes on the user level.  It works nicely
>>> for sockets.
>>
>> On second thought, I'm not so sure how to block on non-blocking pipes
>> on writing.  Assuming a write fails because the buffer is full, we
>> don't have a waitable object to wait on.  Unless the pipe handle is
>> signalled if writing is allowed, but that would be a first in Windows.
>> So in theory this would still require overlapped IO.  Does that still
>> work as desired if the pipe mode is non-blocking?  I don't think I ever
>> tried that...
> 
> That probably doesn't make sense.  If WriteFile returns without writing
> something, what should overlapped io be waiting on?

The approach I've taken on the topic/pipe branch is to stop using overlapped I/O 
and to always keep the blocking mode of the Windows pipe in sync with the 
blocking mode of the fhandler.  This seems to work pretty well so far, although 
problems could certainly show up after further testing.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 12:55                                     ` Corinna Vinschen
  2021-08-30 13:31                                       ` Corinna Vinschen
@ 2021-08-30 13:51                                       ` Ken Brown
  2021-08-30 15:00                                         ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-30 13:51 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> On Aug 30 21:04, Takashi Yano wrote:
>> On Mon, 30 Aug 2021 12:20:30 +0200
>> Corinna Vinschen wrote:
>>> [Move discussion to cygwin-developers]
>>>
>>> On Aug 30 17:02, Takashi Yano via Cygwin wrote:
>>>> [...]
>>>> Is naming the pipe really necessary?
>>>
>>> It's not, but CreatePipe is doing this anyway.
>>>
>>> "Anonymous pipes are implemented using a named pipe with a unique name."
>>> https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe
>>>
>>> The reason CreateNamedPipe was used in the first place was that
>>> FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
>>> of the pipe, however, it creates full duplex pipe:
>>>
>>> https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html
>>>
>>> Given the fact that CreatePipe is implemented in terms of
>>> NtCreateNamedPipeFile anyway, why should the pipe created with
>>> NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
>>>
>>> The only reason can be some missing flag, I think.  Checking
>>> fhandler_pipe.cc::nt_create and comparing that with the default flags
>>> for files and other devices, it occurs to me that the SYNCHRONIZE stuff
>>> is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
>>> this in nt_create:
>>>
>>>    status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
>>> 				  FILE_SHARE_READ | FILE_SHARE_WRITE,
>>> 				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
>>> 				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
>>> 				  0, 1, psize, psize, &timeout);
>>>
>>> Does that fix the above problems, too?
>>
>> Yes it does! Now, if CYGWIN=pipe_byte is also set, the piping issue
>> of C# program is gone!
>>
>> In fact, I've already tested adding the SYNCHRONIZE access flag,
>> but it didn't solve the problem. It seems that the cause was
>> that FILE_SYNCHRONOUS_IO_NONALERT was missing.
>>
>> Thank you for figuring out the solution!
> 
> No worries.  The same should apply to the NtCreateFile side of the
> pipe, btw.

I'll add my thanks.  I should have checked the default flags that are typically 
used for other devices when I wrote nt_create.  I'm glad you caught this.

So I'll reinstate the use of nt_create and then let Takashi recheck everything.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 13:36                               ` Ken Brown
@ 2021-08-30 14:05                                 ` Corinna Vinschen
  2021-08-30 15:53                                   ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 14:05 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 09:36, Ken Brown wrote:
> On 8/29/2021 10:15 PM, Ken Brown via Cygwin wrote:
> > On 8/29/2021 8:22 PM, Takashi Yano via Cygwin wrote:
> > > [...]
> > > Even without this problem, select() for writing pipe has a bug
> > > and does not wrok as expected. The following patch seems to be
> > > needed.
> > > 
> > > diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
> > > index 83e1c00e0..ac2fd227e 100644
> > > --- a/winsup/cygwin/select.cc
> > > +++ b/winsup/cygwin/select.cc
> > > @@ -612,7 +612,6 @@ pipe_data_available (int fd, fhandler_base *fh,
> > > HANDLE h, bool writing)
> > >             that.  This means that a pipe could still block since you could
> > >             be trying to write more to the pipe than is available in the
> > >             buffer but that is the hazard of select().  */
> > > -      fpli.WriteQuotaAvailable = fpli.OutboundQuota - fpli.ReadDataAvailable;
> > >         if (fpli.WriteQuotaAvailable > 0)
> > >          {
> > >            paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
> > > 
> > 
> > I agree.
> 
> Now I'm starting to wonder.  The use of fpli.OutboundQuota -
> fpli.ReadDataAvailable instead of fpli.WriteQuotaAvailable was introduced in
> commit a010e6abe, with no comment in the commit message or the code to
> explain why.  Corinna, does this make sense to you?  Is it related to the
> issues raised in the message
> 
>  https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html
> 
> that you cited elsewhere in this thread?

I thought so, but no.  This patch was introduced by cgf in 2008,
and I don't see any hint as to the why in any of my mail archives,
not even in private email.  The only vague hint is the release message
later on:

  - Reworked pipe handling for better speed and better support for signal
    processing.

I wonder if that was a typo, considering the observation from the above
mail:

  "But there is a strange twist:  When a read is pending on an empty
   pipe, then WriteQuotaAvailable is also decremented!"

> BTW, when I was working on the pipe approach to AF_UNIX sockets
> (topic/af_unix branch), I had occasion to step through
> select.cc:pipe_data_available in gdb, and the use of fpli.OutboundQuota -
> fpli.ReadDataAvailable definitely seemed wrong to me.  So when I wrote
> peek_socket_unix on that branch, I used fpli.WriteQuotaAvailable, as Takashi
> is suggesting now.

If that's working reliable these days (keeping fingers crossed for W7),
it's ok if we use that.  We may want to check if the above observation
in terms on WriteQuotaAvailable on a pipe with pending read is still an
issue.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 13:41                               ` Ken Brown
@ 2021-08-30 14:12                                 ` Corinna Vinschen
  2021-08-30 14:52                                   ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 14:12 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 09:41, Ken Brown wrote:
> On 8/30/2021 9:20 AM, Corinna Vinschen wrote:
> > On Aug 30 15:00, Corinna Vinschen wrote:
> > > On Aug 30 10:27, Corinna Vinschen wrote:
> > > > [Moving discussion to cygwin-developers]
> > > > 
> > > > On Aug 29 11:57, Ken Brown via Cygwin wrote:
> > > > > Corinna, do you still see any benefit to switching to PIPE_NOWAIT?  AFAICT,
> > > > > it wouldn't decrease the code size at this point, so the only question is
> > > > > whether it might improve performance.
> > > > 
> > > > Pipes are already using PIPE_NOWAIT aka FILE_PIPE_COMPLETE_OPERATION
> > > > mode, see set_pipe_non_blocking.  The problem is that it's not used for
> > > > blocking pipes.  Rather, blocking pipes use overlapped IO.  Overlapped
> > > > IO is conceptually upside-down from the POSIX concept of non-blocking.
> > > > Also, the information returned in FilePipeLocalInformation is historically
> > > > borderline.  For kicks, see
> > > > https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html
> > > > 
> > > > So my suggestion is to try switching to non-blocking Windows pipes
> > > > entirely, even for blocking pipes on the user level.  It works nicely
> > > > for sockets.
> > > 
> > > On second thought, I'm not so sure how to block on non-blocking pipes
> > > on writing.  Assuming a write fails because the buffer is full, we
> > > don't have a waitable object to wait on.  Unless the pipe handle is
> > > signalled if writing is allowed, but that would be a first in Windows.
> > > So in theory this would still require overlapped IO.  Does that still
> > > work as desired if the pipe mode is non-blocking?  I don't think I ever
> > > tried that...
> > 
> > That probably doesn't make sense.  If WriteFile returns without writing
> > something, what should overlapped io be waiting on?
> 
> The approach I've taken on the topic/pipe branch is to stop using overlapped
> I/O and to always keep the blocking mode of the Windows pipe in sync with
> the blocking mode of the fhandler.  This seems to work pretty well so far,
> although problems could certainly show up after further testing.

Erm... afaics, fhandler_pipe::raw_read and fhandler_pipe::raw_write are
still using overlapped IO on blocking sockets.  Otherwise, how'd you
handle signals?

Just to be clear, Windows pipes can be read/write in three modes:

- non-blocking	 (FILE_PIPE_COMPLETE_OPERATION)
- synchronous	 (FILE_PIPE_QUEUE_OPERATION, non-overlapped)
- asynchronous   (FILE_PIPE_QUEUE_OPERATION, overlapped)

Right now the pipe code uses non-blocking mode for non-blocking sockets
and asynchronous mode for blocking sockets.  This looks like the most
promising approach, afaics.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 14:12                                 ` Corinna Vinschen
@ 2021-08-30 14:52                                   ` Ken Brown
  2021-08-30 15:15                                     ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-30 14:52 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 10:12 AM, Corinna Vinschen wrote:
> On Aug 30 09:41, Ken Brown wrote:
>> On 8/30/2021 9:20 AM, Corinna Vinschen wrote:
>>> On Aug 30 15:00, Corinna Vinschen wrote:
>>>> On Aug 30 10:27, Corinna Vinschen wrote:
>>>>> [Moving discussion to cygwin-developers]
>>>>>
>>>>> On Aug 29 11:57, Ken Brown via Cygwin wrote:
>>>>>> Corinna, do you still see any benefit to switching to PIPE_NOWAIT?  AFAICT,
>>>>>> it wouldn't decrease the code size at this point, so the only question is
>>>>>> whether it might improve performance.
>>>>>
>>>>> Pipes are already using PIPE_NOWAIT aka FILE_PIPE_COMPLETE_OPERATION
>>>>> mode, see set_pipe_non_blocking.  The problem is that it's not used for
>>>>> blocking pipes.  Rather, blocking pipes use overlapped IO.  Overlapped
>>>>> IO is conceptually upside-down from the POSIX concept of non-blocking.
>>>>> Also, the information returned in FilePipeLocalInformation is historically
>>>>> borderline.  For kicks, see
>>>>> https://cygwin.com/pipermail/cygwin-patches/2004q4/005002.html
>>>>>
>>>>> So my suggestion is to try switching to non-blocking Windows pipes
>>>>> entirely, even for blocking pipes on the user level.  It works nicely
>>>>> for sockets.
>>>>
>>>> On second thought, I'm not so sure how to block on non-blocking pipes
>>>> on writing.  Assuming a write fails because the buffer is full, we
>>>> don't have a waitable object to wait on.  Unless the pipe handle is
>>>> signalled if writing is allowed, but that would be a first in Windows.
>>>> So in theory this would still require overlapped IO.  Does that still
>>>> work as desired if the pipe mode is non-blocking?  I don't think I ever
>>>> tried that...
>>>
>>> That probably doesn't make sense.  If WriteFile returns without writing
>>> something, what should overlapped io be waiting on?
>>
>> The approach I've taken on the topic/pipe branch is to stop using overlapped
>> I/O and to always keep the blocking mode of the Windows pipe in sync with
>> the blocking mode of the fhandler.  This seems to work pretty well so far,
>> although problems could certainly show up after further testing.
> 
> Erm... afaics, fhandler_pipe::raw_read and fhandler_pipe::raw_write are
> still using overlapped IO on blocking sockets.  Otherwise, how'd you
> handle signals?
> 
> Just to be clear, Windows pipes can be read/write in three modes:
> 
> - non-blocking	 (FILE_PIPE_COMPLETE_OPERATION)
> - synchronous	 (FILE_PIPE_QUEUE_OPERATION, non-overlapped)
> - asynchronous   (FILE_PIPE_QUEUE_OPERATION, overlapped)

OK, I've been thoroughly confused about what "overlapped" means.  I thought it 
meant specifying FILE_FLAG_OVERLAPPED and a pointer to an OVERLAPPED structure, 
both of which (I thought) only made sense when using the Win32 API rather than 
the NT API.

I *think* I understand what you mean now.  By using an event in the calls to 
NtReadFile and NtWriteFile in the blocking case, I'm selecting asynchronous 
mode, right?

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 13:51                                       ` Ken Brown
@ 2021-08-30 15:00                                         ` Ken Brown
  2021-08-30 15:19                                           ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-30 15:00 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 9:51 AM, Ken Brown wrote:
> On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
>> On Aug 30 21:04, Takashi Yano wrote:
>>> On Mon, 30 Aug 2021 12:20:30 +0200
>>> Corinna Vinschen wrote:
>>>> [Move discussion to cygwin-developers]
>>>>
>>>> On Aug 30 17:02, Takashi Yano via Cygwin wrote:
>>>>> [...]
>>>>> Is naming the pipe really necessary?
>>>>
>>>> It's not, but CreatePipe is doing this anyway.
>>>>
>>>> "Anonymous pipes are implemented using a named pipe with a unique name."
>>>> https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe 
>>>>
>>>>
>>>> The reason CreateNamedPipe was used in the first place was that
>>>> FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
>>>> of the pipe, however, it creates full duplex pipe:
>>>>
>>>> https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html
>>>>
>>>> Given the fact that CreatePipe is implemented in terms of
>>>> NtCreateNamedPipeFile anyway, why should the pipe created with
>>>> NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
>>>>
>>>> The only reason can be some missing flag, I think.  Checking
>>>> fhandler_pipe.cc::nt_create and comparing that with the default flags
>>>> for files and other devices, it occurs to me that the SYNCHRONIZE stuff
>>>> is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
>>>> this in nt_create:
>>>>
>>>>    status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
>>>>                   FILE_SHARE_READ | FILE_SHARE_WRITE,
>>>>                   FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
>>>>                   pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
>>>>                   0, 1, psize, psize, &timeout);
>>>>
>>>> Does that fix the above problems, too?
>>>
>>> Yes it does! Now, if CYGWIN=pipe_byte is also set, the piping issue
>>> of C# program is gone!
>>>
>>> In fact, I've already tested adding the SYNCHRONIZE access flag,
>>> but it didn't solve the problem. It seems that the cause was
>>> that FILE_SYNCHRONOUS_IO_NONALERT was missing.
>>>
>>> Thank you for figuring out the solution!
>>
>> No worries.  The same should apply to the NtCreateFile side of the
>> pipe, btw.
> 
> I'll add my thanks.  I should have checked the default flags that are typically 
> used for other devices when I wrote nt_create.  I'm glad you caught this.
> 
> So I'll reinstate the use of nt_create and then let Takashi recheck everything.

I've done this now.  I'm still not sure I've got all the flags right.  For 
unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to 
NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I also 
use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even relevant in 
this context?

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 14:52                                   ` Ken Brown
@ 2021-08-30 15:15                                     ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 15:15 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 10:52, Ken Brown wrote:
> On 8/30/2021 10:12 AM, Corinna Vinschen wrote:
> > On Aug 30 09:41, Ken Brown wrote:
> > > The approach I've taken on the topic/pipe branch is to stop using overlapped
> > > I/O and to always keep the blocking mode of the Windows pipe in sync with
> > > the blocking mode of the fhandler.  This seems to work pretty well so far,
> > > although problems could certainly show up after further testing.
> > 
> > Erm... afaics, fhandler_pipe::raw_read and fhandler_pipe::raw_write are
> > still using overlapped IO on blocking sockets.  Otherwise, how'd you
> > handle signals?
> > 
> > Just to be clear, Windows pipes can be read/write in three modes:
> > 
> > - non-blocking	 (FILE_PIPE_COMPLETE_OPERATION)
> > - synchronous	 (FILE_PIPE_QUEUE_OPERATION, non-overlapped)
> > - asynchronous   (FILE_PIPE_QUEUE_OPERATION, overlapped)
> 
> OK, I've been thoroughly confused about what "overlapped" means.  I thought
> it meant specifying FILE_FLAG_OVERLAPPED and a pointer to an OVERLAPPED
> structure, both of which (I thought) only made sense when using the Win32
> API rather than the NT API.
> 
> I *think* I understand what you mean now.  By using an event in the calls to
> NtReadFile and NtWriteFile in the blocking case, I'm selecting asynchronous
> mode, right?

Sorry, yes, OVERLAPPED is a Win32 expression only.  The NT calls only
differ between synchronous and asynchronous calls.  For asynchronous
calls you can either call a wait function on the file handle or you can
add an event object or a completion routine.

But that makes me wonder...  Looks like my idea to add the
FILE_SYNCHRONOUS_IO_NONALERT flag was a red herring.  This enforces
synchronous operation, which is not what we want.  Bummer.

However, if C# can't work with asynchronous handles, I wonder how to
fix this issue at all.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 15:00                                         ` Ken Brown
@ 2021-08-30 15:19                                           ` Corinna Vinschen
  2021-08-30 15:43                                             ` Ken Brown
  2021-08-31  8:52                                             ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 15:19 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 11:00, Ken Brown wrote:
> On 8/30/2021 9:51 AM, Ken Brown wrote:
> > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > On Aug 30 21:04, Takashi Yano wrote:
> > > No worries.  The same should apply to the NtCreateFile side of the
> > > pipe, btw.
> > 
> > I'll add my thanks.  I should have checked the default flags that are
> > typically used for other devices when I wrote nt_create.  I'm glad you
> > caught this.
> > 
> > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> 
> I've done this now.  I'm still not sure I've got all the flags right.  For
> unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> relevant in this context?

This is only relevant if you want to open the pipe from another context,
calling CreateNamedPipe/CreateFile.  As long as the pipe is only
duplicated, it shouldn't matter at all.

But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
flag is probably a good thing for C# apps, but not for Cygwin, because it
enforces synchronous operation.  Sorry about that...


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 15:19                                           ` Corinna Vinschen
@ 2021-08-30 15:43                                             ` Ken Brown
  2021-08-31  9:43                                               ` Corinna Vinschen
  2021-08-31  8:52                                             ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-30 15:43 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 11:19 AM, Corinna Vinschen wrote:
> On Aug 30 11:00, Ken Brown wrote:
>> On 8/30/2021 9:51 AM, Ken Brown wrote:
>>> On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
>>>> On Aug 30 21:04, Takashi Yano wrote:
>>>> No worries.  The same should apply to the NtCreateFile side of the
>>>> pipe, btw.
>>>
>>> I'll add my thanks.  I should have checked the default flags that are
>>> typically used for other devices when I wrote nt_create.  I'm glad you
>>> caught this.
>>>
>>> So I'll reinstate the use of nt_create and then let Takashi recheck everything.
>>
>> I've done this now.  I'm still not sure I've got all the flags right.  For
>> unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
>> NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
>> also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
>> relevant in this context?
> 
> This is only relevant if you want to open the pipe from another context,
> calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> duplicated, it shouldn't matter at all.

OK, then I think I should remove the sharing from NtCreateNamedPipeFile, since 
it could confuse someone reading the code.

> But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> flag is probably a good thing for C# apps, but not for Cygwin, because it
> enforces synchronous operation.  Sorry about that...

No problem.  I'll remove that flag for now, and we may have to live with the C# 
problem unless someone can find a different fix for it.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 14:05                                 ` Corinna Vinschen
@ 2021-08-30 15:53                                   ` Corinna Vinschen
  2021-08-30 17:00                                     ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 15:53 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 16:05, Corinna Vinschen wrote:
> On Aug 30 09:36, Ken Brown wrote:
> > BTW, when I was working on the pipe approach to AF_UNIX sockets
> > (topic/af_unix branch), I had occasion to step through
> > select.cc:pipe_data_available in gdb, and the use of fpli.OutboundQuota -
> > fpli.ReadDataAvailable definitely seemed wrong to me.  So when I wrote
> > peek_socket_unix on that branch, I used fpli.WriteQuotaAvailable, as Takashi
> > is suggesting now.
> 
> If that's working reliable these days (keeping fingers crossed for W7),
> it's ok if we use that.  We may want to check if the above observation
> in terms on WriteQuotaAvailable on a pipe with pending read is still an
> issue.

Ok, I wrote a small testcase.  It creates a named pipe, reads from the
pipe, then, later, writes to the pipe.  Interlaced with these calls, it
calls NtQueryInformationFile(FilePipeLocalInformation) on the write side
of the pipe.  Kind of like this:

  CreatePipe
  NtQueryInformationFile
  ReadFile
  NtQueryInformationFile
  WriteFile
  NtQueryInformationFile

Here's the result:

Before ReadFile:

  InboundQuota: 65536
  ReadDataAvailable: 0
  OutboundQuota: 65536
  WriteQuotaAvailable: 65536

While ReadFile is running:

  InboundQuota: 65536
  ReadDataAvailable: 0
  OutboundQuota: 65536
  WriteQuotaAvailable: 65494	!!!

After WriteFile and ReadFile succeeded:

  InboundQuota: 65536
  ReadDataAvailable: 0
  OutboundQuota: 65536
  WriteQuotaAvailable: 65536

That means, while a reader on the reader side is waiting for data, the
WriteQuotaAvailable on the write side is decremented by the amount of
data requested by the reader (42 bytes in my case), just as outlined in that
mail from 2004.  And this is on W10 now.

What to do with this information?  TBD.

Side note:  My testcase is starting a second thread to call ReadFile.
For that reason I was using synchronous IO on the pipe since, well,
never mind if that thread is blocked in ReadFile, right?  Nothing keeps
us from calling NtQueryInformationFile on the write side of the pipe,
right?

Wrong.  While the second thread was blocked in ReadFile, the call to
NtQueryInformationFile was blocking, too :-P

I had to convert the read side of the pipe to asynchronous mode to be
able to call NtQueryInformationFile(FilePipeLocalInformation) on the
write side of the pipe, while the read side is performing a ReadFile
operation.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 15:53                                   ` Corinna Vinschen
@ 2021-08-30 17:00                                     ` Corinna Vinschen
  2021-08-30 17:11                                       ` Corinna Vinschen
                                                         ` (2 more replies)
  0 siblings, 3 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 17:00 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 17:53, Corinna Vinschen wrote:
> On Aug 30 16:05, Corinna Vinschen wrote:
> > On Aug 30 09:36, Ken Brown wrote:
> > > BTW, when I was working on the pipe approach to AF_UNIX sockets
> > > (topic/af_unix branch), I had occasion to step through
> > > select.cc:pipe_data_available in gdb, and the use of fpli.OutboundQuota -
> > > fpli.ReadDataAvailable definitely seemed wrong to me.  So when I wrote
> > > peek_socket_unix on that branch, I used fpli.WriteQuotaAvailable, as Takashi
> > > is suggesting now.
> > 
> > If that's working reliable these days (keeping fingers crossed for W7),
> > it's ok if we use that.  We may want to check if the above observation
> > in terms on WriteQuotaAvailable on a pipe with pending read is still an
> > issue.
> 
> Ok, I wrote a small testcase.  It creates a named pipe, reads from the
> pipe, then, later, writes to the pipe.  Interlaced with these calls, it
> calls NtQueryInformationFile(FilePipeLocalInformation) on the write side
> of the pipe.  Kind of like this:
> 
>   CreatePipe
>   NtQueryInformationFile
>   ReadFile
>   NtQueryInformationFile
>   WriteFile
>   NtQueryInformationFile
> 
> Here's the result:
> 
> Before ReadFile:
> 
>   InboundQuota: 65536
>   ReadDataAvailable: 0
>   OutboundQuota: 65536
>   WriteQuotaAvailable: 65536
> 
> While ReadFile is running:
> 
>   InboundQuota: 65536
>   ReadDataAvailable: 0
>   OutboundQuota: 65536
>   WriteQuotaAvailable: 65494	!!!
> 
> After WriteFile and ReadFile succeeded:
> 
>   InboundQuota: 65536
>   ReadDataAvailable: 0
>   OutboundQuota: 65536
>   WriteQuotaAvailable: 65536
> 
> That means, while a reader on the reader side is waiting for data, the
> WriteQuotaAvailable on the write side is decremented by the amount of
> data requested by the reader (42 bytes in my case), just as outlined in that
> mail from 2004.  And this is on W10 now.
> 
> What to do with this information?  TBD.

Ok, let's discuss this.  I added more code to my testcase and here's
what I see.  I dropped all data from the output which doesn't change.

What I'm trying to get a grip on are the dependencies here.

After creating the pipe:

  read side: ReadDataAvailable: 0
  write side: WriteQuotaAvailable: 65536

After writing 20 bytes...

  read side: ReadDataAvailable: 20
  write side: WriteQuotaAvailable: 65516

After writing 40 more bytes...

  read side: ReadDataAvailable: 60
  write side: WriteQuotaAvailable: 65476

After reading 42 bytes...

  read side: ReadDataAvailable: 18
  write side: WriteQuotaAvailable: 65518

After writing 20 bytes...

  read side: ReadDataAvailable: 38
  write side: WriteQuotaAvailable: 65498

*While* reading 42 bytes with an empty buffer...

  read side: ReadDataAvailable: 0
  write side: WriteQuotaAvailable: 65494

Another important fun fact:  Assuming the read and write buffer sizes
are differently specified.  I called CreateNamedPipe with an outbuffer
size of 32K and an inbuffer size of 64K:

After creating the pipe:

  read side:
    InboundQuota: 65536
    ReadDataAvailable: 0
    OutboundQuota: 32768
    WriteQuotaAvailable: 32768
  write side:
    InboundQuota: 65536
    ReadDataAvailable: 0
    OutboundQuota: 32768
    WriteQuotaAvailable: 65536	!!!

This last data point shows that:

- InboundQuota and OutboundQuota are always constant values and
  do not depend on the side the information has been queried on.
  That certainly makes sense.

- WriteQuotaAvailable does not depend on the OutboundQuota, but on
  the InboundQuota, and very likely on the InboundQuota of the read
  side.  The OutboundQuota *probably* only makes sense when using
  named pipes with remote clients, which we never do anyway.

The preceeding output shows that ReadDataAvailable on the read side and
WriteQuotaAvailable on the write side are connected.  If we write 20
bytes, ReadDataAvailable is incremented by 20 and WriteQuotaAvailable is
decremented by 20.

So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.

Except when a ReadFile is pending on the read side.  It's as if the
running ReadFile already reserved write quota.  So the write side
WriteQuotaAvailable is the number of bytes we can write without blocking,
after all pending ReadFiles have been satisfied.

Unfortunately that doesn't really make sense when looked at it from the
user space.

What that means in the first place is that WriteQuotaAvailable on the
write side is unreliable.  What we really need is InboundQuota -
read.ReadDataAvailable.  The problem with that is that the write side
usually has no access to the read side of the pipe.

Long story short, I have no idea how to fix that ATM.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 17:00                                     ` Corinna Vinschen
@ 2021-08-30 17:11                                       ` Corinna Vinschen
  2021-08-30 18:59                                       ` Ken Brown
  2021-08-30 20:14                                       ` Corinna Vinschen
  2 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 17:11 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 19:00, Corinna Vinschen wrote:
> - WriteQuotaAvailable does not depend on the OutboundQuota, but on
>   the InboundQuota, and very likely on the InboundQuota of the read
>   side.  The OutboundQuota *probably* only makes sense when using
>   named pipes with remote clients, which we never do anyway.

D'oh.  It makes sense for duplex pipes, of course.  I always forget them.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 17:00                                     ` Corinna Vinschen
  2021-08-30 17:11                                       ` Corinna Vinschen
@ 2021-08-30 18:59                                       ` Ken Brown
  2021-08-30 19:12                                         ` Ken Brown
  2021-08-30 20:21                                         ` Corinna Vinschen
  2021-08-30 20:14                                       ` Corinna Vinschen
  2 siblings, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-08-30 18:59 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 1:00 PM, Corinna Vinschen wrote:
> On Aug 30 17:53, Corinna Vinschen wrote:
>> On Aug 30 16:05, Corinna Vinschen wrote:
>>> On Aug 30 09:36, Ken Brown wrote:
>>>> BTW, when I was working on the pipe approach to AF_UNIX sockets
>>>> (topic/af_unix branch), I had occasion to step through
>>>> select.cc:pipe_data_available in gdb, and the use of fpli.OutboundQuota -
>>>> fpli.ReadDataAvailable definitely seemed wrong to me.  So when I wrote
>>>> peek_socket_unix on that branch, I used fpli.WriteQuotaAvailable, as Takashi
>>>> is suggesting now.
>>>
>>> If that's working reliable these days (keeping fingers crossed for W7),
>>> it's ok if we use that.  We may want to check if the above observation
>>> in terms on WriteQuotaAvailable on a pipe with pending read is still an
>>> issue.
>>
>> Ok, I wrote a small testcase.  It creates a named pipe, reads from the
>> pipe, then, later, writes to the pipe.  Interlaced with these calls, it
>> calls NtQueryInformationFile(FilePipeLocalInformation) on the write side
>> of the pipe.  Kind of like this:
>>
>>    CreatePipe
>>    NtQueryInformationFile
>>    ReadFile
>>    NtQueryInformationFile
>>    WriteFile
>>    NtQueryInformationFile
>>
>> Here's the result:
>>
>> Before ReadFile:
>>
>>    InboundQuota: 65536
>>    ReadDataAvailable: 0
>>    OutboundQuota: 65536
>>    WriteQuotaAvailable: 65536
>>
>> While ReadFile is running:
>>
>>    InboundQuota: 65536
>>    ReadDataAvailable: 0
>>    OutboundQuota: 65536
>>    WriteQuotaAvailable: 65494	!!!
>>
>> After WriteFile and ReadFile succeeded:
>>
>>    InboundQuota: 65536
>>    ReadDataAvailable: 0
>>    OutboundQuota: 65536
>>    WriteQuotaAvailable: 65536
>>
>> That means, while a reader on the reader side is waiting for data, the
>> WriteQuotaAvailable on the write side is decremented by the amount of
>> data requested by the reader (42 bytes in my case), just as outlined in that
>> mail from 2004.  And this is on W10 now.
>>
>> What to do with this information?  TBD.
> 
> Ok, let's discuss this.  I added more code to my testcase and here's
> what I see.  I dropped all data from the output which doesn't change.
> 
> What I'm trying to get a grip on are the dependencies here.
> 
> After creating the pipe:
> 
>    read side: ReadDataAvailable: 0
>    write side: WriteQuotaAvailable: 65536
> 
> After writing 20 bytes...
> 
>    read side: ReadDataAvailable: 20
>    write side: WriteQuotaAvailable: 65516
> 
> After writing 40 more bytes...
> 
>    read side: ReadDataAvailable: 60
>    write side: WriteQuotaAvailable: 65476
> 
> After reading 42 bytes...
> 
>    read side: ReadDataAvailable: 18
>    write side: WriteQuotaAvailable: 65518
> 
> After writing 20 bytes...
> 
>    read side: ReadDataAvailable: 38
>    write side: WriteQuotaAvailable: 65498
> 
> *While* reading 42 bytes with an empty buffer...
> 
>    read side: ReadDataAvailable: 0
>    write side: WriteQuotaAvailable: 65494
> 
> Another important fun fact:  Assuming the read and write buffer sizes
> are differently specified.  I called CreateNamedPipe with an outbuffer
> size of 32K and an inbuffer size of 64K:
> 
> After creating the pipe:
> 
>    read side:
>      InboundQuota: 65536
>      ReadDataAvailable: 0
>      OutboundQuota: 32768
>      WriteQuotaAvailable: 32768
>    write side:
>      InboundQuota: 65536
>      ReadDataAvailable: 0
>      OutboundQuota: 32768
>      WriteQuotaAvailable: 65536	!!!
> 
> This last data point shows that:
> 
> - InboundQuota and OutboundQuota are always constant values and
>    do not depend on the side the information has been queried on.
>    That certainly makes sense.
> 
> - WriteQuotaAvailable does not depend on the OutboundQuota, but on
>    the InboundQuota, and very likely on the InboundQuota of the read
>    side.  The OutboundQuota *probably* only makes sense when using
>    named pipes with remote clients, which we never do anyway.
> 
> The preceeding output shows that ReadDataAvailable on the read side and
> WriteQuotaAvailable on the write side are connected.  If we write 20
> bytes, ReadDataAvailable is incremented by 20 and WriteQuotaAvailable is
> decremented by 20.
> 
> So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.
> 
> Except when a ReadFile is pending on the read side.  It's as if the
> running ReadFile already reserved write quota.  So the write side
> WriteQuotaAvailable is the number of bytes we can write without blocking,
> after all pending ReadFiles have been satisfied.
> 
> Unfortunately that doesn't really make sense when looked at it from the
> user space.
> 
> What that means in the first place is that WriteQuotaAvailable on the
> write side is unreliable.  What we really need is InboundQuota -
> read.ReadDataAvailable.  The problem with that is that the write side
> usually has no access to the read side of the pipe.

For the purposes of select.cc:pipe_data_available, we only need to know whether 
InboundQuota - read.ReadDataAvailable is positive.  If WriteQuotaAvailable is 
positive, then we're OK, even though its precise value might be too small.  But 
if WriteQuotaAvailable == 0, we don't know whether the buffer is actually full 
or there's a pending ReadFile.  It's only in this case that we need access to 
the read side.

What if we reverse the roles of the read and write sides of the pipe, so that 
the write side is the server and the read side is the client.  We can then try 
to use ImpersonateNamedPipeClient to get information about the read side when 
WriteQuotaAvailable == 0.  If we succeed, then we can determine whether or not 
there's space in the buffer.  If we fail, we simply report, possibly 
incorrectly, that there is space.  This is no worse than the current situation 
(on the master branch), in which we use OutboundQuota - ReadDataAvailable, which 
is always positive.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 18:59                                       ` Ken Brown
@ 2021-08-30 19:12                                         ` Ken Brown
  2021-08-30 20:21                                         ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-08-30 19:12 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 2:59 PM, Ken Brown wrote:
> On 8/30/2021 1:00 PM, Corinna Vinschen wrote:
>> On Aug 30 17:53, Corinna Vinschen wrote:
>>> On Aug 30 16:05, Corinna Vinschen wrote:
>>>> On Aug 30 09:36, Ken Brown wrote:
>>>>> BTW, when I was working on the pipe approach to AF_UNIX sockets
>>>>> (topic/af_unix branch), I had occasion to step through
>>>>> select.cc:pipe_data_available in gdb, and the use of fpli.OutboundQuota -
>>>>> fpli.ReadDataAvailable definitely seemed wrong to me.  So when I wrote
>>>>> peek_socket_unix on that branch, I used fpli.WriteQuotaAvailable, as Takashi
>>>>> is suggesting now.
>>>>
>>>> If that's working reliable these days (keeping fingers crossed for W7),
>>>> it's ok if we use that.  We may want to check if the above observation
>>>> in terms on WriteQuotaAvailable on a pipe with pending read is still an
>>>> issue.
>>>
>>> Ok, I wrote a small testcase.  It creates a named pipe, reads from the
>>> pipe, then, later, writes to the pipe.  Interlaced with these calls, it
>>> calls NtQueryInformationFile(FilePipeLocalInformation) on the write side
>>> of the pipe.  Kind of like this:
>>>
>>>    CreatePipe
>>>    NtQueryInformationFile
>>>    ReadFile
>>>    NtQueryInformationFile
>>>    WriteFile
>>>    NtQueryInformationFile
>>>
>>> Here's the result:
>>>
>>> Before ReadFile:
>>>
>>>    InboundQuota: 65536
>>>    ReadDataAvailable: 0
>>>    OutboundQuota: 65536
>>>    WriteQuotaAvailable: 65536
>>>
>>> While ReadFile is running:
>>>
>>>    InboundQuota: 65536
>>>    ReadDataAvailable: 0
>>>    OutboundQuota: 65536
>>>    WriteQuotaAvailable: 65494    !!!
>>>
>>> After WriteFile and ReadFile succeeded:
>>>
>>>    InboundQuota: 65536
>>>    ReadDataAvailable: 0
>>>    OutboundQuota: 65536
>>>    WriteQuotaAvailable: 65536
>>>
>>> That means, while a reader on the reader side is waiting for data, the
>>> WriteQuotaAvailable on the write side is decremented by the amount of
>>> data requested by the reader (42 bytes in my case), just as outlined in that
>>> mail from 2004.  And this is on W10 now.
>>>
>>> What to do with this information?  TBD.
>>
>> Ok, let's discuss this.  I added more code to my testcase and here's
>> what I see.  I dropped all data from the output which doesn't change.
>>
>> What I'm trying to get a grip on are the dependencies here.
>>
>> After creating the pipe:
>>
>>    read side: ReadDataAvailable: 0
>>    write side: WriteQuotaAvailable: 65536
>>
>> After writing 20 bytes...
>>
>>    read side: ReadDataAvailable: 20
>>    write side: WriteQuotaAvailable: 65516
>>
>> After writing 40 more bytes...
>>
>>    read side: ReadDataAvailable: 60
>>    write side: WriteQuotaAvailable: 65476
>>
>> After reading 42 bytes...
>>
>>    read side: ReadDataAvailable: 18
>>    write side: WriteQuotaAvailable: 65518
>>
>> After writing 20 bytes...
>>
>>    read side: ReadDataAvailable: 38
>>    write side: WriteQuotaAvailable: 65498
>>
>> *While* reading 42 bytes with an empty buffer...
>>
>>    read side: ReadDataAvailable: 0
>>    write side: WriteQuotaAvailable: 65494
>>
>> Another important fun fact:  Assuming the read and write buffer sizes
>> are differently specified.  I called CreateNamedPipe with an outbuffer
>> size of 32K and an inbuffer size of 64K:
>>
>> After creating the pipe:
>>
>>    read side:
>>      InboundQuota: 65536
>>      ReadDataAvailable: 0
>>      OutboundQuota: 32768
>>      WriteQuotaAvailable: 32768
>>    write side:
>>      InboundQuota: 65536
>>      ReadDataAvailable: 0
>>      OutboundQuota: 32768
>>      WriteQuotaAvailable: 65536    !!!
>>
>> This last data point shows that:
>>
>> - InboundQuota and OutboundQuota are always constant values and
>>    do not depend on the side the information has been queried on.
>>    That certainly makes sense.
>>
>> - WriteQuotaAvailable does not depend on the OutboundQuota, but on
>>    the InboundQuota, and very likely on the InboundQuota of the read
>>    side.  The OutboundQuota *probably* only makes sense when using
>>    named pipes with remote clients, which we never do anyway.
>>
>> The preceeding output shows that ReadDataAvailable on the read side and
>> WriteQuotaAvailable on the write side are connected.  If we write 20
>> bytes, ReadDataAvailable is incremented by 20 and WriteQuotaAvailable is
>> decremented by 20.
>>
>> So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.
>>
>> Except when a ReadFile is pending on the read side.  It's as if the
>> running ReadFile already reserved write quota.  So the write side
>> WriteQuotaAvailable is the number of bytes we can write without blocking,
>> after all pending ReadFiles have been satisfied.
>>
>> Unfortunately that doesn't really make sense when looked at it from the
>> user space.
>>
>> What that means in the first place is that WriteQuotaAvailable on the
>> write side is unreliable.  What we really need is InboundQuota -
>> read.ReadDataAvailable.  The problem with that is that the write side
>> usually has no access to the read side of the pipe.
> 
> For the purposes of select.cc:pipe_data_available, we only need to know whether 
> InboundQuota - read.ReadDataAvailable is positive.  If WriteQuotaAvailable is 
> positive, then we're OK, even though its precise value might be too small.  But 
> if WriteQuotaAvailable == 0, we don't know whether the buffer is actually full 
> or there's a pending ReadFile.  It's only in this case that we need access to 
> the read side.
> 
> What if we reverse the roles of the read and write sides of the pipe, so that 
> the write side is the server and the read side is the client.  We can then try 
> to use ImpersonateNamedPipeClient to get information about the read side when 
> WriteQuotaAvailable == 0.  If we succeed, then we can determine whether or not 
> there's space in the buffer.  If we fail, we simply report, possibly 
> incorrectly, that there is space.  This is no worse than the current situation 
> (on the master branch), in which we use OutboundQuota - ReadDataAvailable, which 
> is always positive.

I should add that I know absolutely nothing about ImpersonateNamedPipeClient, so 
this might be complete nonsense.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 17:00                                     ` Corinna Vinschen
  2021-08-30 17:11                                       ` Corinna Vinschen
  2021-08-30 18:59                                       ` Ken Brown
@ 2021-08-30 20:14                                       ` Corinna Vinschen
  2021-08-30 20:47                                         ` Ken Brown
  2021-08-31  8:55                                         ` Takashi Yano
  2 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 20:14 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2269 bytes --]

Hi Ken, Hi Takashi,

On Aug 30 19:00, Corinna Vinschen wrote:
> Ok, let's discuss this.  I added more code to my testcase and here's
> what I see.  I dropped all data from the output which doesn't change.
> [...]
> - InboundQuota and OutboundQuota are always constant values and
>   do not depend on the side the information has been queried on.
>   That certainly makes sense.
> 
> - WriteQuotaAvailable does not depend on the OutboundQuota, but on
>   the InboundQuota, and very likely on the InboundQuota of the read
>   side.  The OutboundQuota *probably* only makes sense when using
>   named pipes with remote clients, which we never do anyway.
> 
> The preceeding output shows that ReadDataAvailable on the read side and
> WriteQuotaAvailable on the write side are connected.  If we write 20
> bytes, ReadDataAvailable is incremented by 20 and WriteQuotaAvailable is
> decremented by 20.
> 
> So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.
> 
> Except when a ReadFile is pending on the read side.  It's as if the
> running ReadFile already reserved write quota.  So the write side
> WriteQuotaAvailable is the number of bytes we can write without blocking,
> after all pending ReadFiles have been satisfied.
> 
> Unfortunately that doesn't really make sense when looked at it from the
> user space.
> 
> What that means in the first place is that WriteQuotaAvailable on the
> write side is unreliable.  What we really need is InboundQuota -
> read.ReadDataAvailable.  The problem with that is that the write side
> usually has no access to the read side of the pipe.
> 
> Long story short, I have no idea how to fix that ATM.

Well, what about keeping a duplicate of the read side handle on the 
write side just for calling NtQueryInformationFile?

Attached is an untested patch, can you have a look if that makes sense?

Btw., I think I found a bug in the new fhandler_pipe::create.  If the
function fails to create the write side fhandler, it deletes the read
side fhandler, but neglects to close the read handle.  My patch fixes
that.

While looking into this I found a problem in fhandler_disk_file in
terms of handle inheritance of the special handle for pread/pwrite.
I already force pushed this onto topic/pipe.


Thanks,
Corinna

[-- Attachment #2: pipe.diff --]
[-- Type: text/plain, Size: 5407 bytes --]

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 132e6002133b..a2de4301521b 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -462,6 +462,7 @@ public:
   virtual HANDLE& get_output_handle_nat () { return io_handle; }
   virtual HANDLE get_stat_handle () { return pc.handle () ?: io_handle; }
   virtual HANDLE get_echo_handle () const { return NULL; }
+  virtual HANDLE get_query_handle () const { return NULL; }
   virtual bool hit_eof () {return false;}
   virtual select_record *select_read (select_stuff *);
   virtual select_record *select_write (select_stuff *);
@@ -1171,6 +1172,7 @@ class fhandler_socket_unix : public fhandler_socket
 class fhandler_pipe: public fhandler_base
 {
 private:
+  HANDLE query_hdl;
   pid_t popen_pid;
   size_t max_atomic_write;
   void set_pipe_non_blocking (bool nonblocking);
@@ -1179,6 +1181,12 @@ public:
 
   bool ispipe() const { return true; }
 
+  HANDLE get_query_handle ()
+  {
+    return (get_device () == FH_PIPEW) ? query_hdl : get_handle ();
+  }
+  void set_query_handle (HANDLE qh) { query_hdl = qh; }
+
   void set_popen_pid (pid_t pid) {popen_pid = pid;}
   pid_t get_popen_pid () const {return popen_pid;}
   off_t lseek (off_t offset, int whence);
@@ -1187,7 +1195,9 @@ public:
   select_record *select_except (select_stuff *);
   char *get_proc_fd_name (char *buf);
   int open (int flags, mode_t mode = 0);
+  void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
+  int close ();
   void __reg3 raw_read (void *ptr, size_t& len);
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
   int ioctl (unsigned int cmd, void *);
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index b99f00c099f8..f698f9063207 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -405,22 +405,45 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
   return ret;
 }
 
+void
+fhandler_pipe::fixup_after_fork (HANDLE parent)
+{
+  if (query_hdl)
+    fork_fixup (parent, query_hdl, "query_hdl");
+  fhandler_base::fixup_after_fork (parent);
+}
+
 int
 fhandler_pipe::dup (fhandler_base *child, int flags)
 {
   fhandler_pipe *ftp = (fhandler_pipe *) child;
   ftp->set_popen_pid (0);
 
-  int res;
-  if (get_handle () && fhandler_base::dup (child, flags))
+  int res = 0;
+  if (fhandler_base::dup (child, flags))
     res = -1;
-  else
-    res = 0;
+  else if (query_hdl &&
+	   !DuplicateHandle (GetCurrentProcess (), query_hdl,
+			     GetCurrentProcess (), &ftp->query_hdl,
+			     0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
 }
 
+int
+fhandler_pipe::close ()
+{
+  if (query_hdl)
+    NtClose (query_hdl);
+  return fhandler_base::close ();
+}
+
 #define PIPE_INTRO "\\\\.\\pipe\\cygwin-"
 
 /* Create a pipe, and return handles to the read and write ends,
@@ -608,6 +631,7 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
   else if ((fhs[1] = (fhandler_pipe *) build_fh_dev (*pipew_dev)) == NULL)
     {
       delete fhs[0];
+      CloseHandle (r);
       CloseHandle (w);
     }
   else
@@ -617,7 +641,17 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 		    unique_id);
       fhs[1]->init (w, FILE_CREATE_PIPE_INSTANCE | GENERIC_WRITE, mode,
 		    unique_id);
-      res = 0;
+      if (!DuplicateHandle (GetCurrentProcess (), r, GetCurrentProcess (),
+			    &fhs[1]->query_hdl, FILE_READ_ATTRIBUTES,
+			    !(mode & O_CLOEXEC), 0))
+	{
+	  delete fhs[0];
+	  CloseHandle (r);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	res = 0;
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 83e1c00e0ac7..dc0563a45729 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -608,11 +608,14 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
     }
   if (writing)
     {
-	/* If there is anything available in the pipe buffer then signal
-	   that.  This means that a pipe could still block since you could
-	   be trying to write more to the pipe than is available in the
-	   buffer but that is the hazard of select().  */
-      fpli.WriteQuotaAvailable = fpli.OutboundQuota - fpli.ReadDataAvailable;
+      /* If there is anything available in the pipe buffer then signal
+	 that.  This means that a pipe could still block since you could
+	 be trying to write more to the pipe than is available in the
+	 buffer but that is the hazard of select().
+	 Note that WriteQuotaAvailable is unreliable.  The only reliable
+	 information is available on the read side, which is why we fetch
+	 the info from the read side via the pipe-specific query handle. */
+      fpli.WriteQuotaAvailable = fpli.InboundQuota - fpli.ReadDataAvailable;
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
@@ -718,7 +721,7 @@ out:
       fhandler_pty_master *fhm = (fhandler_pty_master *) fh;
       fhm->set_mask_flusho (s->read_ready);
     }
-  h = fh->get_output_handle ();
+  h = fh->get_query_handle ();
   if (s->write_selected && dev != FH_PIPER)
     {
       gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 18:59                                       ` Ken Brown
  2021-08-30 19:12                                         ` Ken Brown
@ 2021-08-30 20:21                                         ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-30 20:21 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 14:59, Ken Brown wrote:
> On 8/30/2021 1:00 PM, Corinna Vinschen wrote:
> > [...]
> > So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.
> > 
> > Except when a ReadFile is pending on the read side.  It's as if the
> > running ReadFile already reserved write quota.  So the write side
> > WriteQuotaAvailable is the number of bytes we can write without blocking,
> > after all pending ReadFiles have been satisfied.
> > 
> > Unfortunately that doesn't really make sense when looked at it from the
> > user space.
> > 
> > What that means in the first place is that WriteQuotaAvailable on the
> > write side is unreliable.  What we really need is InboundQuota -
> > read.ReadDataAvailable.  The problem with that is that the write side
> > usually has no access to the read side of the pipe.
> 
> For the purposes of select.cc:pipe_data_available, we only need to know
> whether InboundQuota - read.ReadDataAvailable is positive.  If
> WriteQuotaAvailable is positive, then we're OK, even though its precise
> value might be too small.  But if WriteQuotaAvailable == 0, we don't know
> whether the buffer is actually full or there's a pending ReadFile.  It's
> only in this case that we need access to the read side.

Any ReadFile with buffer size >= pipe buffer size will trigger that
immediately.

> What if we reverse the roles of the read and write sides of the pipe, so
> that the write side is the server and the read side is the client.  We can
> then try to use ImpersonateNamedPipeClient to get information about the read
> side when WriteQuotaAvailable == 0.

It's not a problem of impersonation, it's the problem of not having
access to the read side, because the write side just doesn't know
what handle in which process constitutes the read side of the pipe.
For all the write side knows, it could be some other fhandler in the
same process, or some far away HANDLE in some far away process in
the same session, sometimes not even a Cygwin process.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 20:14                                       ` Corinna Vinschen
@ 2021-08-30 20:47                                         ` Ken Brown
  2021-08-31  8:55                                         ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-08-30 20:47 UTC (permalink / raw)
  To: cygwin-developers

On 8/30/2021 4:14 PM, Corinna Vinschen wrote:
> Hi Ken, Hi Takashi,
> 
> On Aug 30 19:00, Corinna Vinschen wrote:
>> Ok, let's discuss this.  I added more code to my testcase and here's
>> what I see.  I dropped all data from the output which doesn't change.
>> [...]
>> - InboundQuota and OutboundQuota are always constant values and
>>    do not depend on the side the information has been queried on.
>>    That certainly makes sense.
>>
>> - WriteQuotaAvailable does not depend on the OutboundQuota, but on
>>    the InboundQuota, and very likely on the InboundQuota of the read
>>    side.  The OutboundQuota *probably* only makes sense when using
>>    named pipes with remote clients, which we never do anyway.
>>
>> The preceeding output shows that ReadDataAvailable on the read side and
>> WriteQuotaAvailable on the write side are connected.  If we write 20
>> bytes, ReadDataAvailable is incremented by 20 and WriteQuotaAvailable is
>> decremented by 20.
>>
>> So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.
>>
>> Except when a ReadFile is pending on the read side.  It's as if the
>> running ReadFile already reserved write quota.  So the write side
>> WriteQuotaAvailable is the number of bytes we can write without blocking,
>> after all pending ReadFiles have been satisfied.
>>
>> Unfortunately that doesn't really make sense when looked at it from the
>> user space.
>>
>> What that means in the first place is that WriteQuotaAvailable on the
>> write side is unreliable.  What we really need is InboundQuota -
>> read.ReadDataAvailable.  The problem with that is that the write side
>> usually has no access to the read side of the pipe.
>>
>> Long story short, I have no idea how to fix that ATM.
> 
> Well, what about keeping a duplicate of the read side handle on the
> write side just for calling NtQueryInformationFile?
> 
> Attached is an untested patch, can you have a look if that makes sense?

I probably won't get a chance to test this until tomorrow, but at a glance it 
looks great!!

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 13:31                                       ` Corinna Vinschen
@ 2021-08-31  8:50                                         ` Takashi Yano
  0 siblings, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-08-31  8:50 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 30 Aug 2021 15:31:23 +0200
Corinna Vinschen wrote:
> Hi Takashi,
> 
> On Aug 30 14:55, Corinna Vinschen wrote:
> > On Aug 30 21:04, Takashi Yano wrote:
> > > On Mon, 30 Aug 2021 12:20:30 +0200
> > > Corinna Vinschen wrote:
> > > > [Move discussion to cygwin-developers]
> > > > 
> > > > On Aug 30 17:02, Takashi Yano via Cygwin wrote:
> > > > > [...]
> > > > > Is naming the pipe really necessary?
> > > > 
> > > > It's not, but CreatePipe is doing this anyway.
> > > > 
> > > > "Anonymous pipes are implemented using a named pipe with a unique name."
> > > > https://docs.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createpipe
> > > > 
> > > > The reason CreateNamedPipe was used in the first place was that
> > > > FILE_READ_ATTRIBUTES isn't set by CreatePipe for the write side
> > > > of the pipe, however, it creates full duplex pipe:
> > > > 
> > > > https://cygwin.com/pipermail/cygwin-patches/2004q3/004912.html
> > > > 
> > > > Given the fact that CreatePipe is implemented in terms of
> > > > NtCreateNamedPipeFile anyway, why should the pipe created with
> > > > NtCreateNamedPipeFile fail where the pipe created with CreatePipe works?
> > > > 
> > > > The only reason can be some missing flag, I think.  Checking
> > > > fhandler_pipe.cc::nt_create and comparing that with the default flags
> > > > for files and other devices, it occurs to me that the SYNCHRONIZE stuff
> > > > is missing.  So, Takashi, what if you call NtCreateNamedPipeFile like
> > > > this in nt_create:
> > > > 
> > > >   status = NtCreateNamedPipeFile (r, access | SYNCHRONIZE, &attr, &io,
> > > > 				  FILE_SHARE_READ | FILE_SHARE_WRITE,
> > > > 				  FILE_CREATE, FILE_SYNCHRONOUS_IO_NONALERT,
> > > > 				  pipe_type, FILE_PIPE_BYTE_STREAM_MODE,
> > > > 				  0, 1, psize, psize, &timeout);
> > > > 
> > > > Does that fix the above problems, too?
> > > 
> > > Yes it does! Now, if CYGWIN=pipe_byte is also set, the piping issue
> > > of C# program is gone!
> 
> I don't quite understand this one.  Is that C# example using the write
> side of the pipe?  If it reads from the pipe, this behaiour would be
> pretty puzzeling, given the read mode is always BYTE.

Both side. Writer and reader are C# program. However, only even if only
reader side is C# program, the problem occurs when the reader starts
to read pipe before the writer writes something to the pipe; i.e.
something like:

(sleep 0.1; echo AAAAAAAA) | ./reader

> Either way, assuming we switch the write side to BYTE mode only, is
> the pty code robust enough to work with that?  The comment
> 
>   Note that the write side of the pipe is opened as PIPE_TYPE_MESSAGE.
>   This *seems* to more closely mimic Linux pipe behavior and is
>   definitely required for pty handling since fhandler_pty_master
>   writes to the pipe in chunks, terminated by newline when CANON mode
>   is specified. 
> 
> is old, so the problems the message mode was trying to solve for
> CANON mode may not apply anymore...

In the current topic/pipe head, named pipe for pty is configured
as PIPE_TYPE_MESSAGE even if CYGWIN=pipe_byte is set. Pty needs
message mode for canonical read.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 15:19                                           ` Corinna Vinschen
  2021-08-30 15:43                                             ` Ken Brown
@ 2021-08-31  8:52                                             ` Takashi Yano
  2021-08-31  9:04                                               ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-31  8:52 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 30 Aug 2021 17:19:44 +0200
Corinna Vinschen wrote:
> On Aug 30 11:00, Ken Brown wrote:
> > On 8/30/2021 9:51 AM, Ken Brown wrote:
> > > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > > On Aug 30 21:04, Takashi Yano wrote:
> > > > No worries.  The same should apply to the NtCreateFile side of the
> > > > pipe, btw.
> > > 
> > > I'll add my thanks.  I should have checked the default flags that are
> > > typically used for other devices when I wrote nt_create.  I'm glad you
> > > caught this.
> > > 
> > > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> > 
> > I've done this now.  I'm still not sure I've got all the flags right.  For
> > unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> > NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> > also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> > relevant in this context?
> 
> This is only relevant if you want to open the pipe from another context,
> calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> duplicated, it shouldn't matter at all.
> 
> But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> flag is probably a good thing for C# apps, but not for Cygwin, because it
> enforces synchronous operation.  Sorry about that...

With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
specifically concerned about cygwin pipe? 

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 20:14                                       ` Corinna Vinschen
  2021-08-30 20:47                                         ` Ken Brown
@ 2021-08-31  8:55                                         ` Takashi Yano
  2021-08-31  9:08                                           ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-31  8:55 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 30 Aug 2021 22:14:15 +0200
Corinna Vinschen wrote:
> Hi Ken, Hi Takashi,
> 
> On Aug 30 19:00, Corinna Vinschen wrote:
> > Ok, let's discuss this.  I added more code to my testcase and here's
> > what I see.  I dropped all data from the output which doesn't change.
> > [...]
> > - InboundQuota and OutboundQuota are always constant values and
> >   do not depend on the side the information has been queried on.
> >   That certainly makes sense.
> > 
> > - WriteQuotaAvailable does not depend on the OutboundQuota, but on
> >   the InboundQuota, and very likely on the InboundQuota of the read
> >   side.  The OutboundQuota *probably* only makes sense when using
> >   named pipes with remote clients, which we never do anyway.
> > 
> > The preceeding output shows that ReadDataAvailable on the read side and
> > WriteQuotaAvailable on the write side are connected.  If we write 20
> > bytes, ReadDataAvailable is incremented by 20 and WriteQuotaAvailable is
> > decremented by 20.
> > 
> > So: write.WriteQuotaAvailable == InboundQuota - read.ReadDataAvailable.
> > 
> > Except when a ReadFile is pending on the read side.  It's as if the
> > running ReadFile already reserved write quota.  So the write side
> > WriteQuotaAvailable is the number of bytes we can write without blocking,
> > after all pending ReadFiles have been satisfied.
> > 
> > Unfortunately that doesn't really make sense when looked at it from the
> > user space.
> > 
> > What that means in the first place is that WriteQuotaAvailable on the
> > write side is unreliable.  What we really need is InboundQuota -
> > read.ReadDataAvailable.  The problem with that is that the write side
> > usually has no access to the read side of the pipe.
> > 
> > Long story short, I have no idea how to fix that ATM.
> 
> Well, what about keeping a duplicate of the read side handle on the 
> write side just for calling NtQueryInformationFile?
> 
> Attached is an untested patch, can you have a look if that makes sense?
> 
> Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> function fails to create the write side fhandler, it deletes the read
> side fhandler, but neglects to close the read handle.  My patch fixes
> that.
> 
> While looking into this I found a problem in fhandler_disk_file in
> terms of handle inheritance of the special handle for pread/pwrite.
> I already force pushed this onto topic/pipe.

I tested your patch attached. Unfortunately, select() does not work
as expected for write pipe. Even if the select reports write pipe
is available, writing to pipe fails. It seems that your patch fails
to detect pipe full.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31  8:52                                             ` Takashi Yano
@ 2021-08-31  9:04                                               ` Corinna Vinschen
  2021-08-31 11:05                                                 ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31  9:04 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 17:52, Takashi Yano wrote:
> On Mon, 30 Aug 2021 17:19:44 +0200
> Corinna Vinschen wrote:
> > On Aug 30 11:00, Ken Brown wrote:
> > > On 8/30/2021 9:51 AM, Ken Brown wrote:
> > > > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > > > On Aug 30 21:04, Takashi Yano wrote:
> > > > > No worries.  The same should apply to the NtCreateFile side of the
> > > > > pipe, btw.
> > > > 
> > > > I'll add my thanks.  I should have checked the default flags that are
> > > > typically used for other devices when I wrote nt_create.  I'm glad you
> > > > caught this.
> > > > 
> > > > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> > > 
> > > I've done this now.  I'm still not sure I've got all the flags right.  For
> > > unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> > > NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> > > also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> > > relevant in this context?
> > 
> > This is only relevant if you want to open the pipe from another context,
> > calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> > duplicated, it shouldn't matter at all.
> > 
> > But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> > flag is probably a good thing for C# apps, but not for Cygwin, because it
> > enforces synchronous operation.  Sorry about that...
> 
> With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
> specifically concerned about cygwin pipe? 

We're using asynchronous IO to be able to call WFMO and thus to be able
to handle signals and thread cancellation events.  Wit hsynchronous IO
this is not possible.


Corinna


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31  8:55                                         ` Takashi Yano
@ 2021-08-31  9:08                                           ` Corinna Vinschen
  2021-08-31  9:25                                             ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31  9:08 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 17:55, Takashi Yano wrote:
> On Mon, 30 Aug 2021 22:14:15 +0200
> Corinna Vinschen wrote:
> > Hi Ken, Hi Takashi,
> > 
> > On Aug 30 19:00, Corinna Vinschen wrote:
> > Well, what about keeping a duplicate of the read side handle on the 
> > write side just for calling NtQueryInformationFile?
> > 
> > Attached is an untested patch, can you have a look if that makes sense?
> > 
> > Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> > function fails to create the write side fhandler, it deletes the read
> > side fhandler, but neglects to close the read handle.  My patch fixes
> > that.
> > 
> > While looking into this I found a problem in fhandler_disk_file in
> > terms of handle inheritance of the special handle for pread/pwrite.
> > I already force pushed this onto topic/pipe.
> 
> I tested your patch attached. Unfortunately, select() does not work
> as expected for write pipe. Even if the select reports write pipe
> is available, writing to pipe fails. It seems that your patch fails
> to detect pipe full.

Bummer.  Is that with byte mode pipes or with message mode pipes?  If
the latter, if you try to write more data than available in the buffer,
it's bound to fail.

Did you add debug output to pipe_data_available to see how the
information looks like?  Or do you have a simple, self-contained
testcase in plain C?


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31  9:08                                           ` Corinna Vinschen
@ 2021-08-31  9:25                                             ` Takashi Yano
  2021-08-31 10:05                                               ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-31  9:25 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1750 bytes --]

On Tue, 31 Aug 2021 11:08:42 +0200
Corinna Vinschenwrote:
> On Aug 31 17:55, Takashi Yano wrote:
> > On Mon, 30 Aug 2021 22:14:15 +0200
> > Corinna Vinschen wrote:
> > > Hi Ken, Hi Takashi,
> > > 
> > > On Aug 30 19:00, Corinna Vinschen wrote:
> > > Well, what about keeping a duplicate of the read side handle on the 
> > > write side just for calling NtQueryInformationFile?
> > > 
> > > Attached is an untested patch, can you have a look if that makes sense?
> > > 
> > > Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> > > function fails to create the write side fhandler, it deletes the read
> > > side fhandler, but neglects to close the read handle.  My patch fixes
> > > that.
> > > 
> > > While looking into this I found a problem in fhandler_disk_file in
> > > terms of handle inheritance of the special handle for pread/pwrite.
> > > I already force pushed this onto topic/pipe.
> > 
> > I tested your patch attached. Unfortunately, select() does not work
> > as expected for write pipe. Even if the select reports write pipe
> > is available, writing to pipe fails. It seems that your patch fails
> > to detect pipe full.
> 
> Bummer.  Is that with byte mode pipes or with message mode pipes?  If
> the latter, if you try to write more data than available in the buffer,
> it's bound to fail.

Both message pipe and byte pipe.

> Did you add debug output to pipe_data_available to see how the
> information looks like?  Or do you have a simple, self-contained
> testcase in plain C?

The test case is attached. If select() works as expected, the program
does not show "r" or "w". However, with your patch, the program prints
many "w" (means write() fails with EAGAIN).

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: stc.c --]
[-- Type: text/x-csrc, Size: 1797 bytes --]

#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>
#include <time.h>
#include <sys/select.h>
#include <fcntl.h>
#include <errno.h>

#define BLKSIZ 16384
#define NBLK (100*(1<<20)/BLKSIZ)

int main()
{
	int fd[2];
	pid_t pid;
	int nonblock = 1;

	pipe(fd);

	if (!(pid = fork ())) {
		int i;
		char buf[BLKSIZ] = {0,};
		close(fd[0]);
		if (nonblock) {
			int flags;
			flags = fcntl(fd[1], F_GETFL);
			flags |= O_NONBLOCK;
			fcntl(fd[1], F_SETFL, flags);
		}
		fd_set wfds;
		for (i=0; i<NBLK; i++) {
			FD_ZERO(&wfds);
			FD_SET(fd[1], &wfds);
			if (select(fd[1]+1, NULL, &wfds, NULL, NULL) > 0
					&& FD_ISSET(fd[1], &wfds)) {
				ssize_t len = write(fd[1], buf, sizeof(buf));
				if (len <= 0 && errno == EAGAIN) printf("w", i);
				if (len <= 0 && errno == EAGAIN) i --;
			}
		}
		close(fd[1]);
	} else {
		int i;
		char buf[BLKSIZ] = {0,};
		int total = 0;
		fd_set rfds;
		struct timespec tv0, tv1;
		double elasped;
		close(fd[1]);
		if (nonblock) {
			int flags;
			flags = fcntl(fd[0], F_GETFL);
			flags |= O_NONBLOCK;
			fcntl(fd[0], F_SETFL, flags);
		}
		usleep(100000); /* Delay to start reader */
		clock_gettime(CLOCK_MONOTONIC, &tv0);
		for (i=0; i<NBLK; i++) {
			FD_ZERO(&rfds);
			FD_SET(fd[0], &rfds);
			if (select(fd[0]+1, &rfds, NULL, NULL, NULL) > 0
					&& FD_ISSET(fd[0], &rfds)) {
				ssize_t len = read(fd[0], buf, sizeof(buf));
				if (len <= 0 && errno == EAGAIN) printf("r");
				if (len <= 0 && errno == EAGAIN) i --;
				else if(len < 0) break;
				else total += len;
			}
		}
		clock_gettime(CLOCK_MONOTONIC, &tv1);
		close(fd[0]);
		elasped = (tv1.tv_sec - tv0.tv_sec) + (tv1.tv_nsec - tv0.tv_nsec)*1e-9;
		printf("Total: %dMB in %f second, %fMB/s\n",
			total/(1<<20), elasped, total/(1<<20)/elasped);
		waitpid(pid, NULL, 0);
	}

	return 0;
}


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-30 15:43                                             ` Ken Brown
@ 2021-08-31  9:43                                               ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31  9:43 UTC (permalink / raw)
  To: cygwin-developers

On Aug 30 11:43, Ken Brown wrote:
> On 8/30/2021 11:19 AM, Corinna Vinschen wrote:
> > On Aug 30 11:00, Ken Brown wrote:
> > > On 8/30/2021 9:51 AM, Ken Brown wrote:
> > > > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > > > On Aug 30 21:04, Takashi Yano wrote:
> > > > > No worries.  The same should apply to the NtCreateFile side of the
> > > > > pipe, btw.
> > > > 
> > > > I'll add my thanks.  I should have checked the default flags that are
> > > > typically used for other devices when I wrote nt_create.  I'm glad you
> > > > caught this.
> > > > 
> > > > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> > > 
> > > I've done this now.  I'm still not sure I've got all the flags right.  For
> > > unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> > > NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> > > also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> > > relevant in this context?
> > 
> > This is only relevant if you want to open the pipe from another context,
> > calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> > duplicated, it shouldn't matter at all.
> 
> OK, then I think I should remove the sharing from NtCreateNamedPipeFile,
> since it could confuse someone reading the code.

I reverted both "fix flags" patches for the time being.  Removing the
sharing flags results in NtOpenFile for the write side to fail.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31  9:25                                             ` Takashi Yano
@ 2021-08-31 10:05                                               ` Corinna Vinschen
  2021-08-31 10:18                                                 ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 10:05 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 18:25, Takashi Yano wrote:
> On Tue, 31 Aug 2021 11:08:42 +0200
> Corinna Vinschenwrote:
> > On Aug 31 17:55, Takashi Yano wrote:
> > > On Mon, 30 Aug 2021 22:14:15 +0200
> > > Corinna Vinschen wrote:
> > > > Hi Ken, Hi Takashi,
> > > > 
> > > > On Aug 30 19:00, Corinna Vinschen wrote:
> > > > Well, what about keeping a duplicate of the read side handle on the 
> > > > write side just for calling NtQueryInformationFile?
> > > > 
> > > > Attached is an untested patch, can you have a look if that makes sense?
> > > > 
> > > > Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> > > > function fails to create the write side fhandler, it deletes the read
> > > > side fhandler, but neglects to close the read handle.  My patch fixes
> > > > that.
> > > > 
> > > > While looking into this I found a problem in fhandler_disk_file in
> > > > terms of handle inheritance of the special handle for pread/pwrite.
> > > > I already force pushed this onto topic/pipe.
> > > 
> > > I tested your patch attached. Unfortunately, select() does not work
> > > as expected for write pipe. Even if the select reports write pipe
> > > is available, writing to pipe fails. It seems that your patch fails
> > > to detect pipe full.
> > 
> > Bummer.  Is that with byte mode pipes or with message mode pipes?  If
> > the latter, if you try to write more data than available in the buffer,
> > it's bound to fail.
> 
> Both message pipe and byte pipe.
> 
> > Did you add debug output to pipe_data_available to see how the
> > information looks like?  Or do you have a simple, self-contained
> > testcase in plain C?
> 
> The test case is attached. If select() works as expected, the program
> does not show "r" or "w". However, with your patch, the program prints
> many "w" (means write() fails with EAGAIN).

Thanks!  I found th culprit, but we have another problem.  Even if
select returns correct info,  A write, trying to write more bytes
than are available in the buffer, hangs.  This shouldn't happen.
Still digging...


Corinna


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 10:05                                               ` Corinna Vinschen
@ 2021-08-31 10:18                                                 ` Corinna Vinschen
  2021-08-31 11:45                                                   ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 10:18 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2316 bytes --]

On Aug 31 12:05, Corinna Vinschen wrote:
> On Aug 31 18:25, Takashi Yano wrote:
> > On Tue, 31 Aug 2021 11:08:42 +0200
> > Corinna Vinschenwrote:
> > > On Aug 31 17:55, Takashi Yano wrote:
> > > > On Mon, 30 Aug 2021 22:14:15 +0200
> > > > Corinna Vinschen wrote:
> > > > > Hi Ken, Hi Takashi,
> > > > > 
> > > > > On Aug 30 19:00, Corinna Vinschen wrote:
> > > > > Well, what about keeping a duplicate of the read side handle on the 
> > > > > write side just for calling NtQueryInformationFile?
> > > > > 
> > > > > Attached is an untested patch, can you have a look if that makes sense?
> > > > > 
> > > > > Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> > > > > function fails to create the write side fhandler, it deletes the read
> > > > > side fhandler, but neglects to close the read handle.  My patch fixes
> > > > > that.
> > > > > 
> > > > > While looking into this I found a problem in fhandler_disk_file in
> > > > > terms of handle inheritance of the special handle for pread/pwrite.
> > > > > I already force pushed this onto topic/pipe.
> > > > 
> > > > I tested your patch attached. Unfortunately, select() does not work
> > > > as expected for write pipe. Even if the select reports write pipe
> > > > is available, writing to pipe fails. It seems that your patch fails
> > > > to detect pipe full.
> > > 
> > > Bummer.  Is that with byte mode pipes or with message mode pipes?  If
> > > the latter, if you try to write more data than available in the buffer,
> > > it's bound to fail.
> > 
> > Both message pipe and byte pipe.
> > 
> > > Did you add debug output to pipe_data_available to see how the
> > > information looks like?  Or do you have a simple, self-contained
> > > testcase in plain C?
> > 
> > The test case is attached. If select() works as expected, the program
> > does not show "r" or "w". However, with your patch, the program prints
> > many "w" (means write() fails with EAGAIN).
> 
> Thanks!  I found th culprit, but we have another problem.  Even if
> select returns correct info,  A write, trying to write more bytes
> than are available in the buffer, hangs.  This shouldn't happen.
> Still digging...

That's, of course, correct behaviour for pipes in blocking mode.  D'oh! 

Please try the attached patch on top of topic/pipe.


Thanks,
Corinna

[-- Attachment #2: pipe.diff --]
[-- Type: text/plain, Size: 6770 bytes --]

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 132e6002133b..1f0f28077a7c 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1171,6 +1171,7 @@ class fhandler_socket_unix : public fhandler_socket
 class fhandler_pipe: public fhandler_base
 {
 private:
+  HANDLE query_hdl;
   pid_t popen_pid;
   size_t max_atomic_write;
   void set_pipe_non_blocking (bool nonblocking);
@@ -1179,6 +1180,8 @@ public:
 
   bool ispipe() const { return true; }
 
+  HANDLE get_query_handle () const { return query_hdl; }
+
   void set_popen_pid (pid_t pid) {popen_pid = pid;}
   pid_t get_popen_pid () const {return popen_pid;}
   off_t lseek (off_t offset, int whence);
@@ -1187,7 +1190,9 @@ public:
   select_record *select_except (select_stuff *);
   char *get_proc_fd_name (char *buf);
   int open (int flags, mode_t mode = 0);
+  void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
+  int close ();
   void __reg3 raw_read (void *ptr, size_t& len);
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
   int ioctl (unsigned int cmd, void *);
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 2dec0a84817c..479b62bbd4aa 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -405,22 +405,45 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
   return ret;
 }
 
+void
+fhandler_pipe::fixup_after_fork (HANDLE parent)
+{
+  if (query_hdl)
+    fork_fixup (parent, query_hdl, "query_hdl");
+  fhandler_base::fixup_after_fork (parent);
+}
+
 int
 fhandler_pipe::dup (fhandler_base *child, int flags)
 {
   fhandler_pipe *ftp = (fhandler_pipe *) child;
   ftp->set_popen_pid (0);
 
-  int res;
-  if (get_handle () && fhandler_base::dup (child, flags))
+  int res = 0;
+  if (fhandler_base::dup (child, flags))
     res = -1;
-  else
-    res = 0;
+  else if (query_hdl &&
+	   !DuplicateHandle (GetCurrentProcess (), query_hdl,
+			     GetCurrentProcess (), &ftp->query_hdl,
+			     0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
 }
 
+int
+fhandler_pipe::close ()
+{
+  if (query_hdl)
+    NtClose (query_hdl);
+  return fhandler_base::close ();
+}
+
 #define PIPE_INTRO "\\\\.\\pipe\\cygwin-"
 
 /* Create a pipe, and return handles to the read and write ends,
@@ -608,6 +631,7 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
   else if ((fhs[1] = (fhandler_pipe *) build_fh_dev (*pipew_dev)) == NULL)
     {
       delete fhs[0];
+      CloseHandle (r);
       CloseHandle (w);
     }
   else
@@ -617,10 +641,23 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 		    unique_id);
       fhs[1]->init (w, FILE_CREATE_PIPE_INSTANCE | GENERIC_WRITE, mode,
 		    unique_id);
-      res = 0;
+      /* For the write side of the pipe, duplicate the handle to the read side
+	 into query_hdl just for calling NtQueryInformationFile.  See longish
+	 comment in select.cc, pipe_data_available() for the reasoning. */
+      if (!DuplicateHandle (GetCurrentProcess (), r, GetCurrentProcess (),
+			    &fhs[1]->query_hdl, GENERIC_READ,
+			    !(mode & O_CLOEXEC), 0))
+	{
+	  delete fhs[0];
+	  CloseHandle (r);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	res = 0;
     }
 
-  debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
+  debug_printf ("%R = pipe(%d, %y)", res, psize, mode);
   return res;
 }
 
@@ -661,7 +698,7 @@ nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
   access = GENERIC_READ | FILE_WRITE_ATTRIBUTES;
 
   ULONG pipe_type = pipe_byte ? FILE_PIPE_BYTE_STREAM_TYPE
-    : FILE_PIPE_MESSAGE_TYPE;
+			      : FILE_PIPE_MESSAGE_TYPE;
 
   /* Retry NtCreateNamedPipeFile as long as the pipe name is in use.
      Retrying will probably never be necessary, but we want
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 83e1c00e0ac7..7e69ad834d9a 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -608,15 +608,33 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
     }
   if (writing)
     {
-	/* If there is anything available in the pipe buffer then signal
-	   that.  This means that a pipe could still block since you could
-	   be trying to write more to the pipe than is available in the
-	   buffer but that is the hazard of select().  */
-      fpli.WriteQuotaAvailable = fpli.OutboundQuota - fpli.ReadDataAvailable;
+      /* If there is anything available in the pipe buffer then signal
+	 that.  This means that a pipe could still block since you could
+	 be trying to write more to the pipe than is available in the
+	 buffer but that is the hazard of select().
+
+	 Note that WriteQuotaAvailable is unreliable.
+
+	 Usually WriteQuotaAvailable on the write side reflects the space
+	 available in the inbound buffer on the read side.  However, if a
+	 pipe read is currently pending, WriteQuotaAvailable on the write side
+	 is decremented by the number of bytes the read side is requesting.
+	 So it's possible (even likely) that WriteQuotaAvailable is 0, even
+	 if the inbound buffer on the read side is not full.  This can lead to
+	 a deadlock situation: The reader is waiting for data, but select
+	 on the writer side assumes that no space is available in the read
+	 side inbound buffer.
+
+	 Consequentially, the only reliable information is available on the
+	 read side, so fetch info from the read side via the pipe-specific
+	 query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
+	 interesting value, which is the InboundQuote on the read side,
+	 decremented by the number of bytes of data in that buffer. */
+      fpli.WriteQuotaAvailable = fpli.InboundQuota - fpli.ReadDataAvailable;
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
-			   fh->get_name (), fpli.OutboundQuota,
+			   fh->get_name (), fpli.InboundQuota,
 			   fpli.WriteQuotaAvailable);
 	  return 1;
 	}
@@ -718,9 +736,14 @@ out:
       fhandler_pty_master *fhm = (fhandler_pty_master *) fh;
       fhm->set_mask_flusho (s->read_ready);
     }
-  h = fh->get_output_handle ();
   if (s->write_selected && dev != FH_PIPER)
     {
+      /* For the write side of a pipe, fetch the handle to the read side.
+	 See the longish comment in pipe_data_available for the reasoning. */
+      if (dev == FH_PIPEW)
+	h = ((fhandler_pipe *) fh)->get_query_handle ();
+      else
+	h = fh->get_output_handle ();
       gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31  9:04                                               ` Corinna Vinschen
@ 2021-08-31 11:05                                                 ` Takashi Yano
  2021-08-31 15:20                                                   ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-31 11:05 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 31 Aug 2021 11:04:05 +0200
Corinna Vinschen wrote:
> On Aug 31 17:52, Takashi Yano wrote:
> > On Mon, 30 Aug 2021 17:19:44 +0200
> > Corinna Vinschen wrote:
> > > On Aug 30 11:00, Ken Brown wrote:
> > > > On 8/30/2021 9:51 AM, Ken Brown wrote:
> > > > > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > > > > On Aug 30 21:04, Takashi Yano wrote:
> > > > > > No worries.  The same should apply to the NtCreateFile side of the
> > > > > > pipe, btw.
> > > > > 
> > > > > I'll add my thanks.  I should have checked the default flags that are
> > > > > typically used for other devices when I wrote nt_create.  I'm glad you
> > > > > caught this.
> > > > > 
> > > > > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> > > > 
> > > > I've done this now.  I'm still not sure I've got all the flags right.  For
> > > > unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> > > > NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> > > > also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> > > > relevant in this context?
> > > 
> > > This is only relevant if you want to open the pipe from another context,
> > > calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> > > duplicated, it shouldn't matter at all.
> > > 
> > > But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> > > flag is probably a good thing for C# apps, but not for Cygwin, because it
> > > enforces synchronous operation.  Sorry about that...
> > 
> > With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
> > specifically concerned about cygwin pipe? 
> 
> We're using asynchronous IO to be able to call WFMO and thus to be able
> to handle signals and thread cancellation events.  Wit hsynchronous IO
> this is not possible.

Thanks. How can I regenerate above issue? Stopping by Ctrl-C or killing
the process by kill seems to work even with FILE_SYNCHRONOUS_IO_NONALERT.
Where is the WFMO called for pipe handle?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 10:18                                                 ` Corinna Vinschen
@ 2021-08-31 11:45                                                   ` Takashi Yano
  2021-08-31 12:31                                                     ` Takashi Yano
  2021-08-31 12:33                                                     ` Ken Brown
  0 siblings, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-08-31 11:45 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3634 bytes --]

On Tue, 31 Aug 2021 12:18:57 +0200
Corinna Vinschen wrote:
> On Aug 31 12:05, Corinna Vinschen wrote:
> > On Aug 31 18:25, Takashi Yano wrote:
> > > On Tue, 31 Aug 2021 11:08:42 +0200
> > > Corinna Vinschenwrote:
> > > > On Aug 31 17:55, Takashi Yano wrote:
> > > > > On Mon, 30 Aug 2021 22:14:15 +0200
> > > > > Corinna Vinschen wrote:
> > > > > > Hi Ken, Hi Takashi,
> > > > > > 
> > > > > > On Aug 30 19:00, Corinna Vinschen wrote:
> > > > > > Well, what about keeping a duplicate of the read side handle on the 
> > > > > > write side just for calling NtQueryInformationFile?
> > > > > > 
> > > > > > Attached is an untested patch, can you have a look if that makes sense?
> > > > > > 
> > > > > > Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> > > > > > function fails to create the write side fhandler, it deletes the read
> > > > > > side fhandler, but neglects to close the read handle.  My patch fixes
> > > > > > that.
> > > > > > 
> > > > > > While looking into this I found a problem in fhandler_disk_file in
> > > > > > terms of handle inheritance of the special handle for pread/pwrite.
> > > > > > I already force pushed this onto topic/pipe.
> > > > > 
> > > > > I tested your patch attached. Unfortunately, select() does not work
> > > > > as expected for write pipe. Even if the select reports write pipe
> > > > > is available, writing to pipe fails. It seems that your patch fails
> > > > > to detect pipe full.
> > > > 
> > > > Bummer.  Is that with byte mode pipes or with message mode pipes?  If
> > > > the latter, if you try to write more data than available in the buffer,
> > > > it's bound to fail.
> > > 
> > > Both message pipe and byte pipe.
> > > 
> > > > Did you add debug output to pipe_data_available to see how the
> > > > information looks like?  Or do you have a simple, self-contained
> > > > testcase in plain C?
> > > 
> > > The test case is attached. If select() works as expected, the program
> > > does not show "r" or "w". However, with your patch, the program prints
> > > many "w" (means write() fails with EAGAIN).
> > 
> > Thanks!  I found th culprit, but we have another problem.  Even if
> > select returns correct info,  A write, trying to write more bytes
> > than are available in the buffer, hangs.  This shouldn't happen.
> > Still digging...
> 
> That's, of course, correct behaviour for pipes in blocking mode.  D'oh! 
> 
> Please try the attached patch on top of topic/pipe.

Thanks for the new patch. I have confirmed that above issue
is fixed and select() for write pipe seems to work as expected.


BTW, I found one minor difference between Linux and this pipe
implementation.

The test case is attached. The test case uses non-bloking I/O.
If this STC runs on Linux, the result is:

1024/1024
1740/1740
2958/2958
5028/5028
8547/8547
14529/14529
24699/24699
41988/41988
22227/71379
65536/121344
65536/206284
Total: 247KB in 0.000612 second, 403517.628166KB/s

On cygwin 3.2.0, the result is similar to Linux.

1024/1024
1740/1740
2957/2957
5026/5026
8544/8544
14524/14524
24690/24690
41972/41972
65536/71352
65536/121298
65536/206206
Total: 290KB in 0.062653 second, 4628.669018KB/s


However, on topic/pipe implementation, the result is

1024/1024
1740/1740
2957/2957
5026/5026
8544/8544
14524/14524
24690/24690
-1/41972
w-1/71352
w-1/121298
w-1/206206
wTotal: 57KB in 0.000330 second, 172989.377845KB/s

In non-blocking mode, writing more than pipe space will fail with
EAGAIN in this implementation.

In Linux and cygwin 3.2.0, it seems to write as much as writable.

Is this difficult to be fixed?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: stc3.c --]
[-- Type: text/x-csrc, Size: 1786 bytes --]

#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>
#include <time.h>
#include <sys/select.h>
#include <fcntl.h>
#include <errno.h>

#define BLKSIZ (65536*4)
#define NBLK (100*(1<<20)/BLKSIZ)

int main()
{
	int fd[2];
	pid_t pid;
	int nonblock = 1;

	pipe(fd);

	if (!(pid = fork ())) {
		int i;
		char buf[BLKSIZ] = {0,};
		close(fd[0]);
		if (nonblock) {
			int flags;
			flags = fcntl(fd[1], F_GETFL);
			flags |= O_NONBLOCK;
			fcntl(fd[1], F_SETFL, flags);
		}
		fd_set wfds;
		for (int wlen=1024; wlen<=BLKSIZ; wlen *= 1.7) {
			FD_ZERO(&wfds);
			FD_SET(fd[1], &wfds);
			if (select(fd[1]+1, NULL, &wfds, NULL, NULL) > 0
					&& FD_ISSET(fd[1], &wfds)) {
				ssize_t len = write(fd[1], buf, wlen);
				printf("%d/%d\n", len, wlen);
				if (len < 0 && errno == EAGAIN) printf("w", i);
				if (len < 0 && errno == EAGAIN) i --;
			}
		}
		close(fd[1]);
	} else {
		char buf[BLKSIZ] = {0,};
		int total = 0;
		fd_set rfds;
		struct timespec tv0, tv1;
		double elasped;
		close(fd[1]);
		if (nonblock) {
			int flags;
			flags = fcntl(fd[0], F_GETFL);
			flags |= O_NONBLOCK;
			fcntl(fd[0], F_SETFL, flags);
		}
		usleep(1000000); /* Delay to start reader */
		clock_gettime(CLOCK_MONOTONIC, &tv0);
		for (;;) {
			FD_ZERO(&rfds);
			FD_SET(fd[0], &rfds);
			if (select(fd[0]+1, &rfds, NULL, NULL, NULL) > 0
					&& FD_ISSET(fd[0], &rfds)) {
				ssize_t len = read(fd[0], buf, sizeof(buf));
				if (len < 0 && errno == EAGAIN) printf("r");
				else if (len <= 0) break;
				else total += len;
			}
		}
		clock_gettime(CLOCK_MONOTONIC, &tv1);
		close(fd[0]);
		elasped = (tv1.tv_sec - tv0.tv_sec) + (tv1.tv_nsec - tv0.tv_nsec)*1e-9;
		printf("Total: %dKB in %f second, %fKB/s\n",
			total/(1<<10), elasped, total/(1<<10)/elasped);
		waitpid(pid, NULL, 0);
	}

	return 0;
}


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 11:45                                                   ` Takashi Yano
@ 2021-08-31 12:31                                                     ` Takashi Yano
  2021-08-31 15:08                                                       ` Corinna Vinschen
  2021-08-31 12:33                                                     ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-31 12:31 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 31 Aug 2021 20:45:41 +0900
Takashi Yano wrote:
> On Tue, 31 Aug 2021 12:18:57 +0200
> Corinna Vinschen wrote:
> > On Aug 31 12:05, Corinna Vinschen wrote:
> > > On Aug 31 18:25, Takashi Yano wrote:
> > > > On Tue, 31 Aug 2021 11:08:42 +0200
> > > > Corinna Vinschenwrote:
> > > > > On Aug 31 17:55, Takashi Yano wrote:
> > > > > > On Mon, 30 Aug 2021 22:14:15 +0200
> > > > > > Corinna Vinschen wrote:
> > > > > > > Hi Ken, Hi Takashi,
> > > > > > > 
> > > > > > > On Aug 30 19:00, Corinna Vinschen wrote:
> > > > > > > Well, what about keeping a duplicate of the read side handle on the 
> > > > > > > write side just for calling NtQueryInformationFile?
> > > > > > > 
> > > > > > > Attached is an untested patch, can you have a look if that makes sense?
> > > > > > > 
> > > > > > > Btw., I think I found a bug in the new fhandler_pipe::create.  If the
> > > > > > > function fails to create the write side fhandler, it deletes the read
> > > > > > > side fhandler, but neglects to close the read handle.  My patch fixes
> > > > > > > that.
> > > > > > > 
> > > > > > > While looking into this I found a problem in fhandler_disk_file in
> > > > > > > terms of handle inheritance of the special handle for pread/pwrite.
> > > > > > > I already force pushed this onto topic/pipe.
> > > > > > 
> > > > > > I tested your patch attached. Unfortunately, select() does not work
> > > > > > as expected for write pipe. Even if the select reports write pipe
> > > > > > is available, writing to pipe fails. It seems that your patch fails
> > > > > > to detect pipe full.
> > > > > 
> > > > > Bummer.  Is that with byte mode pipes or with message mode pipes?  If
> > > > > the latter, if you try to write more data than available in the buffer,
> > > > > it's bound to fail.
> > > > 
> > > > Both message pipe and byte pipe.
> > > > 
> > > > > Did you add debug output to pipe_data_available to see how the
> > > > > information looks like?  Or do you have a simple, self-contained
> > > > > testcase in plain C?
> > > > 
> > > > The test case is attached. If select() works as expected, the program
> > > > does not show "r" or "w". However, with your patch, the program prints
> > > > many "w" (means write() fails with EAGAIN).
> > > 
> > > Thanks!  I found th culprit, but we have another problem.  Even if
> > > select returns correct info,  A write, trying to write more bytes
> > > than are available in the buffer, hangs.  This shouldn't happen.
> > > Still digging...
> > 
> > That's, of course, correct behaviour for pipes in blocking mode.  D'oh! 
> > 
> > Please try the attached patch on top of topic/pipe.
> 
> Thanks for the new patch. I have confirmed that above issue
> is fixed and select() for write pipe seems to work as expected.
> 
> 
> BTW, I found one minor difference between Linux and this pipe
> implementation.
> 
> The test case is attached. The test case uses non-bloking I/O.
> If this STC runs on Linux, the result is:
> 
> 1024/1024
> 1740/1740
> 2958/2958
> 5028/5028
> 8547/8547
> 14529/14529
> 24699/24699
> 41988/41988
> 22227/71379
> 65536/121344
> 65536/206284
> Total: 247KB in 0.000612 second, 403517.628166KB/s
> 
> On cygwin 3.2.0, the result is similar to Linux.
> 
> 1024/1024
> 1740/1740
> 2957/2957
> 5026/5026
> 8544/8544
> 14524/14524
> 24690/24690
> 41972/41972
> 65536/71352
> 65536/121298
> 65536/206206
> Total: 290KB in 0.062653 second, 4628.669018KB/s
> 
> 
> However, on topic/pipe implementation, the result is
> 
> 1024/1024
> 1740/1740
> 2957/2957
> 5026/5026
> 8544/8544
> 14524/14524
> 24690/24690
> -1/41972
> w-1/71352
> w-1/121298
> w-1/206206
> wTotal: 57KB in 0.000330 second, 172989.377845KB/s
> 
> In non-blocking mode, writing more than pipe space will fail with
> EAGAIN in this implementation.
> 
> In Linux and cygwin 3.2.0, it seems to write as much as writable.
> 
> Is this difficult to be fixed?

The following patch almost fixes the issue, but atomicity is the problem.

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 2dec0a848..0a74a654d 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -323,10 +323,15 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
   if (!len)
     return 0;

-  if (len <= max_atomic_write)
+  FILE_PIPE_LOCAL_INFORMATION fpli;
+  NtQueryInformationFile (query_hdl, &io, &fpli, sizeof (fpli),
+                         FilePipeLocalInformation);
+  ULONG room = fpli.InboundQuota - fpli.ReadDataAvailable;
+
+  if (len <= room)
     chunk = len;
   else if (is_nonblocking ())
-    chunk = len = max_atomic_write;
+    chunk = len = room;
   else
     chunk = max_atomic_write;


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 11:45                                                   ` Takashi Yano
  2021-08-31 12:31                                                     ` Takashi Yano
@ 2021-08-31 12:33                                                     ` Ken Brown
  2021-08-31 15:18                                                       ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-31 12:33 UTC (permalink / raw)
  To: cygwin-developers

On 8/31/2021 7:45 AM, Takashi Yano wrote:
> On Tue, 31 Aug 2021 12:18:57 +0200
> Corinna Vinschen wrote:
>> On Aug 31 12:05, Corinna Vinschen wrote:
>>> On Aug 31 18:25, Takashi Yano wrote:
>>>> On Tue, 31 Aug 2021 11:08:42 +0200
>>>> Corinna Vinschenwrote:
>>>>> On Aug 31 17:55, Takashi Yano wrote:
>>>>>> On Mon, 30 Aug 2021 22:14:15 +0200
>>>>>> Corinna Vinschen wrote:
>>>>>>> Hi Ken, Hi Takashi,
>>>>>>>
>>>>>>> On Aug 30 19:00, Corinna Vinschen wrote:
>>>>>>> Well, what about keeping a duplicate of the read side handle on the
>>>>>>> write side just for calling NtQueryInformationFile?
>>>>>>>
>>>>>>> Attached is an untested patch, can you have a look if that makes sense?
>>>>>>>
>>>>>>> Btw., I think I found a bug in the new fhandler_pipe::create.  If the
>>>>>>> function fails to create the write side fhandler, it deletes the read
>>>>>>> side fhandler, but neglects to close the read handle.  My patch fixes
>>>>>>> that.
>>>>>>>
>>>>>>> While looking into this I found a problem in fhandler_disk_file in
>>>>>>> terms of handle inheritance of the special handle for pread/pwrite.
>>>>>>> I already force pushed this onto topic/pipe.
>>>>>>
>>>>>> I tested your patch attached. Unfortunately, select() does not work
>>>>>> as expected for write pipe. Even if the select reports write pipe
>>>>>> is available, writing to pipe fails. It seems that your patch fails
>>>>>> to detect pipe full.
>>>>>
>>>>> Bummer.  Is that with byte mode pipes or with message mode pipes?  If
>>>>> the latter, if you try to write more data than available in the buffer,
>>>>> it's bound to fail.
>>>>
>>>> Both message pipe and byte pipe.
>>>>
>>>>> Did you add debug output to pipe_data_available to see how the
>>>>> information looks like?  Or do you have a simple, self-contained
>>>>> testcase in plain C?
>>>>
>>>> The test case is attached. If select() works as expected, the program
>>>> does not show "r" or "w". However, with your patch, the program prints
>>>> many "w" (means write() fails with EAGAIN).
>>>
>>> Thanks!  I found th culprit, but we have another problem.  Even if
>>> select returns correct info,  A write, trying to write more bytes
>>> than are available in the buffer, hangs.  This shouldn't happen.
>>> Still digging...
>>
>> That's, of course, correct behaviour for pipes in blocking mode.  D'oh!
>>
>> Please try the attached patch on top of topic/pipe.
> 
> Thanks for the new patch. I have confirmed that above issue
> is fixed and select() for write pipe seems to work as expected.
> 
> 
> BTW, I found one minor difference between Linux and this pipe
> implementation.
> 
> The test case is attached. The test case uses non-bloking I/O.
> If this STC runs on Linux, the result is:
> 
> 1024/1024
> 1740/1740
> 2958/2958
> 5028/5028
> 8547/8547
> 14529/14529
> 24699/24699
> 41988/41988
> 22227/71379
> 65536/121344
> 65536/206284
> Total: 247KB in 0.000612 second, 403517.628166KB/s
> 
> On cygwin 3.2.0, the result is similar to Linux.
> 
> 1024/1024
> 1740/1740
> 2957/2957
> 5026/5026
> 8544/8544
> 14524/14524
> 24690/24690
> 41972/41972
> 65536/71352
> 65536/121298
> 65536/206206
> Total: 290KB in 0.062653 second, 4628.669018KB/s
> 
> 
> However, on topic/pipe implementation, the result is
> 
> 1024/1024
> 1740/1740
> 2957/2957
> 5026/5026
> 8544/8544
> 14524/14524
> 24690/24690
> -1/41972
> w-1/71352
> w-1/121298
> w-1/206206
> wTotal: 57KB in 0.000330 second, 172989.377845KB/s
> 
> In non-blocking mode, writing more than pipe space will fail with
> EAGAIN in this implementation.
> 
> In Linux and cygwin 3.2.0, it seems to write as much as writable.
> 
> Is this difficult to be fixed?
Two other remarks:

1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.

2. When the read side of the pipe is non-blocking, there can be no pending 
reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the write 
side?  (I can't test this at the moment.)  This applies in particular to the 
call to pipe_data_available at the end of peek_fifo, since all fifo readers use 
non-blocking pipes.  Maybe pipe_data-available needs an extra parameter so the 
caller can specify that WriteQuotaAvailable should be used.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 12:31                                                     ` Takashi Yano
@ 2021-08-31 15:08                                                       ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 15:08 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 21:31, Takashi Yano wrote:
> On Tue, 31 Aug 2021 20:45:41 +0900
> Takashi Yano wrote:
> > On Tue, 31 Aug 2021 12:18:57 +0200
> > Corinna Vinschen wrote:
> > > Please try the attached patch on top of topic/pipe.
> > 
> > Thanks for the new patch. I have confirmed that above issue
> > is fixed and select() for write pipe seems to work as expected.
> > 
> > 
> > BTW, I found one minor difference between Linux and this pipe
> > implementation.
> > 
> > The test case is attached. The test case uses non-bloking I/O.
> > If this STC runs on Linux, the result is:
> > 
> > 1024/1024
> > 1740/1740
> > 2958/2958
> > 5028/5028
> > 8547/8547
> > 14529/14529
> > 24699/24699
> > 41988/41988
> > 22227/71379
> > 65536/121344
> > 65536/206284
> > Total: 247KB in 0.000612 second, 403517.628166KB/s
> > 
> > On cygwin 3.2.0, the result is similar to Linux.
> > 
> > 1024/1024
> > 1740/1740
> > 2957/2957
> > 5026/5026
> > 8544/8544
> > 14524/14524
> > 24690/24690
> > 41972/41972
> > 65536/71352
> > 65536/121298
> > 65536/206206
> > Total: 290KB in 0.062653 second, 4628.669018KB/s
> > 
> > 
> > However, on topic/pipe implementation, the result is
> > 
> > 1024/1024
> > 1740/1740
> > 2957/2957
> > 5026/5026
> > 8544/8544
> > 14524/14524
> > 24690/24690
> > -1/41972
> > w-1/71352
> > w-1/121298
> > w-1/206206
> > wTotal: 57KB in 0.000330 second, 172989.377845KB/s
> > 
> > In non-blocking mode, writing more than pipe space will fail with
> > EAGAIN in this implementation.
> > 
> > In Linux and cygwin 3.2.0, it seems to write as much as writable.
> > 
> > Is this difficult to be fixed?
> 
> The following patch almost fixes the issue, but atomicity is the problem.

Thanks, I took the liberty to use your idea to implement a loop trying
to write again.  For me the output is now

  1024/1024
  1740/1740
  2958/2958
  5028/5028
  8547/8547
  14529/14529
  24699/24699
  7011/41988
  65536/71379
  65536/121344
  65536/206284
  Total: 256KB in 0.017771 second, 14405.248913KB/s

Could you try again with this patch?  I'm glad if we can straighten
out the bugs :)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 2dec0a84817c..0aed8456bb0b 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -352,8 +352,30 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
       else
 	len1 = (ULONG) left;
       nbytes_now = 0;
-      status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
-			    (PVOID) ptr, len1, NULL, NULL);
+      while (true)
+	{
+	  status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
+				(PVOID) ptr, len1, NULL, NULL);
+	  if (evt || !NT_SUCCESS (status) || io.Information > 0)
+	    break;
+
+	  FILE_PIPE_LOCAL_INFORMATION fpli;
+	  IO_STATUS_BLOCK qio;
+
+	  if (!NT_SUCCESS (NtQueryInformationFile (query_hdl, &qio, &fpli,
+			   sizeof (fpli), FilePipeLocalInformation)))
+	    len1 >>= 1;
+	  else
+	    {
+	      fpli.WriteQuotaAvailable = fpli.InboundQuota
+					 - fpli.ReadDataAvailable;
+	      if (len1 > fpli.WriteQuotaAvailable
+		  && fpli.WriteQuotaAvailable > 0)
+		len1 = fpli.InboundQuota - fpli.ReadDataAvailable;
+	      else
+		break;
+	    }
+	}
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -378,7 +400,7 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
 	  /* NtWriteFile returns success with # of bytes written == 0
 	     if writing on a non-blocking pipe fails because the pipe
 	     buffer doesn't have sufficient space. */
-	  if (nbytes_now == 0)
+	  if (nbytes_now == 0 && nbytes == 0)
 	    set_errno (EAGAIN);
 	  ptr = ((char *) ptr) + chunk;
 	  nbytes += nbytes_now;

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 12:33                                                     ` Ken Brown
@ 2021-08-31 15:18                                                       ` Corinna Vinschen
  2021-08-31 15:27                                                         ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 15:18 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 08:33, Ken Brown wrote:
> On 8/31/2021 7:45 AM, Takashi Yano wrote:
> > On Tue, 31 Aug 2021 12:18:57 +0200
> > Corinna Vinschen wrote:
> > > Please try the attached patch on top of topic/pipe.
> > 
> > Thanks for the new patch. I have confirmed that above issue
> > is fixed and select() for write pipe seems to work as expected.
> > 
> > 
> > BTW, I found one minor difference between Linux and this pipe
> > implementation.
> > [...]
> > Is this difficult to be fixed?
> Two other remarks:
> 
> 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.

No, that's not necessary.  The fhandlers are always ccalloc'ed so they
are all 0 anyway.

> 2. When the read side of the pipe is non-blocking, there can be no pending
> reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
> write side?  (I can't test this at the moment.)

In theory, yes, but is it a safe bet that non-blocking reads won't change
WriteQuotaAvailable on the write side, at least for a very short time?
The question is, of course, if that really makes much of a difference.

> This applies in particular
> to the call to pipe_data_available at the end of peek_fifo, since all fifo
> readers use non-blocking pipes.  Maybe pipe_data-available needs an extra
> parameter so the caller can specify that WriteQuotaAvailable should be used.

I can fix up my patch to accommodate that.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 11:05                                                 ` Takashi Yano
@ 2021-08-31 15:20                                                   ` Corinna Vinschen
  2021-09-01  2:39                                                     ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 15:20 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 20:05, Takashi Yano wrote:
> On Tue, 31 Aug 2021 11:04:05 +0200
> Corinna Vinschen wrote:
> > On Aug 31 17:52, Takashi Yano wrote:
> > > On Mon, 30 Aug 2021 17:19:44 +0200
> > > Corinna Vinschen wrote:
> > > > On Aug 30 11:00, Ken Brown wrote:
> > > > > On 8/30/2021 9:51 AM, Ken Brown wrote:
> > > > > > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > > > > > On Aug 30 21:04, Takashi Yano wrote:
> > > > > > > No worries.  The same should apply to the NtCreateFile side of the
> > > > > > > pipe, btw.
> > > > > > 
> > > > > > I'll add my thanks.  I should have checked the default flags that are
> > > > > > typically used for other devices when I wrote nt_create.  I'm glad you
> > > > > > caught this.
> > > > > > 
> > > > > > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> > > > > 
> > > > > I've done this now.  I'm still not sure I've got all the flags right.  For
> > > > > unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> > > > > NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> > > > > also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> > > > > relevant in this context?
> > > > 
> > > > This is only relevant if you want to open the pipe from another context,
> > > > calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> > > > duplicated, it shouldn't matter at all.
> > > > 
> > > > But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> > > > flag is probably a good thing for C# apps, but not for Cygwin, because it
> > > > enforces synchronous operation.  Sorry about that...
> > > 
> > > With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
> > > specifically concerned about cygwin pipe? 
> > 
> > We're using asynchronous IO to be able to call WFMO and thus to be able
> > to handle signals and thread cancellation events.  Wit hsynchronous IO
> > this is not possible.
> 
> Thanks. How can I regenerate above issue? Stopping by Ctrl-C or killing
> the process by kill seems to work even with FILE_SYNCHRONOUS_IO_NONALERT.

It may depend on the thread you're running this in.  But really, just
call a blocking (SYNCHRONIZE + FILE_SYNCHRONOUS_IO_NONALERT) ReadFile
in the main thread of a Cygwin app, and you'll see that neither Ctrl-C
nor kill signalling will get through.

> Where is the WFMO called for pipe handle?

The cygwait function.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 15:18                                                       ` Corinna Vinschen
@ 2021-08-31 15:27                                                         ` Corinna Vinschen
  2021-08-31 15:50                                                           ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 15:27 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 17:18, Corinna Vinschen wrote:
> On Aug 31 08:33, Ken Brown wrote:
> > On 8/31/2021 7:45 AM, Takashi Yano wrote:
> > > On Tue, 31 Aug 2021 12:18:57 +0200
> > > Corinna Vinschen wrote:
> > > > Please try the attached patch on top of topic/pipe.
> > > 
> > > Thanks for the new patch. I have confirmed that above issue
> > > is fixed and select() for write pipe seems to work as expected.
> > > 
> > > 
> > > BTW, I found one minor difference between Linux and this pipe
> > > implementation.
> > > [...]
> > > Is this difficult to be fixed?
> > Two other remarks:
> > 
> > 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.
> 
> No, that's not necessary.  The fhandlers are always ccalloc'ed so they
> are all 0 anyway.
> 
> > 2. When the read side of the pipe is non-blocking, there can be no pending
> > reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
> > write side?  (I can't test this at the moment.)
> 
> In theory, yes, but is it a safe bet that non-blocking reads won't change
> WriteQuotaAvailable on the write side, at least for a very short time?
> The question is, of course, if that really makes much of a difference.

Oh, btw... why do you want to use WriteQuotaAvailable for normal
pipes, even though the read side information is available anyway?

We can do that for fifos, no problem, but it doesn't make much sense
to differ between blocking and non-blocking pipes, the code flow is the
same.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 15:27                                                         ` Corinna Vinschen
@ 2021-08-31 15:50                                                           ` Corinna Vinschen
  2021-08-31 16:19                                                             ` Ken Brown
  2021-08-31 23:02                                                             ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 15:50 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1933 bytes --]

On Aug 31 17:27, Corinna Vinschen wrote:
> On Aug 31 17:18, Corinna Vinschen wrote:
> > On Aug 31 08:33, Ken Brown wrote:
> > > On 8/31/2021 7:45 AM, Takashi Yano wrote:
> > > > On Tue, 31 Aug 2021 12:18:57 +0200
> > > > Corinna Vinschen wrote:
> > > > > Please try the attached patch on top of topic/pipe.
> > > > 
> > > > Thanks for the new patch. I have confirmed that above issue
> > > > is fixed and select() for write pipe seems to work as expected.
> > > > 
> > > > 
> > > > BTW, I found one minor difference between Linux and this pipe
> > > > implementation.
> > > > [...]
> > > > Is this difficult to be fixed?
> > > Two other remarks:
> > > 
> > > 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.
> > 
> > No, that's not necessary.  The fhandlers are always ccalloc'ed so they
> > are all 0 anyway.
> > 
> > > 2. When the read side of the pipe is non-blocking, there can be no pending
> > > reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
> > > write side?  (I can't test this at the moment.)
> > 
> > In theory, yes, but is it a safe bet that non-blocking reads won't change
> > WriteQuotaAvailable on the write side, at least for a very short time?
> > The question is, of course, if that really makes much of a difference.
> 
> Oh, btw... why do you want to use WriteQuotaAvailable for normal
> pipes, even though the read side information is available anyway?
> 
> We can do that for fifos, no problem, but it doesn't make much sense
> to differ between blocking and non-blocking pipes, the code flow is the
> same.

So for the time being I suggest the below patch on top of topic/pipe.
It contains everything we discussed so far.

One question left is, do we want to switch to FILE_PIPE_BYTE_STREAM_TYPE
entirely for pipes?  I don't see that it's still necessary to use
FILE_PIPE_MESSAGE_TYPE for pipes.  Everything seems to work normally
with byte-type pipes.

[-- Attachment #2: pipe.diff --]
[-- Type: text/plain, Size: 10392 bytes --]

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 132e6002133b..1f0f28077a7c 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1171,6 +1171,7 @@ class fhandler_socket_unix : public fhandler_socket
 class fhandler_pipe: public fhandler_base
 {
 private:
+  HANDLE query_hdl;
   pid_t popen_pid;
   size_t max_atomic_write;
   void set_pipe_non_blocking (bool nonblocking);
@@ -1179,6 +1180,8 @@ public:
 
   bool ispipe() const { return true; }
 
+  HANDLE get_query_handle () const { return query_hdl; }
+
   void set_popen_pid (pid_t pid) {popen_pid = pid;}
   pid_t get_popen_pid () const {return popen_pid;}
   off_t lseek (off_t offset, int whence);
@@ -1187,7 +1190,9 @@ public:
   select_record *select_except (select_stuff *);
   char *get_proc_fd_name (char *buf);
   int open (int flags, mode_t mode = 0);
+  void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
+  int close ();
   void __reg3 raw_read (void *ptr, size_t& len);
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
   int ioctl (unsigned int cmd, void *);
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 2dec0a84817c..2d9e87bb3450 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -352,8 +352,30 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
       else
 	len1 = (ULONG) left;
       nbytes_now = 0;
-      status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
-			    (PVOID) ptr, len1, NULL, NULL);
+      while (true)
+	{
+	  status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
+				(PVOID) ptr, len1, NULL, NULL);
+	  if (evt || !NT_SUCCESS (status) || io.Information > 0)
+	    break;
+
+	  FILE_PIPE_LOCAL_INFORMATION fpli;
+	  IO_STATUS_BLOCK qio;
+
+	  if (!NT_SUCCESS (NtQueryInformationFile (query_hdl, &qio, &fpli,
+			   sizeof (fpli), FilePipeLocalInformation)))
+	    len1 >>= 1;
+	  else
+	    {
+	      fpli.WriteQuotaAvailable = fpli.InboundQuota
+					 - fpli.ReadDataAvailable;
+	      if (len1 > fpli.WriteQuotaAvailable
+		  && fpli.WriteQuotaAvailable > 0)
+		len1 = fpli.InboundQuota - fpli.ReadDataAvailable;
+	      else
+		break;
+	    }
+	}
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -378,7 +400,7 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
 	  /* NtWriteFile returns success with # of bytes written == 0
 	     if writing on a non-blocking pipe fails because the pipe
 	     buffer doesn't have sufficient space. */
-	  if (nbytes_now == 0)
+	  if (nbytes_now == 0 && nbytes == 0)
 	    set_errno (EAGAIN);
 	  ptr = ((char *) ptr) + chunk;
 	  nbytes += nbytes_now;
@@ -405,22 +427,45 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
   return ret;
 }
 
+void
+fhandler_pipe::fixup_after_fork (HANDLE parent)
+{
+  if (query_hdl)
+    fork_fixup (parent, query_hdl, "query_hdl");
+  fhandler_base::fixup_after_fork (parent);
+}
+
 int
 fhandler_pipe::dup (fhandler_base *child, int flags)
 {
   fhandler_pipe *ftp = (fhandler_pipe *) child;
   ftp->set_popen_pid (0);
 
-  int res;
-  if (get_handle () && fhandler_base::dup (child, flags))
+  int res = 0;
+  if (fhandler_base::dup (child, flags))
     res = -1;
-  else
-    res = 0;
+  else if (query_hdl &&
+	   !DuplicateHandle (GetCurrentProcess (), query_hdl,
+			     GetCurrentProcess (), &ftp->query_hdl,
+			     0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
 }
 
+int
+fhandler_pipe::close ()
+{
+  if (query_hdl)
+    NtClose (query_hdl);
+  return fhandler_base::close ();
+}
+
 #define PIPE_INTRO "\\\\.\\pipe\\cygwin-"
 
 /* Create a pipe, and return handles to the read and write ends,
@@ -608,6 +653,7 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
   else if ((fhs[1] = (fhandler_pipe *) build_fh_dev (*pipew_dev)) == NULL)
     {
       delete fhs[0];
+      CloseHandle (r);
       CloseHandle (w);
     }
   else
@@ -617,10 +663,23 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 		    unique_id);
       fhs[1]->init (w, FILE_CREATE_PIPE_INSTANCE | GENERIC_WRITE, mode,
 		    unique_id);
-      res = 0;
+      /* For the write side of the pipe, duplicate the handle to the read side
+	 into query_hdl just for calling NtQueryInformationFile.  See longish
+	 comment in select.cc, pipe_data_available() for the reasoning. */
+      if (!DuplicateHandle (GetCurrentProcess (), r, GetCurrentProcess (),
+			    &fhs[1]->query_hdl, GENERIC_READ,
+			    !(mode & O_CLOEXEC), 0))
+	{
+	  delete fhs[0];
+	  CloseHandle (r);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	res = 0;
     }
 
-  debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
+  debug_printf ("%R = pipe(%d, %y)", res, psize, mode);
   return res;
 }
 
@@ -658,10 +717,10 @@ nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
 				 &cygheap->installation_key,
 				 GetCurrentProcessId ());
 
-  access = GENERIC_READ | FILE_WRITE_ATTRIBUTES;
+  access = GENERIC_READ | FILE_WRITE_ATTRIBUTES | SYNCHRONIZE;
 
   ULONG pipe_type = pipe_byte ? FILE_PIPE_BYTE_STREAM_TYPE
-    : FILE_PIPE_MESSAGE_TYPE;
+			      : FILE_PIPE_MESSAGE_TYPE;
 
   /* Retry NtCreateNamedPipeFile as long as the pipe name is in use.
      Retrying will probably never be necessary, but we want
@@ -737,7 +796,7 @@ nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
     {
       debug_printf ("NtOpenFile: name %S", &pipename);
 
-      access = GENERIC_WRITE | FILE_READ_ATTRIBUTES;
+      access = GENERIC_WRITE | FILE_READ_ATTRIBUTES | SYNCHRONIZE;
       status = NtOpenFile (w, access, &attr, &io, 0, 0);
       if (!NT_SUCCESS (status))
 	{
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 83e1c00e0ac7..dc1f7961351b 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -585,7 +585,8 @@ no_verify (select_record *, fd_set *, fd_set *, fd_set *)
 }
 
 static int
-pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
+pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing,
+		     bool use_readside)
 {
   IO_STATUS_BLOCK iosb = {{0}, 0};
   FILE_PIPE_LOCAL_INFORMATION fpli = {0};
@@ -608,15 +609,34 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
     }
   if (writing)
     {
-	/* If there is anything available in the pipe buffer then signal
-	   that.  This means that a pipe could still block since you could
-	   be trying to write more to the pipe than is available in the
-	   buffer but that is the hazard of select().  */
-      fpli.WriteQuotaAvailable = fpli.OutboundQuota - fpli.ReadDataAvailable;
+      /* If there is anything available in the pipe buffer then signal
+	 that.  This means that a pipe could still block since you could
+	 be trying to write more to the pipe than is available in the
+	 buffer but that is the hazard of select().
+
+	 Note that WriteQuotaAvailable is unreliable.
+
+	 Usually WriteQuotaAvailable on the write side reflects the space
+	 available in the inbound buffer on the read side.  However, if a
+	 pipe read is currently pending, WriteQuotaAvailable on the write side
+	 is decremented by the number of bytes the read side is requesting.
+	 So it's possible (even likely) that WriteQuotaAvailable is 0, even
+	 if the inbound buffer on the read side is not full.  This can lead to
+	 a deadlock situation: The reader is waiting for data, but select
+	 on the writer side assumes that no space is available in the read
+	 side inbound buffer.
+
+	 Consequentially, the only reliable information is available on the
+	 read side, so fetch info from the read side via the pipe-specific
+	 query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
+	 interesting value, which is the InboundQuote on the read side,
+	 decremented by the number of bytes of data in that buffer. */
+      if (use_readside)
+	fpli.WriteQuotaAvailable = fpli.InboundQuota - fpli.ReadDataAvailable;
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
-			   fh->get_name (), fpli.OutboundQuota,
+			   fh->get_name (), fpli.InboundQuota,
 			   fpli.WriteQuotaAvailable);
 	  return 1;
 	}
@@ -684,10 +704,11 @@ peek_pipe (select_record *s, bool from_select)
 	  gotone = s->read_ready = true;
 	  goto out;
 	}
-      int n = pipe_data_available (s->fd, fh, h, false);
+      int n = pipe_data_available (s->fd, fh, h, false, false);
       /* On PTY masters, check if input from the echo pipe is available. */
       if (n == 0 && fh->get_echo_handle ())
-	n = pipe_data_available (s->fd, fh, fh->get_echo_handle (), false);
+	n = pipe_data_available (s->fd, fh, fh->get_echo_handle (), false,
+				 false);
 
       if (n < 0)
 	{
@@ -718,10 +739,16 @@ out:
       fhandler_pty_master *fhm = (fhandler_pty_master *) fh;
       fhm->set_mask_flusho (s->read_ready);
     }
-  h = fh->get_output_handle ();
   if (s->write_selected && dev != FH_PIPER)
     {
-      gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);
+      /* For the write side of a pipe, fetch the handle to the read side.
+	 See the longish comment in pipe_data_available for the reasoning. */
+      if (dev == FH_PIPEW)
+	h = ((fhandler_pipe *) fh)->get_query_handle ();
+      else
+	h = fh->get_output_handle ();
+      gotone += s->write_ready = pipe_data_available (s->fd, fh, h, true,
+						      dev == FH_PIPEW);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }
   return gotone;
@@ -922,7 +949,7 @@ out:
   if (s->write_selected)
     {
       gotone += s->write_ready
-	= pipe_data_available (s->fd, fh, fh->get_handle (), true);
+	= pipe_data_available (s->fd, fh, fh->get_handle (), true, false);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }
   return gotone;
@@ -1368,7 +1395,8 @@ out:
   HANDLE h = ptys->get_output_handle ();
   if (s->write_selected)
     {
-      gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);
+      gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true,
+						       false);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }
   return gotone;

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 15:50                                                           ` Corinna Vinschen
@ 2021-08-31 16:19                                                             ` Ken Brown
  2021-08-31 16:38                                                               ` Ken Brown
  2021-08-31 23:02                                                             ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-31 16:19 UTC (permalink / raw)
  To: cygwin-developers

On 8/31/2021 11:50 AM, Corinna Vinschen wrote:
> On Aug 31 17:27, Corinna Vinschen wrote:
>> On Aug 31 17:18, Corinna Vinschen wrote:
>>> On Aug 31 08:33, Ken Brown wrote:
>>>> On 8/31/2021 7:45 AM, Takashi Yano wrote:
>>>>> On Tue, 31 Aug 2021 12:18:57 +0200
>>>>> Corinna Vinschen wrote:
>>>>>> Please try the attached patch on top of topic/pipe.
>>>>>
>>>>> Thanks for the new patch. I have confirmed that above issue
>>>>> is fixed and select() for write pipe seems to work as expected.
>>>>>
>>>>>
>>>>> BTW, I found one minor difference between Linux and this pipe
>>>>> implementation.
>>>>> [...]
>>>>> Is this difficult to be fixed?
>>>> Two other remarks:
>>>>
>>>> 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.
>>>
>>> No, that's not necessary.  The fhandlers are always ccalloc'ed so they
>>> are all 0 anyway.
>>>
>>>> 2. When the read side of the pipe is non-blocking, there can be no pending
>>>> reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
>>>> write side?  (I can't test this at the moment.)
>>>
>>> In theory, yes, but is it a safe bet that non-blocking reads won't change
>>> WriteQuotaAvailable on the write side, at least for a very short time?
>>> The question is, of course, if that really makes much of a difference.
>>
>> Oh, btw... why do you want to use WriteQuotaAvailable for normal
>> pipes, even though the read side information is available anyway?
>>
>> We can do that for fifos, no problem, but it doesn't make much sense
>> to differ between blocking and non-blocking pipes, the code flow is the
>> same.

Agreed.  It was mainly the fifo case that I was concerned about, and your way of 
handling that is much better than what I suggested.  [I also wondered about the 
pty case, but I see you've dealt with that in your latest patch.]

> So for the time being I suggest the below patch on top of topic/pipe.
> It contains everything we discussed so far.
> 
> One question left is, do we want to switch to FILE_PIPE_BYTE_STREAM_TYPE
> entirely for pipes?  I don't see that it's still necessary to use
> FILE_PIPE_MESSAGE_TYPE for pipes.  Everything seems to work normally
> with byte-type pipes.

Since no one remembers why we're defaulting to FILE_PIPE_MESSAGE_TYPE, I agree. 
  If a problem shows up, we can always rethink it.  I suggest that we still 
retain the CYGWIN option, at least for a while, in case someone encounters a 
problem and wants to switch back to message type for testing.

I'm afraid I still haven't had a chance to do any testing of your patch, but I 
expect to be able to do that later today.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 16:19                                                             ` Ken Brown
@ 2021-08-31 16:38                                                               ` Ken Brown
  2021-08-31 17:30                                                                 ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-31 16:38 UTC (permalink / raw)
  To: cygwin-developers

On 8/31/2021 12:19 PM, Ken Brown wrote:
> On 8/31/2021 11:50 AM, Corinna Vinschen wrote:
>> On Aug 31 17:27, Corinna Vinschen wrote:
>>> On Aug 31 17:18, Corinna Vinschen wrote:
>>>> On Aug 31 08:33, Ken Brown wrote:
>>>>> On 8/31/2021 7:45 AM, Takashi Yano wrote:
>>>>>> On Tue, 31 Aug 2021 12:18:57 +0200
>>>>>> Corinna Vinschen wrote:
>>>>>>> Please try the attached patch on top of topic/pipe.
>>>>>>
>>>>>> Thanks for the new patch. I have confirmed that above issue
>>>>>> is fixed and select() for write pipe seems to work as expected.
>>>>>>
>>>>>>
>>>>>> BTW, I found one minor difference between Linux and this pipe
>>>>>> implementation.
>>>>>> [...]
>>>>>> Is this difficult to be fixed?
>>>>> Two other remarks:
>>>>>
>>>>> 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.
>>>>
>>>> No, that's not necessary.  The fhandlers are always ccalloc'ed so they
>>>> are all 0 anyway.
>>>>
>>>>> 2. When the read side of the pipe is non-blocking, there can be no pending
>>>>> reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
>>>>> write side?  (I can't test this at the moment.)
>>>>
>>>> In theory, yes, but is it a safe bet that non-blocking reads won't change
>>>> WriteQuotaAvailable on the write side, at least for a very short time?
>>>> The question is, of course, if that really makes much of a difference.
>>>
>>> Oh, btw... why do you want to use WriteQuotaAvailable for normal
>>> pipes, even though the read side information is available anyway?
>>>
>>> We can do that for fifos, no problem, but it doesn't make much sense
>>> to differ between blocking and non-blocking pipes, the code flow is the
>>> same.
> 
> Agreed.  It was mainly the fifo case that I was concerned about, and your way of 
> handling that is much better than what I suggested.  [I also wondered about the 
> pty case, but I see you've dealt with that in your latest patch.]
> 
>> So for the time being I suggest the below patch on top of topic/pipe.
>> It contains everything we discussed so far.
>>
>> One question left is, do we want to switch to FILE_PIPE_BYTE_STREAM_TYPE
>> entirely for pipes?  I don't see that it's still necessary to use
>> FILE_PIPE_MESSAGE_TYPE for pipes.  Everything seems to work normally
>> with byte-type pipes.
> 
> Since no one remembers why we're defaulting to FILE_PIPE_MESSAGE_TYPE, I agree. 
>   If a problem shows up, we can always rethink it.  I suggest that we still 
> retain the CYGWIN option, at least for a while, in case someone encounters a 
> problem and wants to switch back to message type for testing.
> 
> I'm afraid I still haven't had a chance to do any testing of your patch, but I 
> expect to be able to do that later today.

And here's a really trivial comment about your patch to raw_write: Where you have

   len1 = fpli.InboundQuota - fpli.ReadDataAvailable;

I think the code would be slightly clearer if you wrote

   len1 = fpli.WriteQuotaAvailable;

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 16:38                                                               ` Ken Brown
@ 2021-08-31 17:30                                                                 ` Corinna Vinschen
  2021-08-31 18:54                                                                   ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 17:30 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 12:38, Ken Brown wrote:
> And here's a really trivial comment about your patch to raw_write: Where you have
> 
>   len1 = fpli.InboundQuota - fpli.ReadDataAvailable;
> 
> I think the code would be slightly clearer if you wrote
> 
>   len1 = fpli.WriteQuotaAvailable;

D'oh!  That was the idea.  Aparently I forgot it in mid-air...


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 17:30                                                                 ` Corinna Vinschen
@ 2021-08-31 18:54                                                                   ` Ken Brown
  2021-08-31 19:51                                                                     ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-08-31 18:54 UTC (permalink / raw)
  To: cygwin-developers

On 8/31/2021 1:30 PM, Corinna Vinschen wrote:
> On Aug 31 12:38, Ken Brown wrote:
>> And here's a really trivial comment about your patch to raw_write: Where you have
>>
>>    len1 = fpli.InboundQuota - fpli.ReadDataAvailable;
>>
>> I think the code would be slightly clearer if you wrote
>>
>>    len1 = fpli.WriteQuotaAvailable;
> 
> D'oh!  That was the idea.  Aparently I forgot it in mid-air...

One more thing.  For a non-blocking write, according to POSIX, "A write request 
for {PIPE_BUF} or fewer bytes shall have the following effect: if there is 
sufficient space available in the pipe, write() shall transfer all the data and 
return the number of bytes requested. Otherwise, write() shall transfer no data 
and return -1 with errno set to [EAGAIN]."

So I think the condition for breaking from the retry loop has to be changed from

   evt || !NT_SUCCESS (status) || io.Information > 0

to

   evt || !NT_SUCCESS (status) || io.Information > 0 || len <= PIPE_BUF

And I wonder if we've now uncovered a reason for using message mode: If the pipe 
was created in byte mode, might we get a partial write when len <= PIPE_BUF?  I 
see the following under "Pipes" at

   https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile

"When writing to a non-blocking, byte-mode pipe handle with insufficient buffer 
space, WriteFile returns TRUE with *lpNumberOfBytesWritten < nNumberOfBytesToWrite."

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 18:54                                                                   ` Ken Brown
@ 2021-08-31 19:51                                                                     ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-08-31 19:51 UTC (permalink / raw)
  To: cygwin-developers

On Aug 31 14:54, Ken Brown wrote:
> On 8/31/2021 1:30 PM, Corinna Vinschen wrote:
> > On Aug 31 12:38, Ken Brown wrote:
> > > And here's a really trivial comment about your patch to raw_write: Where you have
> > > 
> > >    len1 = fpli.InboundQuota - fpli.ReadDataAvailable;
> > > 
> > > I think the code would be slightly clearer if you wrote
> > > 
> > >    len1 = fpli.WriteQuotaAvailable;
> > 
> > D'oh!  That was the idea.  Aparently I forgot it in mid-air...
> 
> One more thing.  For a non-blocking write, according to POSIX, "A write
> request for {PIPE_BUF} or fewer bytes shall have the following effect: if
> there is sufficient space available in the pipe, write() shall transfer all
> the data and return the number of bytes requested. Otherwise, write() shall
> transfer no data and return -1 with errno set to [EAGAIN]."
> 
> So I think the condition for breaking from the retry loop has to be changed from
> 
>   evt || !NT_SUCCESS (status) || io.Information > 0
> 
> to
> 
>   evt || !NT_SUCCESS (status) || io.Information > 0 || len <= PIPE_BUF

Hmm.  I wonder if we shouldn't untangle the raw_write code and handle
blocking and non-blocking writes in two different branches of an if.
That should make things much clearer, shouldn't it?

> And I wonder if we've now uncovered a reason for using message mode: If the
> pipe was created in byte mode, might we get a partial write when len <=
> PIPE_BUF?  I see the following under "Pipes" at
> 
>   https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile
> 
> "When writing to a non-blocking, byte-mode pipe handle with insufficient
> buffer space, WriteFile returns TRUE with *lpNumberOfBytesWritten <
> nNumberOfBytesToWrite."

Good point.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 15:50                                                           ` Corinna Vinschen
  2021-08-31 16:19                                                             ` Ken Brown
@ 2021-08-31 23:02                                                             ` Takashi Yano
  2021-09-01  0:16                                                               ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-08-31 23:02 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 31 Aug 2021 17:50:14 +0200
Corinna Vinschen wrote:
> On Aug 31 17:27, Corinna Vinschen wrote:
> > On Aug 31 17:18, Corinna Vinschen wrote:
> > > On Aug 31 08:33, Ken Brown wrote:
> > > > On 8/31/2021 7:45 AM, Takashi Yano wrote:
> > > > > On Tue, 31 Aug 2021 12:18:57 +0200
> > > > > Corinna Vinschen wrote:
> > > > > > Please try the attached patch on top of topic/pipe.
> > > > > 
> > > > > Thanks for the new patch. I have confirmed that above issue
> > > > > is fixed and select() for write pipe seems to work as expected.
> > > > > 
> > > > > 
> > > > > BTW, I found one minor difference between Linux and this pipe
> > > > > implementation.
> > > > > [...]
> > > > > Is this difficult to be fixed?
> > > > Two other remarks:
> > > > 
> > > > 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.
> > > 
> > > No, that's not necessary.  The fhandlers are always ccalloc'ed so they
> > > are all 0 anyway.
> > > 
> > > > 2. When the read side of the pipe is non-blocking, there can be no pending
> > > > reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
> > > > write side?  (I can't test this at the moment.)
> > > 
> > > In theory, yes, but is it a safe bet that non-blocking reads won't change
> > > WriteQuotaAvailable on the write side, at least for a very short time?
> > > The question is, of course, if that really makes much of a difference.
> > 
> > Oh, btw... why do you want to use WriteQuotaAvailable for normal
> > pipes, even though the read side information is available anyway?
> > 
> > We can do that for fifos, no problem, but it doesn't make much sense
> > to differ between blocking and non-blocking pipes, the code flow is the
> > same.
> 
> So for the time being I suggest the below patch on top of topic/pipe.
> It contains everything we discussed so far.

One more thing. 'git log' cannot stop normally with 'q' with your patch.

> One question left is, do we want to switch to FILE_PIPE_BYTE_STREAM_TYPE
> entirely for pipes?  I don't see that it's still necessary to use
> FILE_PIPE_MESSAGE_TYPE for pipes.  Everything seems to work normally
> with byte-type pipes.

Byte pipe seems to work for me too.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 23:02                                                             ` Takashi Yano
@ 2021-09-01  0:16                                                               ` Takashi Yano
  2021-09-01  8:07                                                                 ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-01  0:16 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 1 Sep 2021 08:02:20 +0900
Takashi Yano wrote:
> On Tue, 31 Aug 2021 17:50:14 +0200
> Corinna Vinschen wrote:
> > On Aug 31 17:27, Corinna Vinschen wrote:
> > > On Aug 31 17:18, Corinna Vinschen wrote:
> > > > On Aug 31 08:33, Ken Brown wrote:
> > > > > On 8/31/2021 7:45 AM, Takashi Yano wrote:
> > > > > > On Tue, 31 Aug 2021 12:18:57 +0200
> > > > > > Corinna Vinschen wrote:
> > > > > > > Please try the attached patch on top of topic/pipe.
> > > > > > 
> > > > > > Thanks for the new patch. I have confirmed that above issue
> > > > > > is fixed and select() for write pipe seems to work as expected.
> > > > > > 
> > > > > > 
> > > > > > BTW, I found one minor difference between Linux and this pipe
> > > > > > implementation.
> > > > > > [...]
> > > > > > Is this difficult to be fixed?
> > > > > Two other remarks:
> > > > > 
> > > > > 1. I think query_hdl needs to be initialized in the fhandler_pipe constructor.
> > > > 
> > > > No, that's not necessary.  The fhandlers are always ccalloc'ed so they
> > > > are all 0 anyway.
> > > > 
> > > > > 2. When the read side of the pipe is non-blocking, there can be no pending
> > > > > reads, so shouldn't we be able to use WriteQuotaAvailable reliably on the
> > > > > write side?  (I can't test this at the moment.)
> > > > 
> > > > In theory, yes, but is it a safe bet that non-blocking reads won't change
> > > > WriteQuotaAvailable on the write side, at least for a very short time?
> > > > The question is, of course, if that really makes much of a difference.
> > > 
> > > Oh, btw... why do you want to use WriteQuotaAvailable for normal
> > > pipes, even though the read side information is available anyway?
> > > 
> > > We can do that for fifos, no problem, but it doesn't make much sense
> > > to differ between blocking and non-blocking pipes, the code flow is the
> > > same.
> > 
> > So for the time being I suggest the below patch on top of topic/pipe.
> > It contains everything we discussed so far.
> 
> One more thing. 'git log' cannot stop normally with 'q' with your patch.

The same happes with 'yes |less'.

The cause is that write side cannot detect closing read side because
query_hdl (read handle) is still opened.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-08-31 15:20                                                   ` Corinna Vinschen
@ 2021-09-01  2:39                                                     ` Takashi Yano
  2021-09-01  8:03                                                       ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-01  2:39 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 31 Aug 2021 17:20:25 +0200
Corinna Vinschen wrote:
> On Aug 31 20:05, Takashi Yano wrote:
> > On Tue, 31 Aug 2021 11:04:05 +0200
> > Corinna Vinschen wrote:
> > > On Aug 31 17:52, Takashi Yano wrote:
> > > > On Mon, 30 Aug 2021 17:19:44 +0200
> > > > Corinna Vinschen wrote:
> > > > > On Aug 30 11:00, Ken Brown wrote:
> > > > > > On 8/30/2021 9:51 AM, Ken Brown wrote:
> > > > > > > On 8/30/2021 8:55 AM, Corinna Vinschen wrote:
> > > > > > > > On Aug 30 21:04, Takashi Yano wrote:
> > > > > > > > No worries.  The same should apply to the NtCreateFile side of the
> > > > > > > > pipe, btw.
> > > > > > > 
> > > > > > > I'll add my thanks.  I should have checked the default flags that are
> > > > > > > typically used for other devices when I wrote nt_create.  I'm glad you
> > > > > > > caught this.
> > > > > > > 
> > > > > > > So I'll reinstate the use of nt_create and then let Takashi recheck everything.
> > > > > > 
> > > > > > I've done this now.  I'm still not sure I've got all the flags right.  For
> > > > > > unknown reasons, I've used FILE_SHARE_READ | FILE_SHARE_WRITE in the call to
> > > > > > NtCreateNamedPipeFile, and no sharing in the call to NtOpenFile.  Should I
> > > > > > also use FILE_SHARE_READ | FILE_SHARE in NtOpenFile?  Is sharing even
> > > > > > relevant in this context?
> > > > > 
> > > > > This is only relevant if you want to open the pipe from another context,
> > > > > calling CreateNamedPipe/CreateFile.  As long as the pipe is only
> > > > > duplicated, it shouldn't matter at all.
> > > > > 
> > > > > But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> > > > > flag is probably a good thing for C# apps, but not for Cygwin, because it
> > > > > enforces synchronous operation.  Sorry about that...
> > > > 
> > > > With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
> > > > specifically concerned about cygwin pipe? 
> > > 
> > > We're using asynchronous IO to be able to call WFMO and thus to be able
> > > to handle signals and thread cancellation events.  Wit hsynchronous IO
> > > this is not possible.
> > 
> > Thanks. How can I regenerate above issue? Stopping by Ctrl-C or killing
> > the process by kill seems to work even with FILE_SYNCHRONOUS_IO_NONALERT.
> 
> It may depend on the thread you're running this in.  But really, just
> call a blocking (SYNCHRONIZE + FILE_SYNCHRONOUS_IO_NONALERT) ReadFile
> in the main thread of a Cygwin app, and you'll see that neither Ctrl-C
> nor kill signalling will get through.

I confirmed the issue with FILE_SYNCHRONOUS_IO_NONALERT.

Thanks.


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  2:39                                                     ` Takashi Yano
@ 2021-09-01  8:03                                                       ` Corinna Vinschen
  2021-09-01  8:13                                                         ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-01  8:03 UTC (permalink / raw)
  To: cygwin-developers

On Sep  1 11:39, Takashi Yano wrote:
> On Tue, 31 Aug 2021 17:20:25 +0200
> Corinna Vinschen wrote:
> > On Aug 31 20:05, Takashi Yano wrote:
> > > On Tue, 31 Aug 2021 11:04:05 +0200
> > > Corinna Vinschen wrote:
> > > > On Aug 31 17:52, Takashi Yano wrote:
> > > > > On Mon, 30 Aug 2021 17:19:44 +0200
> > > > > Corinna Vinschen wrote:
> > > > > > But, as I just wrote in my previous mail, the FILE_SYNCHRONOUS_IO_NONALERT
> > > > > > flag is probably a good thing for C# apps, but not for Cygwin, because it
> > > > > > enforces synchronous operation.  Sorry about that...
> > > > > 
> > > > > With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
> > > > > specifically concerned about cygwin pipe? 
> > > > 
> > > > We're using asynchronous IO to be able to call WFMO and thus to be able
> > > > to handle signals and thread cancellation events.  Wit hsynchronous IO
> > > > this is not possible.
> > > 
> > > Thanks. How can I regenerate above issue? Stopping by Ctrl-C or killing
> > > the process by kill seems to work even with FILE_SYNCHRONOUS_IO_NONALERT.
> > 
> > It may depend on the thread you're running this in.  But really, just
> > call a blocking (SYNCHRONIZE + FILE_SYNCHRONOUS_IO_NONALERT) ReadFile
> > in the main thread of a Cygwin app, and you'll see that neither Ctrl-C
> > nor kill signalling will get through.
> 
> I confirmed the issue with FILE_SYNCHRONOUS_IO_NONALERT.

There's, of course, a workaround.  If you start a thread for each
synchronous ReadFile/WriteFile, you can cygwait in the caller for the
thread, rather than for the event object of the async IO.  If a signal
or a thread cancellation request arrives, you can then call
CancelSynchronousIo on the ReadFile/WriteFile thread and wait for thread
termination.  I implemented something along these lines in
fhandler_disk_file::mand_lock.

Maybe something to consider?  It would certainly fix the C# issue,
but I'm reluctant to do the thread juggle for each read/write.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  0:16                                                               ` Takashi Yano
@ 2021-09-01  8:07                                                                 ` Corinna Vinschen
  2021-09-01  8:23                                                                   ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-01  8:07 UTC (permalink / raw)
  To: cygwin-developers

On Sep  1 09:16, Takashi Yano wrote:
> On Wed, 1 Sep 2021 08:02:20 +0900
> Takashi Yano wrote:
> > On Tue, 31 Aug 2021 17:50:14 +0200
> > Corinna Vinschen wrote:
> > > So for the time being I suggest the below patch on top of topic/pipe.
> > > It contains everything we discussed so far.
> > 
> > One more thing. 'git log' cannot stop normally with 'q' with your patch.
> 
> The same happes with 'yes |less'.
> 
> The cause is that write side cannot detect closing read side because
> query_hdl (read handle) is still opened.

Oh

my

god.


That kills the entire idea of keeping the read handle :(


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  8:03                                                       ` Corinna Vinschen
@ 2021-09-01  8:13                                                         ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-01  8:13 UTC (permalink / raw)
  To: cygwin-developers

On Sep  1 10:03, Corinna Vinschen wrote:
> On Sep  1 11:39, Takashi Yano wrote:
> > On Tue, 31 Aug 2021 17:20:25 +0200
> > Corinna Vinschen wrote:
> > > On Aug 31 20:05, Takashi Yano wrote:
> > > > On Tue, 31 Aug 2021 11:04:05 +0200
> > > > Corinna Vinschen wrote:
> > > > > On Aug 31 17:52, Takashi Yano wrote:
> > > > > > With FILE_SYNCHRONOUS_IO_NONALERT, what kind of problems are you
> > > > > > specifically concerned about cygwin pipe? 
> > > > > 
> > > > > We're using asynchronous IO to be able to call WFMO and thus to be able
> > > > > to handle signals and thread cancellation events.  Wit hsynchronous IO
> > > > > this is not possible.
> > > > 
> > > > Thanks. How can I regenerate above issue? Stopping by Ctrl-C or killing
> > > > the process by kill seems to work even with FILE_SYNCHRONOUS_IO_NONALERT.
> > > 
> > > It may depend on the thread you're running this in.  But really, just
> > > call a blocking (SYNCHRONIZE + FILE_SYNCHRONOUS_IO_NONALERT) ReadFile
> > > in the main thread of a Cygwin app, and you'll see that neither Ctrl-C
> > > nor kill signalling will get through.
> > 
> > I confirmed the issue with FILE_SYNCHRONOUS_IO_NONALERT.
> 
> There's, of course, a workaround.  If you start a thread for each
> synchronous ReadFile/WriteFile, you can cygwait in the caller for the
> thread, rather than for the event object of the async IO.  If a signal
> or a thread cancellation request arrives, you can then call
> CancelSynchronousIo on the ReadFile/WriteFile thread and wait for thread
> termination.  I implemented something along these lines in
> fhandler_disk_file::mand_lock.
> 
> Maybe something to consider?  It would certainly fix the C# issue,
> but I'm reluctant to do the thread juggle for each read/write.

Oh, yeah, there's still the problem that NtQueryInformationFile on the
read side hangs when called during a synchronous ReadFile, as I learned
two days ago...


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  8:07                                                                 ` Corinna Vinschen
@ 2021-09-01  8:23                                                                   ` Takashi Yano
  2021-09-01  8:46                                                                     ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-01  8:23 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 1 Sep 2021 10:07:48 +0200
Corinna Vinschen wrote:
> On Sep  1 09:16, Takashi Yano wrote:
> > On Wed, 1 Sep 2021 08:02:20 +0900
> > Takashi Yano wrote:
> > > On Tue, 31 Aug 2021 17:50:14 +0200
> > > Corinna Vinschen wrote:
> > > > So for the time being I suggest the below patch on top of topic/pipe.
> > > > It contains everything we discussed so far.
> > > 
> > > One more thing. 'git log' cannot stop normally with 'q' with your patch.
> > 
> > The same happes with 'yes |less'.
> > 
> > The cause is that write side cannot detect closing read side because
> > query_hdl (read handle) is still opened.
> 
> Oh
> 
> my
> 
> god.
> 
> 
> That kills the entire idea of keeping the read handle :(

One idea is:

Count read handle and write handle opned using NtQueryObject().
If the numbers of opened handle are equal each other, only
the write side (pair of write handle and query_hdl) is alive.
In this case, write() returns error.
If read side is alive, number of read handles is greater than
number of write handles. 

But atomicity should be considered.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  8:23                                                                   ` Takashi Yano
@ 2021-09-01  8:46                                                                     ` Corinna Vinschen
  2021-09-01 12:56                                                                       ` Ken Brown
  2021-09-02  8:15                                                                       ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-01  8:46 UTC (permalink / raw)
  To: cygwin-developers

On Sep  1 17:23, Takashi Yano wrote:
> On Wed, 1 Sep 2021 10:07:48 +0200
> Corinna Vinschen wrote:
> > On Sep  1 09:16, Takashi Yano wrote:
> > > On Wed, 1 Sep 2021 08:02:20 +0900
> > > Takashi Yano wrote:
> > > > On Tue, 31 Aug 2021 17:50:14 +0200
> > > > Corinna Vinschen wrote:
> > > > > So for the time being I suggest the below patch on top of topic/pipe.
> > > > > It contains everything we discussed so far.
> > > > 
> > > > One more thing. 'git log' cannot stop normally with 'q' with your patch.
> > > 
> > > The same happes with 'yes |less'.
> > > 
> > > The cause is that write side cannot detect closing read side because
> > > query_hdl (read handle) is still opened.
> > 
> > Oh
> > 
> > my
> > 
> > god.
> > 
> > 
> > That kills the entire idea of keeping the read handle :(
> 
> One idea is:
> 
> Count read handle and write handle opned using NtQueryObject().
> If the numbers of opened handle are equal each other, only
> the write side (pair of write handle and query_hdl) is alive.
> In this case, write() returns error.
> If read side is alive, number of read handles is greater than
> number of write handles. 

Interesting idea.  But where do you do the count?  The event object
will not get signalled, so WFMO will not return when performing a
blocking write.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  8:46                                                                     ` Corinna Vinschen
@ 2021-09-01 12:56                                                                       ` Ken Brown
  2021-09-01 13:52                                                                         ` Corinna Vinschen
  2021-09-02  8:15                                                                       ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-01 12:56 UTC (permalink / raw)
  To: cygwin-developers

On 9/1/2021 4:46 AM, Corinna Vinschen wrote:
> On Sep  1 17:23, Takashi Yano wrote:
>> On Wed, 1 Sep 2021 10:07:48 +0200
>> Corinna Vinschen wrote:
>>> On Sep  1 09:16, Takashi Yano wrote:
>>>> On Wed, 1 Sep 2021 08:02:20 +0900
>>>> Takashi Yano wrote:
>>>>> On Tue, 31 Aug 2021 17:50:14 +0200
>>>>> Corinna Vinschen wrote:
>>>>>> So for the time being I suggest the below patch on top of topic/pipe.
>>>>>> It contains everything we discussed so far.
>>>>>
>>>>> One more thing. 'git log' cannot stop normally with 'q' with your patch.
>>>>
>>>> The same happes with 'yes |less'.
>>>>
>>>> The cause is that write side cannot detect closing read side because
>>>> query_hdl (read handle) is still opened.
>>>
>>> Oh
>>>
>>> my
>>>
>>> god.
>>>
>>>
>>> That kills the entire idea of keeping the read handle :(
>>
>> One idea is:
>>
>> Count read handle and write handle opned using NtQueryObject().
>> If the numbers of opened handle are equal each other, only
>> the write side (pair of write handle and query_hdl) is alive.
>> In this case, write() returns error.
>> If read side is alive, number of read handles is greater than
>> number of write handles.
> 
> Interesting idea.  But where do you do the count?  The event object
> will not get signalled, so WFMO will not return when performing a
> blocking write.

What if we create an event that we signal every time a reader closes, and we add 
that to the events that WFMO is waiting for?

If this doesn't work for some reason, a different (but more complicated) idea is 
to keep a count of the number of open readers in shared memory.  When this is 0, 
write returns an error.  I'm thinking of shared memory as in topic/af_unix 
(which I copied in the fifo implementation), but maybe something simpler would 
work since we only have a single variable to keep track of.

Ken


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01 12:56                                                                       ` Ken Brown
@ 2021-09-01 13:52                                                                         ` Corinna Vinschen
  2021-09-01 23:02                                                                           ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-01 13:52 UTC (permalink / raw)
  To: cygwin-developers

On Sep  1 08:56, Ken Brown wrote:
> On 9/1/2021 4:46 AM, Corinna Vinschen wrote:
> > On Sep  1 17:23, Takashi Yano wrote:
> > > On Wed, 1 Sep 2021 10:07:48 +0200
> > > Corinna Vinschen wrote:
> > > > On Sep  1 09:16, Takashi Yano wrote:
> > > > > On Wed, 1 Sep 2021 08:02:20 +0900
> > > > > Takashi Yano wrote:
> > > > > > On Tue, 31 Aug 2021 17:50:14 +0200
> > > > > > Corinna Vinschen wrote:
> > > > > > > So for the time being I suggest the below patch on top of topic/pipe.
> > > > > > > It contains everything we discussed so far.
> > > > > > 
> > > > > > One more thing. 'git log' cannot stop normally with 'q' with your patch.
> > > > > 
> > > > > The same happes with 'yes |less'.
> > > > > 
> > > > > The cause is that write side cannot detect closing read side because
> > > > > query_hdl (read handle) is still opened.
> > > > 
> > > > Oh
> > > > 
> > > > my
> > > > 
> > > > god.
> > > > 
> > > > 
> > > > That kills the entire idea of keeping the read handle :(
> > > 
> > > One idea is:
> > > 
> > > Count read handle and write handle opned using NtQueryObject().
> > > If the numbers of opened handle are equal each other, only
> > > the write side (pair of write handle and query_hdl) is alive.
> > > In this case, write() returns error.
> > > If read side is alive, number of read handles is greater than
> > > number of write handles.
> > 
> > Interesting idea.  But where do you do the count?  The event object
> > will not get signalled, so WFMO will not return when performing a
> > blocking write.
> 
> What if we create an event that we signal every time a reader closes, and we
> add that to the events that WFMO is waiting for?
> 
> If this doesn't work for some reason, a different (but more complicated)
> idea is to keep a count of the number of open readers in shared memory.
> When this is 0, write returns an error.  I'm thinking of shared memory as in
> topic/af_unix (which I copied in the fifo implementation), but maybe
> something simpler would work since we only have a single variable to keep
> track of.

Great idea that.  What we need would be some semaphore upside down.
One that can be used to count items and which is signalled if it's
down to zero.

Hmm, something to think about...


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01 13:52                                                                         ` Corinna Vinschen
@ 2021-09-01 23:02                                                                           ` Ken Brown
  2021-09-02  8:17                                                                             ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-01 23:02 UTC (permalink / raw)
  To: cygwin-developers

On 9/1/2021 9:52 AM, Corinna Vinschen wrote:
> On Sep  1 08:56, Ken Brown wrote:
>> On 9/1/2021 4:46 AM, Corinna Vinschen wrote:
>>> On Sep  1 17:23, Takashi Yano wrote:
>>>> On Wed, 1 Sep 2021 10:07:48 +0200
>>>> Corinna Vinschen wrote:
>>>>> On Sep  1 09:16, Takashi Yano wrote:
>>>>>> On Wed, 1 Sep 2021 08:02:20 +0900
>>>>>> Takashi Yano wrote:
>>>>>>> On Tue, 31 Aug 2021 17:50:14 +0200
>>>>>>> Corinna Vinschen wrote:
>>>>>>>> So for the time being I suggest the below patch on top of topic/pipe.
>>>>>>>> It contains everything we discussed so far.
>>>>>>>
>>>>>>> One more thing. 'git log' cannot stop normally with 'q' with your patch.
>>>>>>
>>>>>> The same happes with 'yes |less'.
>>>>>>
>>>>>> The cause is that write side cannot detect closing read side because
>>>>>> query_hdl (read handle) is still opened.
>>>>>
>>>>> Oh
>>>>>
>>>>> my
>>>>>
>>>>> god.
>>>>>
>>>>>
>>>>> That kills the entire idea of keeping the read handle :(
>>>>
>>>> One idea is:
>>>>
>>>> Count read handle and write handle opned using NtQueryObject().
>>>> If the numbers of opened handle are equal each other, only
>>>> the write side (pair of write handle and query_hdl) is alive.
>>>> In this case, write() returns error.
>>>> If read side is alive, number of read handles is greater than
>>>> number of write handles.
>>>
>>> Interesting idea.  But where do you do the count?  The event object
>>> will not get signalled, so WFMO will not return when performing a
>>> blocking write.
>>
>> What if we create an event that we signal every time a reader closes, and we
>> add that to the events that WFMO is waiting for?
>>
>> If this doesn't work for some reason, a different (but more complicated)
>> idea is to keep a count of the number of open readers in shared memory.
>> When this is 0, write returns an error.  I'm thinking of shared memory as in
>> topic/af_unix (which I copied in the fifo implementation), but maybe
>> something simpler would work since we only have a single variable to keep
>> track of.
> 
> Great idea that.  What we need would be some semaphore upside down.
> One that can be used to count items and which is signalled if it's
> down to zero.

Here's an idea (untested), based on 
https://stackoverflow.com/questions/6559854/is-there-something-opposite-to-semaphore:

We create an ordinary Windows semaphore and use it to count the readers: It 
starts at 0, we increment it by calling ReleaseSemaphore when a reader is opened 
(by fhandler_pipe::create, fork/exec, dup), and we decrement it by calling WFSO 
when a reader closes.  When we decrement it, we test whether it's been reduced 
to 0.  We do this by calling ReleaseSemaphore and using its lpPreviousCount 
argument.

We also create an event that we can use to make WFMO return during a blocking 
write.  We signal this event if a reader closes and we've discovered that there 
are no more readers.  In this case we cancel the pending write [*] and return an 
error.

I'm sure I've overlooked something, but does this seem feasible?

Ken

[*] I don't know offhand if Windows provides a way to cancel a pending write. 
If not, we could use query_hdl to drain the pipe.

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01  8:46                                                                     ` Corinna Vinschen
  2021-09-01 12:56                                                                       ` Ken Brown
@ 2021-09-02  8:15                                                                       ` Takashi Yano
  2021-09-02 18:54                                                                         ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-02  8:15 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1862 bytes --]

On Wed, 1 Sep 2021 10:46:24 +0200
Corinna Vinschen wrote:
> On Sep  1 17:23, Takashi Yano wrote:
> > On Wed, 1 Sep 2021 10:07:48 +0200
> > Corinna Vinschen wrote:
> > > On Sep  1 09:16, Takashi Yano wrote:
> > > > On Wed, 1 Sep 2021 08:02:20 +0900
> > > > Takashi Yano wrote:
> > > > > On Tue, 31 Aug 2021 17:50:14 +0200
> > > > > Corinna Vinschen wrote:
> > > > > > So for the time being I suggest the below patch on top of topic/pipe.
> > > > > > It contains everything we discussed so far.
> > > > > 
> > > > > One more thing. 'git log' cannot stop normally with 'q' with your patch.
> > > > 
> > > > The same happes with 'yes |less'.
> > > > 
> > > > The cause is that write side cannot detect closing read side because
> > > > query_hdl (read handle) is still opened.
> > > 
> > > Oh
> > > 
> > > my
> > > 
> > > god.
> > > 
> > > 
> > > That kills the entire idea of keeping the read handle :(
> > 
> > One idea is:
> > 
> > Count read handle and write handle opned using NtQueryObject().
> > If the numbers of opened handle are equal each other, only
> > the write side (pair of write handle and query_hdl) is alive.
> > In this case, write() returns error.
> > If read side is alive, number of read handles is greater than
> > number of write handles. 
> 
> Interesting idea.  But where do you do the count?  The event object
> will not get signalled, so WFMO will not return when performing a
> blocking write.

I imagined something like attached patch.

Unfortunately, the attached patch seems to have bug that
occasionally causes the following error while building
cygwin1.dll.

<command-line>: error: "GCC_VERSION" redefined [-Werror]
<command-line>: note: this is the location of the previous definition
cc1plus: all warnings being treated as errors
make[1]: *** [Makefile:1942: fhandler_proc.o] Error 1

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: count-rw-handle.patch --]
[-- Type: application/octet-stream, Size: 5129 bytes --]

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 1f0f28077..38390848f 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1172,6 +1172,7 @@ class fhandler_pipe: public fhandler_base
 {
 private:
   HANDLE query_hdl;
+  HANDLE reader_evt;
   pid_t popen_pid;
   size_t max_atomic_write;
   void set_pipe_non_blocking (bool nonblocking);
@@ -1181,6 +1182,7 @@ public:
   bool ispipe() const { return true; }
 
   HANDLE get_query_handle () const { return query_hdl; }
+  bool reader_closed ();
 
   void set_popen_pid (pid_t pid) {popen_pid = pid;}
   pid_t get_popen_pid () const {return popen_pid;}
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 2d9e87bb3..4773d04da 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -51,6 +51,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
   fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
   fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
     : FILE_PIPE_QUEUE_OPERATION;
+  if (get_device () == FH_PIPEW)
+    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;
   status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
 				 FilePipeInformation);
   if (!NT_SUCCESS (status))
@@ -308,6 +310,22 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+
+  if ((ssize_t)len > 0)
+    SetEvent (reader_evt);
+}
+
+bool
+fhandler_pipe::reader_closed (void)
+{
+  if (get_device () == FH_PIPER)
+    return false;
+  OBJECT_BASIC_INFORMATION obi;
+  NtQueryObject (get_handle (), ObjectBasicInformation, &obi, sizeof obi, NULL);
+  int n_writer = obi.HandleCount;
+  NtQueryObject (query_hdl, ObjectBasicInformation, &obi, sizeof obi, NULL);
+  int n_reader = obi.HandleCount;
+  return n_writer == n_reader;
 }
 
 ssize_t __reg3
@@ -323,6 +341,13 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
   if (!len)
     return 0;
 
+  if (reader_closed ())
+    {
+      set_errno(EPIPE);
+      raise(SIGPIPE);
+      return -1;
+    }
+
   if (len <= max_atomic_write)
     chunk = len;
   else if (is_nonblocking ())
@@ -400,10 +425,24 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
 	  /* NtWriteFile returns success with # of bytes written == 0
 	     if writing on a non-blocking pipe fails because the pipe
 	     buffer doesn't have sufficient space. */
-	  if (nbytes_now == 0 && nbytes == 0)
+	  if (nbytes_now == 0 && nbytes == 0 && is_nonblocking ())
 	    set_errno (EAGAIN);
-	  ptr = ((char *) ptr) + chunk;
+	  ptr = ((char *) ptr) + nbytes_now;
 	  nbytes += nbytes_now;
+	  if (reader_closed () && nbytes == 0)
+	    {
+	      set_errno(EPIPE);
+	      raise(SIGPIPE);
+	    }
+	  if (!is_nonblocking () && nbytes < len)
+	    {
+	      if (nbytes_now == 0)
+		{
+		  cygwait (reader_evt);
+		  ResetEvent (reader_evt);
+		}
+	      continue;
+	    }
 	}
       else if (STATUS_PIPE_IS_CLOSED (status))
 	{
@@ -430,9 +469,14 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
 void
 fhandler_pipe::fixup_after_fork (HANDLE parent)
 {
+  if (close_on_exec() && query_hdl)
+    CloseHandle (query_hdl);
   if (query_hdl)
     fork_fixup (parent, query_hdl, "query_hdl");
   fhandler_base::fixup_after_fork (parent);
+  if (!close_on_exec ())
+    DuplicateHandle (parent, reader_evt, GetCurrentProcess (), &reader_evt,
+		     0, 1, DUPLICATE_SAME_ACCESS);
 }
 
 int
@@ -453,6 +497,9 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  DuplicateHandle (GetCurrentProcess (), reader_evt,
+		   GetCurrentProcess (), &ftp->reader_evt,
+		   0, 1, DUPLICATE_SAME_ACCESS);
 
   debug_printf ("res %d", res);
   return res;
@@ -463,7 +510,11 @@ fhandler_pipe::close ()
 {
   if (query_hdl)
     NtClose (query_hdl);
-  return fhandler_base::close ();
+  int ret = fhandler_base::close ();
+  if (get_device () == FH_PIPER)
+    SetEvent (reader_evt);
+  CloseHandle (reader_evt);
+  return ret;
 }
 
 #define PIPE_INTRO "\\\\.\\pipe\\cygwin-"
@@ -678,6 +729,10 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
       else
 	res = 0;
     }
+  fhs[0]->reader_evt = CreateEvent (NULL, true, false, NULL);
+  DuplicateHandle (GetCurrentProcess (), fhs[0]->reader_evt,
+		   GetCurrentProcess (), &fhs[1]->reader_evt,
+		   0, 1, DUPLICATE_SAME_ACCESS);
 
   debug_printf ("%R = pipe(%d, %y)", res, psize, mode);
   return res;
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index dc1f79613..0905b3d38 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -741,6 +741,13 @@ out:
     }
   if (s->write_selected && dev != FH_PIPER)
     {
+      if (dev == FH_PIPEW && ((fhandler_pipe *) fh)->reader_closed ())
+	{
+	  gotone += s->write_ready = true;
+	  if (s->except_selected)
+	    gotone += s->except_ready = true;
+	  return gotone;
+	}
       /* For the write side of a pipe, fetch the handle to the read side.
 	 See the longish comment in pipe_data_available for the reasoning. */
       if (dev == FH_PIPEW)

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-01 23:02                                                                           ` Ken Brown
@ 2021-09-02  8:17                                                                             ` Corinna Vinschen
  2021-09-02 13:01                                                                               ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-02  8:17 UTC (permalink / raw)
  To: cygwin-developers

On Sep  1 19:02, Ken Brown wrote:
> On 9/1/2021 9:52 AM, Corinna Vinschen wrote:
> > Great idea that.  What we need would be some semaphore upside down.
> > One that can be used to count items and which is signalled if it's
> > down to zero.
> 
> Here's an idea (untested), based on https://stackoverflow.com/questions/6559854/is-there-something-opposite-to-semaphore:
> 
> We create an ordinary Windows semaphore and use it to count the readers: It
> starts at 0, we increment it by calling ReleaseSemaphore when a reader is
> opened (by fhandler_pipe::create, fork/exec, dup), and we decrement it by
> calling WFSO when a reader closes.  When we decrement it, we test whether
> it's been reduced to 0.  We do this by calling ReleaseSemaphore and using
> its lpPreviousCount argument.
> 
> We also create an event that we can use to make WFMO return during a
> blocking write.  We signal this event if a reader closes and we've
> discovered that there are no more readers.  In this case we cancel the
> pending write [*] and return an error.
> 
> I'm sure I've overlooked something, but does this seem feasible?

It could work, but the problem with all these approaches is that they
are tricky and bound to fail as soon as a process is killed or crashes.

> [*] I don't know offhand if Windows provides a way to cancel a pending
> write. If not, we could use query_hdl to drain the pipe.

There's a CancelIoEx function to cancel all async IO on a handle.

In a lucid moment tonight, I had another idea.

First of all, scratch my patch.  Also, revert select to check only
for WriteQuotaAvailable.

Next, for sanity, let's assume that non-blocking reads don't change
WriteQuotaAvailable.  So the only important case is the blocking read,
which reduces WriteQuotaAvailable by the number of requested bytes.

Next, fact is, we're only interested in WriteQuotaAvailable > 0.
And we have a buffersize of 64K.

We can also safely assume that we only have a very small number of
readers, typically only one.

So here's the crazily simple idea:

What if the readers never request more than, say, 50 or even 25% of the
available buffer space?  Our buffer is 64K and there's no guarantee that
any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
having to check the other side of the pipe.  Something like this,
ignoring border cases:

pipe::create()
{
   [...]
   mutex = CreateMutex();
}

pipe::raw_read(char *buf, size_t num_requested)
{
  if (blocking)
    {
      WFSO(mutex);
      NtQueryInformationFile(FilePipeLocalInformation);
      if (!fpli.ReadDataAvailable
	  && num_requested > fpli.InboundQuota / 4)
	num_requested = fpli.InboundQuota / 4;
      NtReadFile(pipe, buf, num_requested);
      ReleaseMutex(mutex);
    }
}

It's not entirely foolproof, but it should fix 99% of the cases.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02  8:17                                                                             ` Corinna Vinschen
@ 2021-09-02 13:01                                                                               ` Ken Brown
  2021-09-02 19:00                                                                                 ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-02 13:01 UTC (permalink / raw)
  To: cygwin-developers

On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> On Sep  1 19:02, Ken Brown wrote:
>> On 9/1/2021 9:52 AM, Corinna Vinschen wrote:
>>> Great idea that.  What we need would be some semaphore upside down.
>>> One that can be used to count items and which is signalled if it's
>>> down to zero.
>>
>> Here's an idea (untested), based on https://stackoverflow.com/questions/6559854/is-there-something-opposite-to-semaphore:
>>
>> We create an ordinary Windows semaphore and use it to count the readers: It
>> starts at 0, we increment it by calling ReleaseSemaphore when a reader is
>> opened (by fhandler_pipe::create, fork/exec, dup), and we decrement it by
>> calling WFSO when a reader closes.  When we decrement it, we test whether
>> it's been reduced to 0.  We do this by calling ReleaseSemaphore and using
>> its lpPreviousCount argument.
>>
>> We also create an event that we can use to make WFMO return during a
>> blocking write.  We signal this event if a reader closes and we've
>> discovered that there are no more readers.  In this case we cancel the
>> pending write [*] and return an error.
>>
>> I'm sure I've overlooked something, but does this seem feasible?
> 
> It could work, but the problem with all these approaches is that they
> are tricky and bound to fail as soon as a process is killed or crashes.
> 
>> [*] I don't know offhand if Windows provides a way to cancel a pending
>> write. If not, we could use query_hdl to drain the pipe.
> 
> There's a CancelIoEx function to cancel all async IO on a handle.
> 
> In a lucid moment tonight, I had another idea.
> 
> First of all, scratch my patch.  Also, revert select to check only
> for WriteQuotaAvailable.
> 
> Next, for sanity, let's assume that non-blocking reads don't change
> WriteQuotaAvailable.  So the only important case is the blocking read,
> which reduces WriteQuotaAvailable by the number of requested bytes.
> 
> Next, fact is, we're only interested in WriteQuotaAvailable > 0.
> And we have a buffersize of 64K.
> 
> We can also safely assume that we only have a very small number of
> readers, typically only one.
> 
> So here's the crazily simple idea:
> 
> What if the readers never request more than, say, 50 or even 25% of the
> available buffer space?  Our buffer is 64K and there's no guarantee that
> any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> having to check the other side of the pipe.  Something like this,
> ignoring border cases:
> 
> pipe::create()
> {
>     [...]
>     mutex = CreateMutex();
> }
> 
> pipe::raw_read(char *buf, size_t num_requested)
> {
>    if (blocking)
>      {
>        WFSO(mutex);
>        NtQueryInformationFile(FilePipeLocalInformation);
>        if (!fpli.ReadDataAvailable
> 	  && num_requested > fpli.InboundQuota / 4)
> 	num_requested = fpli.InboundQuota / 4;
>        NtReadFile(pipe, buf, num_requested);
>        ReleaseMutex(mutex);
>      }
> }
> 
> It's not entirely foolproof, but it should fix 99% of the cases.

I like it!

Do you think there's anything we can or should do to avoid a deadlock in the 
rare cases where this fails?  The only thing I can think of immediately is to 
always impose a timeout if select is called with infinite timeout on the write 
side of a pipe, after which we report that the pipe is write ready.  After all, 
we've lived since 2008 with a bug that caused select to *always* report write ready.

Alternatively, we could just wait and see if there's an actual use case in which 
someone encounters a deadlock.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02  8:15                                                                       ` Takashi Yano
@ 2021-09-02 18:54                                                                         ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-02 18:54 UTC (permalink / raw)
  To: cygwin-developers

Hi Takashi,

On Sep  2 17:15, Takashi Yano wrote:
> On Wed, 1 Sep 2021 10:46:24 +0200
> Corinna Vinschen wrote:
> > On Sep  1 17:23, Takashi Yano wrote:
> > > One idea is:
> > > 
> > > Count read handle and write handle opned using NtQueryObject().
> > > If the numbers of opened handle are equal each other, only
> > > the write side (pair of write handle and query_hdl) is alive.
> > > In this case, write() returns error.
> > > If read side is alive, number of read handles is greater than
> > > number of write handles. 
> > 
> > Interesting idea.  But where do you do the count?  The event object
> > will not get signalled, so WFMO will not return when performing a
> > blocking write.
> 
> I imagined something like attached patch.
> 
> Unfortunately, the attached patch seems to have bug that
> occasionally causes the following error while building
> cygwin1.dll.
> 
> <command-line>: error: "GCC_VERSION" redefined [-Werror]
> <command-line>: note: this is the location of the previous definition
> cc1plus: all warnings being treated as errors
> make[1]: *** [Makefile:1942: fhandler_proc.o] Error 1

> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> index 1f0f28077..38390848f 100644
> --- a/winsup/cygwin/fhandler.h
> +++ b/winsup/cygwin/fhandler.h
> @@ -1172,6 +1172,7 @@ class fhandler_pipe: public fhandler_base
>  {
>  private:
>    HANDLE query_hdl;
> +  HANDLE reader_evt;
>    pid_t popen_pid;
>    size_t max_atomic_write;
>    void set_pipe_non_blocking (bool nonblocking);
> @@ -1181,6 +1182,7 @@ public:
>    bool ispipe() const { return true; }
>  
>    HANDLE get_query_handle () const { return query_hdl; }
> +  bool reader_closed ();
>  
>    void set_popen_pid (pid_t pid) {popen_pid = pid;}
>    pid_t get_popen_pid () const {return popen_pid;}
> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> index 2d9e87bb3..4773d04da 100644
> --- a/winsup/cygwin/fhandler_pipe.cc
> +++ b/winsup/cygwin/fhandler_pipe.cc
> @@ -51,6 +51,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
>    fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
>    fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
>      : FILE_PIPE_QUEUE_OPERATION;
> +  if (get_device () == FH_PIPEW)
> +    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;

Ok, so  the write side is always non-blocking...


>    status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
>  				 FilePipeInformation);
>    if (!NT_SUCCESS (status))
> @@ -308,6 +310,22 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
>      }
>    else if (status == STATUS_THREAD_CANCELED)
>      pthread::static_cancel_self ();
> +
> +  if ((ssize_t)len > 0)
> +    SetEvent (reader_evt);
> +}

...and every successful read sets the event object to signalled.

Sounds good.

> +    }
> +
>    if (len <= max_atomic_write)
>      chunk = len;
>    else if (is_nonblocking ())
> @@ -400,10 +425,24 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
>  	  /* NtWriteFile returns success with # of bytes written == 0
>  	     if writing on a non-blocking pipe fails because the pipe
>  	     buffer doesn't have sufficient space. */
> -	  if (nbytes_now == 0 && nbytes == 0)
> +	  if (nbytes_now == 0 && nbytes == 0 && is_nonblocking ())
>  	    set_errno (EAGAIN);
> -	  ptr = ((char *) ptr) + chunk;
> +	  ptr = ((char *) ptr) + nbytes_now;
>  	  nbytes += nbytes_now;
> +	  if (reader_closed () && nbytes == 0)
> +	    {
> +	      set_errno(EPIPE);
> +	      raise(SIGPIPE);
> +	    }
> +	  if (!is_nonblocking () && nbytes < len)
> +	    {
> +	      if (nbytes_now == 0)
> +		{
> +		  cygwait (reader_evt);
> +		  ResetEvent (reader_evt);

But what if a read calls SetEvent between cygwait and ResetEvent?
This looks like a potential deadlock issue, no?

> +	    }
>  	}
>        else if (STATUS_PIPE_IS_CLOSED (status))
>  	{
> @@ -430,9 +469,14 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
>  void
>  fhandler_pipe::fixup_after_fork (HANDLE parent)
>  {
> +  if (close_on_exec() && query_hdl)
> +    CloseHandle (query_hdl);

Why do you close the handle here?  It gets already created with
inheritence settings according to the O_CLOEXEC flag.

>    if (query_hdl)

This is broken.  If you close query_hdl above, it's still != NULL
and fork_fixup will be called.

>      fork_fixup (parent, query_hdl, "query_hdl");
>    fhandler_base::fixup_after_fork (parent);
> +  if (!close_on_exec ())
> +    DuplicateHandle (parent, reader_evt, GetCurrentProcess (), &reader_evt,
> +		     0, 1, DUPLICATE_SAME_ACCESS);

Uhm... this is fixup_after_fork.  I'm a bit puzzled.  You create the
event object with inheritence set to TRUE unconditionally.  So the
forked process will have this handle anyway.  If you duplicate the
handle here, you'll have a handle leak.  What about creating and
duplicating with inheritance == !(flags & O_CLOEXEC) and just call
fork_fixup?

> @@ -463,7 +510,11 @@ fhandler_pipe::close ()
>  {
>    if (query_hdl)
>      NtClose (query_hdl);
> -  return fhandler_base::close ();
> +  int ret = fhandler_base::close ();
> +  if (get_device () == FH_PIPER)
> +    SetEvent (reader_evt);
> +  CloseHandle (reader_evt);
> +  return ret;
>  }

What do you do if the reader process gets killed or crashes?  I fear
this solution has the same problem as a solution using a self-implemented
counter.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 13:01                                                                               ` Ken Brown
@ 2021-09-02 19:00                                                                                 ` Corinna Vinschen
  2021-09-02 19:34                                                                                   ` Ken Brown
  2021-09-02 19:35                                                                                   ` Corinna Vinschen
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-02 19:00 UTC (permalink / raw)
  To: cygwin-developers

On Sep  2 09:01, Ken Brown wrote:
> On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> > What if the readers never request more than, say, 50 or even 25% of the
> > available buffer space?  Our buffer is 64K and there's no guarantee that
> > any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> > having to check the other side of the pipe.  Something like this,
> > ignoring border cases:
> > 
> > pipe::create()
> > {
> >     [...]
> >     mutex = CreateMutex();
> > }
> > 
> > pipe::raw_read(char *buf, size_t num_requested)
> > {
> >    if (blocking)
> >      {
> >        WFSO(mutex);
> >        NtQueryInformationFile(FilePipeLocalInformation);
> >        if (!fpli.ReadDataAvailable
> > 	  && num_requested > fpli.InboundQuota / 4)
> > 	num_requested = fpli.InboundQuota / 4;
> >        NtReadFile(pipe, buf, num_requested);
> >        ReleaseMutex(mutex);
> >      }
> > }
> > 
> > It's not entirely foolproof, but it should fix 99% of the cases.
> 
> I like it!
> 
> Do you think there's anything we can or should do to avoid a deadlock in the
> rare cases where this fails?  The only thing I can think of immediately is
> to always impose a timeout if select is called with infinite timeout on the
> write side of a pipe, after which we report that the pipe is write ready.
> After all, we've lived since 2008 with a bug that caused select to *always*
> report write ready.

Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?

> Alternatively, we could just wait and see if there's an actual use case in
> which someone encounters a deadlock.

Or that.  Fixing up select isn't too hard in that case, I guess.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 19:00                                                                                 ` Corinna Vinschen
@ 2021-09-02 19:34                                                                                   ` Ken Brown
  2021-09-02 19:35                                                                                   ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-02 19:34 UTC (permalink / raw)
  To: cygwin-developers

On 9/2/2021 3:00 PM, Corinna Vinschen wrote:
> On Sep  2 09:01, Ken Brown wrote:
>> On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
>>> What if the readers never request more than, say, 50 or even 25% of the
>>> available buffer space?  Our buffer is 64K and there's no guarantee that
>>> any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
>>> having to check the other side of the pipe.  Something like this,
>>> ignoring border cases:
>>>
>>> pipe::create()
>>> {
>>>      [...]
>>>      mutex = CreateMutex();
>>> }
>>>
>>> pipe::raw_read(char *buf, size_t num_requested)
>>> {
>>>     if (blocking)
>>>       {
>>>         WFSO(mutex);
>>>         NtQueryInformationFile(FilePipeLocalInformation);
>>>         if (!fpli.ReadDataAvailable
>>> 	  && num_requested > fpli.InboundQuota / 4)
>>> 	num_requested = fpli.InboundQuota / 4;
>>>         NtReadFile(pipe, buf, num_requested);
>>>         ReleaseMutex(mutex);
>>>       }
>>> }
>>>
>>> It's not entirely foolproof, but it should fix 99% of the cases.
>>
>> I like it!
>>
>> Do you think there's anything we can or should do to avoid a deadlock in the
>> rare cases where this fails?  The only thing I can think of immediately is
>> to always impose a timeout if select is called with infinite timeout on the
>> write side of a pipe, after which we report that the pipe is write ready.
>> After all, we've lived since 2008 with a bug that caused select to *always*
>> report write ready.
> 
> Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?

Probably seconds (maybe 20, like the connect timeout for sockets?), but it's 
hard to know without seeing naturally occurring cases where this happens.

>> Alternatively, we could just wait and see if there's an actual use case in
>> which someone encounters a deadlock.
> 
> Or that.  Fixing up select isn't too hard in that case, I guess.

I think this would be my preference, at least during testing (if we can persuade 
people to test it).

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 19:00                                                                                 ` Corinna Vinschen
  2021-09-02 19:34                                                                                   ` Ken Brown
@ 2021-09-02 19:35                                                                                   ` Corinna Vinschen
  2021-09-02 20:19                                                                                     ` Ken Brown
                                                                                                       ` (2 more replies)
  1 sibling, 3 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-02 19:35 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2566 bytes --]

On Sep  2 21:00, Corinna Vinschen wrote:
> On Sep  2 09:01, Ken Brown wrote:
> > On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> > > What if the readers never request more than, say, 50 or even 25% of the
> > > available buffer space?  Our buffer is 64K and there's no guarantee that
> > > any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> > > having to check the other side of the pipe.  Something like this,
> > > ignoring border cases:
> > > 
> > > pipe::create()
> > > {
> > >     [...]
> > >     mutex = CreateMutex();
> > > }
> > > 
> > > pipe::raw_read(char *buf, size_t num_requested)
> > > {
> > >    if (blocking)
> > >      {
> > >        WFSO(mutex);
> > >        NtQueryInformationFile(FilePipeLocalInformation);
> > >        if (!fpli.ReadDataAvailable
> > > 	  && num_requested > fpli.InboundQuota / 4)
> > > 	num_requested = fpli.InboundQuota / 4;
> > >        NtReadFile(pipe, buf, num_requested);
> > >        ReleaseMutex(mutex);
> > >      }
> > > }
> > > 
> > > It's not entirely foolproof, but it should fix 99% of the cases.
> > 
> > I like it!
> > 
> > Do you think there's anything we can or should do to avoid a deadlock in the
> > rare cases where this fails?  The only thing I can think of immediately is
> > to always impose a timeout if select is called with infinite timeout on the
> > write side of a pipe, after which we report that the pipe is write ready.
> > After all, we've lived since 2008 with a bug that caused select to *always*
> > report write ready.
> 
> Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
> 
> > Alternatively, we could just wait and see if there's an actual use case in
> > which someone encounters a deadlock.
> 
> Or that.  Fixing up select isn't too hard in that case, I guess.

It's getting too late again.  I drop off for tonight, but I attached
my POC code I have so far.  It also adds the snippets from my previous
patch which fixes stuff Takashi found during testing.  It also fixes
something which looks like a bug in raw_write:

-	  ptr = ((char *) ptr) + chunk;
+	  ptr = ((char *) ptr) + nbytes_now;

Incrementing ptr by chunk bytes while only nbytes_now have been written
looks incorrect.

As for the reader, it makes the # of bytes to read dependent on the
number of reader handles.  I don't know if that's such a bright idea,
but this can be changed easily.

Anyway, this runs all my testcases successfully but they are anything
but thorough.

Patch relativ to topic/pipe attached.  Would you both mind to take a
scrutinizing look?


Thanks,
Corinna

[-- Attachment #2: pipe.diff --]
[-- Type: text/plain, Size: 10810 bytes --]

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 132e6002133b..032ab5fb07ae 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1171,6 +1171,7 @@ class fhandler_socket_unix : public fhandler_socket
 class fhandler_pipe: public fhandler_base
 {
 private:
+  HANDLE read_mtx;
   pid_t popen_pid;
   size_t max_atomic_write;
   void set_pipe_non_blocking (bool nonblocking);
@@ -1178,6 +1179,7 @@ public:
   fhandler_pipe ();
 
   bool ispipe() const { return true; }
+  void set_read_mutex (HANDLE mtx) { read_mtx = mtx; }
 
   void set_popen_pid (pid_t pid) {popen_pid = pid;}
   pid_t get_popen_pid () const {return popen_pid;}
@@ -1187,7 +1189,9 @@ public:
   select_record *select_except (select_stuff *);
   char *get_proc_fd_name (char *buf);
   int open (int flags, mode_t mode = 0);
+  void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
+  int close ();
   void __reg3 raw_read (void *ptr, size_t& len);
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
   int ioctl (unsigned int cmd, void *);
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 2dec0a84817c..7a5cefb3d07c 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -240,8 +240,37 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       keep_looping = false;
       if (evt)
 	ResetEvent (evt);
+      if (!is_nonblocking ())
+	{
+	  FILE_PIPE_LOCAL_INFORMATION fpli;
+	  ULONG reader_count;
+	  ULONG max_len = 64;
+
+	  WaitForSingleObject (read_mtx, INFINITE);
+
+	  /* Make sure never to request more bytes than half the pipe
+	     buffer size.  Every pending read lowers WriteQuotaAvailable
+	     on the write side and thus affects select's ability to return
+	     more or less reliable info whether a write succeeds or not.
+
+	     Let the size of the request depend on the number of readers
+	     at the time. */
+	  status = NtQueryInformationFile (get_handle (), &io,
+					   &fpli, sizeof (fpli),
+					   FilePipeLocalInformation);
+	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
+	    {
+	      reader_count = get_obj_handle_count (get_handle ());
+	      if (reader_count < 10)
+		max_len = fpli.InboundQuota / (2 * reader_count);
+	      if (len > max_len)
+		len = max_len;
+	    }
+	}
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
 			   len, NULL, NULL);
+      if (!is_nonblocking ())
+	ReleaseMutex (read_mtx);
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -313,7 +342,6 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 ssize_t __reg3
 fhandler_pipe::raw_write (const void *ptr, size_t len)
 {
-  ssize_t ret = -1;
   size_t nbytes = 0;
   ULONG chunk;
   NTSTATUS status = STATUS_SUCCESS;
@@ -352,8 +380,36 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
       else
 	len1 = (ULONG) left;
       nbytes_now = 0;
-      status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
-			    (PVOID) ptr, len1, NULL, NULL);
+      /* NtWriteFile returns success with # of bytes written == 0 if writing
+         on a non-blocking pipe fails because the pipe buffer doesn't have
+	 sufficient space.
+
+	 POSIX requires
+	 - A write request for {PIPE_BUF} or fewer bytes shall have the
+	   following effect: if there is sufficient space available in the
+	   pipe, write() shall transfer all the data and return the number
+	   of bytes requested. Otherwise, write() shall transfer no data and
+	   return -1 with errno set to [EAGAIN].
+
+	 - A write request for more than {PIPE_BUF} bytes shall cause one
+	   of the following:
+
+	  - When at least one byte can be written, transfer what it can and
+	    return the number of bytes written. When all data previously
+	    written to the pipe is read, it shall transfer at least {PIPE_BUF}
+	    bytes.
+
+	  - When no data can be written, transfer no data, and return -1 with
+	    errno set to [EAGAIN]. */
+      while (len1 > 0)
+	{
+	  status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
+				(PVOID) ptr, len1, NULL, NULL);
+	  if (evt || !NT_SUCCESS (status) || io.Information > 0
+	      || len <= PIPE_BUF)
+	    break;
+	  len1 >>= 1;
+	}
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -375,13 +431,11 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
       else if (NT_SUCCESS (status))
 	{
 	  nbytes_now = io.Information;
-	  /* NtWriteFile returns success with # of bytes written == 0
-	     if writing on a non-blocking pipe fails because the pipe
-	     buffer doesn't have sufficient space. */
-	  if (nbytes_now == 0)
-	    set_errno (EAGAIN);
-	  ptr = ((char *) ptr) + chunk;
+	  ptr = ((char *) ptr) + nbytes_now;
 	  nbytes += nbytes_now;
+	  /* 0 bytes returned?  EAGAIN.  See above. */
+	  if (nbytes == 0)
+	    set_errno (EAGAIN);
 	}
       else if (STATUS_PIPE_IS_CLOSED (status))
 	{
@@ -392,17 +446,23 @@ fhandler_pipe::raw_write (const void *ptr, size_t len)
 	__seterrno_from_nt_status (status);
 
       if (nbytes_now == 0)
-	len = 0;		/* Terminate loop. */
-      if (nbytes > 0)
-	ret = nbytes;
+	break;
     }
   if (evt)
     CloseHandle (evt);
-  if (status == STATUS_THREAD_SIGNALED && ret < 0)
+  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
     set_errno (EINTR);
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
-  return ret;
+  return nbytes ?: -1;
+}
+
+void
+fhandler_pipe::fixup_after_fork (HANDLE parent)
+{
+  if (read_mtx)
+    fork_fixup (parent, read_mtx, "read_mtx");
+  fhandler_base::fixup_after_fork (parent);
 }
 
 int
@@ -411,16 +471,31 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
   fhandler_pipe *ftp = (fhandler_pipe *) child;
   ftp->set_popen_pid (0);
 
-  int res;
-  if (get_handle () && fhandler_base::dup (child, flags))
+  int res = 0;
+  if (fhandler_base::dup (child, flags))
     res = -1;
-  else
-    res = 0;
+  else if (read_mtx &&
+	   !DuplicateHandle (GetCurrentProcess (), read_mtx,
+			     GetCurrentProcess (), &ftp->read_mtx,
+			     0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
 }
 
+int
+fhandler_pipe::close ()
+{
+  if (read_mtx)
+    NtClose (read_mtx);
+  return fhandler_base::close ();
+}
+
 #define PIPE_INTRO "\\\\.\\pipe\\cygwin-"
 
 /* Create a pipe, and return handles to the read and write ends,
@@ -608,6 +683,7 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
   else if ((fhs[1] = (fhandler_pipe *) build_fh_dev (*pipew_dev)) == NULL)
     {
       delete fhs[0];
+      CloseHandle (r);
       CloseHandle (w);
     }
   else
@@ -617,7 +693,25 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 		    unique_id);
       fhs[1]->init (w, FILE_CREATE_PIPE_INSTANCE | GENERIC_WRITE, mode,
 		    unique_id);
-      res = 0;
+      /* For the read side of the pipe, add a mutex.  See raw_read for the
+	 usage. */
+      SECURITY_ATTRIBUTES sa = { .nLength = sizeof (SECURITY_ATTRIBUTES),
+				 .lpSecurityDescriptor = NULL,
+				 .bInheritHandle = !(mode & O_CLOEXEC)
+			       };
+      HANDLE mtx = CreateMutexW (&sa, FALSE, NULL);
+      if (!mtx)
+	{
+	  delete fhs[0];
+	  CloseHandle (r);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	{
+	  fhs[0]->set_read_mutex (mtx);
+	  res = 0;
+	}
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
@@ -658,7 +752,7 @@ nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
 				 &cygheap->installation_key,
 				 GetCurrentProcessId ());
 
-  access = GENERIC_READ | FILE_WRITE_ATTRIBUTES;
+  access = GENERIC_READ | FILE_WRITE_ATTRIBUTES | SYNCHRONIZE;
 
   ULONG pipe_type = pipe_byte ? FILE_PIPE_BYTE_STREAM_TYPE
     : FILE_PIPE_MESSAGE_TYPE;
@@ -737,7 +831,7 @@ nt_create (LPSECURITY_ATTRIBUTES sa_ptr, PHANDLE r, PHANDLE w,
     {
       debug_printf ("NtOpenFile: name %S", &pipename);
 
-      access = GENERIC_WRITE | FILE_READ_ATTRIBUTES;
+      access = GENERIC_WRITE | FILE_READ_ATTRIBUTES | SYNCHRONIZE;
       status = NtOpenFile (w, access, &attr, &io, 0, 0);
       if (!NT_SUCCESS (status))
 	{
diff --git a/winsup/cygwin/flock.cc b/winsup/cygwin/flock.cc
index bd7a16d91ecd..2f12fc07e37b 100644
--- a/winsup/cygwin/flock.cc
+++ b/winsup/cygwin/flock.cc
@@ -216,22 +216,6 @@ allow_others_to_sync ()
   done = true;
 }
 
-/* Get the handle count of an object. */
-static ULONG
-get_obj_handle_count (HANDLE h)
-{
-  OBJECT_BASIC_INFORMATION obi;
-  NTSTATUS status;
-  ULONG hdl_cnt = 0;
-
-  status = NtQueryObject (h, ObjectBasicInformation, &obi, sizeof obi, NULL);
-  if (!NT_SUCCESS (status))
-    debug_printf ("NtQueryObject: %y", status);
-  else
-    hdl_cnt = obi.HandleCount;
-  return hdl_cnt;
-}
-
 /* Helper struct to construct a local OBJECT_ATTRIBUTES on the stack. */
 struct lockfattr_t
 {
diff --git a/winsup/cygwin/miscfuncs.cc b/winsup/cygwin/miscfuncs.cc
index f4c3a1c48e8e..dc36030ca572 100644
--- a/winsup/cygwin/miscfuncs.cc
+++ b/winsup/cygwin/miscfuncs.cc
@@ -18,6 +18,22 @@ details. */
 #include "tls_pbuf.h"
 #include "mmap_alloc.h"
 
+/* Get handle count of an object. */
+ULONG
+get_obj_handle_count (HANDLE h)
+{
+  OBJECT_BASIC_INFORMATION obi;
+  NTSTATUS status;
+  ULONG hdl_cnt = 0;
+
+  status = NtQueryObject (h, ObjectBasicInformation, &obi, sizeof obi, NULL);
+  if (!NT_SUCCESS (status))
+    debug_printf ("NtQueryObject: %y", status);
+  else
+    hdl_cnt = obi.HandleCount;
+  return hdl_cnt;
+}
+
 int __reg2
 check_invalid_virtual_addr (const void *s, unsigned sz)
 {
diff --git a/winsup/cygwin/miscfuncs.h b/winsup/cygwin/miscfuncs.h
index 1ff7ee0d3fde..47cef6f20c0a 100644
--- a/winsup/cygwin/miscfuncs.h
+++ b/winsup/cygwin/miscfuncs.h
@@ -98,6 +98,9 @@ transform_chars (PUNICODE_STRING upath, USHORT start_idx)
 
 PWCHAR transform_chars_af_unix (PWCHAR, const char *, __socklen_t);
 
+/* Get handle count of an object. */
+ULONG get_obj_handle_count (HANDLE h);
+
 /* Memory checking */
 int __reg2 check_invalid_virtual_addr (const void *s, unsigned sz);
 
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 83e1c00e0ac7..ac2fd227eb17 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -612,7 +612,6 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
 	   that.  This means that a pipe could still block since you could
 	   be trying to write more to the pipe than is available in the
 	   buffer but that is the hazard of select().  */
-      fpli.WriteQuotaAvailable = fpli.OutboundQuota - fpli.ReadDataAvailable;
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 19:35                                                                                   ` Corinna Vinschen
@ 2021-09-02 20:19                                                                                     ` Ken Brown
  2021-09-03  9:12                                                                                       ` Corinna Vinschen
  2021-09-03 10:00                                                                                     ` Takashi Yano
  2021-09-08 11:32                                                                                     ` Takashi Yano
  2 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-02 20:19 UTC (permalink / raw)
  To: cygwin-developers

On 9/2/2021 3:35 PM, Corinna Vinschen wrote:
> On Sep  2 21:00, Corinna Vinschen wrote:
>> On Sep  2 09:01, Ken Brown wrote:
>>> On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
>>>> What if the readers never request more than, say, 50 or even 25% of the
>>>> available buffer space?  Our buffer is 64K and there's no guarantee that
>>>> any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
>>>> having to check the other side of the pipe.  Something like this,
>>>> ignoring border cases:
>>>>
>>>> pipe::create()
>>>> {
>>>>      [...]
>>>>      mutex = CreateMutex();
>>>> }
>>>>
>>>> pipe::raw_read(char *buf, size_t num_requested)
>>>> {
>>>>     if (blocking)
>>>>       {
>>>>         WFSO(mutex);
>>>>         NtQueryInformationFile(FilePipeLocalInformation);
>>>>         if (!fpli.ReadDataAvailable
>>>> 	  && num_requested > fpli.InboundQuota / 4)
>>>> 	num_requested = fpli.InboundQuota / 4;
>>>>         NtReadFile(pipe, buf, num_requested);
>>>>         ReleaseMutex(mutex);
>>>>       }
>>>> }
>>>>
>>>> It's not entirely foolproof, but it should fix 99% of the cases.
>>>
>>> I like it!
>>>
>>> Do you think there's anything we can or should do to avoid a deadlock in the
>>> rare cases where this fails?  The only thing I can think of immediately is
>>> to always impose a timeout if select is called with infinite timeout on the
>>> write side of a pipe, after which we report that the pipe is write ready.
>>> After all, we've lived since 2008 with a bug that caused select to *always*
>>> report write ready.
>>
>> Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
>>
>>> Alternatively, we could just wait and see if there's an actual use case in
>>> which someone encounters a deadlock.
>>
>> Or that.  Fixing up select isn't too hard in that case, I guess.
> 
> It's getting too late again.  I drop off for tonight, but I attached
> my POC code I have so far.  It also adds the snippets from my previous
> patch which fixes stuff Takashi found during testing.  It also fixes
> something which looks like a bug in raw_write:
> 
> -	  ptr = ((char *) ptr) + chunk;
> +	  ptr = ((char *) ptr) + nbytes_now;
> 
> Incrementing ptr by chunk bytes while only nbytes_now have been written
> looks incorrect.

Yes.  I actually copied that bug from fhandler_base_overlapped::raw_write.  It's 
amazing that no one has no one has bumped into it up to now.
> As for the reader, it makes the # of bytes to read dependent on the
> number of reader handles.  I don't know if that's such a bright idea,
> but this can be changed easily.
> 
> Anyway, this runs all my testcases successfully but they are anything
> but thorough.
> 
> Patch relativ to topic/pipe attached.  Would you both mind to take a
> scrutinizing look?

Will do.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 20:19                                                                                     ` Ken Brown
@ 2021-09-03  9:12                                                                                       ` Corinna Vinschen
  2021-09-03 19:00                                                                                         ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03  9:12 UTC (permalink / raw)
  To: cygwin-developers

On Sep  2 16:19, Ken Brown wrote:
> On 9/2/2021 3:35 PM, Corinna Vinschen wrote:
> > On Sep  2 21:00, Corinna Vinschen wrote:
> > > On Sep  2 09:01, Ken Brown wrote:
> > > > Do you think there's anything we can or should do to avoid a deadlock in the
> > > > rare cases where this fails?  The only thing I can think of immediately is
> > > > to always impose a timeout if select is called with infinite timeout on the
> > > > write side of a pipe, after which we report that the pipe is write ready.
> > > > After all, we've lived since 2008 with a bug that caused select to *always*
> > > > report write ready.
> > > 
> > > Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
> > > 
> > > > Alternatively, we could just wait and see if there's an actual use case in
> > > > which someone encounters a deadlock.
> > > 
> > > Or that.  Fixing up select isn't too hard in that case, I guess.
> > 
> > It's getting too late again.  I drop off for tonight, but I attached
> > my POC code I have so far.  It also adds the snippets from my previous
> > patch which fixes stuff Takashi found during testing.  It also fixes
> > something which looks like a bug in raw_write:
> > 
> > -	  ptr = ((char *) ptr) + chunk;
> > +	  ptr = ((char *) ptr) + nbytes_now;
> > 
> > Incrementing ptr by chunk bytes while only nbytes_now have been written
> > looks incorrect.
> 
> Yes.  I actually copied that bug from fhandler_base_overlapped::raw_write.
> It's amazing that no one has no one has bumped into it up to now.
> > As for the reader, it makes the # of bytes to read dependent on the
> > number of reader handles.  I don't know if that's such a bright idea,
> > but this can be changed easily.
> > 
> > Anyway, this runs all my testcases successfully but they are anything
> > but thorough.
> > 
> > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > scrutinizing look?
> 
> Will do.

I pushed my stuff to the topic/pipe branch split into hopefully useful
chunks.  Kick me if anything is wrong or not working.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 19:35                                                                                   ` Corinna Vinschen
  2021-09-02 20:19                                                                                     ` Ken Brown
@ 2021-09-03 10:00                                                                                     ` Takashi Yano
  2021-09-03 10:13                                                                                       ` Takashi Yano
  2021-09-03 10:38                                                                                       ` Takashi Yano
  2021-09-08 11:32                                                                                     ` Takashi Yano
  2 siblings, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-03 10:00 UTC (permalink / raw)
  To: cygwin-developers

On Thu, 2 Sep 2021 21:35:21 +0200
Corinna Vinschen wrote:
> On Sep  2 21:00, Corinna Vinschen wrote:
> > On Sep  2 09:01, Ken Brown wrote:
> > > On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> > > > What if the readers never request more than, say, 50 or even 25% of the
> > > > available buffer space?  Our buffer is 64K and there's no guarantee that
> > > > any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> > > > having to check the other side of the pipe.  Something like this,
> > > > ignoring border cases:
> > > > 
> > > > pipe::create()
> > > > {
> > > >     [...]
> > > >     mutex = CreateMutex();
> > > > }
> > > > 
> > > > pipe::raw_read(char *buf, size_t num_requested)
> > > > {
> > > >    if (blocking)
> > > >      {
> > > >        WFSO(mutex);
> > > >        NtQueryInformationFile(FilePipeLocalInformation);
> > > >        if (!fpli.ReadDataAvailable
> > > > 	  && num_requested > fpli.InboundQuota / 4)
> > > > 	num_requested = fpli.InboundQuota / 4;
> > > >        NtReadFile(pipe, buf, num_requested);
> > > >        ReleaseMutex(mutex);
> > > >      }
> > > > }
> > > > 
> > > > It's not entirely foolproof, but it should fix 99% of the cases.
> > > 
> > > I like it!
> > > 
> > > Do you think there's anything we can or should do to avoid a deadlock in the
> > > rare cases where this fails?  The only thing I can think of immediately is
> > > to always impose a timeout if select is called with infinite timeout on the
> > > write side of a pipe, after which we report that the pipe is write ready.
> > > After all, we've lived since 2008 with a bug that caused select to *always*
> > > report write ready.
> > 
> > Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
> > 
> > > Alternatively, we could just wait and see if there's an actual use case in
> > > which someone encounters a deadlock.
> > 
> > Or that.  Fixing up select isn't too hard in that case, I guess.
> 
> It's getting too late again.  I drop off for tonight, but I attached
> my POC code I have so far.  It also adds the snippets from my previous
> patch which fixes stuff Takashi found during testing.  It also fixes
> something which looks like a bug in raw_write:
> 
> -	  ptr = ((char *) ptr) + chunk;
> +	  ptr = ((char *) ptr) + nbytes_now;
> 
> Incrementing ptr by chunk bytes while only nbytes_now have been written
> looks incorrect.
> 
> As for the reader, it makes the # of bytes to read dependent on the
> number of reader handles.  I don't know if that's such a bright idea,
> but this can be changed easily.
> 
> Anyway, this runs all my testcases successfully but they are anything
> but thorough.
> 
> Patch relativ to topic/pipe attached.  Would you both mind to take a
> scrutinizing look?

Thanks.

Your code seems that read() returns only the partial data even
if the pipe stil has more data. Is this by design?

This happes in both blocking and non-blocking case.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 10:00                                                                                     ` Takashi Yano
@ 2021-09-03 10:13                                                                                       ` Takashi Yano
  2021-09-03 11:31                                                                                         ` Corinna Vinschen
  2021-09-03 10:38                                                                                       ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-03 10:13 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]

On Fri, 3 Sep 2021 19:00:46 +0900
Takashi Yano wrote:
> On Thu, 2 Sep 2021 21:35:21 +0200
> Corinna Vinschen wrote:
> > On Sep  2 21:00, Corinna Vinschen wrote:
> > > On Sep  2 09:01, Ken Brown wrote:
> > > > On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> > > > > What if the readers never request more than, say, 50 or even 25% of the
> > > > > available buffer space?  Our buffer is 64K and there's no guarantee that
> > > > > any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> > > > > having to check the other side of the pipe.  Something like this,
> > > > > ignoring border cases:
> > > > > 
> > > > > pipe::create()
> > > > > {
> > > > >     [...]
> > > > >     mutex = CreateMutex();
> > > > > }
> > > > > 
> > > > > pipe::raw_read(char *buf, size_t num_requested)
> > > > > {
> > > > >    if (blocking)
> > > > >      {
> > > > >        WFSO(mutex);
> > > > >        NtQueryInformationFile(FilePipeLocalInformation);
> > > > >        if (!fpli.ReadDataAvailable
> > > > > 	  && num_requested > fpli.InboundQuota / 4)
> > > > > 	num_requested = fpli.InboundQuota / 4;
> > > > >        NtReadFile(pipe, buf, num_requested);
> > > > >        ReleaseMutex(mutex);
> > > > >      }
> > > > > }
> > > > > 
> > > > > It's not entirely foolproof, but it should fix 99% of the cases.
> > > > 
> > > > I like it!
> > > > 
> > > > Do you think there's anything we can or should do to avoid a deadlock in the
> > > > rare cases where this fails?  The only thing I can think of immediately is
> > > > to always impose a timeout if select is called with infinite timeout on the
> > > > write side of a pipe, after which we report that the pipe is write ready.
> > > > After all, we've lived since 2008 with a bug that caused select to *always*
> > > > report write ready.
> > > 
> > > Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
> > > 
> > > > Alternatively, we could just wait and see if there's an actual use case in
> > > > which someone encounters a deadlock.
> > > 
> > > Or that.  Fixing up select isn't too hard in that case, I guess.
> > 
> > It's getting too late again.  I drop off for tonight, but I attached
> > my POC code I have so far.  It also adds the snippets from my previous
> > patch which fixes stuff Takashi found during testing.  It also fixes
> > something which looks like a bug in raw_write:
> > 
> > -	  ptr = ((char *) ptr) + chunk;
> > +	  ptr = ((char *) ptr) + nbytes_now;
> > 
> > Incrementing ptr by chunk bytes while only nbytes_now have been written
> > looks incorrect.
> > 
> > As for the reader, it makes the # of bytes to read dependent on the
> > number of reader handles.  I don't know if that's such a bright idea,
> > but this can be changed easily.
> > 
> > Anyway, this runs all my testcases successfully but they are anything
> > but thorough.
> > 
> > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > scrutinizing look?
> 
> Thanks.
> 
> Your code seems that read() returns only the partial data even
> if the pipe stil has more data. Is this by design?
> 
> This happes in both blocking and non-blocking case.

The patch attached seems to fix the issue.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: pipe-corinna4-fix.patch --]
[-- Type: application/octet-stream, Size: 3483 bytes --]

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 7a5cefb3d..5b6b98892 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -222,6 +222,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   DWORD waitret = WAIT_OBJECT_0;
   bool keep_looping = false;
   size_t orig_len = len;
+  size_t total_len = 0;
 
   if (!len)
     return;
@@ -236,29 +237,37 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 
   do
     {
-      len = orig_len;
+      char *ptr1 = (char *) ptr + total_len;
+      len = orig_len - total_len;
       keep_looping = false;
       if (evt)
 	ResetEvent (evt);
-      if (!is_nonblocking ())
+
+      FILE_PIPE_LOCAL_INFORMATION fpli;
+      ULONG reader_count;
+      ULONG max_len = 64;
+
+      WaitForSingleObject (read_mtx, INFINITE);
+
+      /* Make sure never to request more bytes than half the pipe
+	 buffer size.  Every pending read lowers WriteQuotaAvailable
+	 on the write side and thus affects select's ability to return
+	 more or less reliable info whether a write succeeds or not.
+
+	 Let the size of the request depend on the number of readers
+	 at the time. */
+      status = NtQueryInformationFile (get_handle (), &io,
+				       &fpli, sizeof (fpli),
+				       FilePipeLocalInformation);
+      if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
 	{
-	  FILE_PIPE_LOCAL_INFORMATION fpli;
-	  ULONG reader_count;
-	  ULONG max_len = 64;
-
-	  WaitForSingleObject (read_mtx, INFINITE);
-
-	  /* Make sure never to request more bytes than half the pipe
-	     buffer size.  Every pending read lowers WriteQuotaAvailable
-	     on the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not.
-
-	     Let the size of the request depend on the number of readers
-	     at the time. */
-	  status = NtQueryInformationFile (get_handle (), &io,
-					   &fpli, sizeof (fpli),
-					   FilePipeLocalInformation);
-	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
+	  if (total_len != 0)
+	    {
+	      len = total_len;
+	      ReleaseMutex (read_mtx);
+	      break;
+	    }
+	  if (!is_nonblocking ())
 	    {
 	      reader_count = get_obj_handle_count (get_handle ());
 	      if (reader_count < 10)
@@ -267,10 +276,11 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 		len = max_len;
 	    }
 	}
-      status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
+
+      status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr1,
 			   len, NULL, NULL);
-      if (!is_nonblocking ())
-	ReleaseMutex (read_mtx);
+      ReleaseMutex (read_mtx);
+
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -291,9 +301,11 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
+	  total_len += io.Information;
+	  if (total_len < orig_len)
 	    keep_looping = true;
+	  else
+	    len = total_len;
 	}
       else
 	{
@@ -308,9 +320,11 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
+	      total_len += io.Information;
+	      if (total_len < orig_len)
 		keep_looping = true;
+	      else
+		len = total_len;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 10:00                                                                                     ` Takashi Yano
  2021-09-03 10:13                                                                                       ` Takashi Yano
@ 2021-09-03 10:38                                                                                       ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-03 10:38 UTC (permalink / raw)
  To: cygwin-developers

On Fri, 3 Sep 2021 19:00:46 +0900
Takashi Yano wrote:
> On Thu, 2 Sep 2021 21:35:21 +0200
> Corinna Vinschen wrote:
> > On Sep  2 21:00, Corinna Vinschen wrote:
> > > On Sep  2 09:01, Ken Brown wrote:
> > > > On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> > > > > What if the readers never request more than, say, 50 or even 25% of the
> > > > > available buffer space?  Our buffer is 64K and there's no guarantee that
> > > > > any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> > > > > having to check the other side of the pipe.  Something like this,
> > > > > ignoring border cases:
> > > > > 
> > > > > pipe::create()
> > > > > {
> > > > >     [...]
> > > > >     mutex = CreateMutex();
> > > > > }
> > > > > 
> > > > > pipe::raw_read(char *buf, size_t num_requested)
> > > > > {
> > > > >    if (blocking)
> > > > >      {
> > > > >        WFSO(mutex);
> > > > >        NtQueryInformationFile(FilePipeLocalInformation);
> > > > >        if (!fpli.ReadDataAvailable
> > > > > 	  && num_requested > fpli.InboundQuota / 4)
> > > > > 	num_requested = fpli.InboundQuota / 4;
> > > > >        NtReadFile(pipe, buf, num_requested);
> > > > >        ReleaseMutex(mutex);
> > > > >      }
> > > > > }
> > > > > 
> > > > > It's not entirely foolproof, but it should fix 99% of the cases.
> > > > 
> > > > I like it!
> > > > 
> > > > Do you think there's anything we can or should do to avoid a deadlock in the
> > > > rare cases where this fails?  The only thing I can think of immediately is
> > > > to always impose a timeout if select is called with infinite timeout on the
> > > > write side of a pipe, after which we report that the pipe is write ready.
> > > > After all, we've lived since 2008 with a bug that caused select to *always*
> > > > report write ready.
> > > 
> > > Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
> > > 
> > > > Alternatively, we could just wait and see if there's an actual use case in
> > > > which someone encounters a deadlock.
> > > 
> > > Or that.  Fixing up select isn't too hard in that case, I guess.
> > 
> > It's getting too late again.  I drop off for tonight, but I attached
> > my POC code I have so far.  It also adds the snippets from my previous
> > patch which fixes stuff Takashi found during testing.  It also fixes
> > something which looks like a bug in raw_write:
> > 
> > -	  ptr = ((char *) ptr) + chunk;
> > +	  ptr = ((char *) ptr) + nbytes_now;
> > 
> > Incrementing ptr by chunk bytes while only nbytes_now have been written
> > looks incorrect.
> > 
> > As for the reader, it makes the # of bytes to read dependent on the
> > number of reader handles.  I don't know if that's such a bright idea,
> > but this can be changed easily.
> > 
> > Anyway, this runs all my testcases successfully but they are anything
> > but thorough.
> > 
> > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > scrutinizing look?
> 
> Thanks.
> 
> Your code seems that read() returns only the partial data even
> if the pipe stil has more data. Is this by design?
> 
> This happes in both blocking and non-blocking case.

Sorry, this may only happen if pipe is blocking mode.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 10:13                                                                                       ` Takashi Yano
@ 2021-09-03 11:31                                                                                         ` Corinna Vinschen
  2021-09-03 11:41                                                                                           ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03 11:31 UTC (permalink / raw)
  To: cygwin-developers

On Sep  3 19:13, Takashi Yano wrote:
> On Fri, 3 Sep 2021 19:00:46 +0900
> Takashi Yano wrote:
> > On Thu, 2 Sep 2021 21:35:21 +0200
> > Corinna Vinschen wrote:
> > > On Sep  2 21:00, Corinna Vinschen wrote:
> > > It's getting too late again.  I drop off for tonight, but I attached
> > > my POC code I have so far.  It also adds the snippets from my previous
> > > patch which fixes stuff Takashi found during testing.  It also fixes
> > > something which looks like a bug in raw_write:
> > > 
> > > -	  ptr = ((char *) ptr) + chunk;
> > > +	  ptr = ((char *) ptr) + nbytes_now;
> > > 
> > > Incrementing ptr by chunk bytes while only nbytes_now have been written
> > > looks incorrect.
> > > 
> > > As for the reader, it makes the # of bytes to read dependent on the
> > > number of reader handles.  I don't know if that's such a bright idea,
> > > but this can be changed easily.
> > > 
> > > Anyway, this runs all my testcases successfully but they are anything
> > > but thorough.
> > > 
> > > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > > scrutinizing look?
> > 
> > Thanks.
> > 
> > Your code seems that read() returns only the partial data even
> > if the pipe stil has more data. Is this by design?
> > 
> > This happes in both blocking and non-blocking case.
> 
> The patch attached seems to fix the issue.

I'm sorry, but I don't see what your patch is supposed to do in the
first place.  What I see is that it now calls NtQueryInformationFile
even in the non-blocking case, which is not supposed to happen.

I'm a bit puzzled what the actual bug is.

The code changing len is only called if there's no data in the pipe.
In that case we only request a partial buffer so as not to block
the writer on select.

If there *is* data in the pipe, it will just go straight to the
NtReadFile code without changing len.

Where's the mistake?


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 11:31                                                                                         ` Corinna Vinschen
@ 2021-09-03 11:41                                                                                           ` Corinna Vinschen
  2021-09-03 12:13                                                                                             ` Ken Brown
  2021-09-03 12:22                                                                                             ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03 11:41 UTC (permalink / raw)
  To: cygwin-developers

On Sep  3 13:31, Corinna Vinschen wrote:
> On Sep  3 19:13, Takashi Yano wrote:
> > On Fri, 3 Sep 2021 19:00:46 +0900
> > Takashi Yano wrote:
> > > On Thu, 2 Sep 2021 21:35:21 +0200
> > > Corinna Vinschen wrote:
> > > > On Sep  2 21:00, Corinna Vinschen wrote:
> > > > It's getting too late again.  I drop off for tonight, but I attached
> > > > my POC code I have so far.  It also adds the snippets from my previous
> > > > patch which fixes stuff Takashi found during testing.  It also fixes
> > > > something which looks like a bug in raw_write:
> > > > 
> > > > -	  ptr = ((char *) ptr) + chunk;
> > > > +	  ptr = ((char *) ptr) + nbytes_now;
> > > > 
> > > > Incrementing ptr by chunk bytes while only nbytes_now have been written
> > > > looks incorrect.
> > > > 
> > > > As for the reader, it makes the # of bytes to read dependent on the
> > > > number of reader handles.  I don't know if that's such a bright idea,
> > > > but this can be changed easily.
> > > > 
> > > > Anyway, this runs all my testcases successfully but they are anything
> > > > but thorough.
> > > > 
> > > > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > > > scrutinizing look?
> > > 
> > > Thanks.
> > > 
> > > Your code seems that read() returns only the partial data even
> > > if the pipe stil has more data. Is this by design?
> > > 
> > > This happes in both blocking and non-blocking case.
> > 
> > The patch attached seems to fix the issue.
> 
> I'm sorry, but I don't see what your patch is supposed to do in the
> first place.  What I see is that it now calls NtQueryInformationFile
> even in the non-blocking case, which is not supposed to happen.
> 
> I'm a bit puzzled what the actual bug is.
> 
> The code changing len is only called if there's no data in the pipe.
> In that case we only request a partial buffer so as not to block
> the writer on select.
> 
> If there *is* data in the pipe, it will just go straight to the
> NtReadFile code without changing len.
> 
> Where's the mistake?

Oh, wait.  Do you mean, if we only request less than len bytes, but
after NtReadFile there's still data in the buffer, we should try to
deplete the buffer up to len bytes in a subsequent NtReadFile?

I thought this is unnecessary, actually, because of POSIX:

   The standard developers considered adding atomicity requirements  to  a
   pipe  or FIFO, but recognized that due to the nature of pipes and FIFOs
   there could be no guarantee of atomicity of reads of {PIPE_BUF} or  any
   other size that would be an aid to applications portability.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 11:41                                                                                           ` Corinna Vinschen
@ 2021-09-03 12:13                                                                                             ` Ken Brown
  2021-09-03 15:00                                                                                               ` Corinna Vinschen
  2021-09-03 12:22                                                                                             ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-03 12:13 UTC (permalink / raw)
  To: cygwin-developers

On 9/3/2021 7:41 AM, Corinna Vinschen wrote:
> On Sep  3 13:31, Corinna Vinschen wrote:
>> On Sep  3 19:13, Takashi Yano wrote:
>>> On Fri, 3 Sep 2021 19:00:46 +0900
>>> Takashi Yano wrote:
>>>> On Thu, 2 Sep 2021 21:35:21 +0200
>>>> Corinna Vinschen wrote:
>>>>> On Sep  2 21:00, Corinna Vinschen wrote:
>>>>> It's getting too late again.  I drop off for tonight, but I attached
>>>>> my POC code I have so far.  It also adds the snippets from my previous
>>>>> patch which fixes stuff Takashi found during testing.  It also fixes
>>>>> something which looks like a bug in raw_write:
>>>>>
>>>>> -	  ptr = ((char *) ptr) + chunk;
>>>>> +	  ptr = ((char *) ptr) + nbytes_now;
>>>>>
>>>>> Incrementing ptr by chunk bytes while only nbytes_now have been written
>>>>> looks incorrect.
>>>>>
>>>>> As for the reader, it makes the # of bytes to read dependent on the
>>>>> number of reader handles.  I don't know if that's such a bright idea,
>>>>> but this can be changed easily.
>>>>>
>>>>> Anyway, this runs all my testcases successfully but they are anything
>>>>> but thorough.
>>>>>
>>>>> Patch relativ to topic/pipe attached.  Would you both mind to take a
>>>>> scrutinizing look?
>>>>
>>>> Thanks.
>>>>
>>>> Your code seems that read() returns only the partial data even
>>>> if the pipe stil has more data. Is this by design?
>>>>
>>>> This happes in both blocking and non-blocking case.
>>>
>>> The patch attached seems to fix the issue.
>>
>> I'm sorry, but I don't see what your patch is supposed to do in the
>> first place.  What I see is that it now calls NtQueryInformationFile
>> even in the non-blocking case, which is not supposed to happen.
>>
>> I'm a bit puzzled what the actual bug is.
>>
>> The code changing len is only called if there's no data in the pipe.
>> In that case we only request a partial buffer so as not to block
>> the writer on select.
>>
>> If there *is* data in the pipe, it will just go straight to the
>> NtReadFile code without changing len.
>>
>> Where's the mistake?
> 
> Oh, wait.  Do you mean, if we only request less than len bytes, but
> after NtReadFile there's still data in the buffer, we should try to
> deplete the buffer up to len bytes in a subsequent NtReadFile?
> 
> I thought this is unnecessary, actually, because of POSIX:
> 
>     The standard developers considered adding atomicity requirements  to  a
>     pipe  or FIFO, but recognized that due to the nature of pipes and FIFOs
>     there could be no guarantee of atomicity of reads of {PIPE_BUF} or  any
>     other size that would be an aid to applications portability.

I agree.  I think if read returns less than what was requested, it's up to the 
caller to call read again if desired.

One tiny thing I noticed: get_obj_handle_count can return 0.  So the line

   reader_count = get_obj_handle_count (get_handle ());

should be

   reader_count = get_obj_handle_count (get_handle ()) ?: 1;

Or else get_obj_handle_count should be changed to return 1 instead of 0 if 
NtQueryObject fails.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 11:41                                                                                           ` Corinna Vinschen
  2021-09-03 12:13                                                                                             ` Ken Brown
@ 2021-09-03 12:22                                                                                             ` Takashi Yano
  2021-09-03 13:27                                                                                               ` Ken Brown
  2021-09-03 15:37                                                                                               ` Corinna Vinschen
  1 sibling, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-03 12:22 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3507 bytes --]

On Fri, 3 Sep 2021 13:41:45 +0200
Corinna Vinschen wrote:
> On Sep  3 13:31, Corinna Vinschen wrote:
> > On Sep  3 19:13, Takashi Yano wrote:
> > > On Fri, 3 Sep 2021 19:00:46 +0900
> > > Takashi Yano wrote:
> > > > On Thu, 2 Sep 2021 21:35:21 +0200
> > > > Corinna Vinschen wrote:
> > > > > On Sep  2 21:00, Corinna Vinschen wrote:
> > > > > It's getting too late again.  I drop off for tonight, but I attached
> > > > > my POC code I have so far.  It also adds the snippets from my previous
> > > > > patch which fixes stuff Takashi found during testing.  It also fixes
> > > > > something which looks like a bug in raw_write:
> > > > > 
> > > > > -	  ptr = ((char *) ptr) + chunk;
> > > > > +	  ptr = ((char *) ptr) + nbytes_now;
> > > > > 
> > > > > Incrementing ptr by chunk bytes while only nbytes_now have been written
> > > > > looks incorrect.
> > > > > 
> > > > > As for the reader, it makes the # of bytes to read dependent on the
> > > > > number of reader handles.  I don't know if that's such a bright idea,
> > > > > but this can be changed easily.
> > > > > 
> > > > > Anyway, this runs all my testcases successfully but they are anything
> > > > > but thorough.
> > > > > 
> > > > > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > > > > scrutinizing look?
> > > > 
> > > > Thanks.
> > > > 
> > > > Your code seems that read() returns only the partial data even
> > > > if the pipe stil has more data. Is this by design?
> > > > 
> > > > This happes in both blocking and non-blocking case.
> > > 
> > > The patch attached seems to fix the issue.
> > 
> > I'm sorry, but I don't see what your patch is supposed to do in the
> > first place.  What I see is that it now calls NtQueryInformationFile
> > even in the non-blocking case, which is not supposed to happen.
> > 
> > I'm a bit puzzled what the actual bug is.
> > 
> > The code changing len is only called if there's no data in the pipe.
> > In that case we only request a partial buffer so as not to block
> > the writer on select.
> > 
> > If there *is* data in the pipe, it will just go straight to the
> > NtReadFile code without changing len.
> > 
> > Where's the mistake?
> 
> Oh, wait.  Do you mean, if we only request less than len bytes, but
> after NtReadFile there's still data in the buffer, we should try to
> deplete the buffer up to len bytes in a subsequent NtReadFile?

Yes. I am sorry, my intent was not clear because I did more than
necessary in the previous patch. Please see attached patch revised.

> I thought this is unnecessary, actually, because of POSIX:
> 
>    The standard developers considered adding atomicity requirements  to  a
>    pipe  or FIFO, but recognized that due to the nature of pipes and FIFOs
>    there could be no guarantee of atomicity of reads of {PIPE_BUF} or  any
>    other size that would be an aid to applications portability.

POSIX says:
    The value returned may be less than nbyte if the number of bytes left
    in the file is less than nbyte, if the read() request was interrupted
    by a signal, or if the file is a pipe or FIFO or special file and has
                                                                      ~~~
    fewer than nbyte bytes immediately available for reading.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html

If it is turned over, read() should read all data immediately available,
I think.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: pipe-corinna4-fix.patch --]
[-- Type: application/octet-stream, Size: 2301 bytes --]

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 85730d039..ef7823ae5 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -222,6 +222,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   DWORD waitret = WAIT_OBJECT_0;
   bool keep_looping = false;
   size_t orig_len = len;
+  size_t total_len = 0;
 
   if (!len)
     return;
@@ -236,7 +237,8 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 
   do
     {
-      len = orig_len;
+      char *ptr1 = (char *) ptr + total_len;
+      len = orig_len - total_len;
       keep_looping = false;
       if (evt)
 	ResetEvent (evt);
@@ -260,6 +262,12 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 					   FilePipeLocalInformation);
 	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
 	    {
+	      if (total_len != 0)
+		{
+		  len = total_len;
+		  ReleaseMutex (read_mtx);
+		  break;
+		}
 	      reader_count = get_obj_handle_count (get_handle ());
 	      if (reader_count < 10)
 		max_len = fpli.InboundQuota / (2 * reader_count);
@@ -267,7 +275,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 		len = max_len;
 	    }
 	}
-      status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
+      status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr1,
 			   len, NULL, NULL);
       if (!is_nonblocking ())
 	ReleaseMutex (read_mtx);
@@ -291,9 +299,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
+	  if (io.Information == 0)
 	    keep_looping = true;
+	  total_len += io.Information;
+	  if (!is_nonblocking () && total_len < orig_len)
+	    keep_looping = true;
+	  else
+	    len = total_len;
 	}
       else
 	{
@@ -308,9 +320,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
+	      if (io.Information == 0)
+		keep_looping = true;
+	      total_len += io.Information;
+	      if (!is_nonblocking () && total_len < orig_len)
 		keep_looping = true;
+	      else
+		len = total_len;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 12:22                                                                                             ` Takashi Yano
@ 2021-09-03 13:27                                                                                               ` Ken Brown
  2021-09-03 15:37                                                                                               ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-03 13:27 UTC (permalink / raw)
  To: cygwin-developers

On 9/3/2021 8:22 AM, Takashi Yano wrote:
> On Fri, 3 Sep 2021 13:41:45 +0200
> Corinna Vinschen wrote:
>> On Sep  3 13:31, Corinna Vinschen wrote:
>>> On Sep  3 19:13, Takashi Yano wrote:
>>>> On Fri, 3 Sep 2021 19:00:46 +0900
>>>> Takashi Yano wrote:
>>>>> On Thu, 2 Sep 2021 21:35:21 +0200
>>>>> Corinna Vinschen wrote:
>>>>>> On Sep  2 21:00, Corinna Vinschen wrote:
>>>>>> It's getting too late again.  I drop off for tonight, but I attached
>>>>>> my POC code I have so far.  It also adds the snippets from my previous
>>>>>> patch which fixes stuff Takashi found during testing.  It also fixes
>>>>>> something which looks like a bug in raw_write:
>>>>>>
>>>>>> -	  ptr = ((char *) ptr) + chunk;
>>>>>> +	  ptr = ((char *) ptr) + nbytes_now;
>>>>>>
>>>>>> Incrementing ptr by chunk bytes while only nbytes_now have been written
>>>>>> looks incorrect.
>>>>>>
>>>>>> As for the reader, it makes the # of bytes to read dependent on the
>>>>>> number of reader handles.  I don't know if that's such a bright idea,
>>>>>> but this can be changed easily.
>>>>>>
>>>>>> Anyway, this runs all my testcases successfully but they are anything
>>>>>> but thorough.
>>>>>>
>>>>>> Patch relativ to topic/pipe attached.  Would you both mind to take a
>>>>>> scrutinizing look?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Your code seems that read() returns only the partial data even
>>>>> if the pipe stil has more data. Is this by design?
>>>>>
>>>>> This happes in both blocking and non-blocking case.
>>>>
>>>> The patch attached seems to fix the issue.
>>>
>>> I'm sorry, but I don't see what your patch is supposed to do in the
>>> first place.  What I see is that it now calls NtQueryInformationFile
>>> even in the non-blocking case, which is not supposed to happen.
>>>
>>> I'm a bit puzzled what the actual bug is.
>>>
>>> The code changing len is only called if there's no data in the pipe.
>>> In that case we only request a partial buffer so as not to block
>>> the writer on select.
>>>
>>> If there *is* data in the pipe, it will just go straight to the
>>> NtReadFile code without changing len.
>>>
>>> Where's the mistake?
>>
>> Oh, wait.  Do you mean, if we only request less than len bytes, but
>> after NtReadFile there's still data in the buffer, we should try to
>> deplete the buffer up to len bytes in a subsequent NtReadFile?
> 
> Yes. I am sorry, my intent was not clear because I did more than
> necessary in the previous patch. Please see attached patch revised.
> 
>> I thought this is unnecessary, actually, because of POSIX:
>>
>>     The standard developers considered adding atomicity requirements  to  a
>>     pipe  or FIFO, but recognized that due to the nature of pipes and FIFOs
>>     there could be no guarantee of atomicity of reads of {PIPE_BUF} or  any
>>     other size that would be an aid to applications portability.
> 
> POSIX says:
>      The value returned may be less than nbyte if the number of bytes left
>      in the file is less than nbyte, if the read() request was interrupted
>      by a signal, or if the file is a pipe or FIFO or special file and has
>                                                                        ~~~
>      fewer than nbyte bytes immediately available for reading.
>      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> 
> If it is turned over, read() should read all data immediately available,
> I think.

I understand the reasoning now, but I think your patch isn't quite right.  As it 
stands, if the call to NtQueryInformationFile fails but total_length != 0, 
you're trying to read again without knowing that there's data in the pipe.

Also, I think you need the following:

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index ef7823ae5..46bb96961 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
      CloseHandle (evt);
    if (status == STATUS_THREAD_SIGNALED)
      {
-      set_errno (EINTR);
-      len = (size_t) -1;
+      if (total_len == 0)
+       {
+         set_errno (EINTR);
+         len = (size_t) -1;
+       }
+      else
+       len = total_len;
      }
    else if (status == STATUS_THREAD_CANCELED)
      pthread::static_cancel_self ();

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 12:13                                                                                             ` Ken Brown
@ 2021-09-03 15:00                                                                                               ` Corinna Vinschen
  2021-09-03 15:14                                                                                                 ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03 15:00 UTC (permalink / raw)
  To: cygwin-developers

On Sep  3 08:13, Ken Brown wrote:
> One tiny thing I noticed: get_obj_handle_count can return 0.  So the line
> 
>   reader_count = get_obj_handle_count (get_handle ());
> 
> should be
> 
>   reader_count = get_obj_handle_count (get_handle ()) ?: 1;
> 
> Or else get_obj_handle_count should be changed to return 1 instead of 0 if
> NtQueryObject fails.

We're in the reader with a valid read handle asking for the number of
open handles.  NtQueryObject can only fail if the handle is invalid.
I don't see how this could be the case here.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 15:00                                                                                               ` Corinna Vinschen
@ 2021-09-03 15:14                                                                                                 ` Ken Brown
  2021-09-03 15:17                                                                                                   ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-03 15:14 UTC (permalink / raw)
  To: cygwin-developers

On 9/3/2021 11:00 AM, Corinna Vinschen wrote:
> On Sep  3 08:13, Ken Brown wrote:
>> One tiny thing I noticed: get_obj_handle_count can return 0.  So the line
>>
>>    reader_count = get_obj_handle_count (get_handle ());
>>
>> should be
>>
>>    reader_count = get_obj_handle_count (get_handle ()) ?: 1;
>>
>> Or else get_obj_handle_count should be changed to return 1 instead of 0 if
>> NtQueryObject fails.
> 
> We're in the reader with a valid read handle asking for the number of
> open handles.  NtQueryObject can only fail if the handle is invalid.
> I don't see how this could be the case here.

I don't either.  I was just being paranoid and wanted to avoid the 
appearance that we might divide by 0.  But if you're sure it's 
impossible, that's fine.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 15:14                                                                                                 ` Ken Brown
@ 2021-09-03 15:17                                                                                                   ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03 15:17 UTC (permalink / raw)
  To: cygwin-developers

On Sep  3 11:14, Ken Brown wrote:
> On 9/3/2021 11:00 AM, Corinna Vinschen wrote:
> > On Sep  3 08:13, Ken Brown wrote:
> > > One tiny thing I noticed: get_obj_handle_count can return 0.  So the line
> > > 
> > >    reader_count = get_obj_handle_count (get_handle ());
> > > 
> > > should be
> > > 
> > >    reader_count = get_obj_handle_count (get_handle ()) ?: 1;
> > > 
> > > Or else get_obj_handle_count should be changed to return 1 instead of 0 if
> > > NtQueryObject fails.
> > 
> > We're in the reader with a valid read handle asking for the number of
> > open handles.  NtQueryObject can only fail if the handle is invalid.
> > I don't see how this could be the case here.
> 
> I don't either.  I was just being paranoid and wanted to avoid the
> appearance that we might divide by 0.

Good point.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 12:22                                                                                             ` Takashi Yano
  2021-09-03 13:27                                                                                               ` Ken Brown
@ 2021-09-03 15:37                                                                                               ` Corinna Vinschen
  2021-09-04 12:02                                                                                                 ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03 15:37 UTC (permalink / raw)
  To: cygwin-developers

On Sep  3 21:22, Takashi Yano wrote:
> On Fri, 3 Sep 2021 13:41:45 +0200
> Corinna Vinschen wrote:
> > Oh, wait.  Do you mean, if we only request less than len bytes, but
> > after NtReadFile there's still data in the buffer, we should try to
> > deplete the buffer up to len bytes in a subsequent NtReadFile?
> 
> Yes. I am sorry, my intent was not clear because I did more than
> necessary in the previous patch. Please see attached patch revised.
> 
> > I thought this is unnecessary, actually, because of POSIX:
> > 
> >    The standard developers considered adding atomicity requirements  to  a
> >    pipe  or FIFO, but recognized that due to the nature of pipes and FIFOs
> >    there could be no guarantee of atomicity of reads of {PIPE_BUF} or  any
> >    other size that would be an aid to applications portability.
> 
> POSIX says:
>     The value returned may be less than nbyte if the number of bytes left
>     in the file is less than nbyte, if the read() request was interrupted
>     by a signal, or if the file is a pipe or FIFO or special file and has
>                                                                       ~~~
>     fewer than nbyte bytes immediately available for reading.
>     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> 
> If it is turned over, read() should read all data immediately available,
> I think.

Hmm, I see the point, but we might have another problem with that.

We can't keep the mutex while waiting on the pending read, and there
could be more than one pending read running at the time.  if so,
chances are extremly high, that the data written to the buffer gets
split like this:

   reader 1		               reader 2

   calls read(65536)                   calls read(65536)

   calls NtReadFile(16384 bytes)
                                       calls NtReadFile(16384 bytes)

writer writes 65536 bytes

   wakes up and gets 16384 bytes
                                       wakes up and gets 16384 bytes
   gets the mutex, calls
   NtReadFile(32768) which 
   returns immediately with
   32768 bytes added to the
   caller's buffer.

so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
missing in the middle of it, *without* the reader knowing about that
fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
been read in a single call, at least, so the byte order is not
unknowingly broken on the application level.

Does that make sense?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03  9:12                                                                                       ` Corinna Vinschen
@ 2021-09-03 19:00                                                                                         ` Ken Brown
  2021-09-03 19:53                                                                                           ` Ken Brown
  2021-09-03 19:54                                                                                           ` Corinna Vinschen
  0 siblings, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-03 19:00 UTC (permalink / raw)
  To: cygwin-developers

On 9/3/2021 5:12 AM, Corinna Vinschen wrote:
> I pushed my stuff to the topic/pipe branch split into hopefully useful
> chunks.  Kick me if anything is wrong or not working.

Some of the bugs you fixed in the pipe code exist in the fifo code also.  I 
started going through them and fixing them, but then I realized that 
fhandler_pipe::raw_write and fhandler_fifo::raw_write are identical.  For ease 
of maintenance, I'm thinking we should have a single function, say 
fhandler_base::raw_write_pipe or fhandler_base::raw_write_pipe_fifo, which is 
called by both of them.

WDYT?

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 19:00                                                                                         ` Ken Brown
@ 2021-09-03 19:53                                                                                           ` Ken Brown
  2021-09-03 19:54                                                                                           ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-03 19:53 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

On 9/3/2021 3:00 PM, Ken Brown wrote:
> On 9/3/2021 5:12 AM, Corinna Vinschen wrote:
>> I pushed my stuff to the topic/pipe branch split into hopefully useful
>> chunks.  Kick me if anything is wrong or not working.
> 
> Some of the bugs you fixed in the pipe code exist in the fifo code also.  I 
> started going through them and fixing them, but then I realized that 
> fhandler_pipe::raw_write and fhandler_fifo::raw_write are identical.  For ease 
> of maintenance, I'm thinking we should have a single function, say 
> fhandler_base::raw_write_pipe or fhandler_base::raw_write_pipe_fifo, which is 
> called by both of them.
> 
> WDYT?

Here's what that would look like (attached).

Ken

[-- Attachment #2: 0001-wip.patch --]
[-- Type: text/plain, Size: 13629 bytes --]

From 108a50006decade6dca0b91ca6f1c165bdfd20dd Mon Sep 17 00:00:00 2001
From: Ken Brown <kbrown@cornell.edu>
Date: Fri, 3 Sep 2021 15:43:17 -0400
Subject: [PATCH 1/2] wip

---
 winsup/cygwin/fhandler.cc      | 124 +++++++++++++++++++++++++++++++++
 winsup/cygwin/fhandler.h       |   6 +-
 winsup/cygwin/fhandler_fifo.cc |  98 +-------------------------
 winsup/cygwin/fhandler_pipe.cc | 121 +-------------------------------
 4 files changed, 130 insertions(+), 219 deletions(-)

diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc
index f0c1b68f1..7d7aad216 100644
--- a/winsup/cygwin/fhandler.cc
+++ b/winsup/cygwin/fhandler.cc
@@ -280,6 +280,129 @@ fhandler_base::raw_write (const void *ptr, size_t len)
   return io.Information;
 }
 
+#define STATUS_PIPE_IS_CLOSED(status)	\
+		({ NTSTATUS _s = (status); \
+		   _s == STATUS_PIPE_CLOSING \
+		   || _s == STATUS_PIPE_BROKEN \
+		   || _s == STATUS_PIPE_EMPTY; })
+
+/* Used by fhandler_pipe::raw_write and fhandler_fifo::raw_write. */
+ssize_t __reg3
+fhandler_base::raw_write_pipe_fifo (const void *ptr, size_t len)
+{
+  size_t nbytes = 0;
+  ULONG chunk;
+  NTSTATUS status = STATUS_SUCCESS;
+  IO_STATUS_BLOCK io;
+  HANDLE evt = NULL;
+
+  if (!len)
+    return 0;
+
+  if (len <= max_atomic_write)
+    chunk = len;
+  else if (is_nonblocking ())
+    chunk = len = max_atomic_write;
+  else
+    chunk = max_atomic_write;
+
+  /* Create a wait event if the pipe or fifo is in blocking mode. */
+  if (!is_nonblocking () && !(evt = CreateEvent (NULL, false, false, NULL)))
+    {
+      __seterrno ();
+      return -1;
+    }
+
+  /* Write in chunks, accumulating a total.  If there's an error, just
+     return the accumulated total unless the first write fails, in
+     which case return -1. */
+  while (nbytes < len)
+    {
+      ULONG_PTR nbytes_now = 0;
+      size_t left = len - nbytes;
+      ULONG len1;
+      DWORD waitret = WAIT_OBJECT_0;
+
+      if (left > chunk)
+	len1 = chunk;
+      else
+	len1 = (ULONG) left;
+      /* NtWriteFile returns success with # of bytes written == 0 if writing
+         on a non-blocking pipe fails because the pipe buffer doesn't have
+	 sufficient space.
+
+	 POSIX requires
+	 - A write request for {PIPE_BUF} or fewer bytes shall have the
+	   following effect: if there is sufficient space available in the
+	   pipe, write() shall transfer all the data and return the number
+	   of bytes requested. Otherwise, write() shall transfer no data and
+	   return -1 with errno set to [EAGAIN].
+
+	 - A write request for more than {PIPE_BUF} bytes shall cause one
+	   of the following:
+
+	  - When at least one byte can be written, transfer what it can and
+	    return the number of bytes written. When all data previously
+	    written to the pipe is read, it shall transfer at least {PIPE_BUF}
+	    bytes.
+
+	  - When no data can be written, transfer no data, and return -1 with
+	    errno set to [EAGAIN]. */
+      while (len1 > 0)
+	{
+	  status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
+				(PVOID) ptr, len1, NULL, NULL);
+	  if (evt || !NT_SUCCESS (status) || io.Information > 0
+	      || len <= PIPE_BUF)
+	    break;
+	  len1 >>= 1;
+	}
+      if (evt && status == STATUS_PENDING)
+	{
+	  waitret = cygwait (evt);
+	  if (waitret == WAIT_OBJECT_0)
+	    status = io.Status;
+	}
+      if (waitret == WAIT_CANCELED)
+	status = STATUS_THREAD_CANCELED;
+      else if (waitret == WAIT_SIGNALED)
+	status = STATUS_THREAD_SIGNALED;
+      else if (isclosed ())  /* A signal handler might have closed the fd. */
+	{
+	  if (waitret == WAIT_OBJECT_0)
+	    set_errno (EBADF);
+	  else
+	    __seterrno ();
+	}
+      else if (NT_SUCCESS (status))
+	{
+	  nbytes_now = io.Information;
+	  ptr = ((char *) ptr) + nbytes_now;
+	  nbytes += nbytes_now;
+	  /* 0 bytes returned?  EAGAIN.  See above. */
+	  if (nbytes == 0)
+	    set_errno (EAGAIN);
+	}
+      else if (STATUS_PIPE_IS_CLOSED (status))
+	{
+	  set_errno (EPIPE);
+	  raise (SIGPIPE);
+	}
+      else
+	__seterrno_from_nt_status (status);
+
+      if (nbytes_now == 0)
+	break;
+    }
+  if (evt)
+    CloseHandle (evt);
+  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
+    set_errno (EINTR);
+  else if (status == STATUS_THREAD_CANCELED)
+    pthread::static_cancel_self ();
+  return nbytes ?: -1;
+}
+
 int
 fhandler_base::get_default_fmode (int flags)
 {
@@ -1464,6 +1587,7 @@ fhandler_base::fhandler_base () :
   _refcnt (0),
   openflags (0),
   unique_id (0),
+  max_atomic_write (DEFAULT_PIPEBUFSIZE),
   archetype (NULL),
   usecount (0)
 {
diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 032ab5fb0..ebddcca88 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -218,6 +218,9 @@ class fhandler_base
 
   HANDLE read_state;
 
+  /* Used by fhandler_pipe and fhandler_fifo. */
+  size_t max_atomic_write;
+
  public:
   LONG inc_refcnt () {return InterlockedIncrement (&_refcnt);}
   LONG dec_refcnt () {return InterlockedDecrement (&_refcnt);}
@@ -453,6 +456,7 @@ public:
 
   virtual void __reg3 raw_read (void *ptr, size_t& ulen);
   virtual ssize_t __reg3 raw_write (const void *ptr, size_t ulen);
+  ssize_t __reg3 raw_write_pipe_fifo (const void *ptr, size_t ulen);
 
   /* Virtual accessor functions to hide the fact
      that some fd's have two handles. */
@@ -1173,7 +1177,6 @@ class fhandler_pipe: public fhandler_base
 private:
   HANDLE read_mtx;
   pid_t popen_pid;
-  size_t max_atomic_write;
   void set_pipe_non_blocking (bool nonblocking);
 public:
   fhandler_pipe ();
@@ -1342,7 +1345,6 @@ class fhandler_fifo: public fhandler_base
   int nhandlers;                       /* Number of elements in the array. */
   af_unix_spinlock_t _fifo_client_lock;
   bool reader, writer, duplexer;
-  size_t max_atomic_write;
   fifo_reader_id_t me;
 
   HANDLE shmem_handle;
diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index b55ba95e7..11b06ed08 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -108,14 +108,6 @@
 */
 
 
-/* This is only to be used for writers.  When reading,
-STATUS_PIPE_EMPTY simply means there's no data to be read. */
-#define STATUS_PIPE_IS_CLOSED(status)	\
-		({ NTSTATUS _s = (status); \
-		   _s == STATUS_PIPE_CLOSING \
-		   || _s == STATUS_PIPE_BROKEN \
-		   || _s == STATUS_PIPE_EMPTY; })
-
 #define STATUS_PIPE_NO_INSTANCE_AVAILABLE(status)	\
 		({ NTSTATUS _s = (status); \
 		   _s == STATUS_INSTANCE_NOT_AVAILABLE \
@@ -134,7 +126,6 @@ fhandler_fifo::fhandler_fifo ():
   cancel_evt (NULL), thr_sync_evt (NULL), pipe_name_buf (NULL),
   fc_handler (NULL), shandlers (0), nhandlers (0),
   reader (false), writer (false), duplexer (false),
-  max_atomic_write (DEFAULT_PIPEBUFSIZE),
   me (null_fr_id), shmem_handle (NULL), shmem (NULL),
   shared_fc_hdl (NULL), shared_fc_handler (NULL)
 {
@@ -1141,94 +1132,7 @@ fhandler_fifo::wait (HANDLE h)
 ssize_t __reg3
 fhandler_fifo::raw_write (const void *ptr, size_t len)
 {
-  ssize_t ret = -1;
-  size_t nbytes = 0;
-  ULONG chunk;
-  NTSTATUS status = STATUS_SUCCESS;
-  IO_STATUS_BLOCK io;
-  HANDLE evt = NULL;
-
-  if (!len)
-    return 0;
-
-  if (len <= max_atomic_write)
-    chunk = len;
-  else if (is_nonblocking ())
-    chunk = len = max_atomic_write;
-  else
-    chunk = max_atomic_write;
-
-  /* Create a wait event if the FIFO is in blocking mode. */
-  if (!is_nonblocking () && !(evt = CreateEvent (NULL, false, false, NULL)))
-    {
-      __seterrno ();
-      return -1;
-    }
-
-  /* Write in chunks, accumulating a total.  If there's an error, just
-     return the accumulated total unless the first write fails, in
-     which case return -1. */
-  while (nbytes < len)
-    {
-      ULONG_PTR nbytes_now = 0;
-      size_t left = len - nbytes;
-      ULONG len1;
-      DWORD waitret = WAIT_OBJECT_0;
-
-      if (left > chunk)
-	len1 = chunk;
-      else
-	len1 = (ULONG) left;
-      nbytes_now = 0;
-      status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
-			    (PVOID) ptr, len1, NULL, NULL);
-      if (evt && status == STATUS_PENDING)
-	{
-	  waitret = cygwait (evt);
-	  if (waitret == WAIT_OBJECT_0)
-	    status = io.Status;
-	}
-      if (waitret == WAIT_CANCELED)
-	status = STATUS_THREAD_CANCELED;
-      else if (waitret == WAIT_SIGNALED)
-	status = STATUS_THREAD_SIGNALED;
-      else if (isclosed ())  /* A signal handler might have closed the fd. */
-	{
-	  if (waitret == WAIT_OBJECT_0)
-	    set_errno (EBADF);
-	  else
-	    __seterrno ();
-	}
-      else if (NT_SUCCESS (status))
-	{
-	  nbytes_now = io.Information;
-	  /* NtWriteFile returns success with # of bytes written == 0
-	     if writing on a non-blocking pipe fails because the pipe
-	     buffer doesn't have sufficient space. */
-	  if (nbytes_now == 0)
-	    set_errno (EAGAIN);
-	  ptr = ((char *) ptr) + chunk;
-	  nbytes += nbytes_now;
-	}
-      else if (STATUS_PIPE_IS_CLOSED (status))
-	{
-	  set_errno (EPIPE);
-	  raise (SIGPIPE);
-	}
-      else
-	__seterrno_from_nt_status (status);
-      if (nbytes_now == 0)
-	len = 0;		/* Terminate loop. */
-      if (nbytes > 0)
-	ret = nbytes;
-    }
-  if (evt)
-    NtClose (evt);
-  if (status == STATUS_THREAD_SIGNALED && ret < 0)
-    set_errno (EINTR);
-  else if (status == STATUS_THREAD_CANCELED)
-    pthread::static_cancel_self ();
-  return ret;
+  return fhandler_base::raw_write_pipe_fifo (ptr, len);
 }
 
 /* Called from raw_read and select.cc:peek_fifo. */
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 81b1aed5e..88c98d41b 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -20,18 +20,9 @@ details. */
 #include "pinfo.h"
 #include "shared_info.h"
 
-/* This is only to be used for writing.  When reading,
-STATUS_PIPE_EMPTY simply means there's no data to be read. */
-#define STATUS_PIPE_IS_CLOSED(status)	\
-		({ NTSTATUS _s = (status); \
-		   _s == STATUS_PIPE_CLOSING \
-		   || _s == STATUS_PIPE_BROKEN \
-		   || _s == STATUS_PIPE_EMPTY; })
-
 fhandler_pipe::fhandler_pipe ()
   : fhandler_base (), popen_pid (0)
 {
-  max_atomic_write = DEFAULT_PIPEBUFSIZE;
   need_fork_fixup (true);
 }
 
@@ -342,117 +333,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 ssize_t __reg3
 fhandler_pipe::raw_write (const void *ptr, size_t len)
 {
-  size_t nbytes = 0;
-  ULONG chunk;
-  NTSTATUS status = STATUS_SUCCESS;
-  IO_STATUS_BLOCK io;
-  HANDLE evt = NULL;
-
-  if (!len)
-    return 0;
-
-  if (len <= max_atomic_write)
-    chunk = len;
-  else if (is_nonblocking ())
-    chunk = len = max_atomic_write;
-  else
-    chunk = max_atomic_write;
-
-  /* Create a wait event if the pipe is in blocking mode. */
-  if (!is_nonblocking () && !(evt = CreateEvent (NULL, false, false, NULL)))
-    {
-      __seterrno ();
-      return -1;
-    }
-
-  /* Write in chunks, accumulating a total.  If there's an error, just
-     return the accumulated total unless the first write fails, in
-     which case return -1. */
-  while (nbytes < len)
-    {
-      ULONG_PTR nbytes_now = 0;
-      size_t left = len - nbytes;
-      ULONG len1;
-      DWORD waitret = WAIT_OBJECT_0;
-
-      if (left > chunk)
-	len1 = chunk;
-      else
-	len1 = (ULONG) left;
-      /* NtWriteFile returns success with # of bytes written == 0 if writing
-         on a non-blocking pipe fails because the pipe buffer doesn't have
-	 sufficient space.
-
-	 POSIX requires
-	 - A write request for {PIPE_BUF} or fewer bytes shall have the
-	   following effect: if there is sufficient space available in the
-	   pipe, write() shall transfer all the data and return the number
-	   of bytes requested. Otherwise, write() shall transfer no data and
-	   return -1 with errno set to [EAGAIN].
-
-	 - A write request for more than {PIPE_BUF} bytes shall cause one
-	   of the following:
-
-	  - When at least one byte can be written, transfer what it can and
-	    return the number of bytes written. When all data previously
-	    written to the pipe is read, it shall transfer at least {PIPE_BUF}
-	    bytes.
-
-	  - When no data can be written, transfer no data, and return -1 with
-	    errno set to [EAGAIN]. */
-      while (len1 > 0)
-	{
-	  status = NtWriteFile (get_handle (), evt, NULL, NULL, &io,
-				(PVOID) ptr, len1, NULL, NULL);
-	  if (evt || !NT_SUCCESS (status) || io.Information > 0
-	      || len <= PIPE_BUF)
-	    break;
-	  len1 >>= 1;
-	}
-      if (evt && status == STATUS_PENDING)
-	{
-	  waitret = cygwait (evt);
-	  if (waitret == WAIT_OBJECT_0)
-	    status = io.Status;
-	}
-      if (waitret == WAIT_CANCELED)
-	status = STATUS_THREAD_CANCELED;
-      else if (waitret == WAIT_SIGNALED)
-	status = STATUS_THREAD_SIGNALED;
-      else if (isclosed ())  /* A signal handler might have closed the fd. */
-	{
-	  if (waitret == WAIT_OBJECT_0)
-	    set_errno (EBADF);
-	  else
-	    __seterrno ();
-	}
-      else if (NT_SUCCESS (status))
-	{
-	  nbytes_now = io.Information;
-	  ptr = ((char *) ptr) + nbytes_now;
-	  nbytes += nbytes_now;
-	  /* 0 bytes returned?  EAGAIN.  See above. */
-	  if (nbytes == 0)
-	    set_errno (EAGAIN);
-	}
-      else if (STATUS_PIPE_IS_CLOSED (status))
-	{
-	  set_errno (EPIPE);
-	  raise (SIGPIPE);
-	}
-      else
-	__seterrno_from_nt_status (status);
-
-      if (nbytes_now == 0)
-	break;
-    }
-  if (evt)
-    CloseHandle (evt);
-  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
-    set_errno (EINTR);
-  else if (status == STATUS_THREAD_CANCELED)
-    pthread::static_cancel_self ();
-  return nbytes ?: -1;
+  return fhandler_base::raw_write_pipe_fifo (ptr, len);
 }
 
 void
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 19:00                                                                                         ` Ken Brown
  2021-09-03 19:53                                                                                           ` Ken Brown
@ 2021-09-03 19:54                                                                                           ` Corinna Vinschen
  2021-09-03 20:05                                                                                             ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-03 19:54 UTC (permalink / raw)
  To: cygwin-developers

On Sep  3 15:00, Ken Brown wrote:
> On 9/3/2021 5:12 AM, Corinna Vinschen wrote:
> > I pushed my stuff to the topic/pipe branch split into hopefully useful
> > chunks.  Kick me if anything is wrong or not working.
> 
> Some of the bugs you fixed in the pipe code exist in the fifo code also.  I
> started going through them and fixing them, but then I realized that
> fhandler_pipe::raw_write and fhandler_fifo::raw_write are identical.  For
> ease of maintenance, I'm thinking we should have a single function, say
> fhandler_base::raw_write_pipe or fhandler_base::raw_write_pipe_fifo, which
> is called by both of them.
> 
> WDYT?

Returning to fhandler_base_overlapped?  :)

No, seriously, the cleanest way is probably to implement a parent class
for fhandler_pipe and fhandler_fifo, derived from fhandler_base, and
just collect the common code there.  For a start it could simply provide
raw_write and nothing else.  fhandler_pipe_fifo, perhaps?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 19:54                                                                                           ` Corinna Vinschen
@ 2021-09-03 20:05                                                                                             ` Ken Brown
  0 siblings, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-03 20:05 UTC (permalink / raw)
  To: cygwin-developers

On 9/3/2021 3:54 PM, Corinna Vinschen wrote:
> On Sep  3 15:00, Ken Brown wrote:
>> On 9/3/2021 5:12 AM, Corinna Vinschen wrote:
>>> I pushed my stuff to the topic/pipe branch split into hopefully useful
>>> chunks.  Kick me if anything is wrong or not working.
>>
>> Some of the bugs you fixed in the pipe code exist in the fifo code also.  I
>> started going through them and fixing them, but then I realized that
>> fhandler_pipe::raw_write and fhandler_fifo::raw_write are identical.  For
>> ease of maintenance, I'm thinking we should have a single function, say
>> fhandler_base::raw_write_pipe or fhandler_base::raw_write_pipe_fifo, which
>> is called by both of them.
>>
>> WDYT?
> 
> Returning to fhandler_base_overlapped?  :)

Touché.

> No, seriously, the cleanest way is probably to implement a parent class
> for fhandler_pipe and fhandler_fifo, derived from fhandler_base, and
> just collect the common code there.  For a start it could simply provide
> raw_write and nothing else.  fhandler_pipe_fifo, perhaps?

Sounds good.  Thanks.  Since the new class has only raw_write for now, I think I 
could just define that function in fhandler_pipe.cc rather than a new file 
fhandler_pipe_fifo.cc.  But I could also do the latter if you prefer.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-03 15:37                                                                                               ` Corinna Vinschen
@ 2021-09-04 12:02                                                                                                 ` Takashi Yano
  2021-09-04 12:37                                                                                                   ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-04 12:02 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna, Ken,

On Fri, 3 Sep 2021 09:27:37 -0400
Ken Brown wrote:
> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> > POSIX says:
> >      The value returned may be less than nbyte if the number of bytes left
> >      in the file is less than nbyte, if the read() request was interrupted
> >      by a signal, or if the file is a pipe or FIFO or special file and has
> >                                                                        ~~~
> >      fewer than nbyte bytes immediately available for reading.
> >      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> > 
> > If it is turned over, read() should read all data immediately available,
> > I think.
> 
> I understand the reasoning now, but I think your patch isn't quite right.  As it 
> stands, if the call to NtQueryInformationFile fails but total_length != 0, 
> you're trying to read again without knowing that there's data in the pipe.
> 
> Also, I think you need the following:
> 
> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> index ef7823ae5..46bb96961 100644
> --- a/winsup/cygwin/fhandler_pipe.cc
> +++ b/winsup/cygwin/fhandler_pipe.cc
> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
>       CloseHandle (evt);
>     if (status == STATUS_THREAD_SIGNALED)
>       {
> -      set_errno (EINTR);
> -      len = (size_t) -1;
> +      if (total_len == 0)
> +       {
> +         set_errno (EINTR);
> +         len = (size_t) -1;
> +       }
> +      else
> +       len = total_len;
>       }
>     else if (status == STATUS_THREAD_CANCELED)
>       pthread::static_cancel_self ();

Thanks for your advice. I fixed the issue and attached new patch.

On Fri, 3 Sep 2021 17:37:13 +0200
Corinna Vinschen wrote:
> Hmm, I see the point, but we might have another problem with that.
> 
> We can't keep the mutex while waiting on the pending read, and there
> could be more than one pending read running at the time.  if so,
> chances are extremly high, that the data written to the buffer gets
> split like this:
> 
>    reader 1		               reader 2
> 
>    calls read(65536)                   calls read(65536)
> 
>    calls NtReadFile(16384 bytes)
>                                        calls NtReadFile(16384 bytes)
> 
> writer writes 65536 bytes
> 
>    wakes up and gets 16384 bytes
>                                        wakes up and gets 16384 bytes
>    gets the mutex, calls
>    NtReadFile(32768) which 
>    returns immediately with
>    32768 bytes added to the
>    caller's buffer.
> 
> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> missing in the middle of it, *without* the reader knowing about that
> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> been read in a single call, at least, so the byte order is not
> unknowingly broken on the application level.
> 
> Does that make sense?

Why can't we keep the mutex while waiting on the pending read?
If we can keep the mutex, the issue above mentioned does not
happen, right?

What about the patch attached? This keeps the mutex while read()
but I do not see any defects so far.


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-04 12:02                                                                                                 ` Takashi Yano
@ 2021-09-04 12:37                                                                                                   ` Takashi Yano
  2021-09-04 14:04                                                                                                     ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-04 12:37 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3518 bytes --]

On Sat, 4 Sep 2021 21:02:58 +0900
Takashi Yano wrote:
> Hi Corinna, Ken,
> 
> On Fri, 3 Sep 2021 09:27:37 -0400
> Ken Brown wrote:
> > On 9/3/2021 8:22 AM, Takashi Yano wrote:
> > > POSIX says:
> > >      The value returned may be less than nbyte if the number of bytes left
> > >      in the file is less than nbyte, if the read() request was interrupted
> > >      by a signal, or if the file is a pipe or FIFO or special file and has
> > >                                                                        ~~~
> > >      fewer than nbyte bytes immediately available for reading.
> > >      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > 
> > > https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> > > 
> > > If it is turned over, read() should read all data immediately available,
> > > I think.
> > 
> > I understand the reasoning now, but I think your patch isn't quite right.  As it 
> > stands, if the call to NtQueryInformationFile fails but total_length != 0, 
> > you're trying to read again without knowing that there's data in the pipe.
> > 
> > Also, I think you need the following:
> > 
> > diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > index ef7823ae5..46bb96961 100644
> > --- a/winsup/cygwin/fhandler_pipe.cc
> > +++ b/winsup/cygwin/fhandler_pipe.cc
> > @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> >       CloseHandle (evt);
> >     if (status == STATUS_THREAD_SIGNALED)
> >       {
> > -      set_errno (EINTR);
> > -      len = (size_t) -1;
> > +      if (total_len == 0)
> > +       {
> > +         set_errno (EINTR);
> > +         len = (size_t) -1;
> > +       }
> > +      else
> > +       len = total_len;
> >       }
> >     else if (status == STATUS_THREAD_CANCELED)
> >       pthread::static_cancel_self ();
> 
> Thanks for your advice. I fixed the issue and attached new patch.
> 
> On Fri, 3 Sep 2021 17:37:13 +0200
> Corinna Vinschen wrote:
> > Hmm, I see the point, but we might have another problem with that.
> > 
> > We can't keep the mutex while waiting on the pending read, and there
> > could be more than one pending read running at the time.  if so,
> > chances are extremly high, that the data written to the buffer gets
> > split like this:
> > 
> >    reader 1		               reader 2
> > 
> >    calls read(65536)                   calls read(65536)
> > 
> >    calls NtReadFile(16384 bytes)
> >                                        calls NtReadFile(16384 bytes)
> > 
> > writer writes 65536 bytes
> > 
> >    wakes up and gets 16384 bytes
> >                                        wakes up and gets 16384 bytes
> >    gets the mutex, calls
> >    NtReadFile(32768) which 
> >    returns immediately with
> >    32768 bytes added to the
> >    caller's buffer.
> > 
> > so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> > missing in the middle of it, *without* the reader knowing about that
> > fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> > been read in a single call, at least, so the byte order is not
> > unknowingly broken on the application level.
> > 
> > Does that make sense?
> 
> Why can't we keep the mutex while waiting on the pending read?
> If we can keep the mutex, the issue above mentioned does not
> happen, right?
> 
> What about the patch attached? This keeps the mutex while read()
> but I do not see any defects so far.

Sorry, I forgot to attach the patch.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: pipe-corinna4-fix.patch --]
[-- Type: application/octet-stream, Size: 3540 bytes --]

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 85730d039..57354712d 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -222,6 +222,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   DWORD waitret = WAIT_OBJECT_0;
   bool keep_looping = false;
   size_t orig_len = len;
+  size_t total_len = 0;
 
   if (!len)
     return;
@@ -234,9 +235,17 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       return;
     }
 
+  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
+  if (WAIT_TIMEOUT == WaitForSingleObject (read_mtx, timeout))
+    {
+      set_errno (EAGAIN);
+      len = (size_t) -1;
+      return;
+    }
   do
     {
-      len = orig_len;
+      char *ptr1 = (char *) ptr + total_len;
+      len = orig_len - total_len;
       keep_looping = false;
       if (evt)
 	ResetEvent (evt);
@@ -246,7 +255,6 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	  ULONG reader_count;
 	  ULONG max_len = 64;
 
-	  WaitForSingleObject (read_mtx, INFINITE);
 
 	  /* Make sure never to request more bytes than half the pipe
 	     buffer size.  Every pending read lowers WriteQuotaAvailable
@@ -260,17 +268,25 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 					   FilePipeLocalInformation);
 	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
 	    {
+	      if (total_len != 0)
+		{
+		  len = total_len;
+		  break;
+		}
 	      reader_count = get_obj_handle_count (get_handle ());
 	      if (reader_count < 10)
 		max_len = fpli.InboundQuota / (2 * reader_count);
 	      if (len > max_len)
 		len = max_len;
 	    }
+	  if (!NT_SUCCESS (status) && total_len != 0)
+	    {
+	      len = total_len;
+	      break;
+	    }
 	}
-      status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
+      status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr1,
 			   len, NULL, NULL);
-      if (!is_nonblocking ())
-	ReleaseMutex (read_mtx);
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -291,9 +307,11 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
+	  total_len += io.Information;
+	  if (total_len < orig_len)
 	    keep_looping = true;
+	  else
+	    len = total_len;
 	}
       else
 	{
@@ -303,17 +321,24 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_END_OF_FILE:
 	    case STATUS_PIPE_BROKEN:
 	      /* This is really EOF.  */
-	      len = 0;
+	      len = total_len;
 	      break;
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
+	      total_len += io.Information;
+	      if (total_len < orig_len)
 		keep_looping = true;
+	      else
+		len = total_len;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:
+	      if (total_len != 0)
+		{
+		  len = total_len;
+		  break;
+		}
 	      if (is_nonblocking ())
 		{
 		  set_errno (EAGAIN);
@@ -328,12 +353,18 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    }
 	}
     } while (keep_looping);
+  ReleaseMutex (read_mtx);
   if (evt)
     CloseHandle (evt);
   if (status == STATUS_THREAD_SIGNALED)
     {
-      set_errno (EINTR);
-      len = (size_t) -1;
+      if (total_len == 0)
+	{
+	  set_errno (EINTR);
+	  len = (size_t) -1;
+	}
+      else
+	len = total_len;
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-04 12:37                                                                                                   ` Takashi Yano
@ 2021-09-04 14:04                                                                                                     ` Ken Brown
  2021-09-04 23:15                                                                                                       ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-04 14:04 UTC (permalink / raw)
  To: cygwin-developers

On 9/4/2021 8:37 AM, Takashi Yano wrote:
> On Sat, 4 Sep 2021 21:02:58 +0900
> Takashi Yano wrote:
>> Hi Corinna, Ken,
>>
>> On Fri, 3 Sep 2021 09:27:37 -0400
>> Ken Brown wrote:
>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
>>>> POSIX says:
>>>>       The value returned may be less than nbyte if the number of bytes left
>>>>       in the file is less than nbyte, if the read() request was interrupted
>>>>       by a signal, or if the file is a pipe or FIFO or special file and has
>>>>                                                                         ~~~
>>>>       fewer than nbyte bytes immediately available for reading.
>>>>       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>
>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
>>>>
>>>> If it is turned over, read() should read all data immediately available,
>>>> I think.
>>>
>>> I understand the reasoning now, but I think your patch isn't quite right.  As it
>>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
>>> you're trying to read again without knowing that there's data in the pipe.
>>>
>>> Also, I think you need the following:
>>>
>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>> index ef7823ae5..46bb96961 100644
>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
>>>        CloseHandle (evt);
>>>      if (status == STATUS_THREAD_SIGNALED)
>>>        {
>>> -      set_errno (EINTR);
>>> -      len = (size_t) -1;
>>> +      if (total_len == 0)
>>> +       {
>>> +         set_errno (EINTR);
>>> +         len = (size_t) -1;
>>> +       }
>>> +      else
>>> +       len = total_len;
>>>        }
>>>      else if (status == STATUS_THREAD_CANCELED)
>>>        pthread::static_cancel_self ();
>>
>> Thanks for your advice. I fixed the issue and attached new patch.
>>
>> On Fri, 3 Sep 2021 17:37:13 +0200
>> Corinna Vinschen wrote:
>>> Hmm, I see the point, but we might have another problem with that.
>>>
>>> We can't keep the mutex while waiting on the pending read, and there
>>> could be more than one pending read running at the time.  if so,
>>> chances are extremly high, that the data written to the buffer gets
>>> split like this:
>>>
>>>     reader 1		               reader 2
>>>
>>>     calls read(65536)                   calls read(65536)
>>>
>>>     calls NtReadFile(16384 bytes)
>>>                                         calls NtReadFile(16384 bytes)
>>>
>>> writer writes 65536 bytes
>>>
>>>     wakes up and gets 16384 bytes
>>>                                         wakes up and gets 16384 bytes
>>>     gets the mutex, calls
>>>     NtReadFile(32768) which
>>>     returns immediately with
>>>     32768 bytes added to the
>>>     caller's buffer.
>>>
>>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
>>> missing in the middle of it, *without* the reader knowing about that
>>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
>>> been read in a single call, at least, so the byte order is not
>>> unknowingly broken on the application level.
>>>
>>> Does that make sense?
>>
>> Why can't we keep the mutex while waiting on the pending read?
>> If we can keep the mutex, the issue above mentioned does not
>> happen, right?
>>
>> What about the patch attached? This keeps the mutex while read()
>> but I do not see any defects so far.

LGTM.

If Corinna agrees, I have a couple of suggestions.

1. With this patch, we can no longer have more than one pending ReadFile.  So 
there's no longer a need to count read handles, and the problem with select is 
completely fixed as long as the number of bytes requested is less than the pipe 
buffer size.

2. raw_read is now reading in chunks, like raw_write.  For readability of the 
code, I think it would be better to make the two functions as similar as 
possible.  For example, you could replace the do/while loop by a 
while(total_len<orig_len) loop.  And you could even use similar names for the 
variables, e.g., nbytes instead of total_len, or vice versa.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-04 14:04                                                                                                     ` Ken Brown
@ 2021-09-04 23:15                                                                                                       ` Takashi Yano
  2021-09-05 13:40                                                                                                         ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-04 23:15 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]

Hi Ken,

On Sat, 4 Sep 2021 10:04:12 -0400
Ken Brown wrote:
> On 9/4/2021 8:37 AM, Takashi Yano wrote:
> > On Sat, 4 Sep 2021 21:02:58 +0900
> > Takashi Yano wrote:
> >> Hi Corinna, Ken,
> >>
> >> On Fri, 3 Sep 2021 09:27:37 -0400
> >> Ken Brown wrote:
> >>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> >>>> POSIX says:
> >>>>       The value returned may be less than nbyte if the number of bytes left
> >>>>       in the file is less than nbyte, if the read() request was interrupted
> >>>>       by a signal, or if the file is a pipe or FIFO or special file and has
> >>>>                                                                         ~~~
> >>>>       fewer than nbyte bytes immediately available for reading.
> >>>>       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>
> >>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> >>>>
> >>>> If it is turned over, read() should read all data immediately available,
> >>>> I think.
> >>>
> >>> I understand the reasoning now, but I think your patch isn't quite right.  As it
> >>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
> >>> you're trying to read again without knowing that there's data in the pipe.
> >>>
> >>> Also, I think you need the following:
> >>>
> >>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>> index ef7823ae5..46bb96961 100644
> >>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> >>>        CloseHandle (evt);
> >>>      if (status == STATUS_THREAD_SIGNALED)
> >>>        {
> >>> -      set_errno (EINTR);
> >>> -      len = (size_t) -1;
> >>> +      if (total_len == 0)
> >>> +       {
> >>> +         set_errno (EINTR);
> >>> +         len = (size_t) -1;
> >>> +       }
> >>> +      else
> >>> +       len = total_len;
> >>>        }
> >>>      else if (status == STATUS_THREAD_CANCELED)
> >>>        pthread::static_cancel_self ();
> >>
> >> Thanks for your advice. I fixed the issue and attached new patch.
> >>
> >> On Fri, 3 Sep 2021 17:37:13 +0200
> >> Corinna Vinschen wrote:
> >>> Hmm, I see the point, but we might have another problem with that.
> >>>
> >>> We can't keep the mutex while waiting on the pending read, and there
> >>> could be more than one pending read running at the time.  if so,
> >>> chances are extremly high, that the data written to the buffer gets
> >>> split like this:
> >>>
> >>>     reader 1		               reader 2
> >>>
> >>>     calls read(65536)                   calls read(65536)
> >>>
> >>>     calls NtReadFile(16384 bytes)
> >>>                                         calls NtReadFile(16384 bytes)
> >>>
> >>> writer writes 65536 bytes
> >>>
> >>>     wakes up and gets 16384 bytes
> >>>                                         wakes up and gets 16384 bytes
> >>>     gets the mutex, calls
> >>>     NtReadFile(32768) which
> >>>     returns immediately with
> >>>     32768 bytes added to the
> >>>     caller's buffer.
> >>>
> >>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> >>> missing in the middle of it, *without* the reader knowing about that
> >>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> >>> been read in a single call, at least, so the byte order is not
> >>> unknowingly broken on the application level.
> >>>
> >>> Does that make sense?
> >>
> >> Why can't we keep the mutex while waiting on the pending read?
> >> If we can keep the mutex, the issue above mentioned does not
> >> happen, right?
> >>
> >> What about the patch attached? This keeps the mutex while read()
> >> but I do not see any defects so far.
> 
> LGTM.
> 
> If Corinna agrees, I have a couple of suggestions.
> 
> 1. With this patch, we can no longer have more than one pending ReadFile.  So 
> there's no longer a need to count read handles, and the problem with select is 
> completely fixed as long as the number of bytes requested is less than the pipe 
> buffer size.
> 
> 2. raw_read is now reading in chunks, like raw_write.  For readability of the 
> code, I think it would be better to make the two functions as similar as 
> possible.  For example, you could replace the do/while loop by a 
> while(total_len<orig_len) loop.  And you could even use similar names for the 
> variables, e.g., nbytes instead of total_len, or vice versa.

Thanks for the suggestion. I have rebuilt the patch.
Please see the patch attached.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Stop-counting-reader-and-read-all-availa.patch --]
[-- Type: application/octet-stream, Size: 5079 bytes --]

From 338c1ed9e260d5d456354ea5985002d1915168b5 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Sun, 5 Sep 2021 06:10:29 +0900
Subject: [PATCH] Cygwin: pipe: Stop counting reader and read all available
 data.

- By guarding read with read_mtx, no more than one ReadFile can
  be called simultaneously. So couting read handles is no longer
  necessary.
- Make raw_read code as similar as possible to raw_write code.
---
 winsup/cygwin/fhandler_pipe.cc | 84 +++++++++++++++++++---------------
 1 file changed, 46 insertions(+), 38 deletions(-)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 85730d039..544e5872d 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -216,12 +216,10 @@ fhandler_pipe::get_proc_fd_name (char *buf)
 void __reg3
 fhandler_pipe::raw_read (void *ptr, size_t& len)
 {
-  NTSTATUS status;
+  size_t nbytes = 0;
+  NTSTATUS status = STATUS_SUCCESS;
   IO_STATUS_BLOCK io;
   HANDLE evt = NULL;
-  DWORD waitret = WAIT_OBJECT_0;
-  bool keep_looping = false;
-  size_t orig_len = len;
 
   if (!len)
     return;
@@ -234,43 +232,47 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       return;
     }
 
-  do
+  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
+  if (WAIT_TIMEOUT == WaitForSingleObject (read_mtx, timeout))
+    {
+      set_errno (EAGAIN);
+      len = (size_t) -1;
+      return;
+    }
+  while (nbytes < len)
     {
-      len = orig_len;
-      keep_looping = false;
+      ULONG_PTR nbytes_now = 0;
+      size_t left = len - nbytes;
+      ULONG len1 = (ULONG) left;
+      DWORD waitret = WAIT_OBJECT_0;
+
       if (evt)
 	ResetEvent (evt);
       if (!is_nonblocking ())
 	{
 	  FILE_PIPE_LOCAL_INFORMATION fpli;
-	  ULONG reader_count;
-	  ULONG max_len = 64;
-
-	  WaitForSingleObject (read_mtx, INFINITE);
 
 	  /* Make sure never to request more bytes than half the pipe
-	     buffer size.  Every pending read lowers WriteQuotaAvailable
-	     on the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not.
-
-	     Let the size of the request depend on the number of readers
-	     at the time. */
+	     buffer size. Pending read lowers WriteQuotaAvailable on
+	     the write side and thus affects select's ability to return
+	     more or less reliable info whether a write succeeds or not. */
+	  ULONG chunk = max_atomic_write / 2;
 	  status = NtQueryInformationFile (get_handle (), &io,
 					   &fpli, sizeof (fpli),
 					   FilePipeLocalInformation);
 	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
 	    {
-	      reader_count = get_obj_handle_count (get_handle ());
-	      if (reader_count < 10)
-		max_len = fpli.InboundQuota / (2 * reader_count);
-	      if (len > max_len)
-		len = max_len;
+	      if (nbytes != 0)
+		break;
+	      chunk = fpli.InboundQuota / 2;
 	    }
+	  if (!NT_SUCCESS (status) && nbytes != 0)
+	    break;
+
+	  len1 = (left < chunk) ? (ULONG) left : chunk;
 	}
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
-			   len, NULL, NULL);
-      if (!is_nonblocking ())
-	ReleaseMutex (read_mtx);
+			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -287,13 +289,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    set_errno (EBADF);
 	  else
 	    __seterrno ();
-	  len = (size_t) -1;
+	  nbytes = (size_t) -1;
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
-	    keep_looping = true;
+	  nbytes_now = io.Information;
+	  ptr = ((char *) ptr) + nbytes_now;
+	  nbytes += nbytes_now;
 	}
       else
 	{
@@ -303,40 +305,46 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_END_OF_FILE:
 	    case STATUS_PIPE_BROKEN:
 	      /* This is really EOF.  */
-	      len = 0;
 	      break;
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
-		keep_looping = true;
+	      nbytes_now = io.Information;
+	      ptr = ((char *) ptr) + nbytes_now;
+	      nbytes += nbytes_now;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:
+	      if (nbytes != 0)
+		break;
 	      if (is_nonblocking ())
 		{
 		  set_errno (EAGAIN);
-		  len = (size_t) -1;
+		  nbytes = (size_t) -1;
 		  break;
 		}
 	      fallthrough;
 	    default:
 	      __seterrno_from_nt_status (status);
-	      len = (size_t) -1;
+	      nbytes = (size_t) -1;
 	      break;
 	    }
 	}
-    } while (keep_looping);
+
+      if (nbytes_now == 0)
+	break;
+    }
+  ReleaseMutex (read_mtx);
   if (evt)
     CloseHandle (evt);
-  if (status == STATUS_THREAD_SIGNALED)
+  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
     {
       set_errno (EINTR);
-      len = (size_t) -1;
+      nbytes = (size_t) -1;
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  len = nbytes;
 }
 
 ssize_t __reg3
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-04 23:15                                                                                                       ` Takashi Yano
@ 2021-09-05 13:40                                                                                                         ` Takashi Yano
  2021-09-05 13:50                                                                                                           ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-05 13:40 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4991 bytes --]

On Sun, 5 Sep 2021 08:15:23 +0900
Takashi Yano wrote:
> Hi Ken,
> 
> On Sat, 4 Sep 2021 10:04:12 -0400
> Ken Brown wrote:
> > On 9/4/2021 8:37 AM, Takashi Yano wrote:
> > > On Sat, 4 Sep 2021 21:02:58 +0900
> > > Takashi Yano wrote:
> > >> Hi Corinna, Ken,
> > >>
> > >> On Fri, 3 Sep 2021 09:27:37 -0400
> > >> Ken Brown wrote:
> > >>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> > >>>> POSIX says:
> > >>>>       The value returned may be less than nbyte if the number of bytes left
> > >>>>       in the file is less than nbyte, if the read() request was interrupted
> > >>>>       by a signal, or if the file is a pipe or FIFO or special file and has
> > >>>>                                                                         ~~~
> > >>>>       fewer than nbyte bytes immediately available for reading.
> > >>>>       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >>>>
> > >>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> > >>>>
> > >>>> If it is turned over, read() should read all data immediately available,
> > >>>> I think.
> > >>>
> > >>> I understand the reasoning now, but I think your patch isn't quite right.  As it
> > >>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
> > >>> you're trying to read again without knowing that there's data in the pipe.
> > >>>
> > >>> Also, I think you need the following:
> > >>>
> > >>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > >>> index ef7823ae5..46bb96961 100644
> > >>> --- a/winsup/cygwin/fhandler_pipe.cc
> > >>> +++ b/winsup/cygwin/fhandler_pipe.cc
> > >>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> > >>>        CloseHandle (evt);
> > >>>      if (status == STATUS_THREAD_SIGNALED)
> > >>>        {
> > >>> -      set_errno (EINTR);
> > >>> -      len = (size_t) -1;
> > >>> +      if (total_len == 0)
> > >>> +       {
> > >>> +         set_errno (EINTR);
> > >>> +         len = (size_t) -1;
> > >>> +       }
> > >>> +      else
> > >>> +       len = total_len;
> > >>>        }
> > >>>      else if (status == STATUS_THREAD_CANCELED)
> > >>>        pthread::static_cancel_self ();
> > >>
> > >> Thanks for your advice. I fixed the issue and attached new patch.
> > >>
> > >> On Fri, 3 Sep 2021 17:37:13 +0200
> > >> Corinna Vinschen wrote:
> > >>> Hmm, I see the point, but we might have another problem with that.
> > >>>
> > >>> We can't keep the mutex while waiting on the pending read, and there
> > >>> could be more than one pending read running at the time.  if so,
> > >>> chances are extremly high, that the data written to the buffer gets
> > >>> split like this:
> > >>>
> > >>>     reader 1		               reader 2
> > >>>
> > >>>     calls read(65536)                   calls read(65536)
> > >>>
> > >>>     calls NtReadFile(16384 bytes)
> > >>>                                         calls NtReadFile(16384 bytes)
> > >>>
> > >>> writer writes 65536 bytes
> > >>>
> > >>>     wakes up and gets 16384 bytes
> > >>>                                         wakes up and gets 16384 bytes
> > >>>     gets the mutex, calls
> > >>>     NtReadFile(32768) which
> > >>>     returns immediately with
> > >>>     32768 bytes added to the
> > >>>     caller's buffer.
> > >>>
> > >>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> > >>> missing in the middle of it, *without* the reader knowing about that
> > >>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> > >>> been read in a single call, at least, so the byte order is not
> > >>> unknowingly broken on the application level.
> > >>>
> > >>> Does that make sense?
> > >>
> > >> Why can't we keep the mutex while waiting on the pending read?
> > >> If we can keep the mutex, the issue above mentioned does not
> > >> happen, right?
> > >>
> > >> What about the patch attached? This keeps the mutex while read()
> > >> but I do not see any defects so far.
> > 
> > LGTM.
> > 
> > If Corinna agrees, I have a couple of suggestions.
> > 
> > 1. With this patch, we can no longer have more than one pending ReadFile.  So 
> > there's no longer a need to count read handles, and the problem with select is 
> > completely fixed as long as the number of bytes requested is less than the pipe 
> > buffer size.
> > 
> > 2. raw_read is now reading in chunks, like raw_write.  For readability of the 
> > code, I think it would be better to make the two functions as similar as 
> > possible.  For example, you could replace the do/while loop by a 
> > while(total_len<orig_len) loop.  And you could even use similar names for the 
> > variables, e.g., nbytes instead of total_len, or vice versa.
> 
> Thanks for the suggestion. I have rebuilt the patch.
> Please see the patch attached.

This patch seems to fail to adopt to current git head of topic/pipe
branch. I rebuilt the patch to fit current top/pipe.

Please see the patch attached.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Stop-counting-reader-and-read-all-availa.patch --]
[-- Type: application/octet-stream, Size: 5024 bytes --]

From 8bd38b3748adb1aa0037fb0683d20d680ce553c4 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Sun, 5 Sep 2021 22:35:28 +0900
Subject: [PATCH] Cygwin: pipe: Stop counting reader and read all available
 data.

- By guarding read with read_mtx, no more than one ReadFile can
  be called simultaneously. So couting read handles is no longer
  necessary.
- Make raw_read code as similar as possible to raw_write code.
---
 winsup/cygwin/fhandler_pipe.cc | 82 +++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 68974eb80..c094515b3 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -221,12 +221,10 @@ fhandler_pipe::get_proc_fd_name (char *buf)
 void __reg3
 fhandler_pipe::raw_read (void *ptr, size_t& len)
 {
-  NTSTATUS status;
+  size_t nbytes = 0;
+  NTSTATUS status = STATUS_SUCCESS;
   IO_STATUS_BLOCK io;
   HANDLE evt = NULL;
-  DWORD waitret = WAIT_OBJECT_0;
-  bool keep_looping = false;
-  size_t orig_len = len;
 
   if (!len)
     return;
@@ -239,43 +237,47 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       return;
     }
 
-  do
+  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
+  if (WAIT_TIMEOUT == WaitForSingleObject (read_mtx, timeout))
     {
-      len = orig_len;
-      keep_looping = false;
+      set_errno (EAGAIN);
+      len = (size_t) -1;
+      return;
+    }
+  while (nbytes < len)
+    {
+      ULONG_PTR nbytes_now = 0;
+      size_t left = len - nbytes;
+      ULONG len1 = (ULONG) left;
+      DWORD waitret = WAIT_OBJECT_0;
+
       if (evt)
 	ResetEvent (evt);
       if (!is_nonblocking ())
 	{
 	  FILE_PIPE_LOCAL_INFORMATION fpli;
-	  ULONG reader_count;
-	  ULONG max_len = 64;
-
-	  WaitForSingleObject (read_mtx, INFINITE);
 
 	  /* If the pipe is empty, don't request more bytes than half the
-	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
+	     pipe buffer size. pending read lowers WriteQuotaAvailable
 	     on the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not.
-
-	     Let the size of the request depend on the number of readers
-	     at the time. */
+	     more or less reliable info whether a write succeeds or not. */
+	  ULONG chunk = max_atomic_write / 2;
 	  status = NtQueryInformationFile (get_handle (), &io,
 					   &fpli, sizeof (fpli),
 					   FilePipeLocalInformation);
 	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
 	    {
-	      reader_count = get_obj_handle_count (get_handle ());
-	      if (reader_count < 10)
-		max_len = fpli.InboundQuota / (2 * reader_count);
-	      if (len > max_len)
-		len = max_len;
+	      if (nbytes != 0)
+		break;
+	      chunk = fpli.InboundQuota / 2;
 	    }
+	  if (!NT_SUCCESS (status) && nbytes != 0)
+	    break;
+
+	  len1 = (left < chunk) ? (ULONG) left : chunk;
 	}
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
-			   len, NULL, NULL);
-      if (!is_nonblocking ())
-	ReleaseMutex (read_mtx);
+			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -292,13 +294,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    set_errno (EBADF);
 	  else
 	    __seterrno ();
-	  len = (size_t) -1;
+	  nbytes = (size_t) -1;
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
-	    keep_looping = true;
+	  nbytes_now = io.Information;
+	  ptr = ((char *) ptr) + nbytes_now;
+	  nbytes += nbytes_now;
 	}
       else
 	{
@@ -308,40 +310,46 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_END_OF_FILE:
 	    case STATUS_PIPE_BROKEN:
 	      /* This is really EOF.  */
-	      len = 0;
 	      break;
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
-		keep_looping = true;
+	      nbytes_now = io.Information;
+	      ptr = ((char *) ptr) + nbytes_now;
+	      nbytes += nbytes_now;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:
+	      if (nbytes != 0)
+		break;
 	      if (is_nonblocking ())
 		{
 		  set_errno (EAGAIN);
-		  len = (size_t) -1;
+		  nbytes = (size_t) -1;
 		  break;
 		}
 	      fallthrough;
 	    default:
 	      __seterrno_from_nt_status (status);
-	      len = (size_t) -1;
+	      nbytes = (size_t) -1;
 	      break;
 	    }
 	}
-    } while (keep_looping);
+
+      if (nbytes_now == 0)
+	break;
+    }
+  ReleaseMutex (read_mtx);
   if (evt)
     CloseHandle (evt);
-  if (status == STATUS_THREAD_SIGNALED)
+  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
     {
       set_errno (EINTR);
-      len = (size_t) -1;
+      nbytes = (size_t) -1;
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  len = nbytes;
 }
 
 ssize_t __reg3
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-05 13:40                                                                                                         ` Takashi Yano
@ 2021-09-05 13:50                                                                                                           ` Takashi Yano
  2021-09-05 18:47                                                                                                             ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-05 13:50 UTC (permalink / raw)
  To: cygwin-developers

On Sun, 5 Sep 2021 22:40:59 +0900
Takashi Yano wrote:
> On Sun, 5 Sep 2021 08:15:23 +0900
> Takashi Yano wrote:
> > Hi Ken,
> > 
> > On Sat, 4 Sep 2021 10:04:12 -0400
> > Ken Brown wrote:
> > > On 9/4/2021 8:37 AM, Takashi Yano wrote:
> > > > On Sat, 4 Sep 2021 21:02:58 +0900
> > > > Takashi Yano wrote:
> > > >> Hi Corinna, Ken,
> > > >>
> > > >> On Fri, 3 Sep 2021 09:27:37 -0400
> > > >> Ken Brown wrote:
> > > >>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> > > >>>> POSIX says:
> > > >>>>       The value returned may be less than nbyte if the number of bytes left
> > > >>>>       in the file is less than nbyte, if the read() request was interrupted
> > > >>>>       by a signal, or if the file is a pipe or FIFO or special file and has
> > > >>>>                                                                         ~~~
> > > >>>>       fewer than nbyte bytes immediately available for reading.
> > > >>>>       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >>>>
> > > >>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> > > >>>>
> > > >>>> If it is turned over, read() should read all data immediately available,
> > > >>>> I think.
> > > >>>
> > > >>> I understand the reasoning now, but I think your patch isn't quite right.  As it
> > > >>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
> > > >>> you're trying to read again without knowing that there's data in the pipe.
> > > >>>
> > > >>> Also, I think you need the following:
> > > >>>
> > > >>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > > >>> index ef7823ae5..46bb96961 100644
> > > >>> --- a/winsup/cygwin/fhandler_pipe.cc
> > > >>> +++ b/winsup/cygwin/fhandler_pipe.cc
> > > >>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> > > >>>        CloseHandle (evt);
> > > >>>      if (status == STATUS_THREAD_SIGNALED)
> > > >>>        {
> > > >>> -      set_errno (EINTR);
> > > >>> -      len = (size_t) -1;
> > > >>> +      if (total_len == 0)
> > > >>> +       {
> > > >>> +         set_errno (EINTR);
> > > >>> +         len = (size_t) -1;
> > > >>> +       }
> > > >>> +      else
> > > >>> +       len = total_len;
> > > >>>        }
> > > >>>      else if (status == STATUS_THREAD_CANCELED)
> > > >>>        pthread::static_cancel_self ();
> > > >>
> > > >> Thanks for your advice. I fixed the issue and attached new patch.
> > > >>
> > > >> On Fri, 3 Sep 2021 17:37:13 +0200
> > > >> Corinna Vinschen wrote:
> > > >>> Hmm, I see the point, but we might have another problem with that.
> > > >>>
> > > >>> We can't keep the mutex while waiting on the pending read, and there
> > > >>> could be more than one pending read running at the time.  if so,
> > > >>> chances are extremly high, that the data written to the buffer gets
> > > >>> split like this:
> > > >>>
> > > >>>     reader 1		               reader 2
> > > >>>
> > > >>>     calls read(65536)                   calls read(65536)
> > > >>>
> > > >>>     calls NtReadFile(16384 bytes)
> > > >>>                                         calls NtReadFile(16384 bytes)
> > > >>>
> > > >>> writer writes 65536 bytes
> > > >>>
> > > >>>     wakes up and gets 16384 bytes
> > > >>>                                         wakes up and gets 16384 bytes
> > > >>>     gets the mutex, calls
> > > >>>     NtReadFile(32768) which
> > > >>>     returns immediately with
> > > >>>     32768 bytes added to the
> > > >>>     caller's buffer.
> > > >>>
> > > >>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> > > >>> missing in the middle of it, *without* the reader knowing about that
> > > >>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> > > >>> been read in a single call, at least, so the byte order is not
> > > >>> unknowingly broken on the application level.
> > > >>>
> > > >>> Does that make sense?
> > > >>
> > > >> Why can't we keep the mutex while waiting on the pending read?
> > > >> If we can keep the mutex, the issue above mentioned does not
> > > >> happen, right?
> > > >>
> > > >> What about the patch attached? This keeps the mutex while read()
> > > >> but I do not see any defects so far.
> > > 
> > > LGTM.
> > > 
> > > If Corinna agrees, I have a couple of suggestions.
> > > 
> > > 1. With this patch, we can no longer have more than one pending ReadFile.  So 
> > > there's no longer a need to count read handles, and the problem with select is 
> > > completely fixed as long as the number of bytes requested is less than the pipe 
> > > buffer size.
> > > 
> > > 2. raw_read is now reading in chunks, like raw_write.  For readability of the 
> > > code, I think it would be better to make the two functions as similar as 
> > > possible.  For example, you could replace the do/while loop by a 
> > > while(total_len<orig_len) loop.  And you could even use similar names for the 
> > > variables, e.g., nbytes instead of total_len, or vice versa.
> > 
> > Thanks for the suggestion. I have rebuilt the patch.
> > Please see the patch attached.
> 
> This patch seems to fail to adopt to current git head of topic/pipe
> branch. I rebuilt the patch to fit current top/pipe.
> 
> Please see the patch attached.

Small typo.

-	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
+	     pipe buffer size. pending read lowers WriteQuotaAvailable

should be:

-	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
+	     pipe buffer size. Pending read lowers WriteQuotaAvailable

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-05 13:50                                                                                                           ` Takashi Yano
@ 2021-09-05 18:47                                                                                                             ` Ken Brown
  2021-09-05 19:42                                                                                                               ` Takashi Yano
  2021-09-05 20:09                                                                                                               ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-05 18:47 UTC (permalink / raw)
  To: cygwin-developers

Hi Takashi,

On 9/5/2021 9:50 AM, Takashi Yano wrote:
> On Sun, 5 Sep 2021 22:40:59 +0900
> Takashi Yano wrote:
>> On Sun, 5 Sep 2021 08:15:23 +0900
>> Takashi Yano wrote:
>>> Hi Ken,
>>>
>>> On Sat, 4 Sep 2021 10:04:12 -0400
>>> Ken Brown wrote:
>>>> On 9/4/2021 8:37 AM, Takashi Yano wrote:
>>>>> On Sat, 4 Sep 2021 21:02:58 +0900
>>>>> Takashi Yano wrote:
>>>>>> Hi Corinna, Ken,
>>>>>>
>>>>>> On Fri, 3 Sep 2021 09:27:37 -0400
>>>>>> Ken Brown wrote:
>>>>>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
>>>>>>>> POSIX says:
>>>>>>>>        The value returned may be less than nbyte if the number of bytes left
>>>>>>>>        in the file is less than nbyte, if the read() request was interrupted
>>>>>>>>        by a signal, or if the file is a pipe or FIFO or special file and has
>>>>>>>>                                                                          ~~~
>>>>>>>>        fewer than nbyte bytes immediately available for reading.
>>>>>>>>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>>
>>>>>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
>>>>>>>>
>>>>>>>> If it is turned over, read() should read all data immediately available,
>>>>>>>> I think.
>>>>>>>
>>>>>>> I understand the reasoning now, but I think your patch isn't quite right.  As it
>>>>>>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
>>>>>>> you're trying to read again without knowing that there's data in the pipe.
>>>>>>>
>>>>>>> Also, I think you need the following:
>>>>>>>
>>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>>>>>> index ef7823ae5..46bb96961 100644
>>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>>>>>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
>>>>>>>         CloseHandle (evt);
>>>>>>>       if (status == STATUS_THREAD_SIGNALED)
>>>>>>>         {
>>>>>>> -      set_errno (EINTR);
>>>>>>> -      len = (size_t) -1;
>>>>>>> +      if (total_len == 0)
>>>>>>> +       {
>>>>>>> +         set_errno (EINTR);
>>>>>>> +         len = (size_t) -1;
>>>>>>> +       }
>>>>>>> +      else
>>>>>>> +       len = total_len;
>>>>>>>         }
>>>>>>>       else if (status == STATUS_THREAD_CANCELED)
>>>>>>>         pthread::static_cancel_self ();
>>>>>>
>>>>>> Thanks for your advice. I fixed the issue and attached new patch.
>>>>>>
>>>>>> On Fri, 3 Sep 2021 17:37:13 +0200
>>>>>> Corinna Vinschen wrote:
>>>>>>> Hmm, I see the point, but we might have another problem with that.
>>>>>>>
>>>>>>> We can't keep the mutex while waiting on the pending read, and there
>>>>>>> could be more than one pending read running at the time.  if so,
>>>>>>> chances are extremly high, that the data written to the buffer gets
>>>>>>> split like this:
>>>>>>>
>>>>>>>      reader 1		               reader 2
>>>>>>>
>>>>>>>      calls read(65536)                   calls read(65536)
>>>>>>>
>>>>>>>      calls NtReadFile(16384 bytes)
>>>>>>>                                          calls NtReadFile(16384 bytes)
>>>>>>>
>>>>>>> writer writes 65536 bytes
>>>>>>>
>>>>>>>      wakes up and gets 16384 bytes
>>>>>>>                                          wakes up and gets 16384 bytes
>>>>>>>      gets the mutex, calls
>>>>>>>      NtReadFile(32768) which
>>>>>>>      returns immediately with
>>>>>>>      32768 bytes added to the
>>>>>>>      caller's buffer.
>>>>>>>
>>>>>>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
>>>>>>> missing in the middle of it, *without* the reader knowing about that
>>>>>>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
>>>>>>> been read in a single call, at least, so the byte order is not
>>>>>>> unknowingly broken on the application level.
>>>>>>>
>>>>>>> Does that make sense?
>>>>>>
>>>>>> Why can't we keep the mutex while waiting on the pending read?
>>>>>> If we can keep the mutex, the issue above mentioned does not
>>>>>> happen, right?
>>>>>>
>>>>>> What about the patch attached? This keeps the mutex while read()
>>>>>> but I do not see any defects so far.
>>>>
>>>> LGTM.
>>>>
>>>> If Corinna agrees, I have a couple of suggestions.
>>>>
>>>> 1. With this patch, we can no longer have more than one pending ReadFile.  So
>>>> there's no longer a need to count read handles, and the problem with select is
>>>> completely fixed as long as the number of bytes requested is less than the pipe
>>>> buffer size.
>>>>
>>>> 2. raw_read is now reading in chunks, like raw_write.  For readability of the
>>>> code, I think it would be better to make the two functions as similar as
>>>> possible.  For example, you could replace the do/while loop by a
>>>> while(total_len<orig_len) loop.  And you could even use similar names for the
>>>> variables, e.g., nbytes instead of total_len, or vice versa.
>>>
>>> Thanks for the suggestion. I have rebuilt the patch.
>>> Please see the patch attached.
>>
>> This patch seems to fail to adopt to current git head of topic/pipe
>> branch. I rebuilt the patch to fit current top/pipe.
>>
>> Please see the patch attached.
> 
> Small typo.
> 
> -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> +	     pipe buffer size. pending read lowers WriteQuotaAvailable
> 
> should be:
> 
> -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> +	     pipe buffer size. Pending read lowers WriteQuotaAvailable

The patch looks great to me.  Two minor nits:

1. The patch doesn't apply cleanly.  Could you rebase it against the current 
HEAD of topic/pipe?

2. There's no need for chunk to be less than the number of bytes requested if we 
know there's data in the pipe.  So maybe something like this (untested) would be 
better:

           ULONG chunk;
           status = NtQueryInformationFile (get_handle (), &io,
                                            &fpli, sizeof (fpli),
                                            FilePipeLocalInformation);
           if (NT_SUCCESS (status))
             {
               if (fpli.ReadDataAvailable > 0)
                 chunk = left;
               else if (nbytes != 0)
                 break;
               else
                 chunk = fpli.InboundQuota / 2;
             }
           else if (nbytes != 0)
             break;
           else
             chunk = max_atomic_write / 2;

           if (chunk < left)
             len1 = chunk;

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-05 18:47                                                                                                             ` Ken Brown
@ 2021-09-05 19:42                                                                                                               ` Takashi Yano
  2021-09-05 20:09                                                                                                               ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-05 19:42 UTC (permalink / raw)
  To: cygwin-developers

On Sun, 5 Sep 2021 14:47:26 -0400
Ken Brown wrote:
> Hi Takashi,
> 
> On 9/5/2021 9:50 AM, Takashi Yano wrote:
> > On Sun, 5 Sep 2021 22:40:59 +0900
> > Takashi Yano wrote:
> >> On Sun, 5 Sep 2021 08:15:23 +0900
> >> Takashi Yano wrote:
> >>> Hi Ken,
> >>>
> >>> On Sat, 4 Sep 2021 10:04:12 -0400
> >>> Ken Brown wrote:
> >>>> On 9/4/2021 8:37 AM, Takashi Yano wrote:
> >>>>> On Sat, 4 Sep 2021 21:02:58 +0900
> >>>>> Takashi Yano wrote:
> >>>>>> Hi Corinna, Ken,
> >>>>>>
> >>>>>> On Fri, 3 Sep 2021 09:27:37 -0400
> >>>>>> Ken Brown wrote:
> >>>>>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> >>>>>>>> POSIX says:
> >>>>>>>>        The value returned may be less than nbyte if the number of bytes left
> >>>>>>>>        in the file is less than nbyte, if the read() request was interrupted
> >>>>>>>>        by a signal, or if the file is a pipe or FIFO or special file and has
> >>>>>>>>                                                                          ~~~
> >>>>>>>>        fewer than nbyte bytes immediately available for reading.
> >>>>>>>>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>>>>>
> >>>>>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> >>>>>>>>
> >>>>>>>> If it is turned over, read() should read all data immediately available,
> >>>>>>>> I think.
> >>>>>>>
> >>>>>>> I understand the reasoning now, but I think your patch isn't quite right.  As it
> >>>>>>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
> >>>>>>> you're trying to read again without knowing that there's data in the pipe.
> >>>>>>>
> >>>>>>> Also, I think you need the following:
> >>>>>>>
> >>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> index ef7823ae5..46bb96961 100644
> >>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> >>>>>>>         CloseHandle (evt);
> >>>>>>>       if (status == STATUS_THREAD_SIGNALED)
> >>>>>>>         {
> >>>>>>> -      set_errno (EINTR);
> >>>>>>> -      len = (size_t) -1;
> >>>>>>> +      if (total_len == 0)
> >>>>>>> +       {
> >>>>>>> +         set_errno (EINTR);
> >>>>>>> +         len = (size_t) -1;
> >>>>>>> +       }
> >>>>>>> +      else
> >>>>>>> +       len = total_len;
> >>>>>>>         }
> >>>>>>>       else if (status == STATUS_THREAD_CANCELED)
> >>>>>>>         pthread::static_cancel_self ();
> >>>>>>
> >>>>>> Thanks for your advice. I fixed the issue and attached new patch.
> >>>>>>
> >>>>>> On Fri, 3 Sep 2021 17:37:13 +0200
> >>>>>> Corinna Vinschen wrote:
> >>>>>>> Hmm, I see the point, but we might have another problem with that.
> >>>>>>>
> >>>>>>> We can't keep the mutex while waiting on the pending read, and there
> >>>>>>> could be more than one pending read running at the time.  if so,
> >>>>>>> chances are extremly high, that the data written to the buffer gets
> >>>>>>> split like this:
> >>>>>>>
> >>>>>>>      reader 1		               reader 2
> >>>>>>>
> >>>>>>>      calls read(65536)                   calls read(65536)
> >>>>>>>
> >>>>>>>      calls NtReadFile(16384 bytes)
> >>>>>>>                                          calls NtReadFile(16384 bytes)
> >>>>>>>
> >>>>>>> writer writes 65536 bytes
> >>>>>>>
> >>>>>>>      wakes up and gets 16384 bytes
> >>>>>>>                                          wakes up and gets 16384 bytes
> >>>>>>>      gets the mutex, calls
> >>>>>>>      NtReadFile(32768) which
> >>>>>>>      returns immediately with
> >>>>>>>      32768 bytes added to the
> >>>>>>>      caller's buffer.
> >>>>>>>
> >>>>>>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> >>>>>>> missing in the middle of it, *without* the reader knowing about that
> >>>>>>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> >>>>>>> been read in a single call, at least, so the byte order is not
> >>>>>>> unknowingly broken on the application level.
> >>>>>>>
> >>>>>>> Does that make sense?
> >>>>>>
> >>>>>> Why can't we keep the mutex while waiting on the pending read?
> >>>>>> If we can keep the mutex, the issue above mentioned does not
> >>>>>> happen, right?
> >>>>>>
> >>>>>> What about the patch attached? This keeps the mutex while read()
> >>>>>> but I do not see any defects so far.
> >>>>
> >>>> LGTM.
> >>>>
> >>>> If Corinna agrees, I have a couple of suggestions.
> >>>>
> >>>> 1. With this patch, we can no longer have more than one pending ReadFile.  So
> >>>> there's no longer a need to count read handles, and the problem with select is
> >>>> completely fixed as long as the number of bytes requested is less than the pipe
> >>>> buffer size.
> >>>>
> >>>> 2. raw_read is now reading in chunks, like raw_write.  For readability of the
> >>>> code, I think it would be better to make the two functions as similar as
> >>>> possible.  For example, you could replace the do/while loop by a
> >>>> while(total_len<orig_len) loop.  And you could even use similar names for the
> >>>> variables, e.g., nbytes instead of total_len, or vice versa.
> >>>
> >>> Thanks for the suggestion. I have rebuilt the patch.
> >>> Please see the patch attached.
> >>
> >> This patch seems to fail to adopt to current git head of topic/pipe
> >> branch. I rebuilt the patch to fit current top/pipe.
> >>
> >> Please see the patch attached.
> > 
> > Small typo.
> > 
> > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > +	     pipe buffer size. pending read lowers WriteQuotaAvailable
> > 
> > should be:
> > 
> > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > +	     pipe buffer size. Pending read lowers WriteQuotaAvailable
> 
> The patch looks great to me.  Two minor nits:
> 
> 1. The patch doesn't apply cleanly.  Could you rebase it against the current 
> HEAD of topic/pipe?
> 
> 2. There's no need for chunk to be less than the number of bytes requested if we 
> know there's data in the pipe.  So maybe something like this (untested) would be 
> better:
> 
>            ULONG chunk;
>            status = NtQueryInformationFile (get_handle (), &io,
>                                             &fpli, sizeof (fpli),
>                                             FilePipeLocalInformation);
>            if (NT_SUCCESS (status))
>              {
>                if (fpli.ReadDataAvailable > 0)
>                  chunk = left;
>                else if (nbytes != 0)
>                  break;
>                else
>                  chunk = fpli.InboundQuota / 2;
>              }
>            else if (nbytes != 0)
>              break;
>            else
>              chunk = max_atomic_write / 2;
> 
>            if (chunk < left)
>              len1 = chunk;

Thanks for the advice.

As for 1., is not the current git head 866a62037e235d558584e821a11d60d848e06234?

In my environment, patch can apply cleanly by git am.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-05 18:47                                                                                                             ` Ken Brown
  2021-09-05 19:42                                                                                                               ` Takashi Yano
@ 2021-09-05 20:09                                                                                                               ` Takashi Yano
  2021-09-05 20:27                                                                                                                 ` Ken Brown
  2021-09-06  8:13                                                                                                                 ` Corinna Vinschen
  1 sibling, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-05 20:09 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 6900 bytes --]

Hi Ken,

On Sun, 5 Sep 2021 14:47:26 -0400
Ken Brown wrote:
> Hi Takashi,
> 
> On 9/5/2021 9:50 AM, Takashi Yano wrote:
> > On Sun, 5 Sep 2021 22:40:59 +0900
> > Takashi Yano wrote:
> >> On Sun, 5 Sep 2021 08:15:23 +0900
> >> Takashi Yano wrote:
> >>> Hi Ken,
> >>>
> >>> On Sat, 4 Sep 2021 10:04:12 -0400
> >>> Ken Brown wrote:
> >>>> On 9/4/2021 8:37 AM, Takashi Yano wrote:
> >>>>> On Sat, 4 Sep 2021 21:02:58 +0900
> >>>>> Takashi Yano wrote:
> >>>>>> Hi Corinna, Ken,
> >>>>>>
> >>>>>> On Fri, 3 Sep 2021 09:27:37 -0400
> >>>>>> Ken Brown wrote:
> >>>>>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> >>>>>>>> POSIX says:
> >>>>>>>>        The value returned may be less than nbyte if the number of bytes left
> >>>>>>>>        in the file is less than nbyte, if the read() request was interrupted
> >>>>>>>>        by a signal, or if the file is a pipe or FIFO or special file and has
> >>>>>>>>                                                                          ~~~
> >>>>>>>>        fewer than nbyte bytes immediately available for reading.
> >>>>>>>>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>>>>>
> >>>>>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> >>>>>>>>
> >>>>>>>> If it is turned over, read() should read all data immediately available,
> >>>>>>>> I think.
> >>>>>>>
> >>>>>>> I understand the reasoning now, but I think your patch isn't quite right.  As it
> >>>>>>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
> >>>>>>> you're trying to read again without knowing that there's data in the pipe.
> >>>>>>>
> >>>>>>> Also, I think you need the following:
> >>>>>>>
> >>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> index ef7823ae5..46bb96961 100644
> >>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> >>>>>>>         CloseHandle (evt);
> >>>>>>>       if (status == STATUS_THREAD_SIGNALED)
> >>>>>>>         {
> >>>>>>> -      set_errno (EINTR);
> >>>>>>> -      len = (size_t) -1;
> >>>>>>> +      if (total_len == 0)
> >>>>>>> +       {
> >>>>>>> +         set_errno (EINTR);
> >>>>>>> +         len = (size_t) -1;
> >>>>>>> +       }
> >>>>>>> +      else
> >>>>>>> +       len = total_len;
> >>>>>>>         }
> >>>>>>>       else if (status == STATUS_THREAD_CANCELED)
> >>>>>>>         pthread::static_cancel_self ();
> >>>>>>
> >>>>>> Thanks for your advice. I fixed the issue and attached new patch.
> >>>>>>
> >>>>>> On Fri, 3 Sep 2021 17:37:13 +0200
> >>>>>> Corinna Vinschen wrote:
> >>>>>>> Hmm, I see the point, but we might have another problem with that.
> >>>>>>>
> >>>>>>> We can't keep the mutex while waiting on the pending read, and there
> >>>>>>> could be more than one pending read running at the time.  if so,
> >>>>>>> chances are extremly high, that the data written to the buffer gets
> >>>>>>> split like this:
> >>>>>>>
> >>>>>>>      reader 1		               reader 2
> >>>>>>>
> >>>>>>>      calls read(65536)                   calls read(65536)
> >>>>>>>
> >>>>>>>      calls NtReadFile(16384 bytes)
> >>>>>>>                                          calls NtReadFile(16384 bytes)
> >>>>>>>
> >>>>>>> writer writes 65536 bytes
> >>>>>>>
> >>>>>>>      wakes up and gets 16384 bytes
> >>>>>>>                                          wakes up and gets 16384 bytes
> >>>>>>>      gets the mutex, calls
> >>>>>>>      NtReadFile(32768) which
> >>>>>>>      returns immediately with
> >>>>>>>      32768 bytes added to the
> >>>>>>>      caller's buffer.
> >>>>>>>
> >>>>>>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
> >>>>>>> missing in the middle of it, *without* the reader knowing about that
> >>>>>>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
> >>>>>>> been read in a single call, at least, so the byte order is not
> >>>>>>> unknowingly broken on the application level.
> >>>>>>>
> >>>>>>> Does that make sense?
> >>>>>>
> >>>>>> Why can't we keep the mutex while waiting on the pending read?
> >>>>>> If we can keep the mutex, the issue above mentioned does not
> >>>>>> happen, right?
> >>>>>>
> >>>>>> What about the patch attached? This keeps the mutex while read()
> >>>>>> but I do not see any defects so far.
> >>>>
> >>>> LGTM.
> >>>>
> >>>> If Corinna agrees, I have a couple of suggestions.
> >>>>
> >>>> 1. With this patch, we can no longer have more than one pending ReadFile.  So
> >>>> there's no longer a need to count read handles, and the problem with select is
> >>>> completely fixed as long as the number of bytes requested is less than the pipe
> >>>> buffer size.
> >>>>
> >>>> 2. raw_read is now reading in chunks, like raw_write.  For readability of the
> >>>> code, I think it would be better to make the two functions as similar as
> >>>> possible.  For example, you could replace the do/while loop by a
> >>>> while(total_len<orig_len) loop.  And you could even use similar names for the
> >>>> variables, e.g., nbytes instead of total_len, or vice versa.
> >>>
> >>> Thanks for the suggestion. I have rebuilt the patch.
> >>> Please see the patch attached.
> >>
> >> This patch seems to fail to adopt to current git head of topic/pipe
> >> branch. I rebuilt the patch to fit current top/pipe.
> >>
> >> Please see the patch attached.
> > 
> > Small typo.
> > 
> > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > +	     pipe buffer size. pending read lowers WriteQuotaAvailable
> > 
> > should be:
> > 
> > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > +	     pipe buffer size. Pending read lowers WriteQuotaAvailable
> 
> The patch looks great to me.  Two minor nits:
> 
> 1. The patch doesn't apply cleanly.  Could you rebase it against the current 
> HEAD of topic/pipe?
> 
> 2. There's no need for chunk to be less than the number of bytes requested if we 
> know there's data in the pipe.  So maybe something like this (untested) would be 
> better:
> 
>            ULONG chunk;
>            status = NtQueryInformationFile (get_handle (), &io,
>                                             &fpli, sizeof (fpli),
>                                             FilePipeLocalInformation);
>            if (NT_SUCCESS (status))
>              {
>                if (fpli.ReadDataAvailable > 0)
>                  chunk = left;
>                else if (nbytes != 0)
>                  break;
>                else
>                  chunk = fpli.InboundQuota / 2;
>              }
>            else if (nbytes != 0)
>              break;
>            else
>              chunk = max_atomic_write / 2;
> 
>            if (chunk < left)
>              len1 = chunk;

Could you please try attached new patch?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Stop-counting-reader-and-read-all-availa.patch --]
[-- Type: application/octet-stream, Size: 5165 bytes --]

From d83d0314a83467d4f1778ad835b12a36c9dcb83b Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Mon, 6 Sep 2021 04:58:58 +0900
Subject: [PATCH] Cygwin: pipe: Stop counting reader and read all available
 data.

- By guarding read with read_mtx, no more than one ReadFile can
  be called simultaneously. So couting read handles is no longer
  necessary.
- Make raw_read code as similar as possible to raw_write code.
---
 winsup/cygwin/fhandler_pipe.cc | 90 +++++++++++++++++++---------------
 1 file changed, 51 insertions(+), 39 deletions(-)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 68974eb80..1a74551c6 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -221,12 +221,10 @@ fhandler_pipe::get_proc_fd_name (char *buf)
 void __reg3
 fhandler_pipe::raw_read (void *ptr, size_t& len)
 {
-  NTSTATUS status;
+  size_t nbytes = 0;
+  NTSTATUS status = STATUS_SUCCESS;
   IO_STATUS_BLOCK io;
   HANDLE evt = NULL;
-  DWORD waitret = WAIT_OBJECT_0;
-  bool keep_looping = false;
-  size_t orig_len = len;
 
   if (!len)
     return;
@@ -239,43 +237,51 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       return;
     }
 
-  do
+  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
+  if (WAIT_TIMEOUT == WaitForSingleObject (read_mtx, timeout))
     {
-      len = orig_len;
-      keep_looping = false;
+      set_errno (EAGAIN);
+      len = (size_t) -1;
+      return;
+    }
+  while (nbytes < len)
+    {
+      ULONG_PTR nbytes_now = 0;
+      size_t left = len - nbytes;
+      ULONG len1 = (ULONG) left;
+      DWORD waitret = WAIT_OBJECT_0;
+
       if (evt)
 	ResetEvent (evt);
       if (!is_nonblocking ())
 	{
 	  FILE_PIPE_LOCAL_INFORMATION fpli;
-	  ULONG reader_count;
-	  ULONG max_len = 64;
-
-	  WaitForSingleObject (read_mtx, INFINITE);
 
 	  /* If the pipe is empty, don't request more bytes than half the
-	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
-	     on the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not.
-
-	     Let the size of the request depend on the number of readers
-	     at the time. */
+	     pipe buffer size. Pending read lowers WriteQuotaAvailable on
+	     the write side and thus affects select's ability to return
+	     more or less reliable info whether a write succeeds or not. */
+	  ULONG chunk = max_atomic_write / 2;
 	  status = NtQueryInformationFile (get_handle (), &io,
 					   &fpli, sizeof (fpli),
 					   FilePipeLocalInformation);
-	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
+	  if (NT_SUCCESS (status))
 	    {
-	      reader_count = get_obj_handle_count (get_handle ());
-	      if (reader_count < 10)
-		max_len = fpli.InboundQuota / (2 * reader_count);
-	      if (len > max_len)
-		len = max_len;
+	      if (fpli.ReadDataAvailable > 0)
+		chunk = left;
+	      else if (nbytes != 0)
+		break;
+	      else
+		chunk = fpli.InboundQuota / 2;
 	    }
+	  else if (nbytes != 0)
+	    break;
+
+	  if (len1 > chunk)
+	    len1 = chunk;
 	}
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
-			   len, NULL, NULL);
-      if (!is_nonblocking ())
-	ReleaseMutex (read_mtx);
+			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -292,13 +298,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    set_errno (EBADF);
 	  else
 	    __seterrno ();
-	  len = (size_t) -1;
+	  nbytes = (size_t) -1;
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
-	    keep_looping = true;
+	  nbytes_now = io.Information;
+	  ptr = ((char *) ptr) + nbytes_now;
+	  nbytes += nbytes_now;
 	}
       else
 	{
@@ -308,40 +314,46 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_END_OF_FILE:
 	    case STATUS_PIPE_BROKEN:
 	      /* This is really EOF.  */
-	      len = 0;
 	      break;
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
-		keep_looping = true;
+	      nbytes_now = io.Information;
+	      ptr = ((char *) ptr) + nbytes_now;
+	      nbytes += nbytes_now;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:
+	      if (nbytes != 0)
+		break;
 	      if (is_nonblocking ())
 		{
 		  set_errno (EAGAIN);
-		  len = (size_t) -1;
+		  nbytes = (size_t) -1;
 		  break;
 		}
 	      fallthrough;
 	    default:
 	      __seterrno_from_nt_status (status);
-	      len = (size_t) -1;
+	      nbytes = (size_t) -1;
 	      break;
 	    }
 	}
-    } while (keep_looping);
+
+      if (nbytes_now == 0)
+	break;
+    }
+  ReleaseMutex (read_mtx);
   if (evt)
     CloseHandle (evt);
-  if (status == STATUS_THREAD_SIGNALED)
+  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
     {
       set_errno (EINTR);
-      len = (size_t) -1;
+      nbytes = (size_t) -1;
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  len = nbytes;
 }
 
 ssize_t __reg3
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-05 20:09                                                                                                               ` Takashi Yano
@ 2021-09-05 20:27                                                                                                                 ` Ken Brown
  2021-09-06  8:13                                                                                                                 ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-05 20:27 UTC (permalink / raw)
  To: cygwin-developers

On 9/5/2021 4:09 PM, Takashi Yano wrote:
> Hi Ken,
> 
> On Sun, 5 Sep 2021 14:47:26 -0400
> Ken Brown wrote:
>> Hi Takashi,
>>
>> On 9/5/2021 9:50 AM, Takashi Yano wrote:
>>> On Sun, 5 Sep 2021 22:40:59 +0900
>>> Takashi Yano wrote:
>>>> On Sun, 5 Sep 2021 08:15:23 +0900
>>>> Takashi Yano wrote:
>>>>> Hi Ken,
>>>>>
>>>>> On Sat, 4 Sep 2021 10:04:12 -0400
>>>>> Ken Brown wrote:
>>>>>> On 9/4/2021 8:37 AM, Takashi Yano wrote:
>>>>>>> On Sat, 4 Sep 2021 21:02:58 +0900
>>>>>>> Takashi Yano wrote:
>>>>>>>> Hi Corinna, Ken,
>>>>>>>>
>>>>>>>> On Fri, 3 Sep 2021 09:27:37 -0400
>>>>>>>> Ken Brown wrote:
>>>>>>>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
>>>>>>>>>> POSIX says:
>>>>>>>>>>         The value returned may be less than nbyte if the number of bytes left
>>>>>>>>>>         in the file is less than nbyte, if the read() request was interrupted
>>>>>>>>>>         by a signal, or if the file is a pipe or FIFO or special file and has
>>>>>>>>>>                                                                           ~~~
>>>>>>>>>>         fewer than nbyte bytes immediately available for reading.
>>>>>>>>>>         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>>>>
>>>>>>>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
>>>>>>>>>>
>>>>>>>>>> If it is turned over, read() should read all data immediately available,
>>>>>>>>>> I think.
>>>>>>>>>
>>>>>>>>> I understand the reasoning now, but I think your patch isn't quite right.  As it
>>>>>>>>> stands, if the call to NtQueryInformationFile fails but total_length != 0,
>>>>>>>>> you're trying to read again without knowing that there's data in the pipe.
>>>>>>>>>
>>>>>>>>> Also, I think you need the following:
>>>>>>>>>
>>>>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>>>>>>>> index ef7823ae5..46bb96961 100644
>>>>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>>>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>>>>>>>> @@ -348,8 +348,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
>>>>>>>>>          CloseHandle (evt);
>>>>>>>>>        if (status == STATUS_THREAD_SIGNALED)
>>>>>>>>>          {
>>>>>>>>> -      set_errno (EINTR);
>>>>>>>>> -      len = (size_t) -1;
>>>>>>>>> +      if (total_len == 0)
>>>>>>>>> +       {
>>>>>>>>> +         set_errno (EINTR);
>>>>>>>>> +         len = (size_t) -1;
>>>>>>>>> +       }
>>>>>>>>> +      else
>>>>>>>>> +       len = total_len;
>>>>>>>>>          }
>>>>>>>>>        else if (status == STATUS_THREAD_CANCELED)
>>>>>>>>>          pthread::static_cancel_self ();
>>>>>>>>
>>>>>>>> Thanks for your advice. I fixed the issue and attached new patch.
>>>>>>>>
>>>>>>>> On Fri, 3 Sep 2021 17:37:13 +0200
>>>>>>>> Corinna Vinschen wrote:
>>>>>>>>> Hmm, I see the point, but we might have another problem with that.
>>>>>>>>>
>>>>>>>>> We can't keep the mutex while waiting on the pending read, and there
>>>>>>>>> could be more than one pending read running at the time.  if so,
>>>>>>>>> chances are extremly high, that the data written to the buffer gets
>>>>>>>>> split like this:
>>>>>>>>>
>>>>>>>>>       reader 1		               reader 2
>>>>>>>>>
>>>>>>>>>       calls read(65536)                   calls read(65536)
>>>>>>>>>
>>>>>>>>>       calls NtReadFile(16384 bytes)
>>>>>>>>>                                           calls NtReadFile(16384 bytes)
>>>>>>>>>
>>>>>>>>> writer writes 65536 bytes
>>>>>>>>>
>>>>>>>>>       wakes up and gets 16384 bytes
>>>>>>>>>                                           wakes up and gets 16384 bytes
>>>>>>>>>       gets the mutex, calls
>>>>>>>>>       NtReadFile(32768) which
>>>>>>>>>       returns immediately with
>>>>>>>>>       32768 bytes added to the
>>>>>>>>>       caller's buffer.
>>>>>>>>>
>>>>>>>>> so the buffer returned to reader 1 is 49152 bytes, with 16384 bytes
>>>>>>>>> missing in the middle of it, *without* the reader knowing about that
>>>>>>>>> fact.  If reader 1 gets the first 16384 bytes, the 16384 bytes have
>>>>>>>>> been read in a single call, at least, so the byte order is not
>>>>>>>>> unknowingly broken on the application level.
>>>>>>>>>
>>>>>>>>> Does that make sense?
>>>>>>>>
>>>>>>>> Why can't we keep the mutex while waiting on the pending read?
>>>>>>>> If we can keep the mutex, the issue above mentioned does not
>>>>>>>> happen, right?
>>>>>>>>
>>>>>>>> What about the patch attached? This keeps the mutex while read()
>>>>>>>> but I do not see any defects so far.
>>>>>>
>>>>>> LGTM.
>>>>>>
>>>>>> If Corinna agrees, I have a couple of suggestions.
>>>>>>
>>>>>> 1. With this patch, we can no longer have more than one pending ReadFile.  So
>>>>>> there's no longer a need to count read handles, and the problem with select is
>>>>>> completely fixed as long as the number of bytes requested is less than the pipe
>>>>>> buffer size.
>>>>>>
>>>>>> 2. raw_read is now reading in chunks, like raw_write.  For readability of the
>>>>>> code, I think it would be better to make the two functions as similar as
>>>>>> possible.  For example, you could replace the do/while loop by a
>>>>>> while(total_len<orig_len) loop.  And you could even use similar names for the
>>>>>> variables, e.g., nbytes instead of total_len, or vice versa.
>>>>>
>>>>> Thanks for the suggestion. I have rebuilt the patch.
>>>>> Please see the patch attached.
>>>>
>>>> This patch seems to fail to adopt to current git head of topic/pipe
>>>> branch. I rebuilt the patch to fit current top/pipe.
>>>>
>>>> Please see the patch attached.
>>>
>>> Small typo.
>>>
>>> -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
>>> +	     pipe buffer size. pending read lowers WriteQuotaAvailable
>>>
>>> should be:
>>>
>>> -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
>>> +	     pipe buffer size. Pending read lowers WriteQuotaAvailable
>>
>> The patch looks great to me.  Two minor nits:
>>
>> 1. The patch doesn't apply cleanly.  Could you rebase it against the current
>> HEAD of topic/pipe?
>>
>> 2. There's no need for chunk to be less than the number of bytes requested if we
>> know there's data in the pipe.  So maybe something like this (untested) would be
>> better:
>>
>>             ULONG chunk;
>>             status = NtQueryInformationFile (get_handle (), &io,
>>                                              &fpli, sizeof (fpli),
>>                                              FilePipeLocalInformation);
>>             if (NT_SUCCESS (status))
>>               {
>>                 if (fpli.ReadDataAvailable > 0)
>>                   chunk = left;
>>                 else if (nbytes != 0)
>>                   break;
>>                 else
>>                   chunk = fpli.InboundQuota / 2;
>>               }
>>             else if (nbytes != 0)
>>               break;
>>             else
>>               chunk = max_atomic_write / 2;
>>
>>             if (chunk < left)
>>               len1 = chunk;
> 
> Could you please try attached new patch?

LGTM.  And it applies cleanly.  Maybe I did something wrong when I thought it 
didn't apply.

Thanks.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-05 20:09                                                                                                               ` Takashi Yano
  2021-09-05 20:27                                                                                                                 ` Ken Brown
@ 2021-09-06  8:13                                                                                                                 ` Corinna Vinschen
  2021-09-06 11:16                                                                                                                   ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-06  8:13 UTC (permalink / raw)
  To: cygwin-developers

On Sep  6 05:09, Takashi Yano wrote:
> On Sun, 5 Sep 2021 14:47:26 -0400
> Ken Brown wrote:
> > On 9/5/2021 9:50 AM, Takashi Yano wrote:
> > > On Sun, 5 Sep 2021 22:40:59 +0900
> > > Takashi Yano wrote:
> > >> On Sun, 5 Sep 2021 08:15:23 +0900
> > >> Takashi Yano wrote:
> > >>> On Sat, 4 Sep 2021 10:04:12 -0400
> > >>> Ken Brown wrote:
> > >>>> On 9/4/2021 8:37 AM, Takashi Yano wrote:
> > >>>>> On Sat, 4 Sep 2021 21:02:58 +0900
> > >>>>> Takashi Yano wrote:
> > >>>>>> On Fri, 3 Sep 2021 09:27:37 -0400
> > >>>>>> Ken Brown wrote:
> > >>>>>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> > >>>>>>>> POSIX says:
> > >>>>>>>>        The value returned may be less than nbyte if the number of bytes left
> > >>>>>>>>        in the file is less than nbyte, if the read() request was interrupted
> > >>>>>>>>        by a signal, or if the file is a pipe or FIFO or special file and has
> > >>>>>>>>                                                                          ~~~
> > >>>>>>>>        fewer than nbyte bytes immediately available for reading.
> > >>>>>>>>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >>>>>>>>
> > >>>>>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> > >>>>>>>>
> > >>>>>>>> If it is turned over, read() should read all data immediately available,
> > >>>>>>>> I think.
> > >>>>>>> [...]
> > >>>>>> Corinna Vinschen wrote:
> > >>>>>>> We can't keep the mutex while waiting on the pending read, and there
> > >>>>>>> could be more than one pending read running at the time.  if so,
> > >>>>>>> chances are extremly high, that the data written to the buffer gets
> > >>>>>>> split like this:
> > >>>>>>> [...]
> > >>>>> Takashi Yano wrote:
> > >>>>>> Why can't we keep the mutex while waiting on the pending read?
> > >>>>>> If we can keep the mutex, the issue above mentioned does not
> > >>>>>> happen, right?
> [...]
> @@ -239,43 +237,51 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
>        return;
>      }
>  
> -  do
> +  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
> +  if (WAIT_TIMEOUT == WaitForSingleObject (read_mtx, timeout))

My code was originally not supposed to serialise the readers.  The
mutex block should be short lived and only create an atomic block
for the two calls NtQueryInformationFile and NtReadFile.

If you have multiple readers, all but one of them will hang in this
WFSO.  They will block here without a chance to kill or Ctrl-C them
and thread cancellation won't work.

To fix that you have to use cygwait and handle signals and thread
cancellation the same way as in the below code following the NtReadFile.

>  	  /* If the pipe is empty, don't request more bytes than half the
> -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> -	     on the write side and thus affects select's ability to return
> -	     more or less reliable info whether a write succeeds or not.
> -
> -	     Let the size of the request depend on the number of readers
> -	     at the time. */
> +	     pipe buffer size. Pending read lowers WriteQuotaAvailable on
> +	     the write side and thus affects select's ability to return
> +	     more or less reliable info whether a write succeeds or not. */
> +	  ULONG chunk = max_atomic_write / 2;
>  	  status = NtQueryInformationFile (get_handle (), &io,
>  					   &fpli, sizeof (fpli),
>  					   FilePipeLocalInformation);
> -	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)

If the readers are serialized anyway, why fetch only half the remaining
buffer size?  In that case fetching fpli.InboundQuota - 1 is as good
as fetching just the half of it, isn't it?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06  8:13                                                                                                                 ` Corinna Vinschen
@ 2021-09-06 11:16                                                                                                                   ` Takashi Yano
  2021-09-06 12:49                                                                                                                     ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-06 11:16 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4027 bytes --]

Hi Corinna,

On Mon, 6 Sep 2021 10:13:10 +0200
Corinna Vinschen wrote:
> On Sep  6 05:09, Takashi Yano wrote:
> > On Sun, 5 Sep 2021 14:47:26 -0400
> > Ken Brown wrote:
> > > On 9/5/2021 9:50 AM, Takashi Yano wrote:
> > > > On Sun, 5 Sep 2021 22:40:59 +0900
> > > > Takashi Yano wrote:
> > > >> On Sun, 5 Sep 2021 08:15:23 +0900
> > > >> Takashi Yano wrote:
> > > >>> On Sat, 4 Sep 2021 10:04:12 -0400
> > > >>> Ken Brown wrote:
> > > >>>> On 9/4/2021 8:37 AM, Takashi Yano wrote:
> > > >>>>> On Sat, 4 Sep 2021 21:02:58 +0900
> > > >>>>> Takashi Yano wrote:
> > > >>>>>> On Fri, 3 Sep 2021 09:27:37 -0400
> > > >>>>>> Ken Brown wrote:
> > > >>>>>>> On 9/3/2021 8:22 AM, Takashi Yano wrote:
> > > >>>>>>>> POSIX says:
> > > >>>>>>>>        The value returned may be less than nbyte if the number of bytes left
> > > >>>>>>>>        in the file is less than nbyte, if the read() request was interrupted
> > > >>>>>>>>        by a signal, or if the file is a pipe or FIFO or special file and has
> > > >>>>>>>>                                                                          ~~~
> > > >>>>>>>>        fewer than nbyte bytes immediately available for reading.
> > > >>>>>>>>        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >>>>>>>>
> > > >>>>>>>> https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html
> > > >>>>>>>>
> > > >>>>>>>> If it is turned over, read() should read all data immediately available,
> > > >>>>>>>> I think.
> > > >>>>>>> [...]
> > > >>>>>> Corinna Vinschen wrote:
> > > >>>>>>> We can't keep the mutex while waiting on the pending read, and there
> > > >>>>>>> could be more than one pending read running at the time.  if so,
> > > >>>>>>> chances are extremly high, that the data written to the buffer gets
> > > >>>>>>> split like this:
> > > >>>>>>> [...]
> > > >>>>> Takashi Yano wrote:
> > > >>>>>> Why can't we keep the mutex while waiting on the pending read?
> > > >>>>>> If we can keep the mutex, the issue above mentioned does not
> > > >>>>>> happen, right?
> > [...]
> > @@ -239,43 +237,51 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
> >        return;
> >      }
> >  
> > -  do
> > +  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
> > +  if (WAIT_TIMEOUT == WaitForSingleObject (read_mtx, timeout))
> 
> My code was originally not supposed to serialise the readers.  The
> mutex block should be short lived and only create an atomic block
> for the two calls NtQueryInformationFile and NtReadFile.
> 
> If you have multiple readers, all but one of them will hang in this
> WFSO.  They will block here without a chance to kill or Ctrl-C them
> and thread cancellation won't work.
> 
> To fix that you have to use cygwait and handle signals and thread
> cancellation the same way as in the below code following the NtReadFile.

OK. Thanks for the advice. Then what about the patch attached?

> >  	  /* If the pipe is empty, don't request more bytes than half the
> > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > -	     on the write side and thus affects select's ability to return
> > -	     more or less reliable info whether a write succeeds or not.
> > -
> > -	     Let the size of the request depend on the number of readers
> > -	     at the time. */
> > +	     pipe buffer size. Pending read lowers WriteQuotaAvailable on
> > +	     the write side and thus affects select's ability to return
> > +	     more or less reliable info whether a write succeeds or not. */
> > +	  ULONG chunk = max_atomic_write / 2;
> >  	  status = NtQueryInformationFile (get_handle (), &io,
> >  					   &fpli, sizeof (fpli),
> >  					   FilePipeLocalInformation);
> > -	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
> 
> If the readers are serialized anyway, why fetch only half the remaining
> buffer size?  In that case fetching fpli.InboundQuota - 1 is as good
> as fetching just the half of it, isn't it?

It sounds reasonable. Adopted in the attached patch.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Stop-counting-reader-and-read-all-availa.patch --]
[-- Type: application/octet-stream, Size: 5370 bytes --]

From 0bb2ea9e552f8cb788c097519174145cd368b45b Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Mon, 6 Sep 2021 20:12:16 +0900
Subject: [PATCH] Cygwin: pipe: Stop counting reader and read all available
 data.

- By guarding read with read_mtx, no more than one ReadFile can
  be called simultaneously. So couting read handles is no longer
  necessary.
- Make raw_read code as similar as possible to raw_write code.
---
 winsup/cygwin/fhandler_pipe.cc | 100 ++++++++++++++++++++-------------
 1 file changed, 60 insertions(+), 40 deletions(-)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 68974eb80..fe0bf0ca2 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -221,12 +221,10 @@ fhandler_pipe::get_proc_fd_name (char *buf)
 void __reg3
 fhandler_pipe::raw_read (void *ptr, size_t& len)
 {
-  NTSTATUS status;
+  size_t nbytes = 0;
+  NTSTATUS status = STATUS_SUCCESS;
   IO_STATUS_BLOCK io;
   HANDLE evt = NULL;
-  DWORD waitret = WAIT_OBJECT_0;
-  bool keep_looping = false;
-  size_t orig_len = len;
 
   if (!len)
     return;
@@ -239,43 +237,59 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       return;
     }
 
-  do
+  DWORD timeout = is_nonblocking () ? 0 : INFINITE;
+  DWORD waitret = cygwait (read_mtx, timeout);
+  switch (waitret)
     {
-      len = orig_len;
-      keep_looping = false;
+    case WAIT_OBJECT_0:
+      break;
+    case WAIT_TIMEOUT:
+      set_errno (EAGAIN);
+      len = (size_t) -1;
+      return;
+    default:
+      set_errno (EINTR);
+      len = (size_t) -1;
+      return;
+    }
+  while (nbytes < len)
+    {
+      ULONG_PTR nbytes_now = 0;
+      size_t left = len - nbytes;
+      ULONG len1 = (ULONG) left;
+      waitret = WAIT_OBJECT_0;
+
       if (evt)
 	ResetEvent (evt);
       if (!is_nonblocking ())
 	{
 	  FILE_PIPE_LOCAL_INFORMATION fpli;
-	  ULONG reader_count;
-	  ULONG max_len = 64;
-
-	  WaitForSingleObject (read_mtx, INFINITE);
 
-	  /* If the pipe is empty, don't request more bytes than half the
-	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
-	     on the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not.
-
-	     Let the size of the request depend on the number of readers
-	     at the time. */
+	  /* If the pipe is empty, don't request more bytes than pipe
+	     buffer size - 1. Pending read lowers WriteQuotaAvailable on
+	     the write side and thus affects select's ability to return
+	     more or less reliable info whether a write succeeds or not. */
+	  ULONG chunk = max_atomic_write - 1;
 	  status = NtQueryInformationFile (get_handle (), &io,
 					   &fpli, sizeof (fpli),
 					   FilePipeLocalInformation);
-	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
+	  if (NT_SUCCESS (status))
 	    {
-	      reader_count = get_obj_handle_count (get_handle ());
-	      if (reader_count < 10)
-		max_len = fpli.InboundQuota / (2 * reader_count);
-	      if (len > max_len)
-		len = max_len;
+	      if (fpli.ReadDataAvailable > 0)
+		chunk = left;
+	      else if (nbytes != 0)
+		break;
+	      else
+		chunk = fpli.InboundQuota - 1;
 	    }
+	  else if (nbytes != 0)
+	    break;
+
+	  if (len1 > chunk)
+	    len1 = chunk;
 	}
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
-			   len, NULL, NULL);
-      if (!is_nonblocking ())
-	ReleaseMutex (read_mtx);
+			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
 	{
 	  waitret = cygwait (evt);
@@ -292,13 +306,13 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    set_errno (EBADF);
 	  else
 	    __seterrno ();
-	  len = (size_t) -1;
+	  nbytes = (size_t) -1;
 	}
       else if (NT_SUCCESS (status))
 	{
-	  len = io.Information;
-	  if (len == 0)
-	    keep_looping = true;
+	  nbytes_now = io.Information;
+	  ptr = ((char *) ptr) + nbytes_now;
+	  nbytes += nbytes_now;
 	}
       else
 	{
@@ -308,40 +322,46 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_END_OF_FILE:
 	    case STATUS_PIPE_BROKEN:
 	      /* This is really EOF.  */
-	      len = 0;
 	      break;
 	    case STATUS_MORE_ENTRIES:
 	    case STATUS_BUFFER_OVERFLOW:
 	      /* `io.Information' is supposedly valid.  */
-	      len = io.Information;
-	      if (len == 0)
-		keep_looping = true;
+	      nbytes_now = io.Information;
+	      ptr = ((char *) ptr) + nbytes_now;
+	      nbytes += nbytes_now;
 	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:
+	      if (nbytes != 0)
+		break;
 	      if (is_nonblocking ())
 		{
 		  set_errno (EAGAIN);
-		  len = (size_t) -1;
+		  nbytes = (size_t) -1;
 		  break;
 		}
 	      fallthrough;
 	    default:
 	      __seterrno_from_nt_status (status);
-	      len = (size_t) -1;
+	      nbytes = (size_t) -1;
 	      break;
 	    }
 	}
-    } while (keep_looping);
+
+      if (nbytes_now == 0)
+	break;
+    }
+  ReleaseMutex (read_mtx);
   if (evt)
     CloseHandle (evt);
-  if (status == STATUS_THREAD_SIGNALED)
+  if (status == STATUS_THREAD_SIGNALED && nbytes == 0)
     {
       set_errno (EINTR);
-      len = (size_t) -1;
+      nbytes = (size_t) -1;
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  len = nbytes;
 }
 
 ssize_t __reg3
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06 11:16                                                                                                                   ` Takashi Yano
@ 2021-09-06 12:49                                                                                                                     ` Corinna Vinschen
  2021-09-06 13:16                                                                                                                       ` Takashi Yano
  2021-09-07 16:14                                                                                                                       ` Ken Brown
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-06 12:49 UTC (permalink / raw)
  To: cygwin-developers

On Sep  6 20:16, Takashi Yano wrote:
> Hi Corinna,
> 
> On Mon, 6 Sep 2021 10:13:10 +0200
> Corinna Vinschen wrote:
> > If you have multiple readers, all but one of them will hang in this
> > WFSO.  They will block here without a chance to kill or Ctrl-C them
> > and thread cancellation won't work.
> > 
> > To fix that you have to use cygwait and handle signals and thread
> > cancellation the same way as in the below code following the NtReadFile.
> 
> OK. Thanks for the advice. Then what about the patch attached?
> 
> > >  	  /* If the pipe is empty, don't request more bytes than half the
> > > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > > -	     on the write side and thus affects select's ability to return
> > > -	     more or less reliable info whether a write succeeds or not.
> > > -
> > > -	     Let the size of the request depend on the number of readers
> > > -	     at the time. */
> > > +	     pipe buffer size. Pending read lowers WriteQuotaAvailable on
> > > +	     the write side and thus affects select's ability to return
> > > +	     more or less reliable info whether a write succeeds or not. */
> > > +	  ULONG chunk = max_atomic_write / 2;
> > >  	  status = NtQueryInformationFile (get_handle (), &io,
> > >  					   &fpli, sizeof (fpli),
> > >  					   FilePipeLocalInformation);
> > > -	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
> > 
> > If the readers are serialized anyway, why fetch only half the remaining
> > buffer size?  In that case fetching fpli.InboundQuota - 1 is as good
> > as fetching just the half of it, isn't it?
> 
> It sounds reasonable. Adopted in the attached patch.

Patch looks ok.

I added one more patches:

- It occured to me that the code is lacking a CancelIo in case cygwait
  is waiting for NtReadFile/NtWriteFile.  Actually, calling cygwait
  without "mask" parameter will result in cygwait performing the thread
  cancellation by itself, but cancelling a thread does not cancel the
  async IO started by that thread.  So I fixed the cygwait calls in
  raw_read/raw_write to return to the caller and then call CancelIo
  before cancelling the thread.

I planned to push one more patch:

- Drop max_atomic_write, it's == DEFAULT_PIPEBUFSIZE anyway

But then some things were coming to mind, which we still have to discuss.

- I think setting chunk to DEFAULT_PIPEBUFSIZE - 1 in the read case and
  DEFAULT_PIPEBUFSIZE in the write case by default is dangerous.
  Assuming the pipe has been created by a non-Cygwin process, the values
  may be way too high.

  Suggestion: Actually set max_atomic_write to something useful.
  Set max_atomic_write to DEFAULT_PIPEBUFSIZE in fhandler_pipe::create.
  In case of stdio handles inherited from non-Cygwin processes, fetch
  the pipe buffer size via NtQueryInformationFile in
  dtable::init_std_file_from_handle().  Better, in a matching
  fhandler_pipe method called from init_std_file_from_handle().

- What about calling select for writing on pipes read by non-Cygwin
  processes?  In that case, we still can't rely on WriteQuotaAvailable,
  just as before.

  I have a vague idea that we might want to count readers in that case,
  but I have to think about it some more.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06 12:49                                                                                                                     ` Corinna Vinschen
@ 2021-09-06 13:16                                                                                                                       ` Takashi Yano
  2021-09-06 16:08                                                                                                                         ` Corinna Vinschen
  2021-09-07 16:14                                                                                                                       ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-06 13:16 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 6 Sep 2021 14:49:55 +0200
Corinna Vinschen wrote:
> On Sep  6 20:16, Takashi Yano wrote:
> > Hi Corinna,
> > 
> > On Mon, 6 Sep 2021 10:13:10 +0200
> > Corinna Vinschen wrote:
> > > If you have multiple readers, all but one of them will hang in this
> > > WFSO.  They will block here without a chance to kill or Ctrl-C them
> > > and thread cancellation won't work.
> > > 
> > > To fix that you have to use cygwait and handle signals and thread
> > > cancellation the same way as in the below code following the NtReadFile.
> > 
> > OK. Thanks for the advice. Then what about the patch attached?
> > 
> > > >  	  /* If the pipe is empty, don't request more bytes than half the
> > > > -	     pipe buffer size.  Every pending read lowers WriteQuotaAvailable
> > > > -	     on the write side and thus affects select's ability to return
> > > > -	     more or less reliable info whether a write succeeds or not.
> > > > -
> > > > -	     Let the size of the request depend on the number of readers
> > > > -	     at the time. */
> > > > +	     pipe buffer size. Pending read lowers WriteQuotaAvailable on
> > > > +	     the write side and thus affects select's ability to return
> > > > +	     more or less reliable info whether a write succeeds or not. */
> > > > +	  ULONG chunk = max_atomic_write / 2;
> > > >  	  status = NtQueryInformationFile (get_handle (), &io,
> > > >  					   &fpli, sizeof (fpli),
> > > >  					   FilePipeLocalInformation);
> > > > -	  if (NT_SUCCESS (status) && fpli.ReadDataAvailable == 0)
> > > 
> > > If the readers are serialized anyway, why fetch only half the remaining
> > > buffer size?  In that case fetching fpli.InboundQuota - 1 is as good
> > > as fetching just the half of it, isn't it?
> > 
> > It sounds reasonable. Adopted in the attached patch.
> 
> Patch looks ok.
> 
> I added one more patches:
> 
> - It occured to me that the code is lacking a CancelIo in case cygwait
>   is waiting for NtReadFile/NtWriteFile.  Actually, calling cygwait
>   without "mask" parameter will result in cygwait performing the thread
>   cancellation by itself, but cancelling a thread does not cancel the
>   async IO started by that thread.  So I fixed the cygwait calls in
>   raw_read/raw_write to return to the caller and then call CancelIo
>   before cancelling the thread.
> 
> I planned to push one more patch:
> 
> - Drop max_atomic_write, it's == DEFAULT_PIPEBUFSIZE anyway
> 
> But then some things were coming to mind, which we still have to discuss.
> 
> - I think setting chunk to DEFAULT_PIPEBUFSIZE - 1 in the read case and
>   DEFAULT_PIPEBUFSIZE in the write case by default is dangerous.
>   Assuming the pipe has been created by a non-Cygwin process, the values
>   may be way too high.
> 
>   Suggestion: Actually set max_atomic_write to something useful.
>   Set max_atomic_write to DEFAULT_PIPEBUFSIZE in fhandler_pipe::create.
>   In case of stdio handles inherited from non-Cygwin processes, fetch
>   the pipe buffer size via NtQueryInformationFile in
>   dtable::init_std_file_from_handle().  Better, in a matching
>   fhandler_pipe method called from init_std_file_from_handle().
> 
> - What about calling select for writing on pipes read by non-Cygwin
>   processes?  In that case, we still can't rely on WriteQuotaAvailable,
>   just as before.
> 
>   I have a vague idea that we might want to count readers in that case,
>   but I have to think about it some more.

Current git head seems to have some bug. With and without my patch,
sftp get for large file causes error:

[yano@Express5800-S70 ~]$ sftp 192.168.0.133
yano@192.168.0.133's password:
Connected to 192.168.0.133.
sftp> get test.dat
Fetching /home/yano/test.dat to test.dat
test.dat                                       13%   66MB  66.4MB/s   00:06 ETAReceived message too long 1728053256
Ensure the remote shell produces no output for non-interactive sessions.
[yano@Express5800-S70 ~]$ sftp 192.168.0.133
yano@192.168.0.133's password:
Connected to 192.168.0.133.
sftp> get test.dat
Fetching /home/yano/test.dat to test.dat
test.dat                                       22%  111MB 110.6MB/s   00:03 ETAdo_download: parse: incomplete message
[yano@Express5800-S70 ~]$



-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06 13:16                                                                                                                       ` Takashi Yano
@ 2021-09-06 16:08                                                                                                                         ` Corinna Vinschen
  2021-09-06 23:39                                                                                                                           ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-06 16:08 UTC (permalink / raw)
  To: cygwin-developers

On Sep  6 22:16, Takashi Yano wrote:
> Current git head seems to have some bug. With and without my patch,
> sftp get for large file causes error:
> 
> [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> yano@192.168.0.133's password:
> Connected to 192.168.0.133.
> sftp> get test.dat
> Fetching /home/yano/test.dat to test.dat
> test.dat                                       13%   66MB  66.4MB/s   00:06 ETAReceived message too long 1728053256
> Ensure the remote shell produces no output for non-interactive sessions.
> [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> yano@192.168.0.133's password:
> Connected to 192.168.0.133.
> sftp> get test.dat
> Fetching /home/yano/test.dat to test.dat
> test.dat                                       22%  111MB 110.6MB/s   00:03 ETAdo_download: parse: incomplete message
> [yano@Express5800-S70 ~]$

I bisected this down to commit 296bd3e78b52, but I'm at a loss in
terms of the cause of the problem, ATM.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06 16:08                                                                                                                         ` Corinna Vinschen
@ 2021-09-06 23:39                                                                                                                           ` Takashi Yano
  2021-09-07  9:14                                                                                                                             ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-06 23:39 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 6 Sep 2021 18:08:54 +0200
Corinna Vinschen wrote:
> On Sep  6 22:16, Takashi Yano wrote:
> > Current git head seems to have some bug. With and without my patch,
> > sftp get for large file causes error:
> > 
> > [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> > yano@192.168.0.133's password:
> > Connected to 192.168.0.133.
> > sftp> get test.dat
> > Fetching /home/yano/test.dat to test.dat
> > test.dat                                       13%   66MB  66.4MB/s   00:06 ETAReceived message too long 1728053256
> > Ensure the remote shell produces no output for non-interactive sessions.
> > [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> > yano@192.168.0.133's password:
> > Connected to 192.168.0.133.
> > sftp> get test.dat
> > Fetching /home/yano/test.dat to test.dat
> > test.dat                                       22%  111MB 110.6MB/s   00:03 ETAdo_download: parse: incomplete message
> > [yano@Express5800-S70 ~]$
> 
> I bisected this down to commit 296bd3e78b52, but I'm at a loss in
> terms of the cause of the problem, ATM.

Thanks for bisecting this.

I am not sure this is the correct thing, however, found the following
patch solves the issue.

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 76ce895e2..83efb8296 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -292,7 +292,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
                           len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
        {
-         waitret = cygwait (evt, INFINITE, cw_cancel | cw_sig);
+         waitret = cygwait (evt, INFINITE, cw_cancel | cw_sig_restart);
          if (waitret == WAIT_OBJECT_0)
            status = io.Status;
        }
@@ -442,7 +442,7 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
        }
       if (evt && status == STATUS_PENDING)
        {
-         waitret = cygwait (evt, INFINITE, cw_cancel | cw_sig);
+         waitret = cygwait (evt, INFINITE, cw_cancel | cw_sig_restart);
          if (waitret == WAIT_OBJECT_0)
            status = io.Status;
        }

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
       [not found]           ` <20210827202440.47706fc2fc07c5e9a1bc0047@nifty.ne.jp>
       [not found]             ` <4f2cb5f3-ce9c-c617-f65f-841a5eca096e@cornell.edu>
@ 2021-09-07  3:26             ` Takashi Yano
  2021-09-07 10:50               ` Takashi Yano
  2021-09-09  3:41               ` Takashi Yano
  1 sibling, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-07  3:26 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1549 bytes --]

On Fri, 27 Aug 2021 20:24:40 +0900
Takashi Yano wrote:
> Hi Ken,
> 
> Thanks much! I tested topic/pipe branch.
> 
> [yano@cygwin-PC ~]$ scp test.dat yano@linux-server:.
> yano@linux-server's password:
> test.dat                                      100%  100MB  95.9MB/s   00:01
> [yano@cygwin-PC ~]$ scp yano@linux-server:test.dat .
> yano@linux-server's password:
> test.dat                                      100%  100MB   8.0MB/s   00:12
> 
> yano@linux-server:~$ scp yano@cygwin-PC:test.dat .
> yano@cygwin-PC's password:
> test.dat                                      100%  100MB 109.7MB/s   00:00
> yano@linux-server:~$ scp test.dat yano@cygwin-PC:.
> yano@cygwin-PC's password:
> test.dat                                      100%  100MB  31.4MB/s   00:03
> 
> As shown above, outgoing transfer-rate has been improved upto near
> theoretical limit. However, incoming transfer-rate is not improved
> much.
> 
> I digged further and found the first patch attached solves the issue
> as follows.
> 
> [yano@cygwin-PC ~]$ scp yano@linux-server:test.dat .
> yano@linux-server's password:
> test.dat                                      100%  100MB 112.8MB/s   00:00
> 
> yano@linux-server2:~$ scp test.dat yano@cygwin-PC:.
> yano@cygwin-PC's password:
> test.dat                                      100%  100MB 102.5MB/s   00:00

With this patch (2e36ae2e), I found a problem that mintty gets into
high load if several keys are typed quickly.

Therefore, I would like to propose a patch attached.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch --]
[-- Type: application/octet-stream, Size: 6936 bytes --]

From a455ae9a0ed871e5f1e9ab5cf89ffdcbe34a49db Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Tue, 7 Sep 2021 09:02:55 +0900
Subject: [PATCH] Cygwin: select: Introduce select_evt event for pipe.

- This patch reverts "Cygwin: select: Improve select/poll response",
  and introduces select_evt event which notifies pipe status change.
---
 winsup/cygwin/fhandler.cc      |  1 +
 winsup/cygwin/fhandler.h       |  3 +++
 winsup/cygwin/fhandler_pipe.cc | 28 +++++++++++++++++++++++
 winsup/cygwin/select.cc        | 41 +++++++++-------------------------
 4 files changed, 42 insertions(+), 31 deletions(-)

diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc
index f0c1b68f1..265e8ee59 100644
--- a/winsup/cygwin/fhandler.cc
+++ b/winsup/cygwin/fhandler.cc
@@ -1464,6 +1464,7 @@ fhandler_base::fhandler_base () :
   _refcnt (0),
   openflags (0),
   unique_id (0),
+  select_evt (NULL),
   archetype (NULL),
   usecount (0)
 {
diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index bb7eb09ce..9022aa09c 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -217,6 +217,7 @@ class fhandler_base
   void set_ino (ino_t i) { ino = i; }
 
   HANDLE read_state;
+  HANDLE select_evt;
 
  public:
   LONG inc_refcnt () {return InterlockedIncrement (&_refcnt);}
@@ -520,6 +521,8 @@ public:
     fh->copy_from (this);
     return fh;
   }
+
+  HANDLE get_select_evt () { return select_evt; }
 };
 
 struct wsa_event
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 83efb8296..7cce4564c 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -367,6 +367,9 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
       CancelIo (get_handle ());
       pthread::static_cancel_self ();
     }
+  if (select_evt && nbytes)
+    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
+      SetEvent (select_evt);
   len = nbytes;
 }
 
@@ -489,6 +492,9 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
       CancelIo (get_handle ());
       pthread::static_cancel_self ();
     }
+  if (select_evt && nbytes)
+    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
+      SetEvent (select_evt);
   return nbytes ?: -1;
 }
 
@@ -497,6 +503,8 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
 {
   if (read_mtx)
     fork_fixup (parent, read_mtx, "read_mtx");
+  if (select_evt)
+    fork_fixup (parent, select_evt, "select_evt");
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -518,6 +526,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (select_evt &&
+	   !DuplicateHandle (GetCurrentProcess (), select_evt,
+			    GetCurrentProcess (), &ftp->select_evt,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -528,6 +545,12 @@ fhandler_pipe::close ()
 {
   if (read_mtx)
     CloseHandle (read_mtx);
+  if (select_evt)
+    {
+      for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
+	SetEvent (select_evt);
+      CloseHandle (select_evt);
+    }
   return fhandler_base::close ();
 }
 
@@ -747,6 +770,11 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	  fhs[0]->set_read_mutex (mtx);
 	  res = 0;
 	}
+      fhs[0]->select_evt = CreateEvent (&sa, FALSE, FALSE, NULL);
+      if (fhs[0]->select_evt)
+	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_evt,
+			 GetCurrentProcess (), &fhs[1]->select_evt,
+			 0, 1, DUPLICATE_SAME_ACCESS);
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index ac2fd227e..19efe9e95 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -734,7 +734,6 @@ thread_pipe (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -754,12 +753,7 @@ thread_pipe (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -776,7 +770,9 @@ start_thread_pipe (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_evt ();
+      if (pi->bye == NULL)
+	pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
       pi->thread = new cygthread (thread_pipe, pi, "pipesel");
       me->h = *pi->thread;
       if (!me->h)
@@ -786,7 +782,7 @@ start_thread_pipe (select_record *me, select_stuff *stuff)
 }
 
 static void
-pipe_cleanup (select_record *, select_stuff *stuff)
+pipe_cleanup (select_record *me, select_stuff *stuff)
 {
   select_pipe_info *pi = (select_pipe_info *) stuff->device_specific_pipe;
   if (!pi)
@@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
       pi->stop_thread = true;
       SetEvent (pi->bye);
       pi->thread->detach ();
-      CloseHandle (pi->bye);
+      if (me->fh->get_select_evt () == NULL)
+	CloseHandle (pi->bye);
     }
   delete pi;
   stuff->device_specific_pipe = NULL;
@@ -935,7 +932,6 @@ thread_fifo (void *arg)
   select_fifo_info *pi = (select_fifo_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -955,12 +951,7 @@ thread_fifo (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -1136,7 +1127,6 @@ thread_console (void *arg)
   select_console_info *ci = (select_console_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1156,12 +1146,7 @@ thread_console (void *arg)
 	break;
       cygwait (ci->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (ci->stop_thread)
 	break;
     }
@@ -1381,7 +1366,6 @@ thread_pty_slave (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1401,12 +1385,7 @@ thread_pty_slave (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06 23:39                                                                                                                           ` Takashi Yano
@ 2021-09-07  9:14                                                                                                                             ` Corinna Vinschen
  2021-09-07 11:03                                                                                                                               ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-07  9:14 UTC (permalink / raw)
  To: cygwin-developers

On Sep  7 08:39, Takashi Yano wrote:
> On Mon, 6 Sep 2021 18:08:54 +0200
> Corinna Vinschen wrote:
> > On Sep  6 22:16, Takashi Yano wrote:
> > > Current git head seems to have some bug. With and without my patch,
> > > sftp get for large file causes error:
> > > 
> > > [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> > > yano@192.168.0.133's password:
> > > Connected to 192.168.0.133.
> > > sftp> get test.dat
> > > Fetching /home/yano/test.dat to test.dat
> > > test.dat                                       13%   66MB  66.4MB/s   00:06 ETAReceived message too long 1728053256
> > > Ensure the remote shell produces no output for non-interactive sessions.
> > > [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> > > yano@192.168.0.133's password:
> > > Connected to 192.168.0.133.
> > > sftp> get test.dat
> > > Fetching /home/yano/test.dat to test.dat
> > > test.dat                                       22%  111MB 110.6MB/s   00:03 ETAdo_download: parse: incomplete message
> > > [yano@Express5800-S70 ~]$
> > 
> > I bisected this down to commit 296bd3e78b52, but I'm at a loss in
> > terms of the cause of the problem, ATM.
> 
> Thanks for bisecting this.
> 
> I am not sure this is the correct thing, however, found the following
> patch solves the issue.

Thanks for the patch!  It's not correct as such, because it enables
SA_RESTART behaviour unconditionally, but it gave me the right hint.

The underlying problem is that in case of a signal, the CancelIo call
was missing.  The signal was processed, but the IO was still ongoing
and so data was read or written without the application's knowledge.

Actually we can always call CancelIo.  It doesn't break the information
in the IO_STATUS_BLOCK if the IO was already finished.  It just sets
io.Status to STATUS_CANCELLED and io.Information to the number of bytes
processed if it really canceled the ongoing IO.

I pushed a matching patch.


Thanks again!
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-07  3:26             ` Takashi Yano
@ 2021-09-07 10:50               ` Takashi Yano
  2021-09-08  0:07                 ` Takashi Yano
  2021-09-09  3:41               ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-07 10:50 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1753 bytes --]

On Tue, 7 Sep 2021 12:26:31 +0900
Takashi Yano wrote:
> On Fri, 27 Aug 2021 20:24:40 +0900
> Takashi Yano wrote:
> > Hi Ken,
> > 
> > Thanks much! I tested topic/pipe branch.
> > 
> > [yano@cygwin-PC ~]$ scp test.dat yano@linux-server:.
> > yano@linux-server's password:
> > test.dat                                      100%  100MB  95.9MB/s   00:01
> > [yano@cygwin-PC ~]$ scp yano@linux-server:test.dat .
> > yano@linux-server's password:
> > test.dat                                      100%  100MB   8.0MB/s   00:12
> > 
> > yano@linux-server:~$ scp yano@cygwin-PC:test.dat .
> > yano@cygwin-PC's password:
> > test.dat                                      100%  100MB 109.7MB/s   00:00
> > yano@linux-server:~$ scp test.dat yano@cygwin-PC:.
> > yano@cygwin-PC's password:
> > test.dat                                      100%  100MB  31.4MB/s   00:03
> > 
> > As shown above, outgoing transfer-rate has been improved upto near
> > theoretical limit. However, incoming transfer-rate is not improved
> > much.
> > 
> > I digged further and found the first patch attached solves the issue
> > as follows.
> > 
> > [yano@cygwin-PC ~]$ scp yano@linux-server:test.dat .
> > yano@linux-server's password:
> > test.dat                                      100%  100MB 112.8MB/s   00:00
> > 
> > yano@linux-server2:~$ scp test.dat yano@cygwin-PC:.
> > yano@cygwin-PC's password:
> > test.dat                                      100%  100MB 102.5MB/s   00:00
> 
> With this patch (2e36ae2e), I found a problem that mintty gets into
> high load if several keys are typed quickly.
> 
> Therefore, I would like to propose a patch attached.

I revised this patch to fit the current git head of topic/pipe branch.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch --]
[-- Type: application/octet-stream, Size: 6975 bytes --]

From 0a6901492631e5abb0dedf8fb70159fd7205c9e8 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Tue, 7 Sep 2021 19:38:58 +0900
Subject: [PATCH] Cygwin: select: Introduce select_evt event for pipe.

- This patch reverts "Cygwin: select: Improve select/poll response",
  and introduces select_evt event which notifies pipe status change.
---
 winsup/cygwin/fhandler.cc      |  1 +
 winsup/cygwin/fhandler.h       |  3 +++
 winsup/cygwin/fhandler_pipe.cc | 28 +++++++++++++++++++++++
 winsup/cygwin/select.cc        | 41 +++++++++-------------------------
 4 files changed, 42 insertions(+), 31 deletions(-)

diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc
index f0c1b68f1..265e8ee59 100644
--- a/winsup/cygwin/fhandler.cc
+++ b/winsup/cygwin/fhandler.cc
@@ -1464,6 +1464,7 @@ fhandler_base::fhandler_base () :
   _refcnt (0),
   openflags (0),
   unique_id (0),
+  select_evt (NULL),
   archetype (NULL),
   usecount (0)
 {
diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index bb7eb09ce..9022aa09c 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -217,6 +217,7 @@ class fhandler_base
   void set_ino (ino_t i) { ino = i; }
 
   HANDLE read_state;
+  HANDLE select_evt;
 
  public:
   LONG inc_refcnt () {return InterlockedIncrement (&_refcnt);}
@@ -520,6 +521,8 @@ public:
     fh->copy_from (this);
     return fh;
   }
+
+  HANDLE get_select_evt () { return select_evt; }
 };
 
 struct wsa_event
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 4bd807a09..d995acbeb 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -376,6 +376,9 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  if (select_evt && nbytes)
+    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
+      SetEvent (select_evt);
   len = nbytes;
 }
 
@@ -507,6 +510,9 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
     set_errno (EINTR);
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  if (select_evt && nbytes)
+    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
+      SetEvent (select_evt);
   return nbytes ?: -1;
 }
 
@@ -515,6 +521,8 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
 {
   if (read_mtx)
     fork_fixup (parent, read_mtx, "read_mtx");
+  if (select_evt)
+    fork_fixup (parent, select_evt, "select_evt");
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -536,6 +544,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (select_evt &&
+	   !DuplicateHandle (GetCurrentProcess (), select_evt,
+			    GetCurrentProcess (), &ftp->select_evt,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -546,6 +563,12 @@ fhandler_pipe::close ()
 {
   if (read_mtx)
     CloseHandle (read_mtx);
+  if (select_evt)
+    {
+      for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
+	SetEvent (select_evt);
+      CloseHandle (select_evt);
+    }
   return fhandler_base::close ();
 }
 
@@ -765,6 +788,11 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	  fhs[0]->set_read_mutex (mtx);
 	  res = 0;
 	}
+      fhs[0]->select_evt = CreateEvent (&sa, FALSE, FALSE, NULL);
+      if (fhs[0]->select_evt)
+	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_evt,
+			 GetCurrentProcess (), &fhs[1]->select_evt,
+			 0, 1, DUPLICATE_SAME_ACCESS);
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index ac2fd227e..19efe9e95 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -734,7 +734,6 @@ thread_pipe (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -754,12 +753,7 @@ thread_pipe (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -776,7 +770,9 @@ start_thread_pipe (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_evt ();
+      if (pi->bye == NULL)
+	pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
       pi->thread = new cygthread (thread_pipe, pi, "pipesel");
       me->h = *pi->thread;
       if (!me->h)
@@ -786,7 +782,7 @@ start_thread_pipe (select_record *me, select_stuff *stuff)
 }
 
 static void
-pipe_cleanup (select_record *, select_stuff *stuff)
+pipe_cleanup (select_record *me, select_stuff *stuff)
 {
   select_pipe_info *pi = (select_pipe_info *) stuff->device_specific_pipe;
   if (!pi)
@@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
       pi->stop_thread = true;
       SetEvent (pi->bye);
       pi->thread->detach ();
-      CloseHandle (pi->bye);
+      if (me->fh->get_select_evt () == NULL)
+	CloseHandle (pi->bye);
     }
   delete pi;
   stuff->device_specific_pipe = NULL;
@@ -935,7 +932,6 @@ thread_fifo (void *arg)
   select_fifo_info *pi = (select_fifo_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -955,12 +951,7 @@ thread_fifo (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -1136,7 +1127,6 @@ thread_console (void *arg)
   select_console_info *ci = (select_console_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1156,12 +1146,7 @@ thread_console (void *arg)
 	break;
       cygwait (ci->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (ci->stop_thread)
 	break;
     }
@@ -1381,7 +1366,6 @@ thread_pty_slave (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1401,12 +1385,7 @@ thread_pty_slave (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-07  9:14                                                                                                                             ` Corinna Vinschen
@ 2021-09-07 11:03                                                                                                                               ` Takashi Yano
  0 siblings, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-07 11:03 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 7 Sep 2021 11:14:28 +0200
Corinna Vinschen wrote:
> On Sep  7 08:39, Takashi Yano wrote:
> > On Mon, 6 Sep 2021 18:08:54 +0200
> > Corinna Vinschen wrote:
> > > On Sep  6 22:16, Takashi Yano wrote:
> > > > Current git head seems to have some bug. With and without my patch,
> > > > sftp get for large file causes error:
> > > > 
> > > > [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> > > > yano@192.168.0.133's password:
> > > > Connected to 192.168.0.133.
> > > > sftp> get test.dat
> > > > Fetching /home/yano/test.dat to test.dat
> > > > test.dat                                       13%   66MB  66.4MB/s   00:06 ETAReceived message too long 1728053256
> > > > Ensure the remote shell produces no output for non-interactive sessions.
> > > > [yano@Express5800-S70 ~]$ sftp 192.168.0.133
> > > > yano@192.168.0.133's password:
> > > > Connected to 192.168.0.133.
> > > > sftp> get test.dat
> > > > Fetching /home/yano/test.dat to test.dat
> > > > test.dat                                       22%  111MB 110.6MB/s   00:03 ETAdo_download: parse: incomplete message
> > > > [yano@Express5800-S70 ~]$
> > > 
> > > I bisected this down to commit 296bd3e78b52, but I'm at a loss in
> > > terms of the cause of the problem, ATM.
> > 
> > Thanks for bisecting this.
> > 
> > I am not sure this is the correct thing, however, found the following
> > patch solves the issue.
> 
> Thanks for the patch!  It's not correct as such, because it enables
> SA_RESTART behaviour unconditionally, but it gave me the right hint.
> 
> The underlying problem is that in case of a signal, the CancelIo call
> was missing.  The signal was processed, but the IO was still ongoing
> and so data was read or written without the application's knowledge.
> 
> Actually we can always call CancelIo.  It doesn't break the information
> in the IO_STATUS_BLOCK if the IO was already finished.  It just sets
> io.Status to STATUS_CANCELLED and io.Information to the number of bytes
> processed if it really canceled the ongoing IO.
> 
> I pushed a matching patch.

I have confirmed that your patch solves the issue. Thanks!


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-06 12:49                                                                                                                     ` Corinna Vinschen
  2021-09-06 13:16                                                                                                                       ` Takashi Yano
@ 2021-09-07 16:14                                                                                                                       ` Ken Brown
  2021-09-07 18:26                                                                                                                         ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-07 16:14 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1353 bytes --]

On 9/6/2021 8:49 AM, Corinna Vinschen wrote:
> - I think setting chunk to DEFAULT_PIPEBUFSIZE - 1 in the read case and
>    DEFAULT_PIPEBUFSIZE in the write case by default is dangerous.
>    Assuming the pipe has been created by a non-Cygwin process, the values
>    may be way too high.
> 
>    Suggestion: Actually set max_atomic_write to something useful.
>    Set max_atomic_write to DEFAULT_PIPEBUFSIZE in fhandler_pipe::create.
>    In case of stdio handles inherited from non-Cygwin processes, fetch
>    the pipe buffer size via NtQueryInformationFile in
>    dtable::init_std_file_from_handle().  Better, in a matching
>    fhandler_pipe method called from init_std_file_from_handle().

How about something like the attached (untested)?

> - What about calling select for writing on pipes read by non-Cygwin
>    processes?  In that case, we still can't rely on WriteQuotaAvailable,
>    just as before.
> 
>    I have a vague idea that we might want to count readers in that case,
>    but I have to think about it some more.

Even if we count readers, we have no way of knowing whether a pending read has 
reduced WriteQuotaAvailable to 0.  Maybe this is a case where we should impose 
some artificial timeout, after which we report write ready.  Falsely reporting 
write ready in this corner case seems better than risking a deadlock.

Ken

[-- Attachment #2: pipe_buf.diff --]
[-- Type: text/plain, Size: 3225 bytes --]

diff --git a/winsup/cygwin/dtable.cc b/winsup/cygwin/dtable.cc
index 8085e656e..a638a5995 100644
--- a/winsup/cygwin/dtable.cc
+++ b/winsup/cygwin/dtable.cc
@@ -406,6 +406,11 @@ dtable::init_std_file_from_handle (int fd, HANDLE handle)
 	}
       if (!fh->init (handle, access, bin))
 	api_fatal ("couldn't initialize fd %d for %s", fd, fh->get_name ());
+      if (fh->ispipe ())
+	{
+	  fhandler_pipe *fhp = (fhandler_pipe *) fh;
+	  fhp->set_pipe_buf_size ();
+	}
 
       fh->open_setup (openflags);
       fh->usecount = 0;
diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index bb7eb09ce..7aed089eb 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1172,7 +1172,7 @@ class fhandler_socket_unix : public fhandler_socket
 class fhandler_pipe_fifo: public fhandler_base
 {
  protected:
-  size_t max_atomic_write;
+  size_t pipe_buf_size;
 
  public:
   fhandler_pipe_fifo ();
@@ -1192,6 +1192,7 @@ public:
 
   bool ispipe() const { return true; }
   void set_read_mutex (HANDLE mtx) { read_mtx = mtx; }
+  void set_pipe_buf_size ();
 
   void set_popen_pid (pid_t pid) {popen_pid = pid;}
   pid_t get_popen_pid () const {return popen_pid;}
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 4bd807a09..c75476040 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -29,7 +29,7 @@ STATUS_PIPE_EMPTY simply means there's no data to be read. */
 		   || _s == STATUS_PIPE_EMPTY; })
 
 fhandler_pipe_fifo::fhandler_pipe_fifo ()
-  : fhandler_base (), max_atomic_write (DEFAULT_PIPEBUFSIZE)
+  : fhandler_base (), pipe_buf_size (DEFAULT_PIPEBUFSIZE)
 {
 }
 
@@ -269,7 +269,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	     buffer size - 1. Pending read lowers WriteQuotaAvailable on
 	     the write side and thus affects select's ability to return
 	     more or less reliable info whether a write succeeds or not. */
-	  ULONG chunk = max_atomic_write - 1;
+	  ULONG chunk = pipe_buf_size - 1;
 	  status = NtQueryInformationFile (get_handle (), &io,
 					   &fpli, sizeof (fpli),
 					   FilePipeLocalInformation);
@@ -391,12 +391,12 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
   if (!len)
     return 0;
 
-  if (len <= max_atomic_write)
+  if (len <= pipe_buf_size)
     chunk = len;
   else if (is_nonblocking ())
-    chunk = len = max_atomic_write;
+    chunk = len = pipe_buf_size;
   else
-    chunk = max_atomic_write;
+    chunk = pipe_buf_size;
 
   /* Create a wait event if the pipe or fifo is in blocking mode. */
   if (!is_nonblocking () && !(evt = CreateEvent (NULL, false, false, NULL)))
@@ -894,6 +894,21 @@ nt_create (LPSECURITY_ATTRIBUTES sa_ptr, HANDLE &r, HANDLE &w,
   return 0;
 }
 
+/* Called by dtable::init_std_file_from_handle for stdio handles
+   inherited from non-Cygwin processes. */
+void
+fhandler_pipe::set_pipe_buf_size ()
+{
+  NTSTATUS status;
+  IO_STATUS_BLOCK io;
+  FILE_PIPE_LOCAL_INFORMATION fpli;
+
+  status = NtQueryInformationFile (get_handle (), &io, &fpli, sizeof fpli,
+				   FilePipeLocalInformation);
+  if (NT_SUCCESS (status))
+    pipe_buf_size = fpli.InboundQuota;
+}
+
 int
 fhandler_pipe::ioctl (unsigned int cmd, void *p)
 {

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-07 16:14                                                                                                                       ` Ken Brown
@ 2021-09-07 18:26                                                                                                                         ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-07 18:26 UTC (permalink / raw)
  To: cygwin-developers

On Sep  7 12:14, Ken Brown wrote:
> On 9/6/2021 8:49 AM, Corinna Vinschen wrote:
> > - I think setting chunk to DEFAULT_PIPEBUFSIZE - 1 in the read case and
> >    DEFAULT_PIPEBUFSIZE in the write case by default is dangerous.
> >    Assuming the pipe has been created by a non-Cygwin process, the values
> >    may be way too high.
> > 
> >    Suggestion: Actually set max_atomic_write to something useful.
> >    Set max_atomic_write to DEFAULT_PIPEBUFSIZE in fhandler_pipe::create.
> >    In case of stdio handles inherited from non-Cygwin processes, fetch
> >    the pipe buffer size via NtQueryInformationFile in
> >    dtable::init_std_file_from_handle().  Better, in a matching
> >    fhandler_pipe method called from init_std_file_from_handle().
> 
> How about something like the attached (untested)?

LGTM.  I like the name change!

> > - What about calling select for writing on pipes read by non-Cygwin
> >    processes?  In that case, we still can't rely on WriteQuotaAvailable,
> >    just as before.
> > 
> >    I have a vague idea that we might want to count readers in that case,
> >    but I have to think about it some more.
> 
> Even if we count readers, we have no way of knowing whether a pending read
> has reduced WriteQuotaAvailable to 0.  Maybe this is a case where we should
> impose some artificial timeout, after which we report write ready.  Falsely
> reporting write ready in this corner case seems better than risking a
> deadlock.

Yeah, it's an almost hopeless case.  A timeout may be a way out.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-07 10:50               ` Takashi Yano
@ 2021-09-08  0:07                 ` Takashi Yano
  2021-09-08  4:11                   ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-08  0:07 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 7 Sep 2021 19:50:23 +0900
Takashi Yano wrote:

> @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
>        pi->stop_thread = true;
>        SetEvent (pi->bye);
         ~~~~~~~~~~~~~~~~~~~
This is not correct. SetEvent() wakes-up one of thread_pipe()s,
but it may be other thread than one which should be stopped.

>        pi->thread->detach ();
> -      CloseHandle (pi->bye);
> +      if (me->fh->get_select_evt () == NULL)
> +	CloseHandle (pi->bye);
>      }
>    delete pi;
>    stuff->device_specific_pipe = NULL;

I think it also should be
> +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> +      SetEvent (select_evt);

Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function

Does using semaphore object instead of event, and releasing
resources equal to the number of handles make sense?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  0:07                 ` Takashi Yano
@ 2021-09-08  4:11                   ` Takashi Yano
  2021-09-08  9:01                     ` Takashi Yano
  2021-09-08  9:01                     ` Corinna Vinschen
  0 siblings, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-08  4:11 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 8 Sep 2021 09:07:48 +0900
Takashi Yano wrote:
> On Tue, 7 Sep 2021 19:50:23 +0900
> Takashi Yano wrote:
> 
> > @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
> >        pi->stop_thread = true;
> >        SetEvent (pi->bye);
>          ~~~~~~~~~~~~~~~~~~~
> This is not correct. SetEvent() wakes-up one of thread_pipe()s,
> but it may be other thread than one which should be stopped.
> 
> >        pi->thread->detach ();
> > -      CloseHandle (pi->bye);
> > +      if (me->fh->get_select_evt () == NULL)
> > +	CloseHandle (pi->bye);
> >      }
> >    delete pi;
> >    stuff->device_specific_pipe = NULL;
> 
> I think it also should be
> > +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> > +      SetEvent (select_evt);
> 
> Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
> https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function
> 
> Does using semaphore object instead of event, and releasing
> resources equal to the number of handles make sense?

No it does not. One thread may consume semaphore multiple times....

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  4:11                   ` Takashi Yano
@ 2021-09-08  9:01                     ` Takashi Yano
  2021-09-08  9:01                     ` Corinna Vinschen
  1 sibling, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-08  9:01 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]

On Wed, 8 Sep 2021 13:11:41 +0900
Takashi Yano wrote:
> On Wed, 8 Sep 2021 09:07:48 +0900
> Takashi Yano wrote:
> > On Tue, 7 Sep 2021 19:50:23 +0900
> > Takashi Yano wrote:
> > 
> > > @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
> > >        pi->stop_thread = true;
> > >        SetEvent (pi->bye);
> >          ~~~~~~~~~~~~~~~~~~~
> > This is not correct. SetEvent() wakes-up one of thread_pipe()s,
> > but it may be other thread than one which should be stopped.
> > 
> > >        pi->thread->detach ();
> > > -      CloseHandle (pi->bye);
> > > +      if (me->fh->get_select_evt () == NULL)
> > > +	CloseHandle (pi->bye);
> > >      }
> > >    delete pi;
> > >    stuff->device_specific_pipe = NULL;
> > 
> > I think it also should be
> > > +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> > > +      SetEvent (select_evt);
> > 
> > Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
> > https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function
> > 
> > Does using semaphore object instead of event, and releasing
> > resources equal to the number of handles make sense?
> 
> No it does not. One thread may consume semaphore multiple times....

I wrote a simple test program which counts wake-up 1000 times in 100
threads, and confirmed that semaphore method works as expected as well
as PulseEvent(). All 100000 (100*1000) wake-up were counted without
miscount.

Therefore, patch attached should work.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch --]
[-- Type: application/octet-stream, Size: 6742 bytes --]

From aaf0d8ba7ff8385884968a3664c58d05931b09b6 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Wed, 8 Sep 2021 17:18:35 +0900
Subject: [PATCH] Cygwin: select: Introduce select_sem semaphore for pipe.

- This patch reverts "Cygwin: select: Improve select/poll response",
  and introduces select_sem semaphore which notifies pipe status
  change.
---
 winsup/cygwin/fhandler.cc      |  1 +
 winsup/cygwin/fhandler.h       |  3 +++
 winsup/cygwin/fhandler_pipe.cc | 25 ++++++++++++++++++++
 winsup/cygwin/select.cc        | 42 ++++++++++------------------------
 4 files changed, 41 insertions(+), 30 deletions(-)

diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc
index f0c1b68f1..39fe2640a 100644
--- a/winsup/cygwin/fhandler.cc
+++ b/winsup/cygwin/fhandler.cc
@@ -1464,6 +1464,7 @@ fhandler_base::fhandler_base () :
   _refcnt (0),
   openflags (0),
   unique_id (0),
+  select_sem (NULL),
   archetype (NULL),
   usecount (0)
 {
diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 7aed089eb..d309be2f7 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -217,6 +217,7 @@ class fhandler_base
   void set_ino (ino_t i) { ino = i; }
 
   HANDLE read_state;
+  HANDLE select_sem;
 
  public:
   LONG inc_refcnt () {return InterlockedIncrement (&_refcnt);}
@@ -520,6 +521,8 @@ public:
     fh->copy_from (this);
     return fh;
   }
+
+  HANDLE get_select_sem () { return select_sem; }
 };
 
 struct wsa_event
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index c75476040..503d5c888 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -376,6 +376,8 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  if (select_sem && nbytes)
+    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
   len = nbytes;
 }
 
@@ -507,6 +509,8 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
     set_errno (EINTR);
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  if (select_sem && nbytes)
+    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
   return nbytes ?: -1;
 }
 
@@ -515,6 +519,8 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
 {
   if (read_mtx)
     fork_fixup (parent, read_mtx, "read_mtx");
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -536,6 +542,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (select_sem &&
+	   !DuplicateHandle (GetCurrentProcess (), select_sem,
+			    GetCurrentProcess (), &ftp->select_sem,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -546,6 +561,11 @@ fhandler_pipe::close ()
 {
   if (read_mtx)
     CloseHandle (read_mtx);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      CloseHandle (select_sem);
+    }
   return fhandler_base::close ();
 }
 
@@ -765,6 +785,11 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	  fhs[0]->set_read_mutex (mtx);
 	  res = 0;
 	}
+      fhs[0]->select_sem = CreateSemaphore (&sa, 0, INT32_MAX, NULL);
+      if (fhs[0]->select_sem)
+	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_sem,
+			 GetCurrentProcess (), &fhs[1]->select_sem,
+			 0, 1, DUPLICATE_SAME_ACCESS);
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 5e338e43f..e9e71b269 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -726,7 +726,6 @@ thread_pipe (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -746,12 +745,7 @@ thread_pipe (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -768,7 +762,13 @@ start_thread_pipe (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_pipe, pi, "pipesel");
       me->h = *pi->thread;
       if (!me->h)
@@ -786,7 +786,7 @@ pipe_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
@@ -927,7 +927,6 @@ thread_fifo (void *arg)
   select_fifo_info *pi = (select_fifo_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -947,12 +946,7 @@ thread_fifo (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -1128,7 +1122,6 @@ thread_console (void *arg)
   select_console_info *ci = (select_console_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1148,12 +1141,7 @@ thread_console (void *arg)
 	break;
       cygwait (ci->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (ci->stop_thread)
 	break;
     }
@@ -1373,7 +1361,6 @@ thread_pty_slave (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1393,12 +1380,7 @@ thread_pty_slave (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  4:11                   ` Takashi Yano
  2021-09-08  9:01                     ` Takashi Yano
@ 2021-09-08  9:01                     ` Corinna Vinschen
  2021-09-08  9:26                       ` Corinna Vinschen
  2021-09-08  9:37                       ` Takashi Yano
  1 sibling, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-08  9:01 UTC (permalink / raw)
  To: cygwin-developers

On Sep  8 13:11, Takashi Yano wrote:
> On Wed, 8 Sep 2021 09:07:48 +0900
> Takashi Yano wrote:
> > On Tue, 7 Sep 2021 19:50:23 +0900
> > Takashi Yano wrote:
> > 
> > > @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
> > >        pi->stop_thread = true;
> > >        SetEvent (pi->bye);
> >          ~~~~~~~~~~~~~~~~~~~
> > This is not correct. SetEvent() wakes-up one of thread_pipe()s,
> > but it may be other thread than one which should be stopped.
> > 
> > >        pi->thread->detach ();
> > > -      CloseHandle (pi->bye);
> > > +      if (me->fh->get_select_evt () == NULL)
> > > +	CloseHandle (pi->bye);
> > >      }
> > >    delete pi;
> > >    stuff->device_specific_pipe = NULL;
> > 
> > I think it also should be
> > > +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> > > +      SetEvent (select_evt);
> > 
> > Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
> > https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function
> > 
> > Does using semaphore object instead of event, and releasing
> > resources equal to the number of handles make sense?
> 
> No it does not. One thread may consume semaphore multiple times....

What exactly is the problem in the code which results in high CPU
load?  Can you explain this a bit?  Maybe we need an entirely
different approach to avoid that.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  9:01                     ` Corinna Vinschen
@ 2021-09-08  9:26                       ` Corinna Vinschen
  2021-09-08  9:45                         ` Takashi Yano
  2021-09-08  9:37                       ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-08  9:26 UTC (permalink / raw)
  To: cygwin-developers

On Sep  8 11:01, Corinna Vinschen wrote:
> On Sep  8 13:11, Takashi Yano wrote:
> > On Wed, 8 Sep 2021 09:07:48 +0900
> > Takashi Yano wrote:
> > > On Tue, 7 Sep 2021 19:50:23 +0900
> > > Takashi Yano wrote:
> > > 
> > > > @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
> > > >        pi->stop_thread = true;
> > > >        SetEvent (pi->bye);
> > >          ~~~~~~~~~~~~~~~~~~~
> > > This is not correct. SetEvent() wakes-up one of thread_pipe()s,
> > > but it may be other thread than one which should be stopped.
> > > 
> > > >        pi->thread->detach ();
> > > > -      CloseHandle (pi->bye);
> > > > +      if (me->fh->get_select_evt () == NULL)
> > > > +	CloseHandle (pi->bye);
> > > >      }
> > > >    delete pi;
> > > >    stuff->device_specific_pipe = NULL;
> > > 
> > > I think it also should be
> > > > +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> > > > +      SetEvent (select_evt);
> > > 
> > > Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
> > > https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function
> > > 
> > > Does using semaphore object instead of event, and releasing
> > > resources equal to the number of handles make sense?
> > 
> > No it does not. One thread may consume semaphore multiple times....
> 
> What exactly is the problem in the code which results in high CPU
> load?  Can you explain this a bit?  Maybe we need an entirely
> different approach to avoid that.

I saw your new patch, but I don't see the problem.  I typed a lot of
keys in mintty quickly and what happens is that the load of mintty
goes up to 9% on a 4 CPU system, but only temporarily while typing.
How do you reproduce the problem?


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  9:01                     ` Corinna Vinschen
  2021-09-08  9:26                       ` Corinna Vinschen
@ 2021-09-08  9:37                       ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-08  9:37 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 8 Sep 2021 11:01:52 +0200
Corinna Vinschen wrote:
> On Sep  8 13:11, Takashi Yano wrote:
> > On Wed, 8 Sep 2021 09:07:48 +0900
> > Takashi Yano wrote:
> > > On Tue, 7 Sep 2021 19:50:23 +0900
> > > Takashi Yano wrote:
> > > 
> > > > @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
> > > >        pi->stop_thread = true;
> > > >        SetEvent (pi->bye);
> > >          ~~~~~~~~~~~~~~~~~~~
> > > This is not correct. SetEvent() wakes-up one of thread_pipe()s,
> > > but it may be other thread than one which should be stopped.
> > > 
> > > >        pi->thread->detach ();
> > > > -      CloseHandle (pi->bye);
> > > > +      if (me->fh->get_select_evt () == NULL)
> > > > +	CloseHandle (pi->bye);
> > > >      }
> > > >    delete pi;
> > > >    stuff->device_specific_pipe = NULL;
> > > 
> > > I think it also should be
> > > > +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> > > > +      SetEvent (select_evt);
> > > 
> > > Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
> > > https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function
> > > 
> > > Does using semaphore object instead of event, and releasing
> > > resources equal to the number of handles make sense?
> > 
> > No it does not. One thread may consume semaphore multiple times....
> 
> What exactly is the problem in the code which results in high CPU
> load?  Can you explain this a bit?  Maybe we need an entirely
> different approach to avoid that.

The thread_pipe() code in the current git head of master is like:

while (looping)
  {
    ...
    if (peek_pipe ())
      looping = false;
    ...
    if (!looping)
      break;
    cygwait (pi->bye, sleep_time >> 3);
    if (sleep_time < 80)
      ++sleep_time;
    if (pi->stop_thread)
      break;
  }
returnn 0;

With this code, the first 8 loops calls cygwait() with
cygwait (pi->bye, 0);
and after that
cygwait (pi->bye, nonzero value);

cygwait() with nonzero timeout value causes usually sleeps at least
15msec (because of the resolution of the timer).

Looping 8 times is just a moment for the CPU.

After the 8 loops, thread_pipe() responds slowly to select() call
because thread notices the status change of pipe after the timeout.

To avoid this, commit dccde0dc changes the code as follows.

diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 8ad982c12..83e1c00e0 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -735,6 +735,7 @@ thread_pipe (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
+  DWORD t0 = GetTickCount ();

   while (looping)
     {
@@ -754,7 +755,12 @@ thread_pipe (void *arg)
        break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-       ++sleep_time;
+       {
+         DWORD t1 = GetTickCount ();
+         if (t0 != t1)
+           ++sleep_time;
+         t0 = t1;
+       }
       if (pi->stop_thread)
        break;
     }

In this code, I expected t0 != t1 happens at most every 1msec,
and after 8msec, timeout for cygwait() becomes nonzero. During
this 8msec, CPU gets high load because looping is without sleep.

However, in fact, t0 != t1 happens every 15msec because the
resolution of timer is low. Then, CPU gets high load during
first 120msec after starting thread_pipe(). So if you type
keys quickly, the loop is always performs without sleep.

Therefore, I proposed new patch which use event or semaphore
to notify read()/write()/close() of the pipe instead of just
sleeping a while.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  9:26                       ` Corinna Vinschen
@ 2021-09-08  9:45                         ` Takashi Yano
  2021-09-08 10:04                           ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-08  9:45 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 8 Sep 2021 11:26:21 +0200
Corinna Vinschen wrote:
> On Sep  8 11:01, Corinna Vinschen wrote:
> > On Sep  8 13:11, Takashi Yano wrote:
> > > On Wed, 8 Sep 2021 09:07:48 +0900
> > > Takashi Yano wrote:
> > > > On Tue, 7 Sep 2021 19:50:23 +0900
> > > > Takashi Yano wrote:
> > > > 
> > > > > @@ -796,7 +792,8 @@ pipe_cleanup (select_record *, select_stuff *stuff)
> > > > >        pi->stop_thread = true;
> > > > >        SetEvent (pi->bye);
> > > >          ~~~~~~~~~~~~~~~~~~~
> > > > This is not correct. SetEvent() wakes-up one of thread_pipe()s,
> > > > but it may be other thread than one which should be stopped.
> > > > 
> > > > >        pi->thread->detach ();
> > > > > -      CloseHandle (pi->bye);
> > > > > +      if (me->fh->get_select_evt () == NULL)
> > > > > +	CloseHandle (pi->bye);
> > > > >      }
> > > > >    delete pi;
> > > > >    stuff->device_specific_pipe = NULL;
> > > > 
> > > > I think it also should be
> > > > > +    for (ULONG i = 0; i < get_obj_handle_count (select_evt); i++)
> > > > > +      SetEvent (select_evt);
> > > > 
> > > > Actually I want to use PulseEvent() here if it is not **UNRELIABLE**.
> > > > https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/28648-pulseevent-is-an-unreliable-function
> > > > 
> > > > Does using semaphore object instead of event, and releasing
> > > > resources equal to the number of handles make sense?
> > > 
> > > No it does not. One thread may consume semaphore multiple times....
> > 
> > What exactly is the problem in the code which results in high CPU
> > load?  Can you explain this a bit?  Maybe we need an entirely
> > different approach to avoid that.
> 
> I saw your new patch, but I don't see the problem.  I typed a lot of
> keys in mintty quickly and what happens is that the load of mintty
> goes up to 9% on a 4 CPU system, but only temporarily while typing.
> How do you reproduce the problem?

Did you apply the patch
0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch
or
0001-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch
?

With these patch, the problem does not occur. The problem occurs
with the commit dccde0dc.

With my 4 core 8 thread CPU, CPU loads goes up to 12-13 % if
I type keys using key repeat (30cps) after the commit dccde0dc.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08  9:45                         ` Takashi Yano
@ 2021-09-08 10:04                           ` Corinna Vinschen
  2021-09-08 10:45                             ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-08 10:04 UTC (permalink / raw)
  To: cygwin-developers

On Sep  8 18:45, Takashi Yano wrote:
> On Wed, 8 Sep 2021 11:26:21 +0200
> Corinna Vinschen wrote:
> > On Sep  8 11:01, Corinna Vinschen wrote:
> > > What exactly is the problem in the code which results in high CPU
> > > load?  Can you explain this a bit?  Maybe we need an entirely
> > > different approach to avoid that.
> > 
> > I saw your new patch, but I don't see the problem.  I typed a lot of
> > keys in mintty quickly and what happens is that the load of mintty
> > goes up to 9% on a 4 CPU system, but only temporarily while typing.
> > How do you reproduce the problem?
> 
> Did you apply the patch
> 0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch
> or
> 0001-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch
> ?
> 
> With these patch, the problem does not occur. The problem occurs
> with the commit dccde0dc.

dccde0dc is 23bb19efcc45 in topic/pipe ATM (after force push) so, yes,
I'm running this with topic/pipe HEAD including this patch.

> With my 4 core 8 thread CPU, CPU loads goes up to 12-13 % if
> I type keys using key repeat (30cps) after the commit dccde0dc.

Oh, wow!  As I wrote above, before applying "Cygwin: select: Introduce
select_sem semaphore for pipe." I only saw a 9% load.  After applying
the patch I saw the same load.

I don't know what I did differently, but after reverting your semaphore
patch I now see loads of up to 30%.  So, never mind, apparently I tested
wrongly before.  Your patch reduces the load tremendously.

Just one question.  Would you mind to split your patch into two parts,
one being just the revert of your "Improve select/poll response." patch
and one introducing select_sem?


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 10:04                           ` Corinna Vinschen
@ 2021-09-08 10:45                             ` Takashi Yano
  2021-09-08 10:51                               ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-08 10:45 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1918 bytes --]

Hi Corinna,

On Wed, 8 Sep 2021 12:04:21 +0200
Corinna Vinschen wrote:
> On Sep  8 18:45, Takashi Yano wrote:
> > On Wed, 8 Sep 2021 11:26:21 +0200
> > Corinna Vinschen wrote:
> > > On Sep  8 11:01, Corinna Vinschen wrote:
> > > > What exactly is the problem in the code which results in high CPU
> > > > load?  Can you explain this a bit?  Maybe we need an entirely
> > > > different approach to avoid that.
> > > 
> > > I saw your new patch, but I don't see the problem.  I typed a lot of
> > > keys in mintty quickly and what happens is that the load of mintty
> > > goes up to 9% on a 4 CPU system, but only temporarily while typing.
> > > How do you reproduce the problem?
> > 
> > Did you apply the patch
> > 0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch
> > or
> > 0001-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch
> > ?
> > 
> > With these patch, the problem does not occur. The problem occurs
> > with the commit dccde0dc.
> 
> dccde0dc is 23bb19efcc45 in topic/pipe ATM (after force push) so, yes,
> I'm running this with topic/pipe HEAD including this patch.
> 
> > With my 4 core 8 thread CPU, CPU loads goes up to 12-13 % if
> > I type keys using key repeat (30cps) after the commit dccde0dc.
> 
> Oh, wow!  As I wrote above, before applying "Cygwin: select: Introduce
> select_sem semaphore for pipe." I only saw a 9% load.  After applying
> the patch I saw the same load.
> 
> I don't know what I did differently, but after reverting your semaphore
> patch I now see loads of up to 30%.  So, never mind, apparently I tested
> wrongly before.  Your patch reduces the load tremendously.

Thanks for testing again.

> Just one question.  Would you mind to split your patch into two parts,
> one being just the revert of your "Improve select/poll response." patch
> and one introducing select_sem?

I split the patch as you advised.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Revert-Cygwin-select-Improve-select-poll-response.patch --]
[-- Type: application/octet-stream, Size: 2525 bytes --]

From bc9fde7a1e4f0c80c4a58c7ed3e9b83c1f4f24d2 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Wed, 8 Sep 2021 19:22:40 +0900
Subject: [PATCH 1/2] Revert "Cygwin: select: Improve select/poll response."

... because this commit (23bb19ef) causes high CPU load.
---
 winsup/cygwin/select.cc | 32 ++++----------------------------
 1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 5e338e43f..c85ce748c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -726,7 +726,6 @@ thread_pipe (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -746,12 +745,7 @@ thread_pipe (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -927,7 +921,6 @@ thread_fifo (void *arg)
   select_fifo_info *pi = (select_fifo_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -947,12 +940,7 @@ thread_fifo (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
@@ -1128,7 +1116,6 @@ thread_console (void *arg)
   select_console_info *ci = (select_console_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1148,12 +1135,7 @@ thread_console (void *arg)
 	break;
       cygwait (ci->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (ci->stop_thread)
 	break;
     }
@@ -1373,7 +1355,6 @@ thread_pty_slave (void *arg)
   select_pipe_info *pi = (select_pipe_info *) arg;
   DWORD sleep_time = 0;
   bool looping = true;
-  DWORD t0 = GetTickCount ();
 
   while (looping)
     {
@@ -1393,12 +1374,7 @@ thread_pty_slave (void *arg)
 	break;
       cygwait (pi->bye, sleep_time >> 3);
       if (sleep_time < 80)
-	{
-	  DWORD t1 = GetTickCount ();
-	  if (t0 != t1)
-	    ++sleep_time;
-	  t0 = t1;
-	}
+	++sleep_time;
       if (pi->stop_thread)
 	break;
     }
-- 
2.33.0


[-- Attachment #3: 0002-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch --]
[-- Type: application/octet-stream, Size: 4714 bytes --]

From 3b35c69225f82554b186264c6c35b4b1dd874cbd Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Wed, 8 Sep 2021 17:18:35 +0900
Subject: [PATCH 2/2] Cygwin: select: Introduce select_sem semaphore for pipe.

- This patch introduces select_sem semaphore which notifies pipe status
  change.
---
 winsup/cygwin/fhandler.cc      |  1 +
 winsup/cygwin/fhandler.h       |  3 +++
 winsup/cygwin/fhandler_pipe.cc | 25 +++++++++++++++++++++++++
 winsup/cygwin/select.cc        | 10 ++++++++--
 4 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc
index f0c1b68f1..39fe2640a 100644
--- a/winsup/cygwin/fhandler.cc
+++ b/winsup/cygwin/fhandler.cc
@@ -1464,6 +1464,7 @@ fhandler_base::fhandler_base () :
   _refcnt (0),
   openflags (0),
   unique_id (0),
+  select_sem (NULL),
   archetype (NULL),
   usecount (0)
 {
diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 7aed089eb..d309be2f7 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -217,6 +217,7 @@ class fhandler_base
   void set_ino (ino_t i) { ino = i; }
 
   HANDLE read_state;
+  HANDLE select_sem;
 
  public:
   LONG inc_refcnt () {return InterlockedIncrement (&_refcnt);}
@@ -520,6 +521,8 @@ public:
     fh->copy_from (this);
     return fh;
   }
+
+  HANDLE get_select_sem () { return select_sem; }
 };
 
 struct wsa_event
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index c75476040..ea8ea41dd 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -376,6 +376,8 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  if (select_sem && nbytes)
+    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
   len = nbytes;
 }
 
@@ -507,6 +509,8 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
     set_errno (EINTR);
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
+  if (select_sem && nbytes)
+    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
   return nbytes ?: -1;
 }
 
@@ -515,6 +519,8 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
 {
   if (read_mtx)
     fork_fixup (parent, read_mtx, "read_mtx");
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -536,6 +542,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (select_sem &&
+	   !DuplicateHandle (GetCurrentProcess (), select_sem,
+			    GetCurrentProcess (), &ftp->select_sem,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -546,6 +561,11 @@ fhandler_pipe::close ()
 {
   if (read_mtx)
     CloseHandle (read_mtx);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      CloseHandle (select_sem);
+    }
   return fhandler_base::close ();
 }
 
@@ -765,6 +785,11 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	  fhs[0]->set_read_mutex (mtx);
 	  res = 0;
 	}
+      fhs[0]->select_sem = CreateSemaphore (&sa, 0, INT32_MAX, NULL);
+      if (fhs[0]->select_sem)
+	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_sem,
+			 GetCurrentProcess (), &fhs[1]->select_sem,
+			 0, 1, DUPLICATE_SAME_ACCESS);
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index c85ce748c..e9e71b269 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -762,7 +762,13 @@ start_thread_pipe (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_pipe, pi, "pipesel");
       me->h = *pi->thread;
       if (!me->h)
@@ -780,7 +786,7 @@ pipe_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 10:45                             ` Takashi Yano
@ 2021-09-08 10:51                               ` Corinna Vinschen
  2021-09-09  3:21                                 ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-08 10:51 UTC (permalink / raw)
  To: cygwin-developers

On Sep  8 19:45, Takashi Yano wrote:
> Hi Corinna,
> 
> On Wed, 8 Sep 2021 12:04:21 +0200
> Corinna Vinschen wrote:
> > On Sep  8 18:45, Takashi Yano wrote:
> > > On Wed, 8 Sep 2021 11:26:21 +0200
> > > Corinna Vinschen wrote:
> > > > On Sep  8 11:01, Corinna Vinschen wrote:
> > > > > What exactly is the problem in the code which results in high CPU
> > > > > load?  Can you explain this a bit?  Maybe we need an entirely
> > > > > different approach to avoid that.
> > > > 
> > > > I saw your new patch, but I don't see the problem.  I typed a lot of
> > > > keys in mintty quickly and what happens is that the load of mintty
> > > > goes up to 9% on a 4 CPU system, but only temporarily while typing.
> > > > How do you reproduce the problem?
> > > 
> > > Did you apply the patch
> > > 0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch
> > > or
> > > 0001-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch
> > > ?
> > > 
> > > With these patch, the problem does not occur. The problem occurs
> > > with the commit dccde0dc.
> > 
> > dccde0dc is 23bb19efcc45 in topic/pipe ATM (after force push) so, yes,
> > I'm running this with topic/pipe HEAD including this patch.
> > 
> > > With my 4 core 8 thread CPU, CPU loads goes up to 12-13 % if
> > > I type keys using key repeat (30cps) after the commit dccde0dc.
> > 
> > Oh, wow!  As I wrote above, before applying "Cygwin: select: Introduce
> > select_sem semaphore for pipe." I only saw a 9% load.  After applying
> > the patch I saw the same load.
> > 
> > I don't know what I did differently, but after reverting your semaphore
> > patch I now see loads of up to 30%.  So, never mind, apparently I tested
> > wrongly before.  Your patch reduces the load tremendously.
> 
> Thanks for testing again.
> 
> > Just one question.  Would you mind to split your patch into two parts,
> > one being just the revert of your "Improve select/poll response." patch
> > and one introducing select_sem?
> 
> I split the patch as you advised.

Pushed.


Thanks,
Corinna




^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-02 19:35                                                                                   ` Corinna Vinschen
  2021-09-02 20:19                                                                                     ` Ken Brown
  2021-09-03 10:00                                                                                     ` Takashi Yano
@ 2021-09-08 11:32                                                                                     ` Takashi Yano
  2021-09-08 11:55                                                                                       ` Corinna Vinschen
  2 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-08 11:32 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On Thu, 2 Sep 2021 21:35:21 +0200
Corinna Vinschen wrote:
> On Sep  2 21:00, Corinna Vinschen wrote:
> > On Sep  2 09:01, Ken Brown wrote:
> > > On 9/2/2021 4:17 AM, Corinna Vinschen wrote:
> > > > What if the readers never request more than, say, 50 or even 25% of the
> > > > available buffer space?  Our buffer is 64K and there's no guarantee that
> > > > any read > PIPE_BUF (== 4K) is atomic anyway.  This can work without
> > > > having to check the other side of the pipe.  Something like this,
> > > > ignoring border cases:
> > > > 
> > > > pipe::create()
> > > > {
> > > >     [...]
> > > >     mutex = CreateMutex();
> > > > }
> > > > 
> > > > pipe::raw_read(char *buf, size_t num_requested)
> > > > {
> > > >    if (blocking)
> > > >      {
> > > >        WFSO(mutex);
> > > >        NtQueryInformationFile(FilePipeLocalInformation);
> > > >        if (!fpli.ReadDataAvailable
> > > > 	  && num_requested > fpli.InboundQuota / 4)
> > > > 	num_requested = fpli.InboundQuota / 4;
> > > >        NtReadFile(pipe, buf, num_requested);
> > > >        ReleaseMutex(mutex);
> > > >      }
> > > > }
> > > > 
> > > > It's not entirely foolproof, but it should fix 99% of the cases.
> > > 
> > > I like it!
> > > 
> > > Do you think there's anything we can or should do to avoid a deadlock in the
> > > rare cases where this fails?  The only thing I can think of immediately is
> > > to always impose a timeout if select is called with infinite timeout on the
> > > write side of a pipe, after which we report that the pipe is write ready.
> > > After all, we've lived since 2008 with a bug that caused select to *always*
> > > report write ready.
> > 
> > Indeed.  Hmm.  What timeout are you thinking of?  Seconds?  Minutes?
> > 
> > > Alternatively, we could just wait and see if there's an actual use case in
> > > which someone encounters a deadlock.
> > 
> > Or that.  Fixing up select isn't too hard in that case, I guess.
> 
> It's getting too late again.  I drop off for tonight, but I attached
> my POC code I have so far.  It also adds the snippets from my previous
> patch which fixes stuff Takashi found during testing.  It also fixes
> something which looks like a bug in raw_write:
> 
> -	  ptr = ((char *) ptr) + chunk;
> +	  ptr = ((char *) ptr) + nbytes_now;
> 
> Incrementing ptr by chunk bytes while only nbytes_now have been written
> looks incorrect.
> 
> As for the reader, it makes the # of bytes to read dependent on the
> number of reader handles.  I don't know if that's such a bright idea,
> but this can be changed easily.
> 
> Anyway, this runs all my testcases successfully but they are anything
> but thorough.
> 
> Patch relativ to topic/pipe attached.  Would you both mind to take a
> scrutinizing look?

Sorry for replying to old post.

As for this patch, read_mtx was introduced. This handle is initialized
only for read pipe. However, this seems to be NULL even without
initialization in write pipe. I wonder why initializing read_mtx in
the constructor is not necessary.

How do you guarantee that read_mtx is NULL on the write pipe?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 11:32                                                                                     ` Takashi Yano
@ 2021-09-08 11:55                                                                                       ` Corinna Vinschen
  2021-09-08 12:33                                                                                         ` Takashi Yano
  2021-09-08 17:43                                                                                         ` Ken Brown
  0 siblings, 2 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-08 11:55 UTC (permalink / raw)
  To: cygwin-developers

On Sep  8 20:32, Takashi Yano wrote:
> Hi Corinna,
> 
> On Thu, 2 Sep 2021 21:35:21 +0200
> Corinna Vinschen wrote:
> > It's getting too late again.  I drop off for tonight, but I attached
> > my POC code I have so far.  It also adds the snippets from my previous
> > patch which fixes stuff Takashi found during testing.
> > [...]
> > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > scrutinizing look?
> 
> Sorry for replying to old post.
> 
> As for this patch, read_mtx was introduced. This handle is initialized
> only for read pipe. However, this seems to be NULL even without
> initialization in write pipe. I wonder why initializing read_mtx in
> the constructor is not necessary.
> 
> How do you guarantee that read_mtx is NULL on the write pipe?

fhandlers are always calloc'ed on the cygheap, so you don't have
to initialize anything to NULL.  It doesn't hurt, of course, but
it's certainly not required.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 11:55                                                                                       ` Corinna Vinschen
@ 2021-09-08 12:33                                                                                         ` Takashi Yano
  2021-09-08 17:43                                                                                         ` Ken Brown
  1 sibling, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-08 12:33 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 8 Sep 2021 13:55:41 +0200
Corinna Vinschen wrote:
> On Sep  8 20:32, Takashi Yano wrote:
> > Hi Corinna,
> > 
> > On Thu, 2 Sep 2021 21:35:21 +0200
> > Corinna Vinschen wrote:
> > > It's getting too late again.  I drop off for tonight, but I attached
> > > my POC code I have so far.  It also adds the snippets from my previous
> > > patch which fixes stuff Takashi found during testing.
> > > [...]
> > > Patch relativ to topic/pipe attached.  Would you both mind to take a
> > > scrutinizing look?
> > 
> > Sorry for replying to old post.
> > 
> > As for this patch, read_mtx was introduced. This handle is initialized
> > only for read pipe. However, this seems to be NULL even without
> > initialization in write pipe. I wonder why initializing read_mtx in
> > the constructor is not necessary.
> > 
> > How do you guarantee that read_mtx is NULL on the write pipe?
> 
> fhandlers are always calloc'ed on the cygheap, so you don't have
> to initialize anything to NULL.  It doesn't hurt, of course, but
> it's certainly not required.

Thanks for the answer. I understand.


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 11:55                                                                                       ` Corinna Vinschen
  2021-09-08 12:33                                                                                         ` Takashi Yano
@ 2021-09-08 17:43                                                                                         ` Ken Brown
  2021-09-08 18:28                                                                                           ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-08 17:43 UTC (permalink / raw)
  To: cygwin-developers

On 9/8/2021 7:55 AM, Corinna Vinschen wrote:
> On Sep  8 20:32, Takashi Yano wrote:
>> As for this patch, read_mtx was introduced. This handle is initialized
>> only for read pipe. However, this seems to be NULL even without
>> initialization in write pipe. I wonder why initializing read_mtx in
>> the constructor is not necessary.
>>
>> How do you guarantee that read_mtx is NULL on the write pipe?
> 
> fhandlers are always calloc'ed on the cygheap, so you don't have
> to initialize anything to NULL.  It doesn't hurt, of course, but
> it's certainly not required.

Takashi and I both asked this question, and you had to answer it twice.  It's 
likely that future readers of the code will ask it again.  Would you be amenable 
to a code cleanup patch that answers it once and for all?  I would suggest (a) 
removing all 0/NULL initializers from fhandler constructors and (b) adding 
comments explaining why they're not needed.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 17:43                                                                                         ` Ken Brown
@ 2021-09-08 18:28                                                                                           ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-08 18:28 UTC (permalink / raw)
  To: cygwin-developers

On Sep  8 13:43, Ken Brown wrote:
> On 9/8/2021 7:55 AM, Corinna Vinschen wrote:
> > On Sep  8 20:32, Takashi Yano wrote:
> > > As for this patch, read_mtx was introduced. This handle is initialized
> > > only for read pipe. However, this seems to be NULL even without
> > > initialization in write pipe. I wonder why initializing read_mtx in
> > > the constructor is not necessary.
> > > 
> > > How do you guarantee that read_mtx is NULL on the write pipe?
> > 
> > fhandlers are always calloc'ed on the cygheap, so you don't have
> > to initialize anything to NULL.  It doesn't hurt, of course, but
> > it's certainly not required.
> 
> Takashi and I both asked this question, and you had to answer it twice.
> It's likely that future readers of the code will ask it again.

Yes, but ... it's not hidden knowledge.  You're using the function
build_fh_dev.  Looking into dtable.cc shows build_fh_dev calls
build_fh_pc which in turn calls fh_alloc.  fh_alloc uses the cnew macro,
defined in the same file, which resolves into

  void* ptr = (void*) ccalloc (HEAP_FHANDLER, 1, sizeof (fhandler_pipe));
  new (ptr) fhandler_pipe (...);

So the constructor is called on a memory slot on the cygheap, allocated
with ccalloc, the cygheap equivalent to calloc on the user heap.

> Would you be
> amenable to a code cleanup patch that answers it once and for all?  I would
> suggest (a) removing all 0/NULL initializers from fhandler constructors and
> (b) adding comments explaining why they're not needed.

a) ok

b) Commenting on the same issue in every single fhandler_* file appears a
   bit excessive, IMHO.  Only one comment in fhandler.h should suffice.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-08 10:51                               ` Corinna Vinschen
@ 2021-09-09  3:21                                 ` Takashi Yano
  2021-09-09  9:37                                   ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-09  3:21 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2284 bytes --]

On Wed, 8 Sep 2021 12:51:31 +0200
Corinna Vinschen wrote:
> On Sep  8 19:45, Takashi Yano wrote:
> > Hi Corinna,
> > 
> > On Wed, 8 Sep 2021 12:04:21 +0200
> > Corinna Vinschen wrote:
> > > On Sep  8 18:45, Takashi Yano wrote:
> > > > On Wed, 8 Sep 2021 11:26:21 +0200
> > > > Corinna Vinschen wrote:
> > > > > On Sep  8 11:01, Corinna Vinschen wrote:
> > > > > > What exactly is the problem in the code which results in high CPU
> > > > > > load?  Can you explain this a bit?  Maybe we need an entirely
> > > > > > different approach to avoid that.
> > > > > 
> > > > > I saw your new patch, but I don't see the problem.  I typed a lot of
> > > > > keys in mintty quickly and what happens is that the load of mintty
> > > > > goes up to 9% on a 4 CPU system, but only temporarily while typing.
> > > > > How do you reproduce the problem?
> > > > 
> > > > Did you apply the patch
> > > > 0001-Cygwin-select-Introduce-select_evt-event-for-pipe.patch
> > > > or
> > > > 0001-Cygwin-select-Introduce-select_sem-semaphore-for-pip.patch
> > > > ?
> > > > 
> > > > With these patch, the problem does not occur. The problem occurs
> > > > with the commit dccde0dc.
> > > 
> > > dccde0dc is 23bb19efcc45 in topic/pipe ATM (after force push) so, yes,
> > > I'm running this with topic/pipe HEAD including this patch.
> > > 
> > > > With my 4 core 8 thread CPU, CPU loads goes up to 12-13 % if
> > > > I type keys using key repeat (30cps) after the commit dccde0dc.
> > > 
> > > Oh, wow!  As I wrote above, before applying "Cygwin: select: Introduce
> > > select_sem semaphore for pipe." I only saw a 9% load.  After applying
> > > the patch I saw the same load.
> > > 
> > > I don't know what I did differently, but after reverting your semaphore
> > > patch I now see loads of up to 30%.  So, never mind, apparently I tested
> > > wrongly before.  Your patch reduces the load tremendously.
> > 
> > Thanks for testing again.
> > 
> > > Just one question.  Would you mind to split your patch into two parts,
> > > one being just the revert of your "Improve select/poll response." patch
> > > and one introducing select_sem?
> > 
> > I split the patch as you advised.
> 
> Pushed.

Timing of select_sem notification is fixed by the patch attached.


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Fix-notification-timing-of-select_sem.patch --]
[-- Type: application/octet-stream, Size: 2381 bytes --]

From 56ff64eadfb539a18f4b7d5ffd3355dfb23391ab Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Thu, 9 Sep 2021 12:08:39 +0900
Subject: [PATCH] Cygwin: pipe: Fix notification timing of select_sem.

- Make select_sem notify even when read/write partially.
---
 winsup/cygwin/fhandler_pipe.cc | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index ea8ea41dd..488e14be8 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -311,6 +311,9 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    {
 	      status = STATUS_THREAD_SIGNALED;
 	      nbytes += io.Information;
+	      if (select_sem && io.Information > 0)
+		ReleaseSemaphore (select_sem,
+				  get_obj_handle_count (select_sem), NULL);
 	      break;
 	    }
 	  status = io.Status;
@@ -365,6 +368,8 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 
       if (nbytes_now == 0)
 	break;
+      else if (select_sem)
+	ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
     }
   ReleaseMutex (read_mtx);
   if (evt)
@@ -376,8 +381,6 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
     }
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
-  if (select_sem && nbytes)
-    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
   len = nbytes;
 }
 
@@ -472,6 +475,9 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 	    {
 	      status = STATUS_THREAD_SIGNALED;
 	      nbytes += io.Information;
+	      if (select_sem && io.Information > 0)
+		ReleaseSemaphore (select_sem,
+				  get_obj_handle_count (select_sem), NULL);
 	      break;
 	    }
 	  status = io.Status;
@@ -502,6 +508,8 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 
       if (nbytes_now == 0)
 	break;
+      else if (select_sem)
+	ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
     }
   if (evt)
     CloseHandle (evt);
@@ -509,8 +517,6 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
     set_errno (EINTR);
   else if (status == STATUS_THREAD_CANCELED)
     pthread::static_cancel_self ();
-  if (select_sem && nbytes)
-    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
   return nbytes ?: -1;
 }
 
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-07  3:26             ` Takashi Yano
  2021-09-07 10:50               ` Takashi Yano
@ 2021-09-09  3:41               ` Takashi Yano
  2021-09-09  8:05                 ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-09  3:41 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3704 bytes --]

Hi Ken,

On Tue, 7 Sep 2021 12:26:31 +0900
Takashi Yano wrote:
> On Fri, 27 Aug 2021 20:24:40 +0900
> Takashi Yano wrote:
> > Hi Ken,
> > 
> > Thanks much! I tested topic/pipe branch.
> > 
> > [yano@cygwin-PC ~]$ scp test.dat yano@linux-server:.
> > yano@linux-server's password:
> > test.dat                                      100%  100MB  95.9MB/s   00:01
> > [yano@cygwin-PC ~]$ scp yano@linux-server:test.dat .
> > yano@linux-server's password:
> > test.dat                                      100%  100MB   8.0MB/s   00:12
> > 
> > yano@linux-server:~$ scp yano@cygwin-PC:test.dat .
> > yano@cygwin-PC's password:
> > test.dat                                      100%  100MB 109.7MB/s   00:00
> > yano@linux-server:~$ scp test.dat yano@cygwin-PC:.
> > yano@cygwin-PC's password:
> > test.dat                                      100%  100MB  31.4MB/s   00:03
> > 
> > As shown above, outgoing transfer-rate has been improved upto near
> > theoretical limit. However, incoming transfer-rate is not improved
> > much.
> > 
> > I digged further and found the first patch attached solves the issue
> > as follows.
> > 
> > [yano@cygwin-PC ~]$ scp yano@linux-server:test.dat .
> > yano@linux-server's password:
> > test.dat                                      100%  100MB 112.8MB/s   00:00
> > 
> > yano@linux-server2:~$ scp test.dat yano@cygwin-PC:.
> > yano@cygwin-PC's password:
> > test.dat                                      100%  100MB 102.5MB/s   00:00
> 
> With this patch (2e36ae2e), I found a problem that mintty gets into
> high load if several keys are typed quickly.
> 
> Therefore, I would like to propose a patch attached.

I found the fifo has same issue as pipe that throughput is
slowing down occasionally.

Please try simple test case attached (fifo_test.c).
Argument of fifo_test means:
0: with select, blocking I/O
1: with select, non-blocking I/O
2: without select, blocking I/O
3: without select, non-blocking I/O

In my environment, the results are as follows.

[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 0

Total: 100MB in 4.593770 second, 21.768613MB/s
[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 1

Total: 100MB in 0.564711 second, 177.081603MB/s
[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 2

Total: 100MB in 0.514730 second, 194.276724MB/s
[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 3
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Total: 100MB in 0.038880 second, 2572.003230MB/s

Therefore, I would like to propose patch attached, which utilizes
select_sem just as pipe.

With the patch attached, the throughput of the fifo is improved
much as follows.

[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 0

Total: 100MB in 0.076400 second, 1308.895384MB/s
[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 1

Total: 100MB in 0.064198 second, 1557.683351MB/s
[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 2

Total: 100MB in 0.021803 second, 4586.503754MB/s
[yano@Express5800-S70 ~/pipe_test]$ ./fifo_test 3
wwwwwwrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Total: 100MB in 0.021747 second, 4598.271969MB/s


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: fifo_test.c --]
[-- Type: text/x-csrc, Size: 2228 bytes --]

#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <stdio.h>
#include <time.h>
#include <sys/select.h>
#include <fcntl.h>
#include <errno.h>

#define FIFO_NAME "/tmp/fifo1"
#define BLKSIZ 65536
#define NBLK (100*(1<<20)/BLKSIZ)


int main(int argc, char *argv[])
{
	pid_t pid;
	int nonblock = 0;
	int skip_select = 0;

	if (argc >= 2 && (atoi(argv[1]) & 1)) nonblock = 1;
	if (argc >= 2 && (atoi(argv[1]) & 2)) skip_select = 1;

	mkfifo("/tmp/fifo1", 0644);

	if (!(pid = fork ())) {
		int fifo;
		char buf[BLKSIZ] = {0,};
		int i;
		fifo = open(FIFO_NAME, O_WRONLY);
		if (nonblock) {
			int flags;
			flags = fcntl(fifo, F_GETFL);
			flags |= O_NONBLOCK;
			fcntl(fifo, F_SETFL, flags);
		}
		fd_set wfds;
		for (i=0; i<NBLK; i++) {
			int total = 0;
			while (total < sizeof(buf)) {
				FD_ZERO(&wfds);
				FD_SET(fifo, &wfds);
				if (skip_select || select(fifo+1, NULL, &wfds, NULL, NULL) > 0
						&& FD_ISSET(fifo, &wfds)) {
					errno = 0;
					ssize_t len = write(fifo, buf + total, sizeof(buf) - total);
					if (len <= 0 && errno == EAGAIN) printf("w", i);
					if (len > 0) total += len;
				}
			}
		}
		close(fifo);
	} else {
		int fifo;
		char buf[BLKSIZ] = {0,};
		int total = 0;
		fd_set rfds;
		struct timespec tv0, tv1;
		double elasped;
		fifo = open(FIFO_NAME, O_RDONLY);
		if (nonblock) {
			int flags;
			flags = fcntl(fifo, F_GETFL);
			flags |= O_NONBLOCK;
			fcntl(fifo, F_SETFL, flags);
		}
		clock_gettime(CLOCK_MONOTONIC, &tv0);
		for (;;) {
			FD_ZERO(&rfds);
			FD_SET(fifo, &rfds);
			if (skip_select || select(fifo+1, &rfds, NULL, NULL, NULL) > 0
					&& FD_ISSET(fifo, &rfds)) {
				errno = 0;
				ssize_t len = read(fifo, buf, sizeof(buf));
				if (len <= 0 && errno == EAGAIN) printf("r");
				else if(len <= 0) break;
				else total += len;
			}
		}
		clock_gettime(CLOCK_MONOTONIC, &tv1);
		close(fifo);
		elasped = (tv1.tv_sec - tv0.tv_sec) + (tv1.tv_nsec - tv0.tv_nsec)*1e-9;
		printf("\nTotal: %dMB in %f second, %fMB/s\n",
			total/(1<<20), elasped, total/(1<<20)/elasped);
		if (total != NBLK*BLKSIZ) {
			printf("Error: %d/%d (%f)\n",
				total, NBLK*BLKSIZ, (double)(total-NBLK*BLKSIZ)/BLKSIZ);
		}
		waitpid(pid, NULL, 0);
		unlink(FIFO_NAME);
	}

	return 0;
}


[-- Attachment #3: 0001-Cygwin-fifo-Utilize-select_sem-for-fifo-as-well-as-p.patch --]
[-- Type: application/octet-stream, Size: 4455 bytes --]

From beeae61f97145d795c4479b0d46cd3b75d6256c4 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Thu, 9 Sep 2021 11:53:11 +0900
Subject: [PATCH] Cygwin: fifo: Utilize select_sem for fifo as well as pipe.

---
 winsup/cygwin/fhandler_fifo.cc | 37 +++++++++++++++++++++++++++++++++-
 winsup/cygwin/select.cc        | 10 +++++++--
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 6709fb974..c40573783 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1047,6 +1047,12 @@ writer_shmem:
   ResetEvent (writer_opening);
   nwriters_unlock ();
 success:
+  if (!select_sem)
+    {
+      char name[MAX_PATH];
+      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
+      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
+    }
   return 1;
 err_close_reader:
   saved_errno = get_errno ();
@@ -1235,6 +1241,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		  len = io.Information;
 		  fifo_client_unlock ();
 		  reading_unlock ();
+		  if (select_sem)
+		    ReleaseSemaphore (select_sem,
+				      get_obj_handle_count (select_sem),
+				      NULL);
 		  return;
 		}
 	      break;
@@ -1273,6 +1283,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1312,6 +1326,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1345,7 +1363,7 @@ maybe_retry:
       else
 	{
 	  /* Allow interruption and don't hog the CPU. */
-	  DWORD waitret = cygwait (NULL, 1, cw_cancel | cw_sig_eintr);
+	  DWORD waitret = cygwait (select_sem, 1, cw_cancel | cw_sig_eintr);
 	  if (waitret == WAIT_CANCELED)
 	    pthread::static_cancel_self ();
 	  else if (waitret == WAIT_SIGNALED)
@@ -1569,6 +1587,11 @@ fhandler_fifo::close ()
     NtClose (write_ready);
   if (writer_opening)
     NtClose (writer_opening);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      NtClose (select_sem);
+    }
   if (nohandle ())
     return 0;
   else
@@ -1683,7 +1706,17 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
     }
   if (writer)
     inc_nwriters ();
+  if (select_sem &&
+      !DuplicateHandle (GetCurrentProcess (), select_sem,
+			GetCurrentProcess (), &fhf->select_sem,
+			0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      goto err_close_thr_sync_evt;
+    }
   return 0;
+err_close_thr_sync_evt:
+  NtClose (fhf->thr_sync_evt);
 err_close_cancel_evt:
   NtClose (fhf->cancel_evt);
 err_close_update_needed_evt:
@@ -1743,6 +1776,8 @@ fhandler_fifo::fixup_after_fork (HANDLE parent)
       me.winpid = GetCurrentProcessId ();
       new cygthread (fifo_reader_thread, this, "fifo_reader", thr_sync_evt);
     }
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   if (writer)
     inc_nwriters ();
 }
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index e9e71b269..5e583434c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -963,7 +963,13 @@ start_thread_fifo (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_fifo, pi, "fifosel");
       me->h = *pi->thread;
       if (!me->h)
@@ -981,7 +987,7 @@ fifo_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09  3:41               ` Takashi Yano
@ 2021-09-09  8:05                 ` Takashi Yano
  2021-09-09 12:19                   ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-09  8:05 UTC (permalink / raw)
  To: cygwin-developers

On Thu, 9 Sep 2021 12:41:15 +0900
Takashi Yano wrote:
> diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
> index 6709fb974..c40573783 100644
> --- a/winsup/cygwin/fhandler_fifo.cc
> +++ b/winsup/cygwin/fhandler_fifo.cc
> @@ -1047,6 +1047,12 @@ writer_shmem:
>    ResetEvent (writer_opening);
>    nwriters_unlock ();
>  success:
> +  if (!select_sem)
> +    {
> +      char name[MAX_PATH];
> +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
> +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
> +    }
>    return 1;
>  err_close_reader:
>    saved_errno = get_errno ();

Should this be:
> +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09  3:21                                 ` Takashi Yano
@ 2021-09-09  9:37                                   ` Corinna Vinschen
  2021-09-09 10:55                                     ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-09  9:37 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]

On Sep  9 12:21, Takashi Yano wrote:
> On Wed, 8 Sep 2021 12:51:31 +0200
> Corinna Vinschen wrote:
> > > > Just one question.  Would you mind to split your patch into two parts,
> > > > one being just the revert of your "Improve select/poll response." patch
> > > > and one introducing select_sem?
> > > 
> > > I split the patch as you advised.
> > 
> > Pushed.
> 
> Timing of select_sem notification is fixed by the patch attached.

If I'm not entirely off-track, I think this isn't quite right yet.
I assume you want to release the semaphore in all cases some bytes
have been read or written, right?

If so, this should cover the STATUS_THREAD_CANCELED and
STATUS_BUFFER_OVERFLOW cases as well.  But then again, the
ReleaseSemaphore calls are a bit spread out over the calls.

I took the liberty to create a followup patch to the attached one.  It
merges all cases potentially reading or writing some bytes into a single
if branch, so only a single ReleaseSemaphore should be required.  I
dropped STATUS_MORE_ENTRIES because it's not an error, subsumed under
NT_SUCCESS, and IIUC, never emitted by the underlying code.

I attached the patch to this mail, can you please check it?


Thanks,
Corinna

[-- Attachment #2: 0001-Cygwin-pipes-always-signal-select_sem-if-any-bytes-a.patch --]
[-- Type: text/patch, Size: 4540 bytes --]

From 11bea88b788aabb762b4e972a7f3fb2423fb32d8 Mon Sep 17 00:00:00 2001
From: Corinna Vinschen <corinna@vinschen.de>
Date: Thu, 9 Sep 2021 11:35:54 +0200
Subject: [PATCH] Cygwin: pipes: always signal select_sem if any bytes are read
 or written

Fold all code branches potentially having read or written data into
a single if branch, so signalling select_sem catches all cases.

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
---
 winsup/cygwin/fhandler_pipe.cc | 64 ++++++++++++----------------------
 1 file changed, 23 insertions(+), 41 deletions(-)

diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 488e14be8c2b..6994a5dceb84 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -303,20 +303,11 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    waitret = WAIT_OBJECT_0;
 
 	  if (waitret == WAIT_CANCELED)
-	    {
-	      status = STATUS_THREAD_CANCELED;
-	      break;
-	    }
+	    status = STATUS_THREAD_CANCELED;
 	  else if (waitret == WAIT_SIGNALED)
-	    {
-	      status = STATUS_THREAD_SIGNALED;
-	      nbytes += io.Information;
-	      if (select_sem && io.Information > 0)
-		ReleaseSemaphore (select_sem,
-				  get_obj_handle_count (select_sem), NULL);
-	      break;
-	    }
-	  status = io.Status;
+	    status = STATUS_THREAD_SIGNALED;
+	  else
+	    status = io.Status;
 	}
       if (isclosed ())  /* A signal handler might have closed the fd. */
 	{
@@ -326,11 +317,17 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    __seterrno ();
 	  nbytes = (size_t) -1;
 	}
-      else if (NT_SUCCESS (status))
+      else if (NT_SUCCESS (status)
+	       || status == STATUS_BUFFER_OVERFLOW
+	       || status == STATUS_THREAD_CANCELED
+	       || status == STATUS_THREAD_SIGNALED)
 	{
 	  nbytes_now = io.Information;
 	  ptr = ((char *) ptr) + nbytes_now;
 	  nbytes += nbytes_now;
+	  if (select_sem && nbytes_now > 0)
+	    ReleaseSemaphore (select_sem,
+			      get_obj_handle_count (select_sem), NULL);
 	}
       else
 	{
@@ -341,13 +338,6 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    case STATUS_PIPE_BROKEN:
 	      /* This is really EOF.  */
 	      break;
-	    case STATUS_MORE_ENTRIES:
-	    case STATUS_BUFFER_OVERFLOW:
-	      /* `io.Information' is supposedly valid.  */
-	      nbytes_now = io.Information;
-	      ptr = ((char *) ptr) + nbytes_now;
-	      nbytes += nbytes_now;
-	      break;
 	    case STATUS_PIPE_LISTENING:
 	    case STATUS_PIPE_EMPTY:
 	      if (nbytes != 0)
@@ -366,10 +356,8 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	    }
 	}
 
-      if (nbytes_now == 0)
+      if (nbytes_now == 0 || status == STATUS_BUFFER_OVERFLOW)
 	break;
-      else if (select_sem)
-	ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
     }
   ReleaseMutex (read_mtx);
   if (evt)
@@ -467,20 +455,11 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 	    waitret = WAIT_OBJECT_0;
 
 	  if (waitret == WAIT_CANCELED)
-	    {
-	      status = STATUS_THREAD_CANCELED;
-	      break;
-	    }
+	    status = STATUS_THREAD_CANCELED;
 	  else if (waitret == WAIT_SIGNALED)
-	    {
-	      status = STATUS_THREAD_SIGNALED;
-	      nbytes += io.Information;
-	      if (select_sem && io.Information > 0)
-		ReleaseSemaphore (select_sem,
-				  get_obj_handle_count (select_sem), NULL);
-	      break;
-	    }
-	  status = io.Status;
+	    status = STATUS_THREAD_SIGNALED;
+	  else
+	    status = io.Status;
 	}
       if (isclosed ())  /* A signal handler might have closed the fd. */
 	{
@@ -489,13 +468,18 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 	  else
 	    __seterrno ();
 	}
-      else if (NT_SUCCESS (status))
+      else if (NT_SUCCESS (status)
+	       || status == STATUS_THREAD_CANCELED
+	       || status == STATUS_THREAD_SIGNALED)
 	{
 	  nbytes_now = io.Information;
 	  ptr = ((char *) ptr) + nbytes_now;
 	  nbytes += nbytes_now;
+	  if (select_sem && nbytes_now > 0)
+	    ReleaseSemaphore (select_sem,
+			      get_obj_handle_count (select_sem), NULL);
 	  /* 0 bytes returned?  EAGAIN.  See above. */
-	  if (nbytes == 0)
+	  if (NT_SUCCESS (status) && nbytes == 0)
 	    set_errno (EAGAIN);
 	}
       else if (STATUS_PIPE_IS_CLOSED (status))
@@ -508,8 +492,6 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 
       if (nbytes_now == 0)
 	break;
-      else if (select_sem)
-	ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
     }
   if (evt)
     CloseHandle (evt);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09  9:37                                   ` Corinna Vinschen
@ 2021-09-09 10:55                                     ` Takashi Yano
  2021-09-09 11:41                                       ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-09 10:55 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On Thu, 9 Sep 2021 11:37:06 +0200
Corinna Vinschen wrote:
> On Sep  9 12:21, Takashi Yano wrote:
> > On Wed, 8 Sep 2021 12:51:31 +0200
> > Corinna Vinschen wrote:
> > > > > Just one question.  Would you mind to split your patch into two parts,
> > > > > one being just the revert of your "Improve select/poll response." patch
> > > > > and one introducing select_sem?
> > > > 
> > > > I split the patch as you advised.
> > > 
> > > Pushed.
> > 
> > Timing of select_sem notification is fixed by the patch attached.
> 
> If I'm not entirely off-track, I think this isn't quite right yet.
> I assume you want to release the semaphore in all cases some bytes
> have been read or written, right?

Right.

> If so, this should cover the STATUS_THREAD_CANCELED and
> STATUS_BUFFER_OVERFLOW cases as well.  But then again, the
> ReleaseSemaphore calls are a bit spread out over the calls.
> 
> I took the liberty to create a followup patch to the attached one.  It
> merges all cases potentially reading or writing some bytes into a single
> if branch, so only a single ReleaseSemaphore should be required.  I
> dropped STATUS_MORE_ENTRIES because it's not an error, subsumed under
> NT_SUCCESS, and IIUC, never emitted by the underlying code.
> 
> I attached the patch to this mail, can you please check it?

Thanks for the patch. LGTM. I also confirmed that your patch
solves the problem which I wanted to fix.

Thanks again.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09 10:55                                     ` Takashi Yano
@ 2021-09-09 11:41                                       ` Corinna Vinschen
  0 siblings, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-09 11:41 UTC (permalink / raw)
  To: cygwin-developers

On Sep  9 19:55, Takashi Yano wrote:
> Hi Corinna,
> 
> On Thu, 9 Sep 2021 11:37:06 +0200
> Corinna Vinschen wrote:
> > On Sep  9 12:21, Takashi Yano wrote:
> > > On Wed, 8 Sep 2021 12:51:31 +0200
> > > Corinna Vinschen wrote:
> > > > > > Just one question.  Would you mind to split your patch into two parts,
> > > > > > one being just the revert of your "Improve select/poll response." patch
> > > > > > and one introducing select_sem?
> > > > > 
> > > > > I split the patch as you advised.
> > > > 
> > > > Pushed.
> > > 
> > > Timing of select_sem notification is fixed by the patch attached.
> > 
> > If I'm not entirely off-track, I think this isn't quite right yet.
> > I assume you want to release the semaphore in all cases some bytes
> > have been read or written, right?
> 
> Right.
> 
> > If so, this should cover the STATUS_THREAD_CANCELED and
> > STATUS_BUFFER_OVERFLOW cases as well.  But then again, the
> > ReleaseSemaphore calls are a bit spread out over the calls.
> > 
> > I took the liberty to create a followup patch to the attached one.  It
> > merges all cases potentially reading or writing some bytes into a single
> > if branch, so only a single ReleaseSemaphore should be required.  I
> > dropped STATUS_MORE_ENTRIES because it's not an error, subsumed under
> > NT_SUCCESS, and IIUC, never emitted by the underlying code.
> > 
> > I attached the patch to this mail, can you please check it?
> 
> Thanks for the patch. LGTM. I also confirmed that your patch
> solves the problem which I wanted to fix.

Great, I pushed both patches.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09  8:05                 ` Takashi Yano
@ 2021-09-09 12:19                   ` Takashi Yano
  2021-09-09 12:42                     ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-09 12:19 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]

On Thu, 9 Sep 2021 17:05:49 +0900
Takashi Yano wrote:
> On Thu, 9 Sep 2021 12:41:15 +0900
> Takashi Yano wrote:
> > diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
> > index 6709fb974..c40573783 100644
> > --- a/winsup/cygwin/fhandler_fifo.cc
> > +++ b/winsup/cygwin/fhandler_fifo.cc
> > @@ -1047,6 +1047,12 @@ writer_shmem:
> >    ResetEvent (writer_opening);
> >    nwriters_unlock ();
> >  success:
> > +  if (!select_sem)
> > +    {
> > +      char name[MAX_PATH];
> > +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
> > +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
> > +    }
> >    return 1;
> >  err_close_reader:
> >    saved_errno = get_errno ();
> 
> Should this be:
> > +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
> ?

I revised the patch a bit.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-fifo-Utilize-select_sem-for-fifo-as-well-as-p.patch --]
[-- Type: application/octet-stream, Size: 4434 bytes --]

From b7be5fb11bac6258e1ac01eccfcfcf2401da1225 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Thu, 9 Sep 2021 17:16:50 +0900
Subject: [PATCH] Cygwin: fifo: Utilize select_sem for fifo as well as pipe.

---
 winsup/cygwin/fhandler_fifo.cc | 36 +++++++++++++++++++++++++++++++++-
 winsup/cygwin/select.cc        | 10 ++++++++--
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 6709fb974..a6fcc38f4 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1047,6 +1047,11 @@ writer_shmem:
   ResetEvent (writer_opening);
   nwriters_unlock ();
 success:
+  if (!select_sem)
+    {
+      __small_sprintf (npbuf, "semaphore.%08x.%016X", get_dev (), get_ino ());
+      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, npbuf);
+    }
   return 1;
 err_close_reader:
   saved_errno = get_errno ();
@@ -1235,6 +1240,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		  len = io.Information;
 		  fifo_client_unlock ();
 		  reading_unlock ();
+		  if (select_sem)
+		    ReleaseSemaphore (select_sem,
+				      get_obj_handle_count (select_sem),
+				      NULL);
 		  return;
 		}
 	      break;
@@ -1273,6 +1282,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1312,6 +1325,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1345,7 +1362,7 @@ maybe_retry:
       else
 	{
 	  /* Allow interruption and don't hog the CPU. */
-	  DWORD waitret = cygwait (NULL, 1, cw_cancel | cw_sig_eintr);
+	  DWORD waitret = cygwait (select_sem, 1, cw_cancel | cw_sig_eintr);
 	  if (waitret == WAIT_CANCELED)
 	    pthread::static_cancel_self ();
 	  else if (waitret == WAIT_SIGNALED)
@@ -1569,6 +1586,11 @@ fhandler_fifo::close ()
     NtClose (write_ready);
   if (writer_opening)
     NtClose (writer_opening);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      NtClose (select_sem);
+    }
   if (nohandle ())
     return 0;
   else
@@ -1683,7 +1705,17 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
     }
   if (writer)
     inc_nwriters ();
+  if (select_sem &&
+      !DuplicateHandle (GetCurrentProcess (), select_sem,
+			GetCurrentProcess (), &fhf->select_sem,
+			0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      goto err_close_thr_sync_evt;
+    }
   return 0;
+err_close_thr_sync_evt:
+  NtClose (fhf->thr_sync_evt);
 err_close_cancel_evt:
   NtClose (fhf->cancel_evt);
 err_close_update_needed_evt:
@@ -1743,6 +1775,8 @@ fhandler_fifo::fixup_after_fork (HANDLE parent)
       me.winpid = GetCurrentProcessId ();
       new cygthread (fifo_reader_thread, this, "fifo_reader", thr_sync_evt);
     }
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   if (writer)
     inc_nwriters ();
 }
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index e9e71b269..5e583434c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -963,7 +963,13 @@ start_thread_fifo (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_fifo, pi, "fifosel");
       me->h = *pi->thread;
       if (!me->h)
@@ -981,7 +987,7 @@ fifo_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09 12:19                   ` Takashi Yano
@ 2021-09-09 12:42                     ` Takashi Yano
  2021-09-09 21:53                       ` Takashi Yano
  2021-09-10 10:57                       ` Ken Brown
  0 siblings, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-09 12:42 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1044 bytes --]

On Thu, 9 Sep 2021 21:19:40 +0900
Takashi Yano wrote:
> On Thu, 9 Sep 2021 17:05:49 +0900
> Takashi Yano wrote:
> > On Thu, 9 Sep 2021 12:41:15 +0900
> > Takashi Yano wrote:
> > > diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
> > > index 6709fb974..c40573783 100644
> > > --- a/winsup/cygwin/fhandler_fifo.cc
> > > +++ b/winsup/cygwin/fhandler_fifo.cc
> > > @@ -1047,6 +1047,12 @@ writer_shmem:
> > >    ResetEvent (writer_opening);
> > >    nwriters_unlock ();
> > >  success:
> > > +  if (!select_sem)
> > > +    {
> > > +      char name[MAX_PATH];
> > > +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
> > > +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
> > > +    }
> > >    return 1;
> > >  err_close_reader:
> > >    saved_errno = get_errno ();
> > 
> > Should this be:
> > > +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
> > ?
> 
> I revised the patch a bit.

Sorry, I revised the patch again.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-fifo-Utilize-select_sem-for-fifo-as-well-as-p.patch --]
[-- Type: application/octet-stream, Size: 5062 bytes --]

From ba92a1ffb1c9b9bdacc363444f27fd328137fefd Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Thu, 9 Sep 2021 21:40:37 +0900
Subject: [PATCH] Cygwin: fifo: Utilize select_sem for fifo as well as pipe.

---
 winsup/cygwin/fhandler_fifo.cc | 38 ++++++++++++++++++++++++++++++++--
 winsup/cygwin/select.cc        | 10 +++++++--
 2 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 6709fb974..6de9a229b 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1047,6 +1047,11 @@ writer_shmem:
   ResetEvent (writer_opening);
   nwriters_unlock ();
 success:
+  if (!select_sem)
+    {
+      __small_sprintf (npbuf, "semaphore.%08x.%016X", get_dev (), get_ino ());
+      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, npbuf);
+    }
   return 1;
 err_close_reader:
   saved_errno = get_errno ();
@@ -1235,6 +1240,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		  len = io.Information;
 		  fifo_client_unlock ();
 		  reading_unlock ();
+		  if (select_sem)
+		    ReleaseSemaphore (select_sem,
+				      get_obj_handle_count (select_sem),
+				      NULL);
 		  return;
 		}
 	      break;
@@ -1273,6 +1282,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1312,6 +1325,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1345,7 +1362,7 @@ maybe_retry:
       else
 	{
 	  /* Allow interruption and don't hog the CPU. */
-	  DWORD waitret = cygwait (NULL, 1, cw_cancel | cw_sig_eintr);
+	  DWORD waitret = cygwait (select_sem, 1, cw_cancel | cw_sig_eintr);
 	  if (waitret == WAIT_CANCELED)
 	    pthread::static_cancel_self ();
 	  else if (waitret == WAIT_SIGNALED)
@@ -1569,6 +1586,11 @@ fhandler_fifo::close ()
     NtClose (write_ready);
   if (writer_opening)
     NtClose (writer_opening);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      NtClose (select_sem);
+    }
   if (nohandle ())
     return 0;
   else
@@ -1632,6 +1654,14 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
       __seterrno ();
       goto err_close_writer_opening;
     }
+  if (select_sem &&
+      !DuplicateHandle (GetCurrentProcess (), select_sem,
+			GetCurrentProcess (), &fhf->select_sem,
+			0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      goto err_close_shmem;
+    }
   if (fhf->reopen_shmem () < 0)
     goto err_close_shmem_handle;
   if (reader)
@@ -1648,7 +1678,7 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
 			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
 	{
 	  __seterrno ();
-	  goto err_close_shmem;
+	  goto err_close_select_sem;
 	}
       if (fhf->reopen_shared_fc_handler () < 0)
 	goto err_close_shared_fc_hdl;
@@ -1696,6 +1726,8 @@ err_close_shared_fc_handler:
   NtUnmapViewOfSection (GetCurrentProcess (), fhf->shared_fc_handler);
 err_close_shared_fc_hdl:
   NtClose (fhf->shared_fc_hdl);
+err_close_select_sem:
+  NtClose (fhf->select_sem);
 err_close_shmem:
   NtUnmapViewOfSection (GetCurrentProcess (), fhf->shmem);
 err_close_shmem_handle:
@@ -1720,6 +1752,8 @@ fhandler_fifo::fixup_after_fork (HANDLE parent)
   fork_fixup (parent, shmem_handle, "shmem_handle");
   if (reopen_shmem () < 0)
     api_fatal ("Can't reopen shared memory during fork, %E");
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   if (reader)
     {
       /* Make sure the child starts unlocked. */
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index e9e71b269..5e583434c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -963,7 +963,13 @@ start_thread_fifo (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_fifo, pi, "fifosel");
       me->h = *pi->thread;
       if (!me->h)
@@ -981,7 +987,7 @@ fifo_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09 12:42                     ` Takashi Yano
@ 2021-09-09 21:53                       ` Takashi Yano
  2021-09-10  3:41                         ` Takashi Yano
  2021-09-10 10:57                       ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-09 21:53 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1207 bytes --]

On Thu, 9 Sep 2021 21:42:46 +0900
Takashi Yano wrote:
> On Thu, 9 Sep 2021 21:19:40 +0900
> Takashi Yano wrote:
> > On Thu, 9 Sep 2021 17:05:49 +0900
> > Takashi Yano wrote:
> > > On Thu, 9 Sep 2021 12:41:15 +0900
> > > Takashi Yano wrote:
> > > > diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
> > > > index 6709fb974..c40573783 100644
> > > > --- a/winsup/cygwin/fhandler_fifo.cc
> > > > +++ b/winsup/cygwin/fhandler_fifo.cc
> > > > @@ -1047,6 +1047,12 @@ writer_shmem:
> > > >    ResetEvent (writer_opening);
> > > >    nwriters_unlock ();
> > > >  success:
> > > > +  if (!select_sem)
> > > > +    {
> > > > +      char name[MAX_PATH];
> > > > +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
> > > > +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
> > > > +    }
> > > >    return 1;
> > > >  err_close_reader:
> > > >    saved_errno = get_errno ();
> > > 
> > > Should this be:
> > > > +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
> > > ?
> > 
> > I revised the patch a bit.
> 
> Sorry, I revised the patch again.

The patch still has a mistake. Revised again.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-fifo-Utilize-select_sem-for-fifo-as-well-as-p.patch --]
[-- Type: application/octet-stream, Size: 5060 bytes --]

From 69e8fea5c34d746faa9a268ec3a3d00d92498127 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Fri, 10 Sep 2021 06:46:40 +0900
Subject: [PATCH] Cygwin: fifo: Utilize select_sem for fifo as well as pipe.

---
 winsup/cygwin/fhandler_fifo.cc | 38 ++++++++++++++++++++++++++++++++--
 winsup/cygwin/select.cc        | 10 +++++++--
 2 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 6709fb974..1e605d697 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1047,6 +1047,11 @@ writer_shmem:
   ResetEvent (writer_opening);
   nwriters_unlock ();
 success:
+  if (!select_sem)
+    {
+      __small_sprintf (npbuf, "semaphore.%08x.%016X", get_dev (), get_ino ());
+      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, npbuf);
+    }
   return 1;
 err_close_reader:
   saved_errno = get_errno ();
@@ -1235,6 +1240,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		  len = io.Information;
 		  fifo_client_unlock ();
 		  reading_unlock ();
+		  if (select_sem)
+		    ReleaseSemaphore (select_sem,
+				      get_obj_handle_count (select_sem),
+				      NULL);
 		  return;
 		}
 	      break;
@@ -1273,6 +1282,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1312,6 +1325,10 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    fc_handler[i].last_read = true;
 		    fifo_client_unlock ();
 		    reading_unlock ();
+		    if (select_sem)
+		      ReleaseSemaphore (select_sem,
+					get_obj_handle_count (select_sem),
+					NULL);
 		    return;
 		  }
 		break;
@@ -1345,7 +1362,7 @@ maybe_retry:
       else
 	{
 	  /* Allow interruption and don't hog the CPU. */
-	  DWORD waitret = cygwait (NULL, 1, cw_cancel | cw_sig_eintr);
+	  DWORD waitret = cygwait (select_sem, 1, cw_cancel | cw_sig_eintr);
 	  if (waitret == WAIT_CANCELED)
 	    pthread::static_cancel_self ();
 	  else if (waitret == WAIT_SIGNALED)
@@ -1569,6 +1586,11 @@ fhandler_fifo::close ()
     NtClose (write_ready);
   if (writer_opening)
     NtClose (writer_opening);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      NtClose (select_sem);
+    }
   if (nohandle ())
     return 0;
   else
@@ -1634,6 +1656,14 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
     }
   if (fhf->reopen_shmem () < 0)
     goto err_close_shmem_handle;
+  if (select_sem &&
+      !DuplicateHandle (GetCurrentProcess (), select_sem,
+			GetCurrentProcess (), &fhf->select_sem,
+			0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      goto err_close_shmem;
+    }
   if (reader)
     {
       /* Make sure the child starts unlocked. */
@@ -1648,7 +1678,7 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
 			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
 	{
 	  __seterrno ();
-	  goto err_close_shmem;
+	  goto err_close_select_sem;
 	}
       if (fhf->reopen_shared_fc_handler () < 0)
 	goto err_close_shared_fc_hdl;
@@ -1696,6 +1726,8 @@ err_close_shared_fc_handler:
   NtUnmapViewOfSection (GetCurrentProcess (), fhf->shared_fc_handler);
 err_close_shared_fc_hdl:
   NtClose (fhf->shared_fc_hdl);
+err_close_select_sem:
+  NtClose (fhf->select_sem);
 err_close_shmem:
   NtUnmapViewOfSection (GetCurrentProcess (), fhf->shmem);
 err_close_shmem_handle:
@@ -1720,6 +1752,8 @@ fhandler_fifo::fixup_after_fork (HANDLE parent)
   fork_fixup (parent, shmem_handle, "shmem_handle");
   if (reopen_shmem () < 0)
     api_fatal ("Can't reopen shared memory during fork, %E");
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   if (reader)
     {
       /* Make sure the child starts unlocked. */
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index e9e71b269..5e583434c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -963,7 +963,13 @@ start_thread_fifo (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_fifo, pi, "fifosel");
       me->h = *pi->thread;
       if (!me->h)
@@ -981,7 +987,7 @@ fifo_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09 21:53                       ` Takashi Yano
@ 2021-09-10  3:41                         ` Takashi Yano
  0 siblings, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-10  3:41 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]

On Fri, 10 Sep 2021 06:53:34 +0900
Takashi Yano wrote:
> On Thu, 9 Sep 2021 21:42:46 +0900
> Takashi Yano wrote:
> > On Thu, 9 Sep 2021 21:19:40 +0900
> > Takashi Yano wrote:
> > > On Thu, 9 Sep 2021 17:05:49 +0900
> > > Takashi Yano wrote:
> > > > On Thu, 9 Sep 2021 12:41:15 +0900
> > > > Takashi Yano wrote:
> > > > > diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
> > > > > index 6709fb974..c40573783 100644
> > > > > --- a/winsup/cygwin/fhandler_fifo.cc
> > > > > +++ b/winsup/cygwin/fhandler_fifo.cc
> > > > > @@ -1047,6 +1047,12 @@ writer_shmem:
> > > > >    ResetEvent (writer_opening);
> > > > >    nwriters_unlock ();
> > > > >  success:
> > > > > +  if (!select_sem)
> > > > > +    {
> > > > > +      char name[MAX_PATH];
> > > > > +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
> > > > > +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
> > > > > +    }
> > > > >    return 1;
> > > > >  err_close_reader:
> > > > >    saved_errno = get_errno ();
> > > > 
> > > > Should this be:
> > > > > +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
> > > > ?
> > > 
> > > I revised the patch a bit.
> > 
> > Sorry, I revised the patch again.
> 
> The patch still has a mistake. Revised again.

Partially commonize the codes regarding ReleaseSemaphore() in raw_read().

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-fifo-Utilize-select_sem-for-fifo-as-well-as-p.patch --]
[-- Type: application/octet-stream, Size: 5271 bytes --]

From 73e83d320bb7de7f5a68918e03ea377e6edf2d12 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Fri, 10 Sep 2021 08:43:59 +0900
Subject: [PATCH] Cygwin: fifo: Utilize select_sem for fifo as well as pipe.

---
 winsup/cygwin/fhandler_fifo.cc | 44 +++++++++++++++++++++++++---------
 winsup/cygwin/select.cc        | 10 ++++++--
 2 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 6709fb974..aa89fa7ae 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1047,6 +1047,11 @@ writer_shmem:
   ResetEvent (writer_opening);
   nwriters_unlock ();
 success:
+  if (!select_sem)
+    {
+      __small_sprintf (npbuf, "semaphore.%08x.%016X", get_dev (), get_ino ());
+      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, npbuf);
+    }
   return 1;
 err_close_reader:
   saved_errno = get_errno ();
@@ -1233,9 +1238,7 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 	      if (io.Information > 0)
 		{
 		  len = io.Information;
-		  fifo_client_unlock ();
-		  reading_unlock ();
-		  return;
+		  goto out;
 		}
 	      break;
 	    case STATUS_PIPE_EMPTY:
@@ -1271,9 +1274,7 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    if (j < nhandlers)
 		      fc_handler[j].last_read = false;
 		    fc_handler[i].last_read = true;
-		    fifo_client_unlock ();
-		    reading_unlock ();
-		    return;
+		    goto out;
 		  }
 		break;
 	      case STATUS_PIPE_EMPTY:
@@ -1310,9 +1311,7 @@ fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 		    if (j < nhandlers)
 		      fc_handler[j].last_read = false;
 		    fc_handler[i].last_read = true;
-		    fifo_client_unlock ();
-		    reading_unlock ();
-		    return;
+		    goto out;
 		  }
 		break;
 	      case STATUS_PIPE_EMPTY:
@@ -1345,7 +1344,7 @@ maybe_retry:
       else
 	{
 	  /* Allow interruption and don't hog the CPU. */
-	  DWORD waitret = cygwait (NULL, 1, cw_cancel | cw_sig_eintr);
+	  DWORD waitret = cygwait (select_sem, 1, cw_cancel | cw_sig_eintr);
 	  if (waitret == WAIT_CANCELED)
 	    pthread::static_cancel_self ();
 	  else if (waitret == WAIT_SIGNALED)
@@ -1368,6 +1367,12 @@ maybe_retry:
     }
 errout:
   len = (size_t) -1;
+  return;
+out:
+  fifo_client_unlock ();
+  reading_unlock ();
+  if (select_sem)
+    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
 }
 
 int __reg2
@@ -1569,6 +1574,11 @@ fhandler_fifo::close ()
     NtClose (write_ready);
   if (writer_opening)
     NtClose (writer_opening);
+  if (select_sem)
+    {
+      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      NtClose (select_sem);
+    }
   if (nohandle ())
     return 0;
   else
@@ -1634,6 +1644,14 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
     }
   if (fhf->reopen_shmem () < 0)
     goto err_close_shmem_handle;
+  if (select_sem &&
+      !DuplicateHandle (GetCurrentProcess (), select_sem,
+			GetCurrentProcess (), &fhf->select_sem,
+			0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      goto err_close_shmem;
+    }
   if (reader)
     {
       /* Make sure the child starts unlocked. */
@@ -1648,7 +1666,7 @@ fhandler_fifo::dup (fhandler_base *child, int flags)
 			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
 	{
 	  __seterrno ();
-	  goto err_close_shmem;
+	  goto err_close_select_sem;
 	}
       if (fhf->reopen_shared_fc_handler () < 0)
 	goto err_close_shared_fc_hdl;
@@ -1696,6 +1714,8 @@ err_close_shared_fc_handler:
   NtUnmapViewOfSection (GetCurrentProcess (), fhf->shared_fc_handler);
 err_close_shared_fc_hdl:
   NtClose (fhf->shared_fc_hdl);
+err_close_select_sem:
+  NtClose (fhf->select_sem);
 err_close_shmem:
   NtUnmapViewOfSection (GetCurrentProcess (), fhf->shmem);
 err_close_shmem_handle:
@@ -1720,6 +1740,8 @@ fhandler_fifo::fixup_after_fork (HANDLE parent)
   fork_fixup (parent, shmem_handle, "shmem_handle");
   if (reopen_shmem () < 0)
     api_fatal ("Can't reopen shared memory during fork, %E");
+  if (select_sem)
+    fork_fixup (parent, select_sem, "select_sem");
   if (reader)
     {
       /* Make sure the child starts unlocked. */
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index e9e71b269..5e583434c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -963,7 +963,13 @@ start_thread_fifo (select_record *me, select_stuff *stuff)
     {
       pi->start = &stuff->start;
       pi->stop_thread = false;
-      pi->bye = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+      pi->bye = me->fh->get_select_sem ();
+      if (pi->bye)
+	DuplicateHandle (GetCurrentProcess (), pi->bye,
+			 GetCurrentProcess (), &pi->bye,
+			 0, 0, DUPLICATE_SAME_ACCESS);
+      else
+	pi->bye = CreateSemaphore (&sec_none_nih, 0, INT32_MAX, NULL);
       pi->thread = new cygthread (thread_fifo, pi, "fifosel");
       me->h = *pi->thread;
       if (!me->h)
@@ -981,7 +987,7 @@ fifo_cleanup (select_record *, select_stuff *stuff)
   if (pi->thread)
     {
       pi->stop_thread = true;
-      SetEvent (pi->bye);
+      ReleaseSemaphore (pi->bye, get_obj_handle_count (pi->bye), NULL);
       pi->thread->detach ();
       CloseHandle (pi->bye);
     }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-09 12:42                     ` Takashi Yano
  2021-09-09 21:53                       ` Takashi Yano
@ 2021-09-10 10:57                       ` Ken Brown
  2021-09-10 15:17                         ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-10 10:57 UTC (permalink / raw)
  To: cygwin-developers

Hi Takashi,

On 9/9/2021 8:42 AM, Takashi Yano wrote:
> On Thu, 9 Sep 2021 21:19:40 +0900
> Takashi Yano wrote:
>> On Thu, 9 Sep 2021 17:05:49 +0900
>> Takashi Yano wrote:
>>> On Thu, 9 Sep 2021 12:41:15 +0900
>>> Takashi Yano wrote:
>>>> diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
>>>> index 6709fb974..c40573783 100644
>>>> --- a/winsup/cygwin/fhandler_fifo.cc
>>>> +++ b/winsup/cygwin/fhandler_fifo.cc
>>>> @@ -1047,6 +1047,12 @@ writer_shmem:
>>>>     ResetEvent (writer_opening);
>>>>     nwriters_unlock ();
>>>>   success:
>>>> +  if (!select_sem)
>>>> +    {
>>>> +      char name[MAX_PATH];
>>>> +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
>>>> +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
>>>> +    }
>>>>     return 1;
>>>>   err_close_reader:
>>>>     saved_errno = get_errno ();
>>>
>>> Should this be:
>>>> +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
>>> ?
>>
>> I revised the patch a bit.
> 
> Sorry, I revised the patch again.

Thanks!  This is an amazing speed-up.  Here's what I see on my system:

Using the HEAD of topic/pipe:

$ ./fifo_test 0

Total: 100MB in 7.646128 second, 13.078515MB/s

$ ./fifo_test 1

Total: 100MB in 5.472798 second, 18.272189MB/s

$ ./fifo_test 2

Total: 100MB in 0.191965 second, 520.928837MB/s

$ ./fifo_test 3
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Total: 100MB in 0.025944 second, 3854.440894MB/s

After applying your patch:

$ ./fifo_test 0

Total: 100MB in 0.074328 second, 1345.391630MB/s

$ ./fifo_test 1

Total: 100MB in 0.062126 second, 1609.632038MB/s

$ ./fifo_test 2

Total: 100MB in 0.013286 second, 7527.003124MB/s

$ ./fifo_test 3
wwwwrrrrrrrrrr
Total: 100MB in 0.014044 second, 7120.326396MB/s

I need to study your patch a little more, but then I'll push it if I don't see 
any problems.

Thanks again.  This is great.

Ken

P.S. I wrote this yesterday before you sent further revisions, but I forgot to 
send it.  I'll recheck the latest version shortly.

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-10 10:57                       ` Ken Brown
@ 2021-09-10 15:17                         ` Ken Brown
  2021-09-10 15:26                           ` Corinna Vinschen
  2021-09-10 22:57                           ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-10 15:17 UTC (permalink / raw)
  To: cygwin-developers

On 9/10/2021 6:57 AM, Ken Brown wrote:
> Hi Takashi,
> 
> On 9/9/2021 8:42 AM, Takashi Yano wrote:
>> On Thu, 9 Sep 2021 21:19:40 +0900
>> Takashi Yano wrote:
>>> On Thu, 9 Sep 2021 17:05:49 +0900
>>> Takashi Yano wrote:
>>>> On Thu, 9 Sep 2021 12:41:15 +0900
>>>> Takashi Yano wrote:
>>>>> diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
>>>>> index 6709fb974..c40573783 100644
>>>>> --- a/winsup/cygwin/fhandler_fifo.cc
>>>>> +++ b/winsup/cygwin/fhandler_fifo.cc
>>>>> @@ -1047,6 +1047,12 @@ writer_shmem:
>>>>>     ResetEvent (writer_opening);
>>>>>     nwriters_unlock ();
>>>>>   success:
>>>>> +  if (!select_sem)
>>>>> +    {
>>>>> +      char name[MAX_PATH];
>>>>> +      __small_sprintf(name, "semaphore-%W", get_pipe_name ()->Buffer);
>>>>> +      select_sem = CreateSemaphore (&sec_none, 0, INT32_MAX, name);
>>>>> +    }
>>>>>     return 1;
>>>>>   err_close_reader:
>>>>>     saved_errno = get_errno ();
>>>>
>>>> Should this be:
>>>>> +      select_sem = CreateSemaphore (sa_buf, 0, INT32_MAX, name);
>>>> ?
>>>
>>> I revised the patch a bit.
>>
>> Sorry, I revised the patch again.
> 
> Thanks!  This is an amazing speed-up.  Here's what I see on my system:
> 
> Using the HEAD of topic/pipe:
> 
> $ ./fifo_test 0
> 
> Total: 100MB in 7.646128 second, 13.078515MB/s
> 
> $ ./fifo_test 1
> 
> Total: 100MB in 5.472798 second, 18.272189MB/s
> 
> $ ./fifo_test 2
> 
> Total: 100MB in 0.191965 second, 520.928837MB/s
> 
> $ ./fifo_test 3
> wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 
> 
> Total: 100MB in 0.025944 second, 3854.440894MB/s
> 
> After applying your patch:
> 
> $ ./fifo_test 0
> 
> Total: 100MB in 0.074328 second, 1345.391630MB/s
> 
> $ ./fifo_test 1
> 
> Total: 100MB in 0.062126 second, 1609.632038MB/s
> 
> $ ./fifo_test 2
> 
> Total: 100MB in 0.013286 second, 7527.003124MB/s
> 
> $ ./fifo_test 3
> wwwwrrrrrrrrrr
> Total: 100MB in 0.014044 second, 7120.326396MB/s
> 
> I need to study your patch a little more, but then I'll push it if I don't see 
> any problems.
> 
> Thanks again.  This is great.
> 
> Ken
> 
> P.S. I wrote this yesterday before you sent further revisions, but I forgot to 
> send it.  I'll recheck the latest version shortly.

I've rerun your test with the latest version, and the test results are similar. 
  I've also run a suite of fifo tests that I've accumulated, and they all pass 
also, so I pushed your patch.

I think we're in pretty good shape now.  The only detail remaining, AFAIK, is 
how to best avoid a deadlock if the pipe has been created by a non-Cygwin 
process.  I've proposed a timeout, but maybe there's a better idea.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-10 15:17                         ` Ken Brown
@ 2021-09-10 15:26                           ` Corinna Vinschen
  2021-09-10 22:57                           ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-10 15:26 UTC (permalink / raw)
  To: cygwin-developers

On Sep 10 11:17, Ken Brown wrote:
> I've rerun your test with the latest version, and the test results are
> similar.  I've also run a suite of fifo tests that I've accumulated, and
> they all pass also, so I pushed your patch.
> 
> I think we're in pretty good shape now.  The only detail remaining, AFAIK,
> is how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> process.  I've proposed a timeout, but maybe there's a better idea.

Not from my side.  Every time I think I have an idea it's another
sort of chicken-egg problem...


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-10 15:17                         ` Ken Brown
  2021-09-10 15:26                           ` Corinna Vinschen
@ 2021-09-10 22:57                           ` Takashi Yano
  2021-09-11  2:17                             ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-10 22:57 UTC (permalink / raw)
  To: cygwin-developers

On Fri, 10 Sep 2021 11:17:58 -0400
Ken Brown wrote:
> I've rerun your test with the latest version, and the test results are similar. 
>   I've also run a suite of fifo tests that I've accumulated, and they all pass 
> also, so I pushed your patch.
> 
> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is 
> how to best avoid a deadlock if the pipe has been created by a non-Cygwin 
> process.  I've proposed a timeout, but maybe there's a better idea.

I am not pretty sure what is the problem, but is not the following
patch enough?

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index d309be2f7..13fba9a14 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1205,6 +1205,7 @@ public:
   select_record *select_except (select_stuff *);
   char *get_proc_fd_name (char *buf);
   int open (int flags, mode_t mode = 0);
+  void open_setup (int flags);
   void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
   int close ();
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 6994a5dce..d84e6ad84 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -191,6 +191,17 @@ out:
   return 0;
 }

+void
+fhandler_pipe::open_setup (int flags)
+{
+  fhandler_base::open_setup (flags);
+  if (get_dev () == FH_PIPER && !read_mtx)
+    {
+      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
+      read_mtx = CreateMutexW (sa, FALSE, NULL);
+    }
+}
+
 off_t
 fhandler_pipe::lseek (off_t offset, int whence)
 {


AFAIK, another problem remaining is:

On Mon, 6 Sep 2021 14:49:55 +0200
Corinna Vinschen wrote:
> - What about calling select for writing on pipes read by non-Cygwin
>   processes?  In that case, we still can't rely on WriteQuotaAvailable,
>   just as before.


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-10 22:57                           ` Takashi Yano
@ 2021-09-11  2:17                             ` Ken Brown
  2021-09-11  2:35                               ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-11  2:17 UTC (permalink / raw)
  To: cygwin-developers

On 9/10/2021 6:57 PM, Takashi Yano wrote:
> On Fri, 10 Sep 2021 11:17:58 -0400
> Ken Brown wrote:
>> I've rerun your test with the latest version, and the test results are similar.
>>    I've also run a suite of fifo tests that I've accumulated, and they all pass
>> also, so I pushed your patch.
>>
>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
>> process.  I've proposed a timeout, but maybe there's a better idea.
> 
> I am not pretty sure what is the problem, but is not the following
> patch enough?
> 
> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> index d309be2f7..13fba9a14 100644
> --- a/winsup/cygwin/fhandler.h
> +++ b/winsup/cygwin/fhandler.h
> @@ -1205,6 +1205,7 @@ public:
>     select_record *select_except (select_stuff *);
>     char *get_proc_fd_name (char *buf);
>     int open (int flags, mode_t mode = 0);
> +  void open_setup (int flags);
>     void fixup_after_fork (HANDLE);
>     int dup (fhandler_base *child, int);
>     int close ();
> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> index 6994a5dce..d84e6ad84 100644
> --- a/winsup/cygwin/fhandler_pipe.cc
> +++ b/winsup/cygwin/fhandler_pipe.cc
> @@ -191,6 +191,17 @@ out:
>     return 0;
>   }
> 
> +void
> +fhandler_pipe::open_setup (int flags)
> +{
> +  fhandler_base::open_setup (flags);
> +  if (get_dev () == FH_PIPER && !read_mtx)
> +    {
> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> +    }
> +}
> +
>   off_t
>   fhandler_pipe::lseek (off_t offset, int whence)
>   {
> 
> 
> AFAIK, another problem remaining is:
> 
> On Mon, 6 Sep 2021 14:49:55 +0200
> Corinna Vinschen wrote:
>> - What about calling select for writing on pipes read by non-Cygwin
>>    processes?  In that case, we still can't rely on WriteQuotaAvailable,
>>    just as before.

This is the problem I was talking about.  In this case the non-Cygwin process 
might have a large pending read, so that the Cygwin process calling select on 
the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock 
with the Cygwin process waiting for write ready while the non-Cygwin process is 
blocked trying to read.

My suggestion is that we impose a timeout in this situation, after which select 
reports write ready.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-11  2:17                             ` Ken Brown
@ 2021-09-11  2:35                               ` Takashi Yano
  2021-09-11 13:12                                 ` Ken Brown
  2021-09-13  9:07                                 ` Corinna Vinschen
  0 siblings, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-11  2:35 UTC (permalink / raw)
  To: cygwin-developers

On Fri, 10 Sep 2021 22:17:21 -0400
Ken Brown wrote:
> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> > On Fri, 10 Sep 2021 11:17:58 -0400
> > Ken Brown wrote:
> >> I've rerun your test with the latest version, and the test results are similar.
> >>    I've also run a suite of fifo tests that I've accumulated, and they all pass
> >> also, so I pushed your patch.
> >>
> >> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
> >> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> >> process.  I've proposed a timeout, but maybe there's a better idea.
> > 
> > I am not pretty sure what is the problem, but is not the following
> > patch enough?
> > 
> > diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> > index d309be2f7..13fba9a14 100644
> > --- a/winsup/cygwin/fhandler.h
> > +++ b/winsup/cygwin/fhandler.h
> > @@ -1205,6 +1205,7 @@ public:
> >     select_record *select_except (select_stuff *);
> >     char *get_proc_fd_name (char *buf);
> >     int open (int flags, mode_t mode = 0);
> > +  void open_setup (int flags);
> >     void fixup_after_fork (HANDLE);
> >     int dup (fhandler_base *child, int);
> >     int close ();
> > diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > index 6994a5dce..d84e6ad84 100644
> > --- a/winsup/cygwin/fhandler_pipe.cc
> > +++ b/winsup/cygwin/fhandler_pipe.cc
> > @@ -191,6 +191,17 @@ out:
> >     return 0;
> >   }
> > 
> > +void
> > +fhandler_pipe::open_setup (int flags)
> > +{
> > +  fhandler_base::open_setup (flags);
> > +  if (get_dev () == FH_PIPER && !read_mtx)
> > +    {
> > +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> > +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> > +    }
> > +}
> > +
> >   off_t
> >   fhandler_pipe::lseek (off_t offset, int whence)
> >   {
> > 
> > 
> > AFAIK, another problem remaining is:
> > 
> > On Mon, 6 Sep 2021 14:49:55 +0200
> > Corinna Vinschen wrote:
> >> - What about calling select for writing on pipes read by non-Cygwin
> >>    processes?  In that case, we still can't rely on WriteQuotaAvailable,
> >>    just as before.
> 
> This is the problem I was talking about.  In this case the non-Cygwin process 
> might have a large pending read, so that the Cygwin process calling select on 
> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock 
> with the Cygwin process waiting for write ready while the non-Cygwin process is 
> blocked trying to read.

Then, the above patch is for another issue.
The problem happes when:
1) Start command prompt.
2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
This causes hang up in cat. In this case, pipe is created by cmd.exe.
Therefore, read_mtx is not created.
 
> My suggestion is that we impose a timeout in this situation, after which select 
> reports write ready.

Keeping read handle in write pipe (Corinna's query_hdl) causes problem
that write side cannot detect close on read side.
Is it possible to open read handle temporally when pipe_data_available()
is called?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-11  2:35                               ` Takashi Yano
@ 2021-09-11 13:12                                 ` Ken Brown
  2021-09-12  6:23                                   ` Takashi Yano
  2021-09-12  8:48                                   ` Takashi Yano
  2021-09-13  9:07                                 ` Corinna Vinschen
  1 sibling, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-11 13:12 UTC (permalink / raw)
  To: cygwin-developers

On 9/10/2021 10:35 PM, Takashi Yano wrote:
> On Fri, 10 Sep 2021 22:17:21 -0400
> Ken Brown wrote:
>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
>>> On Fri, 10 Sep 2021 11:17:58 -0400
>>> Ken Brown wrote:
>>>> I've rerun your test with the latest version, and the test results are similar.
>>>>     I've also run a suite of fifo tests that I've accumulated, and they all pass
>>>> also, so I pushed your patch.
>>>>
>>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
>>>> process.  I've proposed a timeout, but maybe there's a better idea.
>>>
>>> I am not pretty sure what is the problem, but is not the following
>>> patch enough?
>>>
>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
>>> index d309be2f7..13fba9a14 100644
>>> --- a/winsup/cygwin/fhandler.h
>>> +++ b/winsup/cygwin/fhandler.h
>>> @@ -1205,6 +1205,7 @@ public:
>>>      select_record *select_except (select_stuff *);
>>>      char *get_proc_fd_name (char *buf);
>>>      int open (int flags, mode_t mode = 0);
>>> +  void open_setup (int flags);
>>>      void fixup_after_fork (HANDLE);
>>>      int dup (fhandler_base *child, int);
>>>      int close ();
>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>> index 6994a5dce..d84e6ad84 100644
>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>> @@ -191,6 +191,17 @@ out:
>>>      return 0;
>>>    }
>>>
>>> +void
>>> +fhandler_pipe::open_setup (int flags)
>>> +{
>>> +  fhandler_base::open_setup (flags);
>>> +  if (get_dev () == FH_PIPER && !read_mtx)
>>> +    {
>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
>>> +    }
>>> +}
>>> +
>>>    off_t
>>>    fhandler_pipe::lseek (off_t offset, int whence)
>>>    {
>>>
>>>
>>> AFAIK, another problem remaining is:
>>>
>>> On Mon, 6 Sep 2021 14:49:55 +0200
>>> Corinna Vinschen wrote:
>>>> - What about calling select for writing on pipes read by non-Cygwin
>>>>     processes?  In that case, we still can't rely on WriteQuotaAvailable,
>>>>     just as before.
>>
>> This is the problem I was talking about.  In this case the non-Cygwin process
>> might have a large pending read, so that the Cygwin process calling select on
>> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
>> with the Cygwin process waiting for write ready while the non-Cygwin process is
>> blocked trying to read.
> 
> Then, the above patch is for another issue.
> The problem happes when:
> 1) Start command prompt.
> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> This causes hang up in cat. In this case, pipe is created by cmd.exe.
> Therefore, read_mtx is not created.

Confirmed, and your patch fixes it.  Maybe you should check for error in the 
call to CreateMutexW and print a debug message in that case.

>> My suggestion is that we impose a timeout in this situation, after which select
>> reports write ready.
> 
> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> that write side cannot detect close on read side.
> Is it possible to open read handle temporally when pipe_data_available()
> is called?

That would be nice, but I have no idea how you could do that.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-11 13:12                                 ` Ken Brown
@ 2021-09-12  6:23                                   ` Takashi Yano
  2021-09-12 14:39                                     ` Ken Brown
  2021-09-12  8:48                                   ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-12  6:23 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3229 bytes --]

On Sat, 11 Sep 2021 09:12:02 -0400
Ken Brown wrote:
> On 9/10/2021 10:35 PM, Takashi Yano wrote:
> > On Fri, 10 Sep 2021 22:17:21 -0400
> > Ken Brown wrote:
> >> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> >>> On Fri, 10 Sep 2021 11:17:58 -0400
> >>> Ken Brown wrote:
> >>>> I've rerun your test with the latest version, and the test results are similar.
> >>>>     I've also run a suite of fifo tests that I've accumulated, and they all pass
> >>>> also, so I pushed your patch.
> >>>>
> >>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
> >>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> >>>> process.  I've proposed a timeout, but maybe there's a better idea.
> >>>
> >>> I am not pretty sure what is the problem, but is not the following
> >>> patch enough?
> >>>
> >>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> >>> index d309be2f7..13fba9a14 100644
> >>> --- a/winsup/cygwin/fhandler.h
> >>> +++ b/winsup/cygwin/fhandler.h
> >>> @@ -1205,6 +1205,7 @@ public:
> >>>      select_record *select_except (select_stuff *);
> >>>      char *get_proc_fd_name (char *buf);
> >>>      int open (int flags, mode_t mode = 0);
> >>> +  void open_setup (int flags);
> >>>      void fixup_after_fork (HANDLE);
> >>>      int dup (fhandler_base *child, int);
> >>>      int close ();
> >>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>> index 6994a5dce..d84e6ad84 100644
> >>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>> @@ -191,6 +191,17 @@ out:
> >>>      return 0;
> >>>    }
> >>>
> >>> +void
> >>> +fhandler_pipe::open_setup (int flags)
> >>> +{
> >>> +  fhandler_base::open_setup (flags);
> >>> +  if (get_dev () == FH_PIPER && !read_mtx)
> >>> +    {
> >>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> >>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> >>> +    }
> >>> +}
> >>> +
> >>>    off_t
> >>>    fhandler_pipe::lseek (off_t offset, int whence)
> >>>    {
> >>>
> >>>
> >>> AFAIK, another problem remaining is:
> >>>
> >>> On Mon, 6 Sep 2021 14:49:55 +0200
> >>> Corinna Vinschen wrote:
> >>>> - What about calling select for writing on pipes read by non-Cygwin
> >>>>     processes?  In that case, we still can't rely on WriteQuotaAvailable,
> >>>>     just as before.
> >>
> >> This is the problem I was talking about.  In this case the non-Cygwin process
> >> might have a large pending read, so that the Cygwin process calling select on
> >> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
> >> with the Cygwin process waiting for write ready while the non-Cygwin process is
> >> blocked trying to read.
> > 
> > Then, the above patch is for another issue.
> > The problem happes when:
> > 1) Start command prompt.
> > 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> > This causes hang up in cat. In this case, pipe is created by cmd.exe.
> > Therefore, read_mtx is not created.
> 
> Confirmed, and your patch fixes it.  Maybe you should check for error in the 
> call to CreateMutexW and print a debug message in that case.

I added the debug message.


-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Fix-deadlock-if-pipe-is-created-by-non-c.patch --]
[-- Type: application/octet-stream, Size: 1437 bytes --]

From 1c929f7d3633d99c0040e51552082829b17f550d Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Sun, 12 Sep 2021 15:06:05 +0900
Subject: [PATCH] Cygwin: pipe: Fix deadlock if pipe is created by non-cygwin
 app.

---
 winsup/cygwin/fhandler.h       |  1 +
 winsup/cygwin/fhandler_pipe.cc | 13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index d309be2f7..13fba9a14 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1205,6 +1205,7 @@ public:
   select_record *select_except (select_stuff *);
   char *get_proc_fd_name (char *buf);
   int open (int flags, mode_t mode = 0);
+  void open_setup (int flags);
   void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
   int close ();
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 6994a5dce..9b4255cfd 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -191,6 +191,19 @@ out:
   return 0;
 }
 
+void
+fhandler_pipe::open_setup (int flags)
+{
+  fhandler_base::open_setup (flags);
+  if (get_dev () == FH_PIPER && !read_mtx)
+    {
+      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
+      read_mtx = CreateMutex (sa, FALSE, NULL);
+      if (!read_mtx)
+	debug_printf ("CreateMutex failed: %E");
+    }
+}
+
 off_t
 fhandler_pipe::lseek (off_t offset, int whence)
 {
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-11 13:12                                 ` Ken Brown
  2021-09-12  6:23                                   ` Takashi Yano
@ 2021-09-12  8:48                                   ` Takashi Yano
  2021-09-12 11:04                                     ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-12  8:48 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3968 bytes --]

On Sat, 11 Sep 2021 09:12:02 -0400
Ken Brown wrote:
> On 9/10/2021 10:35 PM, Takashi Yano wrote:
> > On Fri, 10 Sep 2021 22:17:21 -0400
> > Ken Brown wrote:
> >> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> >>> On Fri, 10 Sep 2021 11:17:58 -0400
> >>> Ken Brown wrote:
> >>>> I've rerun your test with the latest version, and the test results are similar.
> >>>>     I've also run a suite of fifo tests that I've accumulated, and they all pass
> >>>> also, so I pushed your patch.
> >>>>
> >>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
> >>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> >>>> process.  I've proposed a timeout, but maybe there's a better idea.
> >>>
> >>> I am not pretty sure what is the problem, but is not the following
> >>> patch enough?
> >>>
> >>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> >>> index d309be2f7..13fba9a14 100644
> >>> --- a/winsup/cygwin/fhandler.h
> >>> +++ b/winsup/cygwin/fhandler.h
> >>> @@ -1205,6 +1205,7 @@ public:
> >>>      select_record *select_except (select_stuff *);
> >>>      char *get_proc_fd_name (char *buf);
> >>>      int open (int flags, mode_t mode = 0);
> >>> +  void open_setup (int flags);
> >>>      void fixup_after_fork (HANDLE);
> >>>      int dup (fhandler_base *child, int);
> >>>      int close ();
> >>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>> index 6994a5dce..d84e6ad84 100644
> >>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>> @@ -191,6 +191,17 @@ out:
> >>>      return 0;
> >>>    }
> >>>
> >>> +void
> >>> +fhandler_pipe::open_setup (int flags)
> >>> +{
> >>> +  fhandler_base::open_setup (flags);
> >>> +  if (get_dev () == FH_PIPER && !read_mtx)
> >>> +    {
> >>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> >>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> >>> +    }
> >>> +}
> >>> +
> >>>    off_t
> >>>    fhandler_pipe::lseek (off_t offset, int whence)
> >>>    {
> >>>
> >>>
> >>> AFAIK, another problem remaining is:
> >>>
> >>> On Mon, 6 Sep 2021 14:49:55 +0200
> >>> Corinna Vinschen wrote:
> >>>> - What about calling select for writing on pipes read by non-Cygwin
> >>>>     processes?  In that case, we still can't rely on WriteQuotaAvailable,
> >>>>     just as before.
> >>
> >> This is the problem I was talking about.  In this case the non-Cygwin process
> >> might have a large pending read, so that the Cygwin process calling select on
> >> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
> >> with the Cygwin process waiting for write ready while the non-Cygwin process is
> >> blocked trying to read.
> > 
> > Then, the above patch is for another issue.
> > The problem happes when:
> > 1) Start command prompt.
> > 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> > This causes hang up in cat. In this case, pipe is created by cmd.exe.
> > Therefore, read_mtx is not created.
> 
> Confirmed, and your patch fixes it.  Maybe you should check for error in the 
> call to CreateMutexW and print a debug message in that case.
> 
> >> My suggestion is that we impose a timeout in this situation, after which select
> >> reports write ready.
> > 
> > Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> > that write side cannot detect close on read side.
> > Is it possible to open read handle temporally when pipe_data_available()
> > is called?
> 
> That would be nice, but I have no idea how you could do that.

Hmm. Then, what about PoC code attached? This returns to Corinna's
query_hdl, and counts read/write handles to detect closing reader side.

If the number of read handles is equal to number of write handles,
only the pairs of write handle and query_hdl are alive. So, read pipe
supposed to be closed.

This patch depends another patch I posted a few hours ago.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Return-to-query_hdl-strategy-with-counti.patch --]
[-- Type: application/octet-stream, Size: 9622 bytes --]

From 050a1e204dd0753d3498b7f2e84dff5e6c7704d9 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Sun, 12 Sep 2021 17:42:03 +0900
Subject: [PATCH] Cygwin: pipe: Return to query_hdl strategy with counting r/w
 handles.

---
 winsup/cygwin/fhandler.h       |  5 ++
 winsup/cygwin/fhandler_pipe.cc | 91 +++++++++++++++++++++++-----------
 winsup/cygwin/select.cc        | 50 ++++++++++++++++---
 winsup/cygwin/spawn.cc         |  2 +
 4 files changed, 112 insertions(+), 36 deletions(-)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 13fba9a14..f09af2c37 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1176,10 +1176,15 @@ class fhandler_pipe_fifo: public fhandler_base
 {
  protected:
   size_t pipe_buf_size;
+  HANDLE query_hdl;
 
  public:
   fhandler_pipe_fifo ();
 
+  HANDLE get_query_handle () const { return query_hdl; }
+  void close_query_handle () { CloseHandle (query_hdl); query_hdl = NULL; }
+  bool reader_closed ();
+
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
 
 };
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 9b4255cfd..2818421ec 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -56,6 +56,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
   fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
   fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
     : FILE_PIPE_QUEUE_OPERATION;
+  if (query_hdl)
+    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;
   status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
 				 FilePipeInformation);
   if (!NT_SUCCESS (status))
@@ -202,6 +204,8 @@ fhandler_pipe::open_setup (int flags)
       if (!read_mtx)
 	debug_printf ("CreateMutex failed: %E");
     }
+  if (get_dev () == FH_PIPEW && !query_hdl)
+    set_pipe_non_blocking (is_nonblocking ());
 }
 
 off_t
@@ -268,39 +272,11 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   while (nbytes < len)
     {
       ULONG_PTR nbytes_now = 0;
-      size_t left = len - nbytes;
-      ULONG len1 = (ULONG) left;
+      ULONG len1 = (ULONG) (len - nbytes);
       waitret = WAIT_OBJECT_0;
 
       if (evt)
 	ResetEvent (evt);
-      if (!is_nonblocking ())
-	{
-	  FILE_PIPE_LOCAL_INFORMATION fpli;
-
-	  /* If the pipe is empty, don't request more bytes than pipe
-	     buffer size - 1. Pending read lowers WriteQuotaAvailable on
-	     the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not. */
-	  ULONG chunk = pipe_buf_size - 1;
-	  status = NtQueryInformationFile (get_handle (), &io,
-					   &fpli, sizeof (fpli),
-					   FilePipeLocalInformation);
-	  if (NT_SUCCESS (status))
-	    {
-	      if (fpli.ReadDataAvailable > 0)
-		chunk = left;
-	      else if (nbytes != 0)
-		break;
-	      else
-		chunk = fpli.InboundQuota - 1;
-	    }
-	  else if (nbytes != 0)
-	    break;
-
-	  if (len1 > chunk)
-	    len1 = chunk;
-	}
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
 			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
@@ -385,6 +361,16 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   len = nbytes;
 }
 
+bool
+fhandler_pipe_fifo::reader_closed ()
+{
+  if (!query_hdl)
+    return false;
+  int n_reader = get_obj_handle_count (query_hdl);
+  int n_writer = get_obj_handle_count (get_handle ());
+  return n_reader == n_writer;
+}
+
 ssize_t __reg3
 fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 {
@@ -493,7 +479,20 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 			      get_obj_handle_count (select_sem), NULL);
 	  /* 0 bytes returned?  EAGAIN.  See above. */
 	  if (NT_SUCCESS (status) && nbytes == 0)
-	    set_errno (EAGAIN);
+	    {
+	      if (reader_closed ())
+		{
+		  set_errno (EPIPE);
+		  raise (SIGPIPE);
+		}
+	      else if (is_nonblocking ())
+		set_errno (EAGAIN);
+	      else
+		{
+		  cygwait (select_sem, 10);
+		  continue;
+		}
+	    }
 	}
       else if (STATUS_PIPE_IS_CLOSED (status))
 	{
@@ -522,6 +521,10 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
     fork_fixup (parent, read_mtx, "read_mtx");
   if (select_sem)
     fork_fixup (parent, select_sem, "select_sem");
+  /* Do not duplicate query_hdl if it has been already inrherited. */
+  if (query_hdl && !get_obj_handle_count (query_hdl))
+    fork_fixup (parent, query_hdl, "query_hdl");
+
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -552,6 +555,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (query_hdl &&
+	   !DuplicateHandle (GetCurrentProcess (), query_hdl,
+			    GetCurrentProcess (), &ftp->query_hdl,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -567,6 +579,8 @@ fhandler_pipe::close ()
       ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
       CloseHandle (select_sem);
     }
+  if (query_hdl)
+    CloseHandle (query_hdl);
   return fhandler_base::close ();
 }
 
@@ -791,6 +805,23 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_sem,
 			 GetCurrentProcess (), &fhs[1]->select_sem,
 			 0, 1, DUPLICATE_SAME_ACCESS);
+      if (!DuplicateHandle (GetCurrentProcess (), r,
+			    GetCurrentProcess (), &fhs[1]->query_hdl,
+			    GENERIC_READ, !(mode & O_CLOEXEC), 0))
+	{
+	  CloseHandle (fhs[0]->select_sem);
+	  delete fhs[0];
+	  CloseHandle (r);
+	  CloseHandle (fhs[1]->select_sem);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	{
+	  /* Call set_pipe_non_blocking() again after creating query_hdl. */
+	  fhs[1]->set_pipe_non_blocking (fhs[1]->is_nonblocking ());
+	  res = 0;
+	}
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 5e583434c..c569a059c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -608,12 +608,43 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
     }
   if (writing)
     {
-      /* WriteQuotaAvailable is decremented by the number of bytes requested
-	 by a blocking reader on the other side of the pipe.  Cygwin readers
-	 are serialized and never request a number of bytes equivalent to the
-	 full buffer size.  So WriteQuotaAvailable is 0 only if either the
-	 read buffer on the other side is really full, or if we have non-Cygwin
-	 readers. */
+      /* If there is anything available in the pipe buffer then signal
+        that.  This means that a pipe could still block since you could
+        be trying to write more to the pipe than is available in the
+        buffer but that is the hazard of select().
+
+        Note that WriteQuotaAvailable is unreliable.
+
+        Usually WriteQuotaAvailable on the write side reflects the space
+        available in the inbound buffer on the read side.  However, if a
+        pipe read is currently pending, WriteQuotaAvailable on the write side
+        is decremented by the number of bytes the read side is requesting.
+        So it's possible (even likely) that WriteQuotaAvailable is 0, even
+        if the inbound buffer on the read side is not full.  This can lead to
+        a deadlock situation: The reader is waiting for data, but select
+        on the writer side assumes that no space is available in the read
+        side inbound buffer.
+
+        Consequentially, the only reliable information is available on the
+        read side, so fetch info from the read side via the pipe-specific
+        query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
+        interesting value, which is the OutboundQuote on the write side,
+        decremented by the number of bytes of data in that buffer. */
+      /* Note: Do not use NtQueryInformationFile() for query_hdl because
+	 NtQueryInformationFile() seems to interfere with reading pipes
+	 in non-cygwin apps. Instead, use PeekNamedPipe() here. */
+      if (fh->get_device () == FH_PIPEW)
+	{
+	  HANDLE query_hdl = ((fhandler_pipe *) fh)->get_query_handle ();
+	  if (query_hdl)
+	    {
+	      DWORD nbytes_in_pipe;
+	      PeekNamedPipe (query_hdl, NULL, 0, NULL, &nbytes_in_pipe, NULL);
+	      fpli.WriteQuotaAvailable = fpli.OutboundQuota - nbytes_in_pipe;
+	    }
+	  else
+	    return 1;
+	}
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
@@ -712,6 +743,13 @@ out:
   h = fh->get_output_handle ();
   if (s->write_selected && dev != FH_PIPER)
     {
+      if (dev == FH_PIPEW && ((fhandler_pipe *) fh)->reader_closed ())
+	{
+	  gotone += s->write_ready = true;
+	  if (s->except_selected)
+	    gotone += s->except_ready = true;
+	  return gotone;
+	}
       gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }
diff --git a/winsup/cygwin/spawn.cc b/winsup/cygwin/spawn.cc
index 0bde0b04d..e902b5080 100644
--- a/winsup/cygwin/spawn.cc
+++ b/winsup/cygwin/spawn.cc
@@ -657,6 +657,8 @@ child_info_spawn::worker (const char *prog_arg, const char *const *argv,
 		ptys->create_invisible_console ();
 		ptys->setup_locale ();
 	      }
+	    else if (cfd->get_dev () == FH_PIPEW)
+	      ((fhandler_pipe *)(fhandler_base *) cfd)->close_query_handle ();
 	}
 
       bool enable_pcon = false;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12  8:48                                   ` Takashi Yano
@ 2021-09-12 11:04                                     ` Takashi Yano
  2021-09-12 15:10                                       ` Ken Brown
                                                         ` (2 more replies)
  0 siblings, 3 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-12 11:04 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4233 bytes --]

On Sun, 12 Sep 2021 17:48:49 +0900
Takashi Yano wrote:
> On Sat, 11 Sep 2021 09:12:02 -0400
> Ken Brown wrote:
> > On 9/10/2021 10:35 PM, Takashi Yano wrote:
> > > On Fri, 10 Sep 2021 22:17:21 -0400
> > > Ken Brown wrote:
> > >> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> > >>> On Fri, 10 Sep 2021 11:17:58 -0400
> > >>> Ken Brown wrote:
> > >>>> I've rerun your test with the latest version, and the test results are similar.
> > >>>>     I've also run a suite of fifo tests that I've accumulated, and they all pass
> > >>>> also, so I pushed your patch.
> > >>>>
> > >>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
> > >>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> > >>>> process.  I've proposed a timeout, but maybe there's a better idea.
> > >>>
> > >>> I am not pretty sure what is the problem, but is not the following
> > >>> patch enough?
> > >>>
> > >>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> > >>> index d309be2f7..13fba9a14 100644
> > >>> --- a/winsup/cygwin/fhandler.h
> > >>> +++ b/winsup/cygwin/fhandler.h
> > >>> @@ -1205,6 +1205,7 @@ public:
> > >>>      select_record *select_except (select_stuff *);
> > >>>      char *get_proc_fd_name (char *buf);
> > >>>      int open (int flags, mode_t mode = 0);
> > >>> +  void open_setup (int flags);
> > >>>      void fixup_after_fork (HANDLE);
> > >>>      int dup (fhandler_base *child, int);
> > >>>      int close ();
> > >>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > >>> index 6994a5dce..d84e6ad84 100644
> > >>> --- a/winsup/cygwin/fhandler_pipe.cc
> > >>> +++ b/winsup/cygwin/fhandler_pipe.cc
> > >>> @@ -191,6 +191,17 @@ out:
> > >>>      return 0;
> > >>>    }
> > >>>
> > >>> +void
> > >>> +fhandler_pipe::open_setup (int flags)
> > >>> +{
> > >>> +  fhandler_base::open_setup (flags);
> > >>> +  if (get_dev () == FH_PIPER && !read_mtx)
> > >>> +    {
> > >>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> > >>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> > >>> +    }
> > >>> +}
> > >>> +
> > >>>    off_t
> > >>>    fhandler_pipe::lseek (off_t offset, int whence)
> > >>>    {
> > >>>
> > >>>
> > >>> AFAIK, another problem remaining is:
> > >>>
> > >>> On Mon, 6 Sep 2021 14:49:55 +0200
> > >>> Corinna Vinschen wrote:
> > >>>> - What about calling select for writing on pipes read by non-Cygwin
> > >>>>     processes?  In that case, we still can't rely on WriteQuotaAvailable,
> > >>>>     just as before.
> > >>
> > >> This is the problem I was talking about.  In this case the non-Cygwin process
> > >> might have a large pending read, so that the Cygwin process calling select on
> > >> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
> > >> with the Cygwin process waiting for write ready while the non-Cygwin process is
> > >> blocked trying to read.
> > > 
> > > Then, the above patch is for another issue.
> > > The problem happes when:
> > > 1) Start command prompt.
> > > 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> > > This causes hang up in cat. In this case, pipe is created by cmd.exe.
> > > Therefore, read_mtx is not created.
> > 
> > Confirmed, and your patch fixes it.  Maybe you should check for error in the 
> > call to CreateMutexW and print a debug message in that case.
> > 
> > >> My suggestion is that we impose a timeout in this situation, after which select
> > >> reports write ready.
> > > 
> > > Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> > > that write side cannot detect close on read side.
> > > Is it possible to open read handle temporally when pipe_data_available()
> > > is called?
> > 
> > That would be nice, but I have no idea how you could do that.
> 
> Hmm. Then, what about PoC code attached? This returns to Corinna's
> query_hdl, and counts read/write handles to detect closing reader side.
> 
> If the number of read handles is equal to number of write handles,
> only the pairs of write handle and query_hdl are alive. So, read pipe
> supposed to be closed.
> 
> This patch depends another patch I posted a few hours ago.

Revised a bit.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Return-to-query_hdl-strategy-with-counti.patch --]
[-- Type: application/octet-stream, Size: 9924 bytes --]

From fc611be411e1d380f1258bb0b225d99757b4ef59 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Sun, 12 Sep 2021 19:59:53 +0900
Subject: [PATCH] Cygwin: pipe: Return to query_hdl strategy with counting r/w
 handles.

---
 winsup/cygwin/fhandler.h       |  5 ++
 winsup/cygwin/fhandler_pipe.cc | 98 ++++++++++++++++++++++++----------
 winsup/cygwin/select.cc        | 50 ++++++++++++++---
 winsup/cygwin/spawn.cc         |  2 +
 4 files changed, 121 insertions(+), 34 deletions(-)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 13fba9a14..f09af2c37 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1176,10 +1176,15 @@ class fhandler_pipe_fifo: public fhandler_base
 {
  protected:
   size_t pipe_buf_size;
+  HANDLE query_hdl;
 
  public:
   fhandler_pipe_fifo ();
 
+  HANDLE get_query_handle () const { return query_hdl; }
+  void close_query_handle () { CloseHandle (query_hdl); query_hdl = NULL; }
+  bool reader_closed ();
+
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
 
 };
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 9b4255cfd..b051f5c03 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -56,6 +56,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
   fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
   fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
     : FILE_PIPE_QUEUE_OPERATION;
+  if (query_hdl)
+    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;
   status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
 				 FilePipeInformation);
   if (!NT_SUCCESS (status))
@@ -202,6 +204,8 @@ fhandler_pipe::open_setup (int flags)
       if (!read_mtx)
 	debug_printf ("CreateMutex failed: %E");
     }
+  if (get_dev () == FH_PIPEW && !query_hdl)
+    set_pipe_non_blocking (is_nonblocking ());
 }
 
 off_t
@@ -268,39 +272,22 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   while (nbytes < len)
     {
       ULONG_PTR nbytes_now = 0;
-      size_t left = len - nbytes;
-      ULONG len1 = (ULONG) left;
+      ULONG len1 = (ULONG) (len - nbytes);
       waitret = WAIT_OBJECT_0;
 
       if (evt)
 	ResetEvent (evt);
-      if (!is_nonblocking ())
+      FILE_PIPE_LOCAL_INFORMATION fpli;
+      status = NtQueryInformationFile (get_handle (), &io,
+				       &fpli, sizeof (fpli),
+				       FilePipeLocalInformation);
+      if (NT_SUCCESS (status))
 	{
-	  FILE_PIPE_LOCAL_INFORMATION fpli;
-
-	  /* If the pipe is empty, don't request more bytes than pipe
-	     buffer size - 1. Pending read lowers WriteQuotaAvailable on
-	     the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not. */
-	  ULONG chunk = pipe_buf_size - 1;
-	  status = NtQueryInformationFile (get_handle (), &io,
-					   &fpli, sizeof (fpli),
-					   FilePipeLocalInformation);
-	  if (NT_SUCCESS (status))
-	    {
-	      if (fpli.ReadDataAvailable > 0)
-		chunk = left;
-	      else if (nbytes != 0)
-		break;
-	      else
-		chunk = fpli.InboundQuota - 1;
-	    }
-	  else if (nbytes != 0)
-	    break;
-
-	  if (len1 > chunk)
-	    len1 = chunk;
+	if (fpli.ReadDataAvailable == 0 && nbytes != 0)
+	  break;
 	}
+      else if (nbytes != 0)
+	break;
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
 			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
@@ -385,6 +372,16 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   len = nbytes;
 }
 
+bool
+fhandler_pipe_fifo::reader_closed ()
+{
+  if (!query_hdl)
+    return false;
+  int n_reader = get_obj_handle_count (query_hdl);
+  int n_writer = get_obj_handle_count (get_handle ());
+  return n_reader == n_writer;
+}
+
 ssize_t __reg3
 fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 {
@@ -493,7 +490,20 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 			      get_obj_handle_count (select_sem), NULL);
 	  /* 0 bytes returned?  EAGAIN.  See above. */
 	  if (NT_SUCCESS (status) && nbytes == 0)
-	    set_errno (EAGAIN);
+	    {
+	      if (reader_closed ())
+		{
+		  set_errno (EPIPE);
+		  raise (SIGPIPE);
+		}
+	      else if (is_nonblocking ())
+		set_errno (EAGAIN);
+	      else
+		{
+		  cygwait (select_sem, 10);
+		  continue;
+		}
+	    }
 	}
       else if (STATUS_PIPE_IS_CLOSED (status))
 	{
@@ -522,6 +532,10 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
     fork_fixup (parent, read_mtx, "read_mtx");
   if (select_sem)
     fork_fixup (parent, select_sem, "select_sem");
+  /* Do not duplicate query_hdl if it has been already inherited. */
+  if (query_hdl && !get_obj_handle_count (query_hdl))
+    fork_fixup (parent, query_hdl, "query_hdl");
+
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -552,6 +566,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (query_hdl &&
+	   !DuplicateHandle (GetCurrentProcess (), query_hdl,
+			    GetCurrentProcess (), &ftp->query_hdl,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -567,6 +590,8 @@ fhandler_pipe::close ()
       ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
       CloseHandle (select_sem);
     }
+  if (query_hdl)
+    CloseHandle (query_hdl);
   return fhandler_base::close ();
 }
 
@@ -791,6 +816,23 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_sem,
 			 GetCurrentProcess (), &fhs[1]->select_sem,
 			 0, 1, DUPLICATE_SAME_ACCESS);
+      if (!DuplicateHandle (GetCurrentProcess (), r,
+			    GetCurrentProcess (), &fhs[1]->query_hdl,
+			    GENERIC_READ, !(mode & O_CLOEXEC), 0))
+	{
+	  CloseHandle (fhs[0]->select_sem);
+	  delete fhs[0];
+	  CloseHandle (r);
+	  CloseHandle (fhs[1]->select_sem);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	{
+	  /* Call set_pipe_non_blocking() again after creating query_hdl. */
+	  fhs[1]->set_pipe_non_blocking (fhs[1]->is_nonblocking ());
+	  res = 0;
+	}
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 5e583434c..c569a059c 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -608,12 +608,43 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
     }
   if (writing)
     {
-      /* WriteQuotaAvailable is decremented by the number of bytes requested
-	 by a blocking reader on the other side of the pipe.  Cygwin readers
-	 are serialized and never request a number of bytes equivalent to the
-	 full buffer size.  So WriteQuotaAvailable is 0 only if either the
-	 read buffer on the other side is really full, or if we have non-Cygwin
-	 readers. */
+      /* If there is anything available in the pipe buffer then signal
+        that.  This means that a pipe could still block since you could
+        be trying to write more to the pipe than is available in the
+        buffer but that is the hazard of select().
+
+        Note that WriteQuotaAvailable is unreliable.
+
+        Usually WriteQuotaAvailable on the write side reflects the space
+        available in the inbound buffer on the read side.  However, if a
+        pipe read is currently pending, WriteQuotaAvailable on the write side
+        is decremented by the number of bytes the read side is requesting.
+        So it's possible (even likely) that WriteQuotaAvailable is 0, even
+        if the inbound buffer on the read side is not full.  This can lead to
+        a deadlock situation: The reader is waiting for data, but select
+        on the writer side assumes that no space is available in the read
+        side inbound buffer.
+
+        Consequentially, the only reliable information is available on the
+        read side, so fetch info from the read side via the pipe-specific
+        query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
+        interesting value, which is the OutboundQuote on the write side,
+        decremented by the number of bytes of data in that buffer. */
+      /* Note: Do not use NtQueryInformationFile() for query_hdl because
+	 NtQueryInformationFile() seems to interfere with reading pipes
+	 in non-cygwin apps. Instead, use PeekNamedPipe() here. */
+      if (fh->get_device () == FH_PIPEW)
+	{
+	  HANDLE query_hdl = ((fhandler_pipe *) fh)->get_query_handle ();
+	  if (query_hdl)
+	    {
+	      DWORD nbytes_in_pipe;
+	      PeekNamedPipe (query_hdl, NULL, 0, NULL, &nbytes_in_pipe, NULL);
+	      fpli.WriteQuotaAvailable = fpli.OutboundQuota - nbytes_in_pipe;
+	    }
+	  else
+	    return 1;
+	}
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
@@ -712,6 +743,13 @@ out:
   h = fh->get_output_handle ();
   if (s->write_selected && dev != FH_PIPER)
     {
+      if (dev == FH_PIPEW && ((fhandler_pipe *) fh)->reader_closed ())
+	{
+	  gotone += s->write_ready = true;
+	  if (s->except_selected)
+	    gotone += s->except_ready = true;
+	  return gotone;
+	}
       gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }
diff --git a/winsup/cygwin/spawn.cc b/winsup/cygwin/spawn.cc
index 0bde0b04d..e902b5080 100644
--- a/winsup/cygwin/spawn.cc
+++ b/winsup/cygwin/spawn.cc
@@ -657,6 +657,8 @@ child_info_spawn::worker (const char *prog_arg, const char *const *argv,
 		ptys->create_invisible_console ();
 		ptys->setup_locale ();
 	      }
+	    else if (cfd->get_dev () == FH_PIPEW)
+	      ((fhandler_pipe *)(fhandler_base *) cfd)->close_query_handle ();
 	}
 
       bool enable_pcon = false;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12  6:23                                   ` Takashi Yano
@ 2021-09-12 14:39                                     ` Ken Brown
  2021-09-13  9:11                                       ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-12 14:39 UTC (permalink / raw)
  To: cygwin-developers

On 9/12/2021 2:23 AM, Takashi Yano wrote:
> On Sat, 11 Sep 2021 09:12:02 -0400
> Ken Brown wrote:
>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
>>> On Fri, 10 Sep 2021 22:17:21 -0400
>>> Ken Brown wrote:
>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
>>>>> Ken Brown wrote:
>>>>>> I've rerun your test with the latest version, and the test results are similar.
>>>>>>      I've also run a suite of fifo tests that I've accumulated, and they all pass
>>>>>> also, so I pushed your patch.
>>>>>>
>>>>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
>>>>>
>>>>> I am not pretty sure what is the problem, but is not the following
>>>>> patch enough?
>>>>>
>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
>>>>> index d309be2f7..13fba9a14 100644
>>>>> --- a/winsup/cygwin/fhandler.h
>>>>> +++ b/winsup/cygwin/fhandler.h
>>>>> @@ -1205,6 +1205,7 @@ public:
>>>>>       select_record *select_except (select_stuff *);
>>>>>       char *get_proc_fd_name (char *buf);
>>>>>       int open (int flags, mode_t mode = 0);
>>>>> +  void open_setup (int flags);
>>>>>       void fixup_after_fork (HANDLE);
>>>>>       int dup (fhandler_base *child, int);
>>>>>       int close ();
>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>>>> index 6994a5dce..d84e6ad84 100644
>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>>>> @@ -191,6 +191,17 @@ out:
>>>>>       return 0;
>>>>>     }
>>>>>
>>>>> +void
>>>>> +fhandler_pipe::open_setup (int flags)
>>>>> +{
>>>>> +  fhandler_base::open_setup (flags);
>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
>>>>> +    {
>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>>     off_t
>>>>>     fhandler_pipe::lseek (off_t offset, int whence)
>>>>>     {
>>>>>
>>>>>
>>>>> AFAIK, another problem remaining is:
>>>>>
>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
>>>>> Corinna Vinschen wrote:
>>>>>> - What about calling select for writing on pipes read by non-Cygwin
>>>>>>      processes?  In that case, we still can't rely on WriteQuotaAvailable,
>>>>>>      just as before.
>>>>
>>>> This is the problem I was talking about.  In this case the non-Cygwin process
>>>> might have a large pending read, so that the Cygwin process calling select on
>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
>>>> with the Cygwin process waiting for write ready while the non-Cygwin process is
>>>> blocked trying to read.
>>>
>>> Then, the above patch is for another issue.
>>> The problem happes when:
>>> 1) Start command prompt.
>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
>>> Therefore, read_mtx is not created.
>>
>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
>> call to CreateMutexW and print a debug message in that case.
> 
> I added the debug message.

LGTM.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 11:04                                     ` Takashi Yano
@ 2021-09-12 15:10                                       ` Ken Brown
  2021-09-12 21:46                                         ` Ken Brown
  2021-09-12 23:41                                         ` Takashi Yano
  2021-09-13 17:42                                       ` Ken Brown
  2021-09-13 18:32                                       ` Corinna Vinschen
  2 siblings, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-12 15:10 UTC (permalink / raw)
  To: cygwin-developers

On 9/12/2021 7:04 AM, Takashi Yano wrote:
> On Sun, 12 Sep 2021 17:48:49 +0900
> Takashi Yano wrote:
>> On Sat, 11 Sep 2021 09:12:02 -0400
>> Ken Brown wrote:
>>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
>>>> On Fri, 10 Sep 2021 22:17:21 -0400
>>>> Ken Brown wrote:
>>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
>>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
>>>>>> Ken Brown wrote:
>>>>>>> I've rerun your test with the latest version, and the test results are similar.
>>>>>>>      I've also run a suite of fifo tests that I've accumulated, and they all pass
>>>>>>> also, so I pushed your patch.
>>>>>>>
>>>>>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
>>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
>>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
>>>>>>
>>>>>> I am not pretty sure what is the problem, but is not the following
>>>>>> patch enough?
>>>>>>
>>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
>>>>>> index d309be2f7..13fba9a14 100644
>>>>>> --- a/winsup/cygwin/fhandler.h
>>>>>> +++ b/winsup/cygwin/fhandler.h
>>>>>> @@ -1205,6 +1205,7 @@ public:
>>>>>>       select_record *select_except (select_stuff *);
>>>>>>       char *get_proc_fd_name (char *buf);
>>>>>>       int open (int flags, mode_t mode = 0);
>>>>>> +  void open_setup (int flags);
>>>>>>       void fixup_after_fork (HANDLE);
>>>>>>       int dup (fhandler_base *child, int);
>>>>>>       int close ();
>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>>>>> index 6994a5dce..d84e6ad84 100644
>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>>>>> @@ -191,6 +191,17 @@ out:
>>>>>>       return 0;
>>>>>>     }
>>>>>>
>>>>>> +void
>>>>>> +fhandler_pipe::open_setup (int flags)
>>>>>> +{
>>>>>> +  fhandler_base::open_setup (flags);
>>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
>>>>>> +    {
>>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
>>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>>     off_t
>>>>>>     fhandler_pipe::lseek (off_t offset, int whence)
>>>>>>     {
>>>>>>
>>>>>>
>>>>>> AFAIK, another problem remaining is:
>>>>>>
>>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
>>>>>> Corinna Vinschen wrote:
>>>>>>> - What about calling select for writing on pipes read by non-Cygwin
>>>>>>>      processes?  In that case, we still can't rely on WriteQuotaAvailable,
>>>>>>>      just as before.
>>>>>
>>>>> This is the problem I was talking about.  In this case the non-Cygwin process
>>>>> might have a large pending read, so that the Cygwin process calling select on
>>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
>>>>> with the Cygwin process waiting for write ready while the non-Cygwin process is
>>>>> blocked trying to read.
>>>>
>>>> Then, the above patch is for another issue.
>>>> The problem happes when:
>>>> 1) Start command prompt.
>>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
>>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
>>>> Therefore, read_mtx is not created.
>>>
>>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
>>> call to CreateMutexW and print a debug message in that case.
>>>
>>>>> My suggestion is that we impose a timeout in this situation, after which select
>>>>> reports write ready.
>>>>
>>>> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
>>>> that write side cannot detect close on read side.
>>>> Is it possible to open read handle temporally when pipe_data_available()
>>>> is called?
>>>
>>> That would be nice, but I have no idea how you could do that.
>>
>> Hmm. Then, what about PoC code attached? This returns to Corinna's
>> query_hdl, and counts read/write handles to detect closing reader side.
>>
>> If the number of read handles is equal to number of write handles,
>> only the pairs of write handle and query_hdl are alive. So, read pipe
>> supposed to be closed.
>>
>> This patch depends another patch I posted a few hours ago.
> 
> Revised a bit.

I don't see how this solves the problem.  In the case we were worried about 
where we have a non-Cygwin reader, the writer has no query_hdl, and you're just 
always reporting write ready, aren't you?  Or am I missing something?

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 15:10                                       ` Ken Brown
@ 2021-09-12 21:46                                         ` Ken Brown
  2021-09-12 23:54                                           ` Takashi Yano
  2021-09-13  9:42                                           ` Corinna Vinschen
  2021-09-12 23:41                                         ` Takashi Yano
  1 sibling, 2 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-12 21:46 UTC (permalink / raw)
  To: cygwin-developers

On 9/12/2021 11:10 AM, Ken Brown wrote:
> On 9/12/2021 7:04 AM, Takashi Yano wrote:
>> On Sun, 12 Sep 2021 17:48:49 +0900
>> Takashi Yano wrote:
>>> On Sat, 11 Sep 2021 09:12:02 -0400
>>> Ken Brown wrote:
>>>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
>>>>> On Fri, 10 Sep 2021 22:17:21 -0400
>>>>> Ken Brown wrote:
>>>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
>>>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
>>>>>>> Ken Brown wrote:
>>>>>>>> I've rerun your test with the latest version, and the test results are 
>>>>>>>> similar.
>>>>>>>>      I've also run a suite of fifo tests that I've accumulated, and they 
>>>>>>>> all pass
>>>>>>>> also, so I pushed your patch.
>>>>>>>>
>>>>>>>> I think we're in pretty good shape now.  The only detail remaining, 
>>>>>>>> AFAIK, is
>>>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
>>>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
>>>>>>>
>>>>>>> I am not pretty sure what is the problem, but is not the following
>>>>>>> patch enough?
>>>>>>>
>>>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
>>>>>>> index d309be2f7..13fba9a14 100644
>>>>>>> --- a/winsup/cygwin/fhandler.h
>>>>>>> +++ b/winsup/cygwin/fhandler.h
>>>>>>> @@ -1205,6 +1205,7 @@ public:
>>>>>>>       select_record *select_except (select_stuff *);
>>>>>>>       char *get_proc_fd_name (char *buf);
>>>>>>>       int open (int flags, mode_t mode = 0);
>>>>>>> +  void open_setup (int flags);
>>>>>>>       void fixup_after_fork (HANDLE);
>>>>>>>       int dup (fhandler_base *child, int);
>>>>>>>       int close ();
>>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>>>>>> index 6994a5dce..d84e6ad84 100644
>>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>>>>>> @@ -191,6 +191,17 @@ out:
>>>>>>>       return 0;
>>>>>>>     }
>>>>>>>
>>>>>>> +void
>>>>>>> +fhandler_pipe::open_setup (int flags)
>>>>>>> +{
>>>>>>> +  fhandler_base::open_setup (flags);
>>>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
>>>>>>> +    {
>>>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
>>>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>>     off_t
>>>>>>>     fhandler_pipe::lseek (off_t offset, int whence)
>>>>>>>     {
>>>>>>>
>>>>>>>
>>>>>>> AFAIK, another problem remaining is:
>>>>>>>
>>>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
>>>>>>> Corinna Vinschen wrote:
>>>>>>>> - What about calling select for writing on pipes read by non-Cygwin
>>>>>>>>      processes?  In that case, we still can't rely on WriteQuotaAvailable,
>>>>>>>>      just as before.
>>>>>>
>>>>>> This is the problem I was talking about.  In this case the non-Cygwin process
>>>>>> might have a large pending read, so that the Cygwin process calling select on
>>>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a 
>>>>>> deadlock
>>>>>> with the Cygwin process waiting for write ready while the non-Cygwin 
>>>>>> process is
>>>>>> blocked trying to read.
>>>>>
>>>>> Then, the above patch is for another issue.
>>>>> The problem happes when:
>>>>> 1) Start command prompt.
>>>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
>>>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
>>>>> Therefore, read_mtx is not created.
>>>>
>>>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
>>>> call to CreateMutexW and print a debug message in that case.
>>>>
>>>>>> My suggestion is that we impose a timeout in this situation, after which 
>>>>>> select
>>>>>> reports write ready.
>>>>>
>>>>> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
>>>>> that write side cannot detect close on read side.
>>>>> Is it possible to open read handle temporally when pipe_data_available()
>>>>> is called?
>>>>
>>>> That would be nice, but I have no idea how you could do that.
>>>
>>> Hmm. Then, what about PoC code attached? This returns to Corinna's
>>> query_hdl, and counts read/write handles to detect closing reader side.
>>>
>>> If the number of read handles is equal to number of write handles,
>>> only the pairs of write handle and query_hdl are alive. So, read pipe
>>> supposed to be closed.
>>>
>>> This patch depends another patch I posted a few hours ago.
>>
>> Revised a bit.
> 
> I don't see how this solves the problem.  In the case we were worried about 
> where we have a non-Cygwin reader, the writer has no query_hdl, and you're just 
> always reporting write ready, aren't you?  Or am I missing something?

BTW, we could just decide that always reporting write ready in this corner case 
is acceptable.  But then we could just do that without going back to query_hdl.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 15:10                                       ` Ken Brown
  2021-09-12 21:46                                         ` Ken Brown
@ 2021-09-12 23:41                                         ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-12 23:41 UTC (permalink / raw)
  To: cygwin-developers

On Sun, 12 Sep 2021 11:10:54 -0400
Ken Brown wrote:
> On 9/12/2021 7:04 AM, Takashi Yano wrote:
> > On Sun, 12 Sep 2021 17:48:49 +0900
> > Takashi Yano wrote:
> >> On Sat, 11 Sep 2021 09:12:02 -0400
> >> Ken Brown wrote:
> >>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
> >>>> On Fri, 10 Sep 2021 22:17:21 -0400
> >>>> Ken Brown wrote:
> >>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> >>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
> >>>>>> Ken Brown wrote:
> >>>>>>> I've rerun your test with the latest version, and the test results are similar.
> >>>>>>>      I've also run a suite of fifo tests that I've accumulated, and they all pass
> >>>>>>> also, so I pushed your patch.
> >>>>>>>
> >>>>>>> I think we're in pretty good shape now.  The only detail remaining, AFAIK, is
> >>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> >>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
> >>>>>>
> >>>>>> I am not pretty sure what is the problem, but is not the following
> >>>>>> patch enough?
> >>>>>>
> >>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> >>>>>> index d309be2f7..13fba9a14 100644
> >>>>>> --- a/winsup/cygwin/fhandler.h
> >>>>>> +++ b/winsup/cygwin/fhandler.h
> >>>>>> @@ -1205,6 +1205,7 @@ public:
> >>>>>>       select_record *select_except (select_stuff *);
> >>>>>>       char *get_proc_fd_name (char *buf);
> >>>>>>       int open (int flags, mode_t mode = 0);
> >>>>>> +  void open_setup (int flags);
> >>>>>>       void fixup_after_fork (HANDLE);
> >>>>>>       int dup (fhandler_base *child, int);
> >>>>>>       int close ();
> >>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>>>>> index 6994a5dce..d84e6ad84 100644
> >>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>>>>> @@ -191,6 +191,17 @@ out:
> >>>>>>       return 0;
> >>>>>>     }
> >>>>>>
> >>>>>> +void
> >>>>>> +fhandler_pipe::open_setup (int flags)
> >>>>>> +{
> >>>>>> +  fhandler_base::open_setup (flags);
> >>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
> >>>>>> +    {
> >>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> >>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> >>>>>> +    }
> >>>>>> +}
> >>>>>> +
> >>>>>>     off_t
> >>>>>>     fhandler_pipe::lseek (off_t offset, int whence)
> >>>>>>     {
> >>>>>>
> >>>>>>
> >>>>>> AFAIK, another problem remaining is:
> >>>>>>
> >>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
> >>>>>> Corinna Vinschen wrote:
> >>>>>>> - What about calling select for writing on pipes read by non-Cygwin
> >>>>>>>      processes?  In that case, we still can't rely on WriteQuotaAvailable,
> >>>>>>>      just as before.
> >>>>>
> >>>>> This is the problem I was talking about.  In this case the non-Cygwin process
> >>>>> might have a large pending read, so that the Cygwin process calling select on
> >>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a deadlock
> >>>>> with the Cygwin process waiting for write ready while the non-Cygwin process is
> >>>>> blocked trying to read.
> >>>>
> >>>> Then, the above patch is for another issue.
> >>>> The problem happes when:
> >>>> 1) Start command prompt.
> >>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> >>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
> >>>> Therefore, read_mtx is not created.
> >>>
> >>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
> >>> call to CreateMutexW and print a debug message in that case.
> >>>
> >>>>> My suggestion is that we impose a timeout in this situation, after which select
> >>>>> reports write ready.
> >>>>
> >>>> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> >>>> that write side cannot detect close on read side.
> >>>> Is it possible to open read handle temporally when pipe_data_available()
> >>>> is called?
> >>>
> >>> That would be nice, but I have no idea how you could do that.
> >>
> >> Hmm. Then, what about PoC code attached? This returns to Corinna's
> >> query_hdl, and counts read/write handles to detect closing reader side.
> >>
> >> If the number of read handles is equal to number of write handles,
> >> only the pairs of write handle and query_hdl are alive. So, read pipe
> >> supposed to be closed.
> >>
> >> This patch depends another patch I posted a few hours ago.
> > 
> > Revised a bit.
> 
> I don't see how this solves the problem.  In the case we were worried about 
> where we have a non-Cygwin reader, the writer has no query_hdl, and you're just 
> always reporting write ready, aren't you?  Or am I missing something?

Do you assume pipe is created by non-cygwin app?

My assumption is:
1) Pipe is created by cygwin.
2) Writer is cygwin app using select().
3) Reader is non-cygwin app reading large block.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 21:46                                         ` Ken Brown
@ 2021-09-12 23:54                                           ` Takashi Yano
  2021-09-13  2:19                                             ` Ken Brown
  2021-09-13  8:40                                             ` Takashi Yano
  2021-09-13  9:42                                           ` Corinna Vinschen
  1 sibling, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-12 23:54 UTC (permalink / raw)
  To: cygwin-developers

On Sun, 12 Sep 2021 17:46:47 -0400
Ken Brown wrote:
> On 9/12/2021 11:10 AM, Ken Brown wrote:
> > On 9/12/2021 7:04 AM, Takashi Yano wrote:
> >> On Sun, 12 Sep 2021 17:48:49 +0900
> >> Takashi Yano wrote:
> >>> On Sat, 11 Sep 2021 09:12:02 -0400
> >>> Ken Brown wrote:
> >>>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
> >>>>> On Fri, 10 Sep 2021 22:17:21 -0400
> >>>>> Ken Brown wrote:
> >>>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> >>>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
> >>>>>>> Ken Brown wrote:
> >>>>>>>> I've rerun your test with the latest version, and the test results are 
> >>>>>>>> similar.
> >>>>>>>>      I've also run a suite of fifo tests that I've accumulated, and they 
> >>>>>>>> all pass
> >>>>>>>> also, so I pushed your patch.
> >>>>>>>>
> >>>>>>>> I think we're in pretty good shape now.  The only detail remaining, 
> >>>>>>>> AFAIK, is
> >>>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> >>>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
> >>>>>>>
> >>>>>>> I am not pretty sure what is the problem, but is not the following
> >>>>>>> patch enough?
> >>>>>>>
> >>>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> >>>>>>> index d309be2f7..13fba9a14 100644
> >>>>>>> --- a/winsup/cygwin/fhandler.h
> >>>>>>> +++ b/winsup/cygwin/fhandler.h
> >>>>>>> @@ -1205,6 +1205,7 @@ public:
> >>>>>>>       select_record *select_except (select_stuff *);
> >>>>>>>       char *get_proc_fd_name (char *buf);
> >>>>>>>       int open (int flags, mode_t mode = 0);
> >>>>>>> +  void open_setup (int flags);
> >>>>>>>       void fixup_after_fork (HANDLE);
> >>>>>>>       int dup (fhandler_base *child, int);
> >>>>>>>       int close ();
> >>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> index 6994a5dce..d84e6ad84 100644
> >>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
> >>>>>>> @@ -191,6 +191,17 @@ out:
> >>>>>>>       return 0;
> >>>>>>>     }
> >>>>>>>
> >>>>>>> +void
> >>>>>>> +fhandler_pipe::open_setup (int flags)
> >>>>>>> +{
> >>>>>>> +  fhandler_base::open_setup (flags);
> >>>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
> >>>>>>> +    {
> >>>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> >>>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> >>>>>>> +    }
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>     off_t
> >>>>>>>     fhandler_pipe::lseek (off_t offset, int whence)
> >>>>>>>     {
> >>>>>>>
> >>>>>>>
> >>>>>>> AFAIK, another problem remaining is:
> >>>>>>>
> >>>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
> >>>>>>> Corinna Vinschen wrote:
> >>>>>>>> - What about calling select for writing on pipes read by non-Cygwin
> >>>>>>>>      processes?  In that case, we still can't rely on WriteQuotaAvailable,
> >>>>>>>>      just as before.
> >>>>>>
> >>>>>> This is the problem I was talking about.  In this case the non-Cygwin process
> >>>>>> might have a large pending read, so that the Cygwin process calling select on
> >>>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a 
> >>>>>> deadlock
> >>>>>> with the Cygwin process waiting for write ready while the non-Cygwin 
> >>>>>> process is
> >>>>>> blocked trying to read.
> >>>>>
> >>>>> Then, the above patch is for another issue.
> >>>>> The problem happes when:
> >>>>> 1) Start command prompt.
> >>>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> >>>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
> >>>>> Therefore, read_mtx is not created.
> >>>>
> >>>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
> >>>> call to CreateMutexW and print a debug message in that case.
> >>>>
> >>>>>> My suggestion is that we impose a timeout in this situation, after which 
> >>>>>> select
> >>>>>> reports write ready.
> >>>>>
> >>>>> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> >>>>> that write side cannot detect close on read side.
> >>>>> Is it possible to open read handle temporally when pipe_data_available()
> >>>>> is called?
> >>>>
> >>>> That would be nice, but I have no idea how you could do that.
> >>>
> >>> Hmm. Then, what about PoC code attached? This returns to Corinna's
> >>> query_hdl, and counts read/write handles to detect closing reader side.
> >>>
> >>> If the number of read handles is equal to number of write handles,
> >>> only the pairs of write handle and query_hdl are alive. So, read pipe
> >>> supposed to be closed.
> >>>
> >>> This patch depends another patch I posted a few hours ago.
> >>
> >> Revised a bit.
> > 
> > I don't see how this solves the problem.  In the case we were worried about 
> > where we have a non-Cygwin reader, the writer has no query_hdl, and you're just 
> > always reporting write ready, aren't you?  Or am I missing something?
> 
> BTW, we could just decide that always reporting write ready in this corner case 
> is acceptable.  But then we could just do that without going back to query_hdl.

The various combination of cygwin and non-cygwin cases resuts in:

W P R: current, query_hdl
c c c: OK     , OK
c c n: NG     , OK
c n c: OK     , OK
c n n: NG     , select() always report write ready

where

W: Writer
P: Pipe
R: Reder
c: cygwin
n: non-cygwin

* Reder requests larger block than pipe size.
* Writer cannot be non-cygwin because we assume the case
  that writer uses select().

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 23:54                                           ` Takashi Yano
@ 2021-09-13  2:19                                             ` Ken Brown
  2021-09-13  8:40                                             ` Takashi Yano
  1 sibling, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-13  2:19 UTC (permalink / raw)
  To: cygwin-developers

On 9/12/2021 7:54 PM, Takashi Yano wrote:
> On Sun, 12 Sep 2021 17:46:47 -0400
> Ken Brown wrote:
>> On 9/12/2021 11:10 AM, Ken Brown wrote:
>>> On 9/12/2021 7:04 AM, Takashi Yano wrote:
>>>> On Sun, 12 Sep 2021 17:48:49 +0900
>>>> Takashi Yano wrote:
>>>>> On Sat, 11 Sep 2021 09:12:02 -0400
>>>>> Ken Brown wrote:
>>>>>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
>>>>>>> On Fri, 10 Sep 2021 22:17:21 -0400
>>>>>>> Ken Brown wrote:
>>>>>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
>>>>>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
>>>>>>>>> Ken Brown wrote:
>>>>>>>>>> I've rerun your test with the latest version, and the test results are
>>>>>>>>>> similar.
>>>>>>>>>>       I've also run a suite of fifo tests that I've accumulated, and they
>>>>>>>>>> all pass
>>>>>>>>>> also, so I pushed your patch.
>>>>>>>>>>
>>>>>>>>>> I think we're in pretty good shape now.  The only detail remaining,
>>>>>>>>>> AFAIK, is
>>>>>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
>>>>>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
>>>>>>>>>
>>>>>>>>> I am not pretty sure what is the problem, but is not the following
>>>>>>>>> patch enough?
>>>>>>>>>
>>>>>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
>>>>>>>>> index d309be2f7..13fba9a14 100644
>>>>>>>>> --- a/winsup/cygwin/fhandler.h
>>>>>>>>> +++ b/winsup/cygwin/fhandler.h
>>>>>>>>> @@ -1205,6 +1205,7 @@ public:
>>>>>>>>>        select_record *select_except (select_stuff *);
>>>>>>>>>        char *get_proc_fd_name (char *buf);
>>>>>>>>>        int open (int flags, mode_t mode = 0);
>>>>>>>>> +  void open_setup (int flags);
>>>>>>>>>        void fixup_after_fork (HANDLE);
>>>>>>>>>        int dup (fhandler_base *child, int);
>>>>>>>>>        int close ();
>>>>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
>>>>>>>>> index 6994a5dce..d84e6ad84 100644
>>>>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
>>>>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
>>>>>>>>> @@ -191,6 +191,17 @@ out:
>>>>>>>>>        return 0;
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> +void
>>>>>>>>> +fhandler_pipe::open_setup (int flags)
>>>>>>>>> +{
>>>>>>>>> +  fhandler_base::open_setup (flags);
>>>>>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
>>>>>>>>> +    {
>>>>>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
>>>>>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
>>>>>>>>> +    }
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>      off_t
>>>>>>>>>      fhandler_pipe::lseek (off_t offset, int whence)
>>>>>>>>>      {
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> AFAIK, another problem remaining is:
>>>>>>>>>
>>>>>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
>>>>>>>>> Corinna Vinschen wrote:
>>>>>>>>>> - What about calling select for writing on pipes read by non-Cygwin
>>>>>>>>>>       processes?  In that case, we still can't rely on WriteQuotaAvailable,
>>>>>>>>>>       just as before.
>>>>>>>>
>>>>>>>> This is the problem I was talking about.  In this case the non-Cygwin process
>>>>>>>> might have a large pending read, so that the Cygwin process calling select on
>>>>>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a
>>>>>>>> deadlock
>>>>>>>> with the Cygwin process waiting for write ready while the non-Cygwin
>>>>>>>> process is
>>>>>>>> blocked trying to read.
>>>>>>>
>>>>>>> Then, the above patch is for another issue.
>>>>>>> The problem happes when:
>>>>>>> 1) Start command prompt.
>>>>>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
>>>>>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
>>>>>>> Therefore, read_mtx is not created.
>>>>>>
>>>>>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
>>>>>> call to CreateMutexW and print a debug message in that case.
>>>>>>
>>>>>>>> My suggestion is that we impose a timeout in this situation, after which
>>>>>>>> select
>>>>>>>> reports write ready.
>>>>>>>
>>>>>>> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
>>>>>>> that write side cannot detect close on read side.
>>>>>>> Is it possible to open read handle temporally when pipe_data_available()
>>>>>>> is called?
>>>>>>
>>>>>> That would be nice, but I have no idea how you could do that.
>>>>>
>>>>> Hmm. Then, what about PoC code attached? This returns to Corinna's
>>>>> query_hdl, and counts read/write handles to detect closing reader side.
>>>>>
>>>>> If the number of read handles is equal to number of write handles,
>>>>> only the pairs of write handle and query_hdl are alive. So, read pipe
>>>>> supposed to be closed.
>>>>>
>>>>> This patch depends another patch I posted a few hours ago.
>>>>
>>>> Revised a bit.
>>>
>>> I don't see how this solves the problem.  In the case we were worried about
>>> where we have a non-Cygwin reader, the writer has no query_hdl, and you're just
>>> always reporting write ready, aren't you?  Or am I missing something?
>>
>> BTW, we could just decide that always reporting write ready in this corner case
>> is acceptable.  But then we could just do that without going back to query_hdl.
> 
> The various combination of cygwin and non-cygwin cases resuts in:
> 
> W P R: current, query_hdl
> c c c: OK     , OK
> c c n: NG     , OK
> c n c: OK     , OK
> c n n: NG     , select() always report write ready

This is the case I've been talking about.  I'm sorry for not being more clear.

> where
> 
> W: Writer
> P: Pipe
> R: Reder
> c: cygwin
> n: non-cygwin
> 
> * Reder requests larger block than pipe size.
> * Writer cannot be non-cygwin because we assume the case
>    that writer uses select().

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 23:54                                           ` Takashi Yano
  2021-09-13  2:19                                             ` Ken Brown
@ 2021-09-13  8:40                                             ` Takashi Yano
  2021-09-13 12:51                                               ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-13  8:40 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 13 Sep 2021 08:54:31 +0900
Takashi Yano wrote:
> On Sun, 12 Sep 2021 17:46:47 -0400
> Ken Brown wrote:
> > On 9/12/2021 11:10 AM, Ken Brown wrote:
> > > On 9/12/2021 7:04 AM, Takashi Yano wrote:
> > >> On Sun, 12 Sep 2021 17:48:49 +0900
> > >> Takashi Yano wrote:
> > >>> On Sat, 11 Sep 2021 09:12:02 -0400
> > >>> Ken Brown wrote:
> > >>>> On 9/10/2021 10:35 PM, Takashi Yano wrote:
> > >>>>> On Fri, 10 Sep 2021 22:17:21 -0400
> > >>>>> Ken Brown wrote:
> > >>>>>> On 9/10/2021 6:57 PM, Takashi Yano wrote:
> > >>>>>>> On Fri, 10 Sep 2021 11:17:58 -0400
> > >>>>>>> Ken Brown wrote:
> > >>>>>>>> I've rerun your test with the latest version, and the test results are 
> > >>>>>>>> similar.
> > >>>>>>>>      I've also run a suite of fifo tests that I've accumulated, and they 
> > >>>>>>>> all pass
> > >>>>>>>> also, so I pushed your patch.
> > >>>>>>>>
> > >>>>>>>> I think we're in pretty good shape now.  The only detail remaining, 
> > >>>>>>>> AFAIK, is
> > >>>>>>>> how to best avoid a deadlock if the pipe has been created by a non-Cygwin
> > >>>>>>>> process.  I've proposed a timeout, but maybe there's a better idea.
> > >>>>>>>
> > >>>>>>> I am not pretty sure what is the problem, but is not the following
> > >>>>>>> patch enough?
> > >>>>>>>
> > >>>>>>> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> > >>>>>>> index d309be2f7..13fba9a14 100644
> > >>>>>>> --- a/winsup/cygwin/fhandler.h
> > >>>>>>> +++ b/winsup/cygwin/fhandler.h
> > >>>>>>> @@ -1205,6 +1205,7 @@ public:
> > >>>>>>>       select_record *select_except (select_stuff *);
> > >>>>>>>       char *get_proc_fd_name (char *buf);
> > >>>>>>>       int open (int flags, mode_t mode = 0);
> > >>>>>>> +  void open_setup (int flags);
> > >>>>>>>       void fixup_after_fork (HANDLE);
> > >>>>>>>       int dup (fhandler_base *child, int);
> > >>>>>>>       int close ();
> > >>>>>>> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > >>>>>>> index 6994a5dce..d84e6ad84 100644
> > >>>>>>> --- a/winsup/cygwin/fhandler_pipe.cc
> > >>>>>>> +++ b/winsup/cygwin/fhandler_pipe.cc
> > >>>>>>> @@ -191,6 +191,17 @@ out:
> > >>>>>>>       return 0;
> > >>>>>>>     }
> > >>>>>>>
> > >>>>>>> +void
> > >>>>>>> +fhandler_pipe::open_setup (int flags)
> > >>>>>>> +{
> > >>>>>>> +  fhandler_base::open_setup (flags);
> > >>>>>>> +  if (get_dev () == FH_PIPER && !read_mtx)
> > >>>>>>> +    {
> > >>>>>>> +      SECURITY_ATTRIBUTES *sa = sec_none_cloexec (flags);
> > >>>>>>> +      read_mtx = CreateMutexW (sa, FALSE, NULL);
> > >>>>>>> +    }
> > >>>>>>> +}
> > >>>>>>> +
> > >>>>>>>     off_t
> > >>>>>>>     fhandler_pipe::lseek (off_t offset, int whence)
> > >>>>>>>     {
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> AFAIK, another problem remaining is:
> > >>>>>>>
> > >>>>>>> On Mon, 6 Sep 2021 14:49:55 +0200
> > >>>>>>> Corinna Vinschen wrote:
> > >>>>>>>> - What about calling select for writing on pipes read by non-Cygwin
> > >>>>>>>>      processes?  In that case, we still can't rely on WriteQuotaAvailable,
> > >>>>>>>>      just as before.
> > >>>>>>
> > >>>>>> This is the problem I was talking about.  In this case the non-Cygwin process
> > >>>>>> might have a large pending read, so that the Cygwin process calling select on
> > >>>>>> the write side will see WriteQuotaAvailable == 0.  This could lead to a 
> > >>>>>> deadlock
> > >>>>>> with the Cygwin process waiting for write ready while the non-Cygwin 
> > >>>>>> process is
> > >>>>>> blocked trying to read.
> > >>>>>
> > >>>>> Then, the above patch is for another issue.
> > >>>>> The problem happes when:
> > >>>>> 1) Start command prompt.
> > >>>>> 2) Run 'echo AAAAAAAAAAAA | \cygwin64\bin\cat
> > >>>>> This causes hang up in cat. In this case, pipe is created by cmd.exe.
> > >>>>> Therefore, read_mtx is not created.
> > >>>>
> > >>>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
> > >>>> call to CreateMutexW and print a debug message in that case.
> > >>>>
> > >>>>>> My suggestion is that we impose a timeout in this situation, after which 
> > >>>>>> select
> > >>>>>> reports write ready.
> > >>>>>
> > >>>>> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> > >>>>> that write side cannot detect close on read side.
> > >>>>> Is it possible to open read handle temporally when pipe_data_available()
> > >>>>> is called?
> > >>>>
> > >>>> That would be nice, but I have no idea how you could do that.
> > >>>
> > >>> Hmm. Then, what about PoC code attached? This returns to Corinna's
> > >>> query_hdl, and counts read/write handles to detect closing reader side.
> > >>>
> > >>> If the number of read handles is equal to number of write handles,
> > >>> only the pairs of write handle and query_hdl are alive. So, read pipe
> > >>> supposed to be closed.
> > >>>
> > >>> This patch depends another patch I posted a few hours ago.
> > >>
> > >> Revised a bit.
> > > 
> > > I don't see how this solves the problem.  In the case we were worried about 
> > > where we have a non-Cygwin reader, the writer has no query_hdl, and you're just 
> > > always reporting write ready, aren't you?  Or am I missing something?
> > 
> > BTW, we could just decide that always reporting write ready in this corner case 
> > is acceptable.  But then we could just do that without going back to query_hdl.
> 
> The various combination of cygwin and non-cygwin cases resuts in:
> 
> W P R: current, query_hdl
> c c c: OK     , OK
> c c n: NG     , OK
> c n c: OK     , OK
> c n n: NG     , select() always report write ready

Sorry, this was not correct. In fact,
W P R: current, query_hdl
c c c: OK     , OK
c c n: NG     , OK
c n c: OK     , select() always report write ready
c n n: NG     , select() always report write ready

:-(

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-11  2:35                               ` Takashi Yano
  2021-09-11 13:12                                 ` Ken Brown
@ 2021-09-13  9:07                                 ` Corinna Vinschen
  2021-09-20 12:52                                   ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-13  9:07 UTC (permalink / raw)
  To: cygwin-developers

On Sep 11 11:35, Takashi Yano wrote:
> On Fri, 10 Sep 2021 22:17:21 -0400
> Ken Brown wrote:
> > My suggestion is that we impose a timeout in this situation, after which select 
> > reports write ready.
> 
> Keeping read handle in write pipe (Corinna's query_hdl) causes problem
> that write side cannot detect close on read side.
> Is it possible to open read handle temporally when pipe_data_available()
> is called?

1. You would have to know which process keeps the other side of the pipe.
2. You would have to have the permission to open the other process to
   duplicate the pipe into your own process
3. You would have to know the HANDLE value of the read side of your pipe
   in that other process.

Point 1 is (kind of) doable using GetNamedPipeClientProcessId or
GetNamedPipeServerProcessId.  ZIt's not clear how reliable these
functions are, given that both pipe sides are created by the same
process and then usually inherited by two child processes communicating
over that pipe.

Point 2 is most of the time the case, especially when talking with
native processes.

Point 3 requires some sort of IPC.

Having said that, I think this is too complicated.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 14:39                                     ` Ken Brown
@ 2021-09-13  9:11                                       ` Corinna Vinschen
  2021-09-13 12:30                                         ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-13  9:11 UTC (permalink / raw)
  To: cygwin-developers

On Sep 12 10:39, Ken Brown wrote:
> On 9/12/2021 2:23 AM, Takashi Yano wrote:
> > > Confirmed, and your patch fixes it.  Maybe you should check for error in the
> > > call to CreateMutexW and print a debug message in that case.
> > 
> > I added the debug message.
> 
> LGTM.

Are you going to push this, Ken?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 21:46                                         ` Ken Brown
  2021-09-12 23:54                                           ` Takashi Yano
@ 2021-09-13  9:42                                           ` Corinna Vinschen
  2021-09-13 13:03                                             ` Ken Brown
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-13  9:42 UTC (permalink / raw)
  To: cygwin-developers

[Guys, is it asked too much to trim your mails to the required context,
 rather than always performing a Full Quote?  Pretty please...]

On Sep 12 17:46, Ken Brown wrote:
> On 9/12/2021 11:10 AM, Ken Brown wrote:
> > On 9/12/2021 7:04 AM, Takashi Yano wrote:
> > > On Sun, 12 Sep 2021 17:48:49 +0900
> > > Takashi Yano wrote:
> > > > Hmm. Then, what about PoC code attached? This returns to Corinna's
> > > > query_hdl, and counts read/write handles to detect closing reader side.
> > > > 
> > > > If the number of read handles is equal to number of write handles,
> > > > only the pairs of write handle and query_hdl are alive. So, read pipe
> > > > supposed to be closed.
> > > > 
> > > > This patch depends another patch I posted a few hours ago.
> > > [...]
> > I don't see how this solves the problem.  In the case we were worried
> > about where we have a non-Cygwin reader, the writer has no query_hdl,
> > and you're just always reporting write ready, aren't you?  Or am I
> > missing something?
> 
> BTW, we could just decide that always reporting write ready in this corner
> case is acceptable.  But then we could just do that without going back to
> query_hdl.

The problem with the corner case is, how to find out?  You could have
arbitrarily complex process trees with the pipe inherited by grand
children, one of which is a non-Cygwin process.  How does a Cygwin
process tree member learn about that fact, if it didn't start the
non-Cygwin process by itself?

Looks like we have three choices:

- Reintroducing query_hdl as in Takashi's patch.

- select timeouts

- always return "pipe writable"

I think we might try Takashi's idea for as start, no?

Didn't we also have a problem with C# in terms of non-blocking pipes?
I wonder if we could just do the following: As soon as we spawn a
non-Cygwin process, just call set_pipe_non_blocking(false) for all
pipes.  Blocking vs. nonblocking mode is a per-handle thingy anyway.

What do you think?


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13  9:11                                       ` Corinna Vinschen
@ 2021-09-13 12:30                                         ` Ken Brown
  0 siblings, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-13 12:30 UTC (permalink / raw)
  To: cygwin-developers

On 9/13/2021 5:11 AM, Corinna Vinschen wrote:
> On Sep 12 10:39, Ken Brown wrote:
>> On 9/12/2021 2:23 AM, Takashi Yano wrote:
>>>> Confirmed, and your patch fixes it.  Maybe you should check for error in the
>>>> call to CreateMutexW and print a debug message in that case.
>>>
>>> I added the debug message.
>>
>> LGTM.
> 
> Are you going to push this, Ken?

Done.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13  8:40                                             ` Takashi Yano
@ 2021-09-13 12:51                                               ` Ken Brown
  2021-09-13 17:05                                                 ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-13 12:51 UTC (permalink / raw)
  To: cygwin-developers

On 9/13/2021 4:40 AM, Takashi Yano wrote:
> On Mon, 13 Sep 2021 08:54:31 +0900
> Takashi Yano wrote:
>> On Sun, 12 Sep 2021 17:46:47 -0400
>> Ken Brown wrote:
>>> On 9/12/2021 11:10 AM, Ken Brown wrote:
>>>> I don't see how this solves the problem.  In the case we were worried about
>>>> where we have a non-Cygwin reader, the writer has no query_hdl, and you're just
>>>> always reporting write ready, aren't you?  Or am I missing something?
>>>
>>> BTW, we could just decide that always reporting write ready in this corner case
>>> is acceptable.  But then we could just do that without going back to query_hdl.
>>
>> The various combination of cygwin and non-cygwin cases resuts in:
>>
>> W P R: current, query_hdl
>> c c c: OK     , OK
>> c c n: NG     , OK
>> c n c: OK     , OK
>> c n n: NG     , select() always report write ready
> 
> Sorry, this was not correct. In fact,
> W P R: current, query_hdl
> c c c: OK     , OK
> c c n: NG     , OK
> c n c: OK     , select() always report write ready
> c n n: NG     , select() always report write ready

What if you use query_hdl to fix the ccn case, but also keep the current code in 
raw_read that serializes the reads and avoids a large blocking read?  Would that 
avoid breaking the cnc case?

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13  9:42                                           ` Corinna Vinschen
@ 2021-09-13 13:03                                             ` Ken Brown
  2021-09-13 18:39                                               ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-13 13:03 UTC (permalink / raw)
  To: cygwin-developers

On 9/13/2021 5:42 AM, Corinna Vinschen wrote:
> On Sep 12 17:46, Ken Brown wrote:
>> BTW, we could just decide that always reporting write ready in this corner
>> case is acceptable.  But then we could just do that without going back to
>> query_hdl.
> 
> The problem with the corner case is, how to find out?  You could have
> arbitrarily complex process trees with the pipe inherited by grand
> children, one of which is a non-Cygwin process.  How does a Cygwin
> process tree member learn about that fact, if it didn't start the
> non-Cygwin process by itself?
> 
> Looks like we have three choices:
> 
> - Reintroducing query_hdl as in Takashi's patch.
> 
> - select timeouts
> 
> - always return "pipe writable"
> 
> I think we might try Takashi's idea for as start, no?

That sounds good to me, provided he can fix the "cnc" case I just asked him 
about -- Cygwin reader and writer on a pipe created by a non-Cygwin process.

> Didn't we also have a problem with C# in terms of non-blocking pipes?
> I wonder if we could just do the following: As soon as we spawn a
> non-Cygwin process, just call set_pipe_non_blocking(false) for all
> pipes.  Blocking vs. nonblocking mode is a per-handle thingy anyway.
> 
> What do you think?

I don't remember exactly what the issue was with C# programs, so I'll defer to 
Takashi on this.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 12:51                                               ` Ken Brown
@ 2021-09-13 17:05                                                 ` Ken Brown
  0 siblings, 0 replies; 250+ messages in thread
From: Ken Brown @ 2021-09-13 17:05 UTC (permalink / raw)
  To: cygwin-developers

On 9/13/2021 8:51 AM, Ken Brown wrote:
> On 9/13/2021 4:40 AM, Takashi Yano wrote:
>> On Mon, 13 Sep 2021 08:54:31 +0900
>> Takashi Yano wrote:
>>> On Sun, 12 Sep 2021 17:46:47 -0400
>>> Ken Brown wrote:
>>>> On 9/12/2021 11:10 AM, Ken Brown wrote:
>>>>> I don't see how this solves the problem.  In the case we were worried about
>>>>> where we have a non-Cygwin reader, the writer has no query_hdl, and you're 
>>>>> just
>>>>> always reporting write ready, aren't you?  Or am I missing something?
>>>>
>>>> BTW, we could just decide that always reporting write ready in this corner case
>>>> is acceptable.  But then we could just do that without going back to query_hdl.
>>>
>>> The various combination of cygwin and non-cygwin cases resuts in:
>>>
>>> W P R: current, query_hdl
>>> c c c: OK     , OK
>>> c c n: NG     , OK
>>> c n c: OK     , OK
>>> c n n: NG     , select() always report write ready
>>
>> Sorry, this was not correct. In fact,
>> W P R: current, query_hdl
>> c c c: OK     , OK
>> c c n: NG     , OK
>> c n c: OK     , select() always report write ready
>> c n n: NG     , select() always report write ready
> 
> What if you use query_hdl to fix the ccn case, but also keep the current code in 
> raw_read that serializes the reads and avoids a large blocking read?  Would that 
> avoid breaking the cnc case?

Never mind.  I don't think that makes sense.  The way I see it now, if we have a 
query_hdl, we should use it.  If we don't have a query_hdl, then we know there 
were non-Cygwin processes in the process tree, and then if WriteQuotaAvailable 
== 0, the pipe buffer might actually be empty.  So I don't see any option other 
than reporting write ready, as in your patch.

I do have a couple of small comments about your patch, which I'll send later.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 11:04                                     ` Takashi Yano
  2021-09-12 15:10                                       ` Ken Brown
@ 2021-09-13 17:42                                       ` Ken Brown
  2021-09-13 18:54                                         ` Takashi Yano
  2021-09-13 18:32                                       ` Corinna Vinschen
  2 siblings, 1 reply; 250+ messages in thread
From: Ken Brown @ 2021-09-13 17:42 UTC (permalink / raw)
  To: cygwin-developers

On 9/12/2021 7:04 AM, Takashi Yano wrote:
> On Sun, 12 Sep 2021 17:48:49 +0900
> Takashi Yano wrote:
>> Hmm. Then, what about PoC code attached? This returns to Corinna's
>> query_hdl, and counts read/write handles to detect closing reader side.
>>
>> If the number of read handles is equal to number of write handles,
>> only the pairs of write handle and query_hdl are alive. So, read pipe
>> supposed to be closed.
>>
>> This patch depends another patch I posted a few hours ago.
> 
> Revised a bit.

A few small comments/questions:

> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> index 13fba9a14..f09af2c37 100644
> --- a/winsup/cygwin/fhandler.h
> +++ b/winsup/cygwin/fhandler.h
> @@ -1176,10 +1176,15 @@ class fhandler_pipe_fifo: public fhandler_base
>  {
>   protected:
>    size_t pipe_buf_size;
> +  HANDLE query_hdl;
>  
>   public:
>    fhandler_pipe_fifo ();
>  
> +  HANDLE get_query_handle () const { return query_hdl; }
> +  void close_query_handle () { CloseHandle (query_hdl); query_hdl = NULL; }

Should you use if(query_hdl) here?  Or is it up to the caller to check that?

> @@ -522,6 +532,10 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
>      fork_fixup (parent, read_mtx, "read_mtx");
>    if (select_sem)
>      fork_fixup (parent, select_sem, "select_sem");
> +  /* Do not duplicate query_hdl if it has been already inherited. */
> +  if (query_hdl && !get_obj_handle_count (query_hdl))
> +    fork_fixup (parent, query_hdl, "query_hdl");


Why do you need to call get_obj_handle_count here?  Shouldn't fork_fixup take 
care of the case where the handle has been inherited?

> @@ -608,12 +608,43 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
>      }
>    if (writing)
>      {
> -      /* WriteQuotaAvailable is decremented by the number of bytes requested
> -	 by a blocking reader on the other side of the pipe.  Cygwin readers
> -	 are serialized and never request a number of bytes equivalent to the
> -	 full buffer size.  So WriteQuotaAvailable is 0 only if either the
> -	 read buffer on the other side is really full, or if we have non-Cygwin
> -	 readers. */
> +      /* If there is anything available in the pipe buffer then signal
> +        that.  This means that a pipe could still block since you could
> +        be trying to write more to the pipe than is available in the
> +        buffer but that is the hazard of select().
> +
> +        Note that WriteQuotaAvailable is unreliable.
> +
> +        Usually WriteQuotaAvailable on the write side reflects the space
> +        available in the inbound buffer on the read side.  However, if a
> +        pipe read is currently pending, WriteQuotaAvailable on the write side
> +        is decremented by the number of bytes the read side is requesting.
> +        So it's possible (even likely) that WriteQuotaAvailable is 0, even
> +        if the inbound buffer on the read side is not full.  This can lead to
> +        a deadlock situation: The reader is waiting for data, but select
> +        on the writer side assumes that no space is available in the read
> +        side inbound buffer.
> +
> +        Consequentially, the only reliable information is available on the
> +        read side, so fetch info from the read side via the pipe-specific
> +        query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
> +        interesting value, which is the OutboundQuote on the write side,

I thought Corinna's experiments showed that InboundQuota and OutboundQuota are 
the same on the read and write sides, and that InboundQuota is the one we should 
be using.

Ken

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-12 11:04                                     ` Takashi Yano
  2021-09-12 15:10                                       ` Ken Brown
  2021-09-13 17:42                                       ` Ken Brown
@ 2021-09-13 18:32                                       ` Corinna Vinschen
  2021-09-13 19:37                                         ` Takashi Yano
  2 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-13 18:32 UTC (permalink / raw)
  To: cygwin-developers

On Sep 12 20:04, Takashi Yano wrote:
> On Sun, 12 Sep 2021 17:48:49 +0900
> Takashi Yano wrote:
> > Hmm. Then, what about PoC code attached? This returns to Corinna's
> > query_hdl, and counts read/write handles to detect closing reader side.
> > 
> > If the number of read handles is equal to number of write handles,
> > only the pairs of write handle and query_hdl are alive. So, read pipe
> > supposed to be closed.
> > 
> > This patch depends another patch I posted a few hours ago.
> 
> Revised a bit.
> [...]

What I miss is a bit more detailed commit message...

> diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> index 13fba9a14..f09af2c37 100644
> --- a/winsup/cygwin/fhandler.h
> +++ b/winsup/cygwin/fhandler.h
> @@ -1176,10 +1176,15 @@ class fhandler_pipe_fifo: public fhandler_base
>  {
>   protected:
>    size_t pipe_buf_size;
> +  HANDLE query_hdl;
>  
>   public:
>    fhandler_pipe_fifo ();
>  
> +  HANDLE get_query_handle () const { return query_hdl; }
> +  void close_query_handle () { CloseHandle (query_hdl); query_hdl = NULL; }
> +  bool reader_closed ();
> +
>    ssize_t __reg3 raw_write (const void *ptr, size_t len);
>  
>  };
> diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> index 9b4255cfd..b051f5c03 100644
> --- a/winsup/cygwin/fhandler_pipe.cc
> +++ b/winsup/cygwin/fhandler_pipe.cc
> @@ -56,6 +56,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
>    fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
>    fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
>      : FILE_PIPE_QUEUE_OPERATION;
> +  if (query_hdl)
> +    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;

This should be a single expression, i.e.

   fpi.CompletionMode = nonblocking || query_hdl
                        ? FILE_PIPE_COMPLETE_OPERATION
                        : FILE_PIPE_QUEUE_OPERATION;

ideally combined with a comment.

But then again... you're basically switching the write side of
a pipe to nonblocking mode unconditionally.  The downside is a
busy wait:

>  fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
>  {
> @@ -493,7 +490,20 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
>  			      get_obj_handle_count (select_sem), NULL);
>  	  /* 0 bytes returned?  EAGAIN.  See above. */
>  	  if (NT_SUCCESS (status) && nbytes == 0)
> -	    set_errno (EAGAIN);
> +	    {
> +	      if (reader_closed ())
> +		{
> +		  set_errno (EPIPE);
> +		  raise (SIGPIPE);
> +		}
> +	      else if (is_nonblocking ())
> +		set_errno (EAGAIN);
> +	      else
> +		{
> +		  cygwait (select_sem, 10);
> +		  continue;

I'm a bit puzzled.  The cygwait branch neglects to check if select_sem
is NULL (the preceeding ReleaseSemaphore expression does!)
And then it doesn't matter if the caller got blocked or not, it will
always perform a continue.  So why do it at all?  Worse, if this
expression loops, it will eat up the semaphore, because each call will
decrement the semaphore count until it blocks.  That sounds wrong to me.

Btw., while looking into the current pipe code, I wonder what select_sem
is doing in the pipe code at all so far.  It gets released, but it never
gets waited on?!?  Am I missing something?

>  	{
> @@ -522,6 +532,10 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
>      fork_fixup (parent, read_mtx, "read_mtx");
>    if (select_sem)
>      fork_fixup (parent, select_sem, "select_sem");
> +  /* Do not duplicate query_hdl if it has been already inherited. */
> +  if (query_hdl && !get_obj_handle_count (query_hdl))
> +    fork_fixup (parent, query_hdl, "query_hdl");

I don't understand why calling fork_fixup on query_hdl should depend
on the handle count.  If you duplicate a writer, you always have to
duplicate query_hdl as well to keep the count, no?  Inheritence is
handled by the O_CLOEXEC flag and fork_fixup will do the right thing.

> +      if (!DuplicateHandle (GetCurrentProcess (), r,
> +			    GetCurrentProcess (), &fhs[1]->query_hdl,
> +			    GENERIC_READ, !(mode & O_CLOEXEC), 0))

This is a bug I introduced accidentally during testing.  This
GENERIC_READ is actually supposed to be a FILE_READ_ATTRIBUTES.
Sorry about that.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 13:03                                             ` Ken Brown
@ 2021-09-13 18:39                                               ` Takashi Yano
  0 siblings, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-13 18:39 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 13 Sep 2021 09:03:04 -0400
Ken Brown wrote:
> On 9/13/2021 5:42 AM, Corinna Vinschen wrote:
> > Didn't we also have a problem with C# in terms of non-blocking pipes?
> > I wonder if we could just do the following: As soon as we spawn a
> > non-Cygwin process, just call set_pipe_non_blocking(false) for all
> > pipes.  Blocking vs. nonblocking mode is a per-handle thingy anyway.
> > 
> > What do you think?
> 
> I don't remember exactly what the issue was with C# programs, so I'll defer to 
> Takashi on this.

Actually, piping C# program has a problem even if setting the pipe
blocking. It needs also pipe_byte and FILE_SYNCHRONOUS_IO_NONALERT.
However, I agree it is safer to set the pipe blocking for non-cygwin
apps.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 17:42                                       ` Ken Brown
@ 2021-09-13 18:54                                         ` Takashi Yano
  0 siblings, 0 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-13 18:54 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 13 Sep 2021 13:42:46 -0400
Ken Brown wrote:
> > diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
> > index 13fba9a14..f09af2c37 100644
> > --- a/winsup/cygwin/fhandler.h
> > +++ b/winsup/cygwin/fhandler.h
> > @@ -1176,10 +1176,15 @@ class fhandler_pipe_fifo: public fhandler_base
> >  {
> >   protected:
> >    size_t pipe_buf_size;
> > +  HANDLE query_hdl;
> >  
> >   public:
> >    fhandler_pipe_fifo ();
> >  
> > +  HANDLE get_query_handle () const { return query_hdl; }
> > +  void close_query_handle () { CloseHandle (query_hdl); query_hdl = NULL; }
> 
> Should you use if(query_hdl) here?  Or is it up to the caller to check that?

Right. I will fix it.

> > @@ -522,6 +532,10 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
> >      fork_fixup (parent, read_mtx, "read_mtx");
> >    if (select_sem)
> >      fork_fixup (parent, select_sem, "select_sem");
> > +  /* Do not duplicate query_hdl if it has been already inherited. */
> > +  if (query_hdl && !get_obj_handle_count (query_hdl))
> > +    fork_fixup (parent, query_hdl, "query_hdl");
> 
> 
> Why do you need to call get_obj_handle_count here?  Shouldn't fork_fixup take 
> care of the case where the handle has been inherited?

I also thought so, however, counting query_hdl did not work
as expected without this check. I am not sure why...

There seems to be the case that handle is already inherited
here without fork_fixup.

> > @@ -608,12 +608,43 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
> >      }
> >    if (writing)
> >      {
> > -      /* WriteQuotaAvailable is decremented by the number of bytes requested
> > -	 by a blocking reader on the other side of the pipe.  Cygwin readers
> > -	 are serialized and never request a number of bytes equivalent to the
> > -	 full buffer size.  So WriteQuotaAvailable is 0 only if either the
> > -	 read buffer on the other side is really full, or if we have non-Cygwin
> > -	 readers. */
> > +      /* If there is anything available in the pipe buffer then signal
> > +        that.  This means that a pipe could still block since you could
> > +        be trying to write more to the pipe than is available in the
> > +        buffer but that is the hazard of select().
> > +
> > +        Note that WriteQuotaAvailable is unreliable.
> > +
> > +        Usually WriteQuotaAvailable on the write side reflects the space
> > +        available in the inbound buffer on the read side.  However, if a
> > +        pipe read is currently pending, WriteQuotaAvailable on the write side
> > +        is decremented by the number of bytes the read side is requesting.
> > +        So it's possible (even likely) that WriteQuotaAvailable is 0, even
> > +        if the inbound buffer on the read side is not full.  This can lead to
> > +        a deadlock situation: The reader is waiting for data, but select
> > +        on the writer side assumes that no space is available in the read
> > +        side inbound buffer.
> > +
> > +        Consequentially, the only reliable information is available on the
> > +        read side, so fetch info from the read side via the pipe-specific
> > +        query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
> > +        interesting value, which is the OutboundQuote on the write side,
> 
> I thought Corinna's experiments showed that InboundQuota and OutboundQuota are 
> the same on the read and write sides, and that InboundQuota is the one we should 
> be using.

I have confirmed that behaviour. You are right. Using InboundQuota
is right thing. I'll fix it. Thanks.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 18:32                                       ` Corinna Vinschen
@ 2021-09-13 19:37                                         ` Takashi Yano
  2021-09-13 20:15                                           ` Corinna Vinschen
  2021-09-14  8:08                                           ` Takashi Yano
  0 siblings, 2 replies; 250+ messages in thread
From: Takashi Yano @ 2021-09-13 19:37 UTC (permalink / raw)
  To: cygwin-developers

On Mon, 13 Sep 2021 20:32:33 +0200
Corinna Vinschen wrote:

> On Sep 12 20:04, Takashi Yano wrote:
> > On Sun, 12 Sep 2021 17:48:49 +0900
> > Takashi Yano wrote:
> > > Hmm. Then, what about PoC code attached? This returns to Corinna's
> > > query_hdl, and counts read/write handles to detect closing reader side.
> > > 
> > > If the number of read handles is equal to number of write handles,
> > > only the pairs of write handle and query_hdl are alive. So, read pipe
> > > supposed to be closed.
> > > 
> > > This patch depends another patch I posted a few hours ago.
> > 
> > Revised a bit.
> > [...]
> 
> What I miss is a bit more detailed commit message...

I am sorry, I will add more detail commit message.

> > diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > index 9b4255cfd..b051f5c03 100644
> > --- a/winsup/cygwin/fhandler_pipe.cc
> > +++ b/winsup/cygwin/fhandler_pipe.cc
> > @@ -56,6 +56,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
> >    fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
> >    fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
> >      : FILE_PIPE_QUEUE_OPERATION;
> > +  if (query_hdl)
> > +    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;
> 
> This should be a single expression, i.e.
> 
>    fpi.CompletionMode = nonblocking || query_hdl
>                         ? FILE_PIPE_COMPLETE_OPERATION
>                         : FILE_PIPE_QUEUE_OPERATION;
> 
> ideally combined with a comment.

Thanks. I'll do that.

> But then again... you're basically switching the write side of
> a pipe to nonblocking mode unconditionally.  The downside is a
> busy wait:
> 
> >  fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
> >  {
> > @@ -493,7 +490,20 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
> >  			      get_obj_handle_count (select_sem), NULL);
> >  	  /* 0 bytes returned?  EAGAIN.  See above. */
> >  	  if (NT_SUCCESS (status) && nbytes == 0)
> > -	    set_errno (EAGAIN);
> > +	    {
> > +	      if (reader_closed ())
> > +		{
> > +		  set_errno (EPIPE);
> > +		  raise (SIGPIPE);
> > +		}
> > +	      else if (is_nonblocking ())
> > +		set_errno (EAGAIN);
> > +	      else
> > +		{
> > +		  cygwait (select_sem, 10);
> > +		  continue;
> 
> I'm a bit puzzled.  The cygwait branch neglects to check if select_sem
> is NULL (the preceeding ReleaseSemaphore expression does!)
> And then it doesn't matter if the caller got blocked or not, it will
> always perform a continue.  So why do it at all?  Worse, if this
> expression loops, it will eat up the semaphore, because each call will
> decrement the semaphore count until it blocks.  That sounds wrong to me.

It is by design. ReleaseSemaphore() releases maximum number of semaphore
which the waiter can exists. If only one writer and one reader exist,
ReleaseSemaphore releases 2 semaphores. Then cygwait here consume semaphore
two times and return to wait state.
This wait state is released by raw_read() or close().
 
> Btw., while looking into the current pipe code, I wonder what select_sem
> is doing in the pipe code at all so far.  It gets released, but it never
> gets waited on?!?  Am I missing something?

The semaphore is waited in select.cc.
But, wait. Wat happens if select() is not called? Released semaphore
can be accumulated up to INT32_MAX!!?

Let me consider.

> >  	{
> > @@ -522,6 +532,10 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
> >      fork_fixup (parent, read_mtx, "read_mtx");
> >    if (select_sem)
> >      fork_fixup (parent, select_sem, "select_sem");
> > +  /* Do not duplicate query_hdl if it has been already inherited. */
> > +  if (query_hdl && !get_obj_handle_count (query_hdl))
> > +    fork_fixup (parent, query_hdl, "query_hdl");
> 
> I don't understand why calling fork_fixup on query_hdl should depend
> on the handle count.  If you duplicate a writer, you always have to
> duplicate query_hdl as well to keep the count, no?  Inheritence is
> handled by the O_CLOEXEC flag and fork_fixup will do the right thing.

I thought so, however, counting query_hdl cannot work as expected
without this check. The number of query_hdl opend seems to exceed
the number of writer.

There seems to be the case that handle is already inherited here
without fork_fixup. Any idea?

> 
> > +      if (!DuplicateHandle (GetCurrentProcess (), r,
> > +			    GetCurrentProcess (), &fhs[1]->query_hdl,
> > +			    GENERIC_READ, !(mode & O_CLOEXEC), 0))
> 
> This is a bug I introduced accidentally during testing.  This
> GENERIC_READ is actually supposed to be a FILE_READ_ATTRIBUTES.
> Sorry about that.

The PoC code uses PeekNamedPipe for query_hdl, so GENERIC_READ is
necessary I think.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 19:37                                         ` Takashi Yano
@ 2021-09-13 20:15                                           ` Corinna Vinschen
  2021-09-14  8:07                                             ` Takashi Yano
  2021-09-14  8:08                                           ` Takashi Yano
  1 sibling, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-13 20:15 UTC (permalink / raw)
  To: cygwin-developers

On Sep 14 04:37, Takashi Yano wrote:
> On Mon, 13 Sep 2021 20:32:33 +0200
> Corinna Vinschen wrote:
> > Btw., while looking into the current pipe code, I wonder what select_sem
> > is doing in the pipe code at all so far.  It gets released, but it never
> > gets waited on?!?  Am I missing something?
> 
> The semaphore is waited in select.cc.

Ouch, I missed the get_select_sem call, sorry.

> > I don't understand why calling fork_fixup on query_hdl should depend
> > on the handle count.  If you duplicate a writer, you always have to
> > duplicate query_hdl as well to keep the count, no?  Inheritence is
> > handled by the O_CLOEXEC flag and fork_fixup will do the right thing.
> 
> I thought so, however, counting query_hdl cannot work as expected
> without this check. The number of query_hdl opend seems to exceed
> the number of writer.

If the write handle as well as the query handle are both opened,
duplicated and closed in the same manner, they should never have a
different count, unless the write side is inherited by a non-Cygwin
client.

> There seems to be the case that handle is already inherited here
> without fork_fixup. Any idea?

That should depend on the O_CLOEXEC setting, but identically for
all handles in the fhandler.

I pushed two more patches to topic/pipe in terms of inheritence,
maybe that gives a clue?

> > 
> > > +      if (!DuplicateHandle (GetCurrentProcess (), r,
> > > +			    GetCurrentProcess (), &fhs[1]->query_hdl,
> > > +			    GENERIC_READ, !(mode & O_CLOEXEC), 0))
> > 
> > This is a bug I introduced accidentally during testing.  This
> > GENERIC_READ is actually supposed to be a FILE_READ_ATTRIBUTES.
> > Sorry about that.
> 
> The PoC code uses PeekNamedPipe for query_hdl, so GENERIC_READ is
> necessary I think.

Oh, right, that's how it's documented.  Funny enough, the descriptions
of FSCTL_PIPE_PEEK does not mention any permissions at all.  I tried the
permissions and it's FILE_READ_DATA which is required.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 20:15                                           ` Corinna Vinschen
@ 2021-09-14  8:07                                             ` Takashi Yano
  2021-09-14  8:47                                               ` Corinna Vinschen
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-14  8:07 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2099 bytes --]

On Mon, 13 Sep 2021 22:15:25 +0200
Corinna Vinschen wrote:
> On Sep 14 04:37, Takashi Yano wrote:
> > On Mon, 13 Sep 2021 20:32:33 +0200
> > Corinna Vinschen wrote:
> > > I don't understand why calling fork_fixup on query_hdl should depend
> > > on the handle count.  If you duplicate a writer, you always have to
> > > duplicate query_hdl as well to keep the count, no?  Inheritence is
> > > handled by the O_CLOEXEC flag and fork_fixup will do the right thing.
> > 
> > I thought so, however, counting query_hdl cannot work as expected
> > without this check. The number of query_hdl opend seems to exceed
> > the number of writer.
> 
> If the write handle as well as the query handle are both opened,
> duplicated and closed in the same manner, they should never have a
> different count, unless the write side is inherited by a non-Cygwin
> client.
> 
> > There seems to be the case that handle is already inherited here
> > without fork_fixup. Any idea?
> 
> That should depend on the O_CLOEXEC setting, but identically for
> all handles in the fhandler.

I found the cause. set_close_on_exec() in fhandler_pipe is missing.
set_no_inheritance() calls for all adjunct handles are necessary.

> I pushed two more patches to topic/pipe in terms of inheritence,
> maybe that gives a clue?

I attached two additional patch for this issue.

> > > > +      if (!DuplicateHandle (GetCurrentProcess (), r,
> > > > +			    GetCurrentProcess (), &fhs[1]->query_hdl,
> > > > +			    GENERIC_READ, !(mode & O_CLOEXEC), 0))
> > > 
> > > This is a bug I introduced accidentally during testing.  This
> > > GENERIC_READ is actually supposed to be a FILE_READ_ATTRIBUTES.
> > > Sorry about that.
> > 
> > The PoC code uses PeekNamedPipe for query_hdl, so GENERIC_READ is
> > necessary I think.
> 
> Oh, right, that's how it's documented.  Funny enough, the descriptions
> of FSCTL_PIPE_PEEK does not mention any permissions at all.  I tried the
> permissions and it's FILE_READ_DATA which is required.

Thanks. I revised the patch and attach it the subsequent mail.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-fhandler_base-dup-Reflect-O_CLOEXEC-to-inheri.patch --]
[-- Type: application/octet-stream, Size: 1257 bytes --]

From 8dd2bdff579762497898044e9e5304b4b6ab5e93 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Tue, 14 Sep 2021 12:48:03 +0900
Subject: [PATCH 1/2] Cygwin: fhandler_base::dup Reflect O_CLOEXEC to
 inheritance flag.

- Currently fhandler_base::dup duplicates handles with bInheritHandle
  TRUE unconditionally. This patch reflects O_CLOEXEC flag to that
  parameter.
---
 winsup/cygwin/fhandler.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc
index 39fe2640a..9dfe70be3 100644
--- a/winsup/cygwin/fhandler.cc
+++ b/winsup/cygwin/fhandler.cc
@@ -1308,7 +1308,7 @@ fhandler_base::init (HANDLE f, DWORD a, mode_t bin)
 }
 
 int
-fhandler_base::dup (fhandler_base *child, int)
+fhandler_base::dup (fhandler_base *child, int flags)
 {
   debug_printf ("in fhandler_base dup");
 
@@ -1317,7 +1317,7 @@ fhandler_base::dup (fhandler_base *child, int)
     {
       if (!DuplicateHandle (GetCurrentProcess (), get_handle (),
 			    GetCurrentProcess (), &nh,
-			    0, TRUE, DUPLICATE_SAME_ACCESS))
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
 	{
 	  debug_printf ("dup(%s) failed, handle %p, %E",
 			get_name (), get_handle ());
-- 
2.33.0


[-- Attachment #3: 0002-Cygwin-pipe-fifo-Call-set_no_inheritance-for-adjunct.patch --]
[-- Type: application/octet-stream, Size: 1962 bytes --]

From a3c223f72f98baf14df3b58941761c72690f5eee Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Tue, 14 Sep 2021 12:49:35 +0900
Subject: [PATCH 2/2] Cygwin: pipe, fifo: Call set_no_inheritance() for adjunct
 handles.

- Currntly, set_no_inheritance() is not called for the adjunct handles
  such as select_sem. This patch fixes the issue.
---
 winsup/cygwin/fhandler.h       |  1 +
 winsup/cygwin/fhandler_fifo.cc |  2 ++
 winsup/cygwin/fhandler_pipe.cc | 10 ++++++++++
 3 files changed, 13 insertions(+)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 13fba9a14..46381c397 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1208,6 +1208,7 @@ public:
   void open_setup (int flags);
   void fixup_after_fork (HANDLE);
   int dup (fhandler_base *child, int);
+  void set_close_on_exec (bool val);
   int close ();
   void __reg3 raw_read (void *ptr, size_t& len);
   int ioctl (unsigned int cmd, void *);
diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index aa89fa7ae..37498f547 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1817,4 +1817,6 @@ fhandler_fifo::set_close_on_exec (bool val)
 	set_no_inheritance (fc_handler[i].h, val);
       fifo_client_unlock ();
     }
+  if (select_sem)
+    set_no_inheritance (select_sem, val);
 }
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 70cfa3784..da473a1dc 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -515,6 +515,16 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
   return nbytes ?: -1;
 }
 
+void
+fhandler_pipe::set_close_on_exec (bool val)
+{
+  fhandler_base::set_close_on_exec (val);
+  if (read_mtx)
+    set_no_inheritance (read_mtx, val);
+  if (select_sem)
+    set_no_inheritance (select_sem, val);
+}
+
 void
 fhandler_pipe::fixup_after_fork (HANDLE parent)
 {
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-13 19:37                                         ` Takashi Yano
  2021-09-13 20:15                                           ` Corinna Vinschen
@ 2021-09-14  8:08                                           ` Takashi Yano
  2021-09-14  9:03                                             ` Corinna Vinschen
  1 sibling, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-14  8:08 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4129 bytes --]

On Tue, 14 Sep 2021 04:37:18 +0900
Takashi Yano wrote:
> On Mon, 13 Sep 2021 20:32:33 +0200
> Corinna Vinschen wrote:
> > On Sep 12 20:04, Takashi Yano wrote:
> > > On Sun, 12 Sep 2021 17:48:49 +0900
> > > Takashi Yano wrote:
> > > > Hmm. Then, what about PoC code attached? This returns to Corinna's
> > > > query_hdl, and counts read/write handles to detect closing reader side.
> > > > 
> > > > If the number of read handles is equal to number of write handles,
> > > > only the pairs of write handle and query_hdl are alive. So, read pipe
> > > > supposed to be closed.
> > > > 
> > > > This patch depends another patch I posted a few hours ago.
> > > 
> > > Revised a bit.
> > > [...]
> > 
> > What I miss is a bit more detailed commit message...
> 
> I am sorry, I will add more detail commit message.

Done.

> > > diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
> > > index 9b4255cfd..b051f5c03 100644
> > > --- a/winsup/cygwin/fhandler_pipe.cc
> > > +++ b/winsup/cygwin/fhandler_pipe.cc
> > > @@ -56,6 +56,8 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
> > >    fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
> > >    fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
> > >      : FILE_PIPE_QUEUE_OPERATION;
> > > +  if (query_hdl)
> > > +    fpi.CompletionMode = FILE_PIPE_COMPLETE_OPERATION;
> > 
> > This should be a single expression, i.e.
> > 
> >    fpi.CompletionMode = nonblocking || query_hdl
> >                         ? FILE_PIPE_COMPLETE_OPERATION
> >                         : FILE_PIPE_QUEUE_OPERATION;
> > 
> > ideally combined with a comment.
> 
> Thanks. I'll do that.

Done.


> > But then again... you're basically switching the write side of
> > a pipe to nonblocking mode unconditionally.  The downside is a
> > busy wait:
> > 
> > >  fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
> > >  {
> > > @@ -493,7 +490,20 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
> > >  			      get_obj_handle_count (select_sem), NULL);
> > >  	  /* 0 bytes returned?  EAGAIN.  See above. */
> > >  	  if (NT_SUCCESS (status) && nbytes == 0)
> > > -	    set_errno (EAGAIN);
> > > +	    {
> > > +	      if (reader_closed ())
> > > +		{
> > > +		  set_errno (EPIPE);
> > > +		  raise (SIGPIPE);
> > > +		}
> > > +	      else if (is_nonblocking ())
> > > +		set_errno (EAGAIN);
> > > +	      else
> > > +		{
> > > +		  cygwait (select_sem, 10);
> > > +		  continue;
> > 
> > I'm a bit puzzled.  The cygwait branch neglects to check if select_sem
> > is NULL (the preceeding ReleaseSemaphore expression does!)
> > And then it doesn't matter if the caller got blocked or not, it will
> > always perform a continue.  So why do it at all?  Worse, if this
> > expression loops, it will eat up the semaphore, because each call will
> > decrement the semaphore count until it blocks.  That sounds wrong to me.
> 
> It is by design. ReleaseSemaphore() releases maximum number of semaphore
> which the waiter can exists. If only one writer and one reader exist,
> ReleaseSemaphore releases 2 semaphores. Then cygwait here consume semaphore
> two times and return to wait state.
> This wait state is released by raw_read() or close().
>  
> > Btw., while looking into the current pipe code, I wonder what select_sem
> > is doing in the pipe code at all so far.  It gets released, but it never
> > gets waited on?!?  Am I missing something?
> 
> The semaphore is waited in select.cc.
> But, wait. Wat happens if select() is not called? Released semaphore
> can be accumulated up to INT32_MAX!!?
> 
> Let me consider.

See the second patch attached. With this patch, only minimum number
of semaphores needed are released.


Please apply following two patches I attached to previous mail first:
0001-Cygwin-fhandler_base-dup-Reflect-O_CLOEXEC-to-inheri.patch
0002-Cygwin-pipe-fifo-Call-set_no_inheritance-for-adjunct.patch

Then, apply the patches attached this mail.
0001-Cygwin-pipe-Use-read-pipe-handle-for-select-on-write.patch
0002-Cygwin-pipe-fifo-Release-select_sem-semaphore-as-muc.patch

Thanks

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: 0001-Cygwin-pipe-Use-read-pipe-handle-for-select-on-write.patch --]
[-- Type: application/octet-stream, Size: 12170 bytes --]

From 611ac5f87df0b5156e3ec82e98af27892a9c8882 Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Tue, 14 Sep 2021 12:27:33 +0900
Subject: [PATCH 1/2] Cygwin: pipe: Use read pipe handle for select() on write
 pipe.

- Usually WriteQuotaAvailable retrieved by NtQueryInformationFile()
  on the write side reflects the space available in the inbound buffer
  on the read side. However, if a pipe read is currently pending,
  WriteQuotaAvailable on the write side is decremented by the number
  of bytes the read side is requesting. So it's possible (even likely)
  that WriteQuotaAvailable is 0, even if the inbound buffer on the
  read side is not full. This can lead to a deadlock situation:
  The reader is waiting for data, but select on the writer side
  assumes that no space is available in the read side inbound buffer.

  Currently, to avoid this stuation, read() does not request larger
  block than pipe size - 1. However, this mechanism does not take
  effect if the reader side is non-cygwin app.

  The only reliable information is available on the read side, so
  fetch info from the read side via the pipe-specific query handle
  (query_hdl) introduced.
---
 winsup/cygwin/fhandler.h       |  14 ++++-
 winsup/cygwin/fhandler_pipe.cc | 105 +++++++++++++++++++++++----------
 winsup/cygwin/select.cc        |  52 +++++++++++++---
 winsup/cygwin/spawn.cc         |  11 ++++
 4 files changed, 144 insertions(+), 38 deletions(-)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 46381c397..db2325144 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1176,10 +1176,22 @@ class fhandler_pipe_fifo: public fhandler_base
 {
  protected:
   size_t pipe_buf_size;
+  HANDLE query_hdl;
 
  public:
   fhandler_pipe_fifo ();
 
+  HANDLE get_query_handle () const { return query_hdl; }
+  void close_query_handle ()
+  {
+    if (query_hdl)
+      {
+	CloseHandle (query_hdl);
+	query_hdl = NULL;
+      }
+  }
+  bool reader_closed ();
+
   ssize_t __reg3 raw_write (const void *ptr, size_t len);
 
 };
@@ -1189,7 +1201,6 @@ class fhandler_pipe: public fhandler_pipe_fifo
 private:
   HANDLE read_mtx;
   pid_t popen_pid;
-  void set_pipe_non_blocking (bool nonblocking);
 public:
   fhandler_pipe ();
 
@@ -1237,6 +1248,7 @@ public:
     fh->copy_from (this);
     return fh;
   }
+  void set_pipe_non_blocking (bool nonblocking);
 };
 
 #define CYGWIN_FIFO_PIPE_NAME_LEN     47
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index da473a1dc..4dab3015d 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -54,8 +54,12 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
   FILE_PIPE_INFORMATION fpi;
 
   fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
-  fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
-    : FILE_PIPE_QUEUE_OPERATION;
+  /* If query_hdl is set, write pipe should check reader_closed()
+     while raw_read(). If the pipe is blocking, raw_write() stops
+     at NtWriteFile() and loses the chance to check it. Therefore,
+     always set write pipe to non-blocking. */
+  fpi.CompletionMode = (nonblocking || query_hdl)
+    ? FILE_PIPE_COMPLETE_OPERATION : FILE_PIPE_QUEUE_OPERATION;
   status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
 				 FilePipeInformation);
   if (!NT_SUCCESS (status))
@@ -202,6 +206,8 @@ fhandler_pipe::open_setup (int flags)
       if (!read_mtx)
 	debug_printf ("CreateMutex failed: %E");
     }
+  if (get_dev () == FH_PIPEW && !query_hdl)
+    set_pipe_non_blocking (is_nonblocking ());
 }
 
 off_t
@@ -268,39 +274,22 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   while (nbytes < len)
     {
       ULONG_PTR nbytes_now = 0;
-      size_t left = len - nbytes;
-      ULONG len1 = (ULONG) left;
+      ULONG len1 = (ULONG) (len - nbytes);
       waitret = WAIT_OBJECT_0;
 
       if (evt)
 	ResetEvent (evt);
-      if (!is_nonblocking ())
+      FILE_PIPE_LOCAL_INFORMATION fpli;
+      status = NtQueryInformationFile (get_handle (), &io,
+				       &fpli, sizeof (fpli),
+				       FilePipeLocalInformation);
+      if (NT_SUCCESS (status))
 	{
-	  FILE_PIPE_LOCAL_INFORMATION fpli;
-
-	  /* If the pipe is empty, don't request more bytes than pipe
-	     buffer size - 1. Pending read lowers WriteQuotaAvailable on
-	     the write side and thus affects select's ability to return
-	     more or less reliable info whether a write succeeds or not. */
-	  ULONG chunk = pipe_buf_size - 1;
-	  status = NtQueryInformationFile (get_handle (), &io,
-					   &fpli, sizeof (fpli),
-					   FilePipeLocalInformation);
-	  if (NT_SUCCESS (status))
-	    {
-	      if (fpli.ReadDataAvailable > 0)
-		chunk = left;
-	      else if (nbytes != 0)
-		break;
-	      else
-		chunk = fpli.InboundQuota - 1;
-	    }
-	  else if (nbytes != 0)
-	    break;
-
-	  if (len1 > chunk)
-	    len1 = chunk;
+	if (fpli.ReadDataAvailable == 0 && nbytes != 0)
+	  break;
 	}
+      else if (nbytes != 0)
+	break;
       status = NtReadFile (get_handle (), evt, NULL, NULL, &io, ptr,
 			   len1, NULL, NULL);
       if (evt && status == STATUS_PENDING)
@@ -385,6 +374,16 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
   len = nbytes;
 }
 
+bool
+fhandler_pipe_fifo::reader_closed ()
+{
+  if (!query_hdl)
+    return false;
+  int n_reader = get_obj_handle_count (query_hdl);
+  int n_writer = get_obj_handle_count (get_handle ());
+  return n_reader == n_writer;
+}
+
 ssize_t __reg3
 fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 {
@@ -493,7 +492,20 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 			      get_obj_handle_count (select_sem), NULL);
 	  /* 0 bytes returned?  EAGAIN.  See above. */
 	  if (NT_SUCCESS (status) && nbytes == 0)
-	    set_errno (EAGAIN);
+	    {
+	      if (reader_closed ())
+		{
+		  set_errno (EPIPE);
+		  raise (SIGPIPE);
+		}
+	      else if (is_nonblocking ())
+		set_errno (EAGAIN);
+	      else
+		{
+		  cygwait (select_sem, 10);
+		  continue;
+		}
+	    }
 	}
       else if (STATUS_PIPE_IS_CLOSED (status))
 	{
@@ -523,6 +535,8 @@ fhandler_pipe::set_close_on_exec (bool val)
     set_no_inheritance (read_mtx, val);
   if (select_sem)
     set_no_inheritance (select_sem, val);
+  if (query_hdl)
+    set_no_inheritance (query_hdl, val);
 }
 
 void
@@ -532,6 +546,9 @@ fhandler_pipe::fixup_after_fork (HANDLE parent)
     fork_fixup (parent, read_mtx, "read_mtx");
   if (select_sem)
     fork_fixup (parent, select_sem, "select_sem");
+  if (query_hdl)
+    fork_fixup (parent, query_hdl, "query_hdl");
+
   fhandler_base::fixup_after_fork (parent);
 }
 
@@ -562,6 +579,15 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
       ftp->close ();
       res = -1;
     }
+  else if (query_hdl &&
+	   !DuplicateHandle (GetCurrentProcess (), query_hdl,
+			    GetCurrentProcess (), &ftp->query_hdl,
+			    0, !(flags & O_CLOEXEC), DUPLICATE_SAME_ACCESS))
+    {
+      __seterrno ();
+      ftp->close ();
+      res = -1;
+    }
 
   debug_printf ("res %d", res);
   return res;
@@ -577,6 +603,8 @@ fhandler_pipe::close ()
       ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
       CloseHandle (select_sem);
     }
+  if (query_hdl)
+    CloseHandle (query_hdl);
   return fhandler_base::close ();
 }
 
@@ -797,6 +825,23 @@ fhandler_pipe::create (fhandler_pipe *fhs[2], unsigned psize, int mode)
 	DuplicateHandle (GetCurrentProcess (), fhs[0]->select_sem,
 			 GetCurrentProcess (), &fhs[1]->select_sem,
 			 0, sa->bInheritHandle, DUPLICATE_SAME_ACCESS);
+      if (!DuplicateHandle (GetCurrentProcess (), r,
+			    GetCurrentProcess (), &fhs[1]->query_hdl,
+			    FILE_READ_DATA, sa->bInheritHandle, 0))
+	{
+	  CloseHandle (fhs[0]->select_sem);
+	  delete fhs[0];
+	  CloseHandle (r);
+	  CloseHandle (fhs[1]->select_sem);
+	  delete fhs[1];
+	  CloseHandle (w);
+	}
+      else
+	{
+	  /* Call set_pipe_non_blocking() again after creating query_hdl. */
+	  fhs[1]->set_pipe_non_blocking (fhs[1]->is_nonblocking ());
+	  res = 0;
+	}
     }
 
   debug_printf ("%R = pipe([%p, %p], %d, %y)", res, fhs[0], fhs[1], psize, mode);
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 5e583434c..ac2f3a9e0 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -608,16 +608,47 @@ pipe_data_available (int fd, fhandler_base *fh, HANDLE h, bool writing)
     }
   if (writing)
     {
-      /* WriteQuotaAvailable is decremented by the number of bytes requested
-	 by a blocking reader on the other side of the pipe.  Cygwin readers
-	 are serialized and never request a number of bytes equivalent to the
-	 full buffer size.  So WriteQuotaAvailable is 0 only if either the
-	 read buffer on the other side is really full, or if we have non-Cygwin
-	 readers. */
+      /* If there is anything available in the pipe buffer then signal
+        that.  This means that a pipe could still block since you could
+        be trying to write more to the pipe than is available in the
+        buffer but that is the hazard of select().
+
+        Note that WriteQuotaAvailable is unreliable.
+
+        Usually WriteQuotaAvailable on the write side reflects the space
+        available in the inbound buffer on the read side.  However, if a
+        pipe read is currently pending, WriteQuotaAvailable on the write side
+        is decremented by the number of bytes the read side is requesting.
+        So it's possible (even likely) that WriteQuotaAvailable is 0, even
+        if the inbound buffer on the read side is not full.  This can lead to
+        a deadlock situation: The reader is waiting for data, but select
+        on the writer side assumes that no space is available in the read
+        side inbound buffer.
+
+        Consequentially, the only reliable information is available on the
+        read side, so fetch info from the read side via the pipe-specific
+        query handle.  Use fpli.WriteQuotaAvailable as storage for the actual
+        interesting value, which is the InboundQuote on the write side,
+        decremented by the number of bytes of data in that buffer. */
+      /* Note: Do not use NtQueryInformationFile() for query_hdl because
+	 NtQueryInformationFile() seems to interfere with reading pipes
+	 in non-cygwin apps. Instead, use PeekNamedPipe() here. */
+      if (fh->get_device () == FH_PIPEW)
+	{
+	  HANDLE query_hdl = ((fhandler_pipe *) fh)->get_query_handle ();
+	  if (query_hdl)
+	    {
+	      DWORD nbytes_in_pipe;
+	      PeekNamedPipe (query_hdl, NULL, 0, NULL, &nbytes_in_pipe, NULL);
+	      fpli.WriteQuotaAvailable = fpli.InboundQuota - nbytes_in_pipe;
+	    }
+	  else
+	    return 1;
+	}
       if (fpli.WriteQuotaAvailable > 0)
 	{
 	  paranoid_printf ("fd %d, %s, write: size %u, avail %u", fd,
-			   fh->get_name (), fpli.OutboundQuota,
+			   fh->get_name (), fpli.InboundQuota,
 			   fpli.WriteQuotaAvailable);
 	  return 1;
 	}
@@ -712,6 +743,13 @@ out:
   h = fh->get_output_handle ();
   if (s->write_selected && dev != FH_PIPER)
     {
+      if (dev == FH_PIPEW && ((fhandler_pipe *) fh)->reader_closed ())
+	{
+	  gotone += s->write_ready = true;
+	  if (s->except_selected)
+	    gotone += s->except_ready = true;
+	  return gotone;
+	}
       gotone += s->write_ready =  pipe_data_available (s->fd, fh, h, true);
       select_printf ("write: %s, gotone %d", fh->get_name (), gotone);
     }
diff --git a/winsup/cygwin/spawn.cc b/winsup/cygwin/spawn.cc
index 0bde0b04d..6b2026776 100644
--- a/winsup/cygwin/spawn.cc
+++ b/winsup/cygwin/spawn.cc
@@ -657,6 +657,17 @@ child_info_spawn::worker (const char *prog_arg, const char *const *argv,
 		ptys->create_invisible_console ();
 		ptys->setup_locale ();
 	      }
+	    else if (cfd->get_dev () == FH_PIPEW)
+	      {
+		fhandler_pipe *pipe = (fhandler_pipe *)(fhandler_base *) cfd;
+		pipe->close_query_handle ();
+		pipe->set_pipe_non_blocking (false);
+	      }
+	    else if (cfd->get_dev () == FH_PIPER)
+	      {
+		fhandler_pipe *pipe = (fhandler_pipe *)(fhandler_base *) cfd;
+		pipe->set_pipe_non_blocking (false);
+	      }
 	}
 
       bool enable_pcon = false;
-- 
2.33.0


[-- Attachment #3: 0002-Cygwin-pipe-fifo-Release-select_sem-semaphore-as-muc.patch --]
[-- Type: application/octet-stream, Size: 5277 bytes --]

From e5c64960fddd43f08dae7afbe3ae1c75bd41c81d Mon Sep 17 00:00:00 2001
From: Takashi Yano <takashi.yano@nifty.ne.jp>
Date: Tue, 14 Sep 2021 13:10:54 +0900
Subject: [PATCH 2/2] Cygwin: pipe, fifo: Release select_sem semaphore as much
 as needed.

- Currently, raw_read(), raw_write() and close() release select_sem
  unconditionally even if no waiter for select_sem exists. With this
  patch, only the minimum number of semaphores required is released.
---
 winsup/cygwin/fhandler.h       |  4 ++++
 winsup/cygwin/fhandler_fifo.cc | 28 ++++++++++++++++++++++------
 winsup/cygwin/fhandler_pipe.cc | 28 +++++++++++++++++++++-------
 3 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index db2325144..9580a698c 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1177,6 +1177,7 @@ class fhandler_pipe_fifo: public fhandler_base
  protected:
   size_t pipe_buf_size;
   HANDLE query_hdl;
+  virtual void release_select_sem (const char *) {};
 
  public:
   fhandler_pipe_fifo ();
@@ -1201,6 +1202,7 @@ class fhandler_pipe: public fhandler_pipe_fifo
 private:
   HANDLE read_mtx;
   pid_t popen_pid;
+  void release_select_sem (const char *);
 public:
   fhandler_pipe ();
 
@@ -1444,6 +1446,8 @@ class fhandler_fifo: public fhandler_pipe_fifo
   void shared_fc_handler_updated (bool val)
   { shmem->shared_fc_handler_updated (val); }
 
+  void release_select_sem (const char *);
+
 public:
   fhandler_fifo ();
   ~fhandler_fifo ()
diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 37498f547..489ba528c 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1185,6 +1185,22 @@ fhandler_fifo::take_ownership (DWORD timeout)
   return ret;
 }
 
+void
+fhandler_fifo::release_select_sem (const char *from)
+{
+  LONG n_release;
+  if (reader) /* Number of select() call. */
+    n_release = get_obj_handle_count (select_sem)
+      - get_obj_handle_count (read_ready);
+  else /* Number of select() and reader */
+    n_release = get_obj_handle_count (select_sem)
+      - get_obj_handle_count (get_handle ());
+  debug_printf("%s(%s) release %d", from,
+	       reader ? "reader" : "writer", n_release);
+  if (n_release)
+    ReleaseSemaphore (select_sem, n_release, NULL);
+}
+
 void __reg3
 fhandler_fifo::raw_read (void *in_ptr, size_t& len)
 {
@@ -1372,7 +1388,7 @@ out:
   fifo_client_unlock ();
   reading_unlock ();
   if (select_sem)
-    ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+    release_select_sem ("raw_read");
 }
 
 int __reg2
@@ -1483,6 +1499,11 @@ fhandler_fifo::cancel_reader_thread ()
 int
 fhandler_fifo::close ()
 {
+  if (select_sem)
+    {
+      release_select_sem ("close");
+      NtClose (select_sem);
+    }
   if (writer)
     {
       nwriters_lock ();
@@ -1574,11 +1595,6 @@ fhandler_fifo::close ()
     NtClose (write_ready);
   if (writer_opening)
     NtClose (writer_opening);
-  if (select_sem)
-    {
-      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
-      NtClose (select_sem);
-    }
   if (nohandle ())
     return 0;
   else
diff --git a/winsup/cygwin/fhandler_pipe.cc b/winsup/cygwin/fhandler_pipe.cc
index 4dab3015d..c0379e81d 100644
--- a/winsup/cygwin/fhandler_pipe.cc
+++ b/winsup/cygwin/fhandler_pipe.cc
@@ -237,6 +237,22 @@ fhandler_pipe::get_proc_fd_name (char *buf)
   return buf;
 }
 
+void
+fhandler_pipe::release_select_sem (const char *from)
+{
+  LONG n_release;
+  if (get_dev () == FH_PIPER) /* Number of select() and writer */
+    n_release = get_obj_handle_count (select_sem)
+      - get_obj_handle_count (read_mtx);
+  else /* Number of select() call */
+    n_release = get_obj_handle_count (select_sem)
+      - get_obj_handle_count (query_hdl);
+  debug_printf("%s(%s) release %d", from,
+	       get_dev () == FH_PIPER ? "PIPER" : "PIPEW", n_release);
+  if (n_release)
+    ReleaseSemaphore (select_sem, n_release, NULL);
+}
+
 void __reg3
 fhandler_pipe::raw_read (void *ptr, size_t& len)
 {
@@ -328,8 +344,7 @@ fhandler_pipe::raw_read (void *ptr, size_t& len)
 	  ptr = ((char *) ptr) + nbytes_now;
 	  nbytes += nbytes_now;
 	  if (select_sem && nbytes_now > 0)
-	    ReleaseSemaphore (select_sem,
-			      get_obj_handle_count (select_sem), NULL);
+	    release_select_sem ("raw_read");
 	}
       else
 	{
@@ -488,8 +503,7 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 	  ptr = ((char *) ptr) + nbytes_now;
 	  nbytes += nbytes_now;
 	  if (select_sem && nbytes_now > 0)
-	    ReleaseSemaphore (select_sem,
-			      get_obj_handle_count (select_sem), NULL);
+	    release_select_sem ("raw_write");
 	  /* 0 bytes returned?  EAGAIN.  See above. */
 	  if (NT_SUCCESS (status) && nbytes == 0)
 	    {
@@ -596,13 +610,13 @@ fhandler_pipe::dup (fhandler_base *child, int flags)
 int
 fhandler_pipe::close ()
 {
-  if (read_mtx)
-    CloseHandle (read_mtx);
   if (select_sem)
     {
-      ReleaseSemaphore (select_sem, get_obj_handle_count (select_sem), NULL);
+      release_select_sem ("close");
       CloseHandle (select_sem);
     }
+  if (read_mtx)
+    CloseHandle (read_mtx);
   if (query_hdl)
     CloseHandle (query_hdl);
   return fhandler_base::close ();
-- 
2.33.0


^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-14  8:07                                             ` Takashi Yano
@ 2021-09-14  8:47                                               ` Corinna Vinschen
  2021-09-14 12:38                                                 ` Ken Brown
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-14  8:47 UTC (permalink / raw)
  To: cygwin-developers

On Sep 14 17:07, Takashi Yano wrote:
> On Mon, 13 Sep 2021 22:15:25 +0200
> Corinna Vinschen wrote:
> > That should depend on the O_CLOEXEC setting, but identically for
> > all handles in the fhandler.
> 
> I found the cause. set_close_on_exec() in fhandler_pipe is missing.
> set_no_inheritance() calls for all adjunct handles are necessary.
> 
> > I pushed two more patches to topic/pipe in terms of inheritence,
> > maybe that gives a clue?
> 
> I attached two additional patch for this issue.

Uh oh!  This patch to fhandler_base::dup made me check other fhandlers
and, yeah, we have more unconditional inheritence ignoring O_CLOEXEC
(fhandler_tape for instance).  We should fix that at one point, but that
requires your patch to go to master first.  Let's just keep that in mind
for now.

I'll push both patches in a bit.


Corinna

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-14  8:08                                           ` Takashi Yano
@ 2021-09-14  9:03                                             ` Corinna Vinschen
  2021-09-14  9:56                                               ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Corinna Vinschen @ 2021-09-14  9:03 UTC (permalink / raw)
  To: cygwin-developers

On Sep 14 17:08, Takashi Yano wrote:
> @@ -54,8 +54,12 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
>    FILE_PIPE_INFORMATION fpi;
>  
>    fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
> -  fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
> -    : FILE_PIPE_QUEUE_OPERATION;
> +  /* If query_hdl is set, write pipe should check reader_closed()
> +     while raw_read(). If the pipe is blocking, raw_write() stops
> +     at NtWriteFile() and loses the chance to check it. Therefore,
> +     always set write pipe to non-blocking. */
> +  fpi.CompletionMode = (nonblocking || query_hdl)
> +    ? FILE_PIPE_COMPLETE_OPERATION : FILE_PIPE_QUEUE_OPERATION;
>    status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
>  				 FilePipeInformation);
>    if (!NT_SUCCESS (status))

I don't quite follow the argument.  Blocking pipes are using
asynchronous IO, so they are in fact not blocking calls on the
OS level.  After calling NtWriteFile, the blocking variation
will go into the subsequent

  waitret = cygwait (evt, INFINITE, cw_cancel | cw_sig);

So, wouldn't you get the same effect by keeping the pipe in
FILE_PIPE_QUEUE_OPERATION mode and just add a timeout to the above
cygwait call and handle select_sem in a not yet existing WAIT_TIMEOUT
conditional?


Corinna


P.S.: Maybe the cygwait call is just too simple.  It would be nice if it
      had been defined to take an array of handles, rather than just a
      single handle.  Another change we should keep in mind.

^ permalink raw reply	[flat|nested] 250+ messages in thread

* Re: cygrunsrv + sshd + rsync = 20 times too slow -- throttled?
  2021-09-14  9:03                                             ` Corinna Vinschen
@ 2021-09-14  9:56                                               ` Takashi Yano
  2021-09-14 10:19                                                 ` Takashi Yano
  0 siblings, 1 reply; 250+ messages in thread
From: Takashi Yano @ 2021-09-14  9:56 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1545 bytes --]

On Tue, 14 Sep 2021 11:03:39 +0200
Corinna Vinschen wrote:
> On Sep 14 17:08, Takashi Yano wrote:
> > @@ -54,8 +54,12 @@ fhandler_pipe::set_pipe_non_blocking (bool nonblocking)
> >    FILE_PIPE_INFORMATION fpi;
> >  
> >    fpi.ReadMode = FILE_PIPE_BYTE_STREAM_MODE;
> > -  fpi.CompletionMode = nonblocking ? FILE_PIPE_COMPLETE_OPERATION
> > -    : FILE_PIPE_QUEUE_OPERATION;
> > +  /* If query_hdl is set, write pipe should check reader_closed()
> > +     while raw_read(). If the pipe is blocking, raw_write() stops
> > +     at NtWriteFile() and loses the chance to check it. Therefore,
> > +     always set write pipe to non-blocking. */
> > +  fpi.CompletionMode = (nonblocking || query_hdl)
> > +    ? FILE_PIPE_COMPLETE_OPERATION : FILE_PIPE_QUEUE_OPERATION;
> >    status = NtSetInformationFile (get_handle (), &io, &fpi, sizeof fpi,
> >  				 FilePipeInformation);
> >    if (!NT_SUCCESS (status))
> 
> I don't quite follow the argument.  Blocking pipes are using
> asynchronous IO, so they are in fact not blocking calls on the
> OS level.  After calling NtWriteFile, the blocking variation
> will go into the subsequent
> 
>   waitret = cygwait (evt, INFINITE, cw_cancel | cw_sig);
> 
> So, wouldn't you get the same effect by keeping the pipe in
> FILE_PIPE_QUEUE_OPERATION mode and just add a timeout to the above
> cygwait call and handle select_sem in a not yet existing WAIT_TIMEOUT
> conditional?

Sounds reasonable. I revised the patches. Do you mean something like
patch attached?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>