From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2155) id 9FB923858C53; Fri, 25 Aug 2023 10:50:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9FB923858C53 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1692960658; bh=CvNMauOl8VEHi7NIYripCHdA8bIWrUwW7Ykaf9bn3tw=; h=Date:From:To:Subject:Reply-To:References:In-Reply-To:From; b=t4I1CmMEeWOxZvfZZ8dDIug+X4pGEm8isbHyS8t3rWh76+7YkBwJV4k7nObqdrjJS 3hDgJU9s46DY6M1mwiHYOgiDCGBPVpfUA+sIKpRmLLa8w5Ep8o2zVBcYHIF9e7e6So mGIUdBqbFa7BBG7zLQ6Gm7YLKvS6JI7C+ehuFTbE= Received: by calimero.vinschen.de (Postfix, from userid 500) id B6F18A80C9A; Fri, 25 Aug 2023 12:50:56 +0200 (CEST) Date: Fri, 25 Aug 2023 12:50:56 +0200 From: Corinna Vinschen To: cygwin@cygwin.com Subject: Re: scp stalls on uploading in cygwin 3.5 current master. Message-ID: Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com References: <20230824060502.c4798062cb19d4d35a5633ae@nifty.ne.jp> <20230824123131.390b4471915c963425c77608@nifty.ne.jp> <20230825174832.9ebae8112667d5d5411cb8db@nifty.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230825174832.9ebae8112667d5d5411cb8db@nifty.ne.jp> List-Id: On Aug 25 17:48, Takashi Yano via Cygwin wrote: > On Thu, 24 Aug 2023 10:59:33 +0200 > Corinna Vinschen wrote: > > > I'm not sure why at all, however, the following patch seems to > > > solve the issue. > > > > > > diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc > > > index 7b9473849..de5794c9f 100644 > > > --- a/winsup/cygwin/select.cc > > > +++ b/winsup/cygwin/select.cc > > > @@ -1790,7 +1790,7 @@ peek_socket (select_record *me, bool) > > > if (events & FD_WRITE) > > > { > > > wfd_set w = { 1, { fh->get_socket () } }; > > > - TIMEVAL t = { 0 }; > > > + TIMEVAL t = { .tv_sec = 0, .tv_usec = 1 }; > > > > > > if (_win32_select (0, NULL, &w, NULL, &t) == 0) > > > events &= ~FD_WRITE; > > > > Yeah, this is weird. A TIMEVAL value of 0 indicates non-blocking, > > so why should waiting a usec make that better? It also potentially > > slows down Cygwin's select noticably if multiple sockets are part > > of the descriptor set. > > > > Hmmm. > > > > Is it possible that _win32_select returns with SOCKET_ERROR for > > some reason? > > > > Unfortunately I'm a bit swamped ATM, but rather than setting t to 1 > > usec, what if the check goes: > > > > if (_win32_select (0, NULL, &w, NULL, &t) != 1) > > > > ? > > This did not help. I looked into this deeper and noticed that: > 1) _win32_select() sometimes returns 0. > 2) If _win32_select() returns 0, WaitForMultipleObjects(..., INFINITE) > is called in thread_socket(). > 3) WaitForMultipleObjects() sometimes does not return for FD_WRITE > for unknown reason. > This causes the stall. So the situation is that the network event handling returned FD_WRITE, because it always returns FD_WRITE as long as a non-blocking send() function didn't explicitely fail due to buffer overrun. However, _win32_select will notice that the buffer is full, so it does not return 1, but 0. I e., the socket is not ready for writing. Now you're saying that it's possible that the following WFMO will never return? That would mean that the FD_WRITE event won't be triggered again because it already *had* been triggered and the only way to re-enable it is to call one of the send() functions (see https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaeventselect) I don't have an answer to this problem yet. Can we use send(sock, "", 0) to reenable FD_WRITE, perhaps? Corinna