public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* [PATCH] Fix poll/select signal socket as write ready on connect failure
@ 2020-07-15 18:54 Marc Hoersken
  2020-07-16  9:25 ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Hoersken @ 2020-07-15 18:54 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]

Hello everyone,

I identified an issue related to the way the events FD_CONNECT and 
FD_CLOSE returned by WSAEnumNetworkEvents are currently handled in 
winsup/cygwin/fhandler_socket_inet.cc.

It seems like the code does not handle the fact that those events are 
returned only once for a socket and if not acted upon by the calling 
program may not be received again. This means poll and select are 
currently not consistend about the socket still being writable after a 
connect failure. The first call to poll or select would signal the 
socket as writable, but not any following call. The first call consumes 
the FD_CONNECT and FD_CLOSE events, regardless of the event mask 
supplied by the calling program. So even if the calling program does not 
care about writability in the first call, the events are consumed and 
following calls checking for writability will not be able to detect a 
connection failure.

A very simple test to reproduce can be made with the Cygwin provided 
curl package. After installing with current Cygwin, issue the following 
command to make it try connecting to a local non-listening port (eg. 47):

curl -v 127.0.0.1:47

With current Cygwin this will never timeout. An explanation of the curl 
internals can be found here [1], but the short version is: curl waits on 
sockets without checking/handling writability in a first poll call and 
then after waiting, the writability (connection failure) is checked in a 
second poll call per wait-loop iteration. Therefore curl can never 
detect the connection failure in the second call, because the first call 
already consumed the relevant events.

As far as I understand calling poll and/or select should not 
change/reset the socket readyness state, therefore I created a simple 
fix which could be used to solve this issue. Attached you will find a 
suggested patch to make sure poll and select always signal writability 
of a connection failed socket. With this patch applied the above example 
command failed with a "Connection refused" as expected.

This patch only fixes the behaviour regarding connection failure (during 
FD_CONNECT), I am not sure if connection closure (during FD_CLOSE) is 
also affected, but I was not able to find code handling the fact that 
FD_CLOSE is only signalled once.

Please take a look and thanks in advance!

Best regards,
Marc Hörsken

[1] https://github.com/curl/curl/pull/5509#issuecomment-658357933


[-- Attachment #2: 0001-Cygwin-make-sure-failed-sockets-always-signal-writab.patch --]
[-- Type: text/plain, Size: 1365 bytes --]

From 7cd9d597a2a314c3aeb5b7c8aaa970ded6d56d7a Mon Sep 17 00:00:00 2001
From: Marc Hoersken <info@marc-hoersken.de>
Date: Wed, 15 Jul 2020 20:53:21 +0200
Subject: [PATCH] Cygwin: make sure failed sockets always signal writability

Since FD_CONNECT is only given once, we manually need to set
FD_WRITE for connection failed sockets to have consistent
behaviour in programs calling poll/select multiple times.

Example test to non-listening port: curl -v 127.0.0.1:47
---
 winsup/cygwin/fhandler_socket_inet.cc | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/winsup/cygwin/fhandler_socket_inet.cc b/winsup/cygwin/fhandler_socket_inet.cc
index 74c415d..e5b0d2d 100644
--- a/winsup/cygwin/fhandler_socket_inet.cc
+++ b/winsup/cygwin/fhandler_socket_inet.cc
@@ -376,6 +376,12 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
       if (erase)
 	wsock_events->events &= ~(events & ~(FD_WRITE | FD_CLOSE));
     }
+  /* Since FD_CONNECT is only given once, we manually need to set
+     FD_WRITE for connection failed sockets to have consistent
+     behaviour in programs calling poll/select multiple times.
+     Example test to non-listening port: curl -v 127.0.0.1:47 */
+  if ((connect_state () == connect_failed) && (event_mask & FD_WRITE))
+    wsock_events->events |= FD_WRITE;
   UNLOCK_EVENTS;
 
   return ret;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix poll/select signal socket as write ready on connect failure
  2020-07-15 18:54 [PATCH] Fix poll/select signal socket as write ready on connect failure Marc Hoersken
@ 2020-07-16  9:25 ` Corinna Vinschen
  2020-07-17 18:56   ` Marc Hoersken
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2020-07-16  9:25 UTC (permalink / raw)
  To: Marc Hoersken; +Cc: cygwin

Hi Marc,

On Jul 15 20:54, Marc Hoersken via Cygwin wrote:
> Hello everyone,
> 
> I identified an issue related to the way the events FD_CONNECT and FD_CLOSE
> returned by WSAEnumNetworkEvents are currently handled in
> winsup/cygwin/fhandler_socket_inet.cc.
> 
> It seems like the code does not handle the fact that those events are
> returned only once for a socket and if not acted upon by the calling program
> may not be received again. This means poll and select are currently not
> consistend about the socket still being writable after a connect failure.
> The first call to poll or select would signal the socket as writable, but
> not any following call. The first call consumes the FD_CONNECT and FD_CLOSE
> events, regardless of the event mask supplied by the calling program. So
> even if the calling program does not care about writability in the first
> call, the events are consumed and following calls checking for writability
> will not be able to detect a connection failure.
> [...]
> As far as I understand calling poll and/or select should not change/reset
> the socket readyness state, therefore I created a simple fix which could be
> used to solve this issue. Attached you will find a suggested patch to make
> sure poll and select always signal writability of a connection failed
> socket. With this patch applied the above example command failed with a
> "Connection refused" as expected.
> 
> This patch only fixes the behaviour regarding connection failure (during
> FD_CONNECT), I am not sure if connection closure (during FD_CLOSE) is also
> affected, but I was not able to find code handling the fact that FD_CLOSE is
> only signalled once.
> 
> Please take a look and thanks in advance!

Thanks for the patch.  I pushed it.  But then I got second thoughts in
terms of how to fix the issue.  The reason is that the FD_CLOSE problem
shouldn't exist, simply for the fact that we never remove FD_CLOSE from
the events mask, see

https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/fhandler_socket_inet.cc;hb=HEAD#l377

So, rather than setting FD_WRITE at some later point in the code, what
about handling this where the other FD_CONNECT stuff is handled, by
not erasing the FD_CONNECT bit, just like with FD_CLOSE?

diff --git a/winsup/cygwin/fhandler_socket_inet.cc b/winsup/cygwin/fhandler_socket_inet.cc
index e5b0d2d1443e..b64d96225db1 100644
--- a/winsup/cygwin/fhandler_socket_inet.cc
+++ b/winsup/cygwin/fhandler_socket_inet.cc
@@ -354,7 +354,12 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
 	    }
 	  else
 	    wsock_events->events |= FD_WRITE;
-	  wsock_events->events &= ~FD_CONNECT;
+	  /* Since FD_CONNECT is only given once, we have to keep FD_CONNECT
+	     for connection failed sockets to have consistent behaviour in
+	     programs calling poll/select multiple times.  Example test to
+	     non-listening port: curl -v 127.0.0.1:47 */
+	  if (connect_state () != connect_failed)
+	    wsock_events->events &= ~FD_CONNECT;
 	  wsock_events->connect_errorcode = 0;
 	}
       /* This test makes accept/connect behave as on Linux when accept/connect
@@ -376,12 +381,6 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
       if (erase)
 	wsock_events->events &= ~(events & ~(FD_WRITE | FD_CLOSE));
     }
-  /* Since FD_CONNECT is only given once, we manually need to set
-     FD_WRITE for connection failed sockets to have consistent
-     behaviour in programs calling poll/select multiple times.
-     Example test to non-listening port: curl -v 127.0.0.1:47 */
-  if ((connect_state () == connect_failed) && (event_mask & FD_WRITE))
-    wsock_events->events |= FD_WRITE;
   UNLOCK_EVENTS;
 
   return ret;

What do you think?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix poll/select signal socket as write ready on connect failure
  2020-07-16  9:25 ` Corinna Vinschen
@ 2020-07-17 18:56   ` Marc Hoersken
  2020-07-17 19:21     ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Hoersken @ 2020-07-17 18:56 UTC (permalink / raw)
  To: cygwin

Hi Corinna,

Am 16.07.2020 um 11:25 schrieb Corinna Vinschen:
> Thanks for the patch.  I pushed it.


thanks for pushing it already. Please excuse my delayed response, family 
live kept me busy.


> But then I got second thoughts in terms of how to fix the issue.


Yes, I also got second thoughts yesterday about my initial approach.


> The reason is that the FD_CLOSE problem shouldn't exist,
> simply for the fact that we never remove FD_CLOSE from
> the events mask, see
>
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/fhandler_socket_inet.cc;hb=HEAD#l377


Thanks, I also understood in the meantime, that some flags/events are 
not removed from wsock_events->event. FD_CLOSE seems not to be affected 
as you described and I was unable to produce an issue in case the 
connection was closed from the server side. So only the FD_CONNECT to 
FD_WRITE handling remained problematic (before the patch).


> So, rather than setting FD_WRITE at some later point in the code, what
> about handling this where the other FD_CONNECT stuff is handled, by
> not erasing the FD_CONNECT bit, just like with FD_CLOSE?


I think this makes more sense, yes. I am just not sure if the socket 
should also be write ready in case of a socket error. Looking at the 
description of EINPROGRESS on the man page of connect [1], it seems like 
writability is given regardless of the connection being successful or 
not, but as soon as the connection attempt is no longer pending. For 
successful connections FD_WRITE will be given already, so we will only 
need to set it for failed connections regardless of a socket error in 
wsock_events->connect_errorcode. Therefore I suggest to move the line 
setting FD_WRITE [2] one level up outside of the else branch.


> diff --git a/winsup/cygwin/fhandler_socket_inet.cc b/winsup/cygwin/fhandler_socket_inet.cc
> index e5b0d2d1443e..b64d96225db1 100644
> --- a/winsup/cygwin/fhandler_socket_inet.cc
> +++ b/winsup/cygwin/fhandler_socket_inet.cc
> @@ -354,7 +354,12 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
>   	    }
>   	  else
>   	    wsock_events->events |= FD_WRITE;
> -	  wsock_events->events &= ~FD_CONNECT;
> +	  /* Since FD_CONNECT is only given once, we have to keep FD_CONNECT
> +	     for connection failed sockets to have consistent behaviour in
> +	     programs calling poll/select multiple times.  Example test to
> +	     non-listening port: curl -v 127.0.0.1:47 */
> +	  if (connect_state () != connect_failed)
> +	    wsock_events->events &= ~FD_CONNECT;
>   	  wsock_events->connect_errorcode = 0;
>   	}
>         /* This test makes accept/connect behave as on Linux when accept/connect
> @@ -376,12 +381,6 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
>         if (erase)
>   	wsock_events->events &= ~(events & ~(FD_WRITE | FD_CLOSE));
>       }
> -  /* Since FD_CONNECT is only given once, we manually need to set
> -     FD_WRITE for connection failed sockets to have consistent
> -     behaviour in programs calling poll/select multiple times.
> -     Example test to non-listening port: curl -v 127.0.0.1:47 */
> -  if ((connect_state () == connect_failed) && (event_mask & FD_WRITE))
> -    wsock_events->events |= FD_WRITE;
>     UNLOCK_EVENTS;
>   
>     return ret;
>
> What do you think?


I already tested your diff successfully, so this could be an alternative 
approach to the issue. I just think the wsock_events->connect_errorcode 
should also only be reset if FD_CONNECT is removed, right? So the if 
branch would need to be extended to include the second line [3] as well.


Everything together, I think our suggestions together would look like this:


diff --git a/winsup/cygwin/fhandler_socket_inet.cc 
b/winsup/cygwin/fhandler_socket_inet.cc
index e5b0d2d14..84cd63698 100644
--- a/winsup/cygwin/fhandler_socket_inet.cc
+++ b/winsup/cygwin/fhandler_socket_inet.cc
@@ -352,10 +352,15 @@ fhandler_socket_wsock::evaluate_events (const long 
event_mask, long &events,
               WSASetLastError (wsa_err);
               ret = SOCKET_ERROR;
             }
-         else
-           wsock_events->events |= FD_WRITE;
-         wsock_events->events &= ~FD_CONNECT;
-         wsock_events->connect_errorcode = 0;
+         wsock_events->events |= FD_WRITE;
+         /* Since FD_CONNECT is only given once, we have to keep FD_CONNECT
+            for connection failed sockets to have consistent behaviour
+            programs calling poll/select multiple times.  Example test to
+            non-listening port: curl -v 127.0.0.1:47 */
+         if (connect_state () != connect_failed) {
+           wsock_events->events &= ~FD_CONNECT;
+           wsock_events->connect_errorcode = 0;
+         }
         }
        /* This test makes accept/connect behave as on Linux when 
accept/connect
           is called on a socket for which shutdown has been called.  
The second
@@ -376,12 +381,6 @@ fhandler_socket_wsock::evaluate_events (const long 
event_mask, long &events,
        if (erase)
         wsock_events->events &= ~(events & ~(FD_WRITE | FD_CLOSE));
      }
-  /* Since FD_CONNECT is only given once, we manually need to set
-     FD_WRITE for connection failed sockets to have consistent
-     behaviour in programs calling poll/select multiple times.
-     Example test to non-listening port: curl -v 127.0.0.1:47 */
-  if ((connect_state () == connect_failed) && (event_mask & FD_WRITE))
-    wsock_events->events |= FD_WRITE;
    UNLOCK_EVENTS;

    return ret;



Best regards,
Marc

[1] https://www.man7.org/linux/man-pages/man2/connect.2.html#ERRORS

[2] 
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/fhandler_socket_inet.cc;h=e5b0d2d1443ecc4430104f6cfb78bf580a8116e5;hb=aa86784937ec7868c358dd90ea5e5324f0be750d#l356

[3] 
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/fhandler_socket_inet.cc;h=e5b0d2d1443ecc4430104f6cfb78bf580a8116e5;hb=aa86784937ec7868c358dd90ea5e5324f0be750d#l358


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix poll/select signal socket as write ready on connect failure
  2020-07-17 18:56   ` Marc Hoersken
@ 2020-07-17 19:21     ` Corinna Vinschen
  2020-07-17 19:33       ` Marc Hoersken
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2020-07-17 19:21 UTC (permalink / raw)
  To: Marc Hoersken; +Cc: cygwin

Hi Marc,

On Jul 17 20:56, Marc Hoersken via Cygwin wrote:
> Hi Corinna,
> 
> Am 16.07.2020 um 11:25 schrieb Corinna Vinschen:
> [...]
> > So, rather than setting FD_WRITE at some later point in the code, what
> > about handling this where the other FD_CONNECT stuff is handled, by
> > not erasing the FD_CONNECT bit, just like with FD_CLOSE?
> 
> I think this makes more sense, yes. I am just not sure if the socket should
> also be write ready in case of a socket error. Looking at the description of
> EINPROGRESS on the man page of connect [1], it seems like writability is
> given regardless of the connection being successful or not, but as soon as
> the connection attempt is no longer pending. For successful connections
> FD_WRITE will be given already, so we will only need to set it for failed
> connections regardless of a socket error in wsock_events->connect_errorcode.
> Therefore I suggest to move the line setting FD_WRITE [2] one level up
> outside of the else branch.

Sounds right to me.

> [...]
> I already tested your diff successfully, so this could be an alternative
> approach to the issue. I just think the wsock_events->connect_errorcode
> should also only be reset if FD_CONNECT is removed, right? So the if branch
> would need to be extended to include the second line [3] as well.

I don't agree here.  The sole purpose for connect_errorcode is to set
SOL_SOCKET/SO_ERROR in case a caller requests FD_CONNECT and FD_CONNECT
is available.  After being set once, SOL_SOCKET/SO_ERROR should not be
rewritten, given the description of SO_ERROR in `man 7 socket':

    SO_ERROR
	  Get and clear the pending socket error.  This socket  option  is
	      ^^^^^^^^^
	  read-only.  Expects an integer.

Therefore I'm inclined to push this:

diff --git a/winsup/cygwin/fhandler_socket_inet.cc b/winsup/cygwin/fhandler_socket_inet.cc
index e5b0d2d1443e..2b50671e533d 100644
--- a/winsup/cygwin/fhandler_socket_inet.cc
+++ b/winsup/cygwin/fhandler_socket_inet.cc
@@ -352,9 +352,13 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
 	      WSASetLastError (wsa_err);
 	      ret = SOCKET_ERROR;
 	    }
-	  else
-	    wsock_events->events |= FD_WRITE;
-	  wsock_events->events &= ~FD_CONNECT;
+	  /* Since FD_CONNECT is only given once, we have to keep FD_CONNECT
+	     for connection failed sockets to have consistent behaviour in
+	     programs calling poll/select multiple times.  Example test to
+	     non-listening port: curl -v 127.0.0.1:47 */
+	  if (connect_state () != connect_failed)
+	    wsock_events->events &= ~FD_CONNECT;
+	  wsock_events->events |= FD_WRITE;
 	  wsock_events->connect_errorcode = 0;
 	}
       /* This test makes accept/connect behave as on Linux when accept/connect
@@ -376,12 +380,6 @@ fhandler_socket_wsock::evaluate_events (const long event_mask, long &events,
       if (erase)
 	wsock_events->events &= ~(events & ~(FD_WRITE | FD_CLOSE));
     }
-  /* Since FD_CONNECT is only given once, we manually need to set
-     FD_WRITE for connection failed sockets to have consistent
-     behaviour in programs calling poll/select multiple times.
-     Example test to non-listening port: curl -v 127.0.0.1:47 */
-  if ((connect_state () == connect_failed) && (event_mask & FD_WRITE))
-    wsock_events->events |= FD_WRITE;
   UNLOCK_EVENTS;
 
   return ret;


Make sense?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix poll/select signal socket as write ready on connect failure
  2020-07-17 19:21     ` Corinna Vinschen
@ 2020-07-17 19:33       ` Marc Hoersken
  2020-07-20  7:55         ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Hoersken @ 2020-07-17 19:33 UTC (permalink / raw)
  To: cygwin

Hi Corinna,

Am 17.07.2020 um 21:21 schrieb Corinna Vinschen:
> I don't agree here.  The sole purpose for connect_errorcode is to set
> SOL_SOCKET/SO_ERROR in case a caller requests FD_CONNECT and FD_CONNECT
> is available.  After being set once, SOL_SOCKET/SO_ERROR should not be
> rewritten, given the description of SO_ERROR in `man 7 socket':
>
>      SO_ERROR
> 	  Get and clear the pending socket error.  This socket  option  is
> 	      ^^^^^^^^^
> 	  read-only.  Expects an integer.
>
> [...]
>
> Make sense?


yes, this makes sense. Please go for it.

Is there a public changelog I can check regulary to see if this has been 
released (once it is)? Thanks!

Best regards,
Marc


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix poll/select signal socket as write ready on connect failure
  2020-07-17 19:33       ` Marc Hoersken
@ 2020-07-20  7:55         ` Corinna Vinschen
  0 siblings, 0 replies; 6+ messages in thread
From: Corinna Vinschen @ 2020-07-20  7:55 UTC (permalink / raw)
  To: Marc Hoersken; +Cc: cygwin

On Jul 17 21:33, Marc Hoersken via Cygwin wrote:
> Hi Corinna,
> 
> Am 17.07.2020 um 21:21 schrieb Corinna Vinschen:
> > I don't agree here.  The sole purpose for connect_errorcode is to set
> > SOL_SOCKET/SO_ERROR in case a caller requests FD_CONNECT and FD_CONNECT
> > is available.  After being set once, SOL_SOCKET/SO_ERROR should not be
> > rewritten, given the description of SO_ERROR in `man 7 socket':
> > 
> >      SO_ERROR
> > 	  Get and clear the pending socket error.  This socket  option  is
> > 	      ^^^^^^^^^
> > 	  read-only.  Expects an integer.
> > 
> > [...]
> > 
> > Make sense?
> 
> 
> yes, this makes sense. Please go for it.

Great, done!

> Is there a public changelog I can check regulary to see if this has been
> released (once it is)? Thanks!

git log?  Or do you mean official Cygwin releases?  The only public
changelog is the announcement on cygwin-announce in this case.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-20  7:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 18:54 [PATCH] Fix poll/select signal socket as write ready on connect failure Marc Hoersken
2020-07-16  9:25 ` Corinna Vinschen
2020-07-17 18:56   ` Marc Hoersken
2020-07-17 19:21     ` Corinna Vinschen
2020-07-17 19:33       ` Marc Hoersken
2020-07-20  7:55         ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).