public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Cygwin socket option SO_REUSEADDR operates unlike Linux
@ 2018-01-13  8:37 Mark Geisert
  2018-01-13 13:51 ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Geisert @ 2018-01-13  8:37 UTC (permalink / raw)
  To: cygwin

This report is based on a series of recent list emails with Subject: lines
"RPC clnt_create() adress already in use" which date back to last 
September but are unfortunately not chained together...  They contain a 
discussion I've been having with OP Raimund Paulus.

I believe I've distilled the issue(s) down as far as I can.  A 
self-contained STC is included at the end of this email.

On the latest 64-bit Cygwin, running the STC shows:

~ netstat -an|grep :111
   TCP    0.0.0.0:111            0.0.0.0:0              LISTENING
   TCP    [::]:111               [::]:0                 LISTENING
   UDP    0.0.0.0:111            *:*
   UDP    [::]:111               *:*

~ ./bindtest
1st socket is 3
1st bind OK
1st connect OK
2nd socket is 3
2nd bind OK
2nd connect: Address already in use

~ ./bindtest
1st socket is 3
1st bind OK
1st connect: Address already in use

On Fedora 27, running the same STC shows:

[mark@lux ~]$ netstat -an|grep :111
tcp        0      0 0.0.0.0:111         0.0.0.0:*        LISTEN
tcp6       0      0 :::111              :::*             LISTEN
udp        0      0 0.0.0.0:111         0.0.0.0:*
udp6       0      0 :::111              :::*
[mark@lux ~]$ ./bindtest
1st socket is 3
1st bind OK
1st connect OK
2nd socket is 3
2nd bind OK
2nd connect OK
[mark@lux ~]$ ./bindtest
1st socket is 3
1st bind OK
1st connect OK
2nd socket is 3
2nd bind OK
2nd connect OK

The STC source code is given below.  It assumes you're running rpcbind on 
the local machine at TCP port 111.  Remember to abort rpcbind after 
testing if you don't need it running for other RPC services.

Two issues are visible.  (1) The 2nd connect attempt elicits an EADDRINUSE 
on Cygwin even though SO_REUSEADDR has been set.  Fedora allows the 2nd 
connect to succeed.  (2) The EADDRINUSE is being reported by connect(), 
not by the preceding bind(), which is where one usually sees it.

I've spent some time inside Cygwin's net.cc and fhandler_socket.cc and see 
how Cygwin deals with Winsock's peculiar notion of SO_REUSEADDR.  Between 
that unfortunate but necessary workaround and what I see on MSDN now, at
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740621(v=vs.85).aspx 
I can't help wondering if the workaround has maybe stopped working.  The 
web page's section "Enhanced Socket Security" is what made me wonder this.

Here's the STC, a single source file bindtest.c to be compiled with
     gcc -g -Wall -o bindtest bindtest.c

Thanks,

..mark

--------8<--------
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define LCLADDR "127.0.0.1"
#define LCLPORT 8888
#define RMTADDR LCLADDR
#define RMTPORT 111

void
checkresult (int value, char *string)
{
     if (value == -1) {
         perror (string);
         exit (1);
     }
}

int
main (int argc, char **argv)
{
     const int one = 1;
     int res;
     int sock;
     struct sockaddr lcladdr, rmtaddr;
     struct sockaddr_in addr;
     socklen_t alen = sizeof (addr);

     memset (&addr, 0, alen);
     memset (&rmtaddr, 0, sizeof (rmtaddr));
     memset (&lcladdr, 0, sizeof (lcladdr));

     addr.sin_family = AF_INET;
     addr.sin_port = htons (RMTPORT);
     addr.sin_addr.s_addr = inet_addr (RMTADDR);
     memcpy (&rmtaddr, &addr, alen);

     addr.sin_port = htons (LCLPORT);
     addr.sin_addr.s_addr = inet_addr (LCLADDR);
     memcpy (&lcladdr, &addr, alen);

//  FIRST CONNECTION ATTEMPT
     sock = socket (AF_INET, SOCK_STREAM, 0);
     checkresult (sock, "1st socket");
     fprintf (stderr, "1st socket is %d\n", sock);

     res = setsockopt (sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof (one));
     checkresult (res, "1st setsockopt");

     res = bind (sock, &lcladdr, alen);
     checkresult (res, "1st bind");
     fprintf (stderr, "1st bind OK\n");

     res = connect (sock, &rmtaddr, alen);
     checkresult (res, "1st connect");
     fprintf (stderr, "1st connect OK\n");

     res = close (sock);
     checkresult (res, "1st close");

//  SECOND CONNECTION ATTEMPT
     sock = socket (AF_INET, SOCK_STREAM, 0);
     checkresult (sock, "2nd socket");
     fprintf (stderr, "2nd socket is %d\n", sock);

     res = setsockopt (sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof (one));
     checkresult (res, "2nd setsockopt");

     res = bind (sock, &lcladdr, alen);
     checkresult (res, "2nd bind");
     fprintf (stderr, "2nd bind OK\n");

     res = connect (sock, &rmtaddr, alen);
     checkresult (res, "2nd connect");
     fprintf (stderr, "2nd connect OK\n");

     res = close (sock);
     checkresult (res, "2nd close");

     return 0;
}

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cygwin socket option SO_REUSEADDR operates unlike Linux
  2018-01-13  8:37 Cygwin socket option SO_REUSEADDR operates unlike Linux Mark Geisert
@ 2018-01-13 13:51 ` Corinna Vinschen
  2018-01-13 21:39   ` Mark Geisert
  0 siblings, 1 reply; 5+ messages in thread
From: Corinna Vinschen @ 2018-01-13 13:51 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3157 bytes --]

On Jan 13 00:36, Mark Geisert wrote:
> This report is based on a series of recent list emails with Subject: lines
> "RPC clnt_create() adress already in use" which date back to last September
> but are unfortunately not chained together...  They contain a discussion
> I've been having with OP Raimund Paulus.
> 
> I believe I've distilled the issue(s) down as far as I can.  A
> self-contained STC is included at the end of this email.
> 
> On the latest 64-bit Cygwin, running the STC shows:
> 
> ~ netstat -an|grep :111
>   TCP    0.0.0.0:111            0.0.0.0:0              LISTENING
>   TCP    [::]:111               [::]:0                 LISTENING
>   UDP    0.0.0.0:111            *:*
>   UDP    [::]:111               *:*
> 
> ~ ./bindtest
> 1st socket is 3
> 1st bind OK
> 1st connect OK
> 2nd socket is 3
> 2nd bind OK
> 2nd connect: Address already in use
> 
> ~ ./bindtest
> 1st socket is 3
> 1st bind OK
> 1st connect: Address already in use
> 
> On Fedora 27, running the same STC shows:
> 
> [mark@lux ~]$ netstat -an|grep :111
> tcp        0      0 0.0.0.0:111         0.0.0.0:*        LISTEN
> tcp6       0      0 :::111              :::*             LISTEN
> udp        0      0 0.0.0.0:111         0.0.0.0:*
> udp6       0      0 :::111              :::*
> [mark@lux ~]$ ./bindtest
> 1st socket is 3
> 1st bind OK
> 1st connect OK
> 2nd socket is 3
> 2nd bind OK
> 2nd connect OK

I can't reproduce this:

$ uname -sr
Linux 4.14.13-300.fc27.x86_64
$ netstat -an|grep :111
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN     
tcp6       0      0 :::111                  :::*                    LISTEN     
udp        0      0 0.0.0.0:111             0.0.0.0:*                          
udp6       0      0 :::111                  :::*                               
$ ./bindtest
1st socket is 3
1st bind OK
1st connect OK
2nd socket is 3
2nd bind OK
2nd connect: Cannot assign requested address

I tried this a couple of times even as root, just to be sure, but the
result is invariable "2nd connect: Cannot assign requested address".

The error message is different from Cygwin, but the overall behaviour is
the same for me, and it matches the comment I wrote in cygwin_setsockopt
back in 2009 and 2011.

I'm very puzzled that it works for you.  As I wrote in my comment, a
complete duplicate of a local TCP address is not allowed, regardless of
SO_REUSEADDR.

If I may quote Mr. Network himself, the late W. R. Stevens, "UNIX
Network Programming, Networking APIs: Sockets and XTI", Volume 1, 2nd
Edition.  Section 7.5:

  "With TCP we are never able to start multiple servers that bind the
   same IP address and the same port: a 'complete duplicate binding'.
   That is, we cannot start one server that binds 198.69.10.2 port 80
   and start another that also binds 198.69.10.2 port 80, even if we set
   the SO_REUSEADDR soocket option for the second server."


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cygwin socket option SO_REUSEADDR operates unlike Linux
  2018-01-13 13:51 ` Corinna Vinschen
@ 2018-01-13 21:39   ` Mark Geisert
  2018-01-15 20:03     ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Geisert @ 2018-01-13 21:39 UTC (permalink / raw)
  To: cygwin

Corinna Vinschen wrote:
> On Jan 13 00:36, Mark Geisert wrote:
>> This report is based on a series of recent list emails with Subject: lines
>> "RPC clnt_create() adress already in use" which date back to last September
>> but are unfortunately not chained together...  They contain a discussion
>> I've been having with OP Raimund Paulus.
>>
>> I believe I've distilled the issue(s) down as far as I can.  A
>> self-contained STC is included at the end of this email.
>>
>> On the latest 64-bit Cygwin, running the STC shows:
>>
>> ~ netstat -an|grep :111
>>   TCP    0.0.0.0:111            0.0.0.0:0              LISTENING
>>   TCP    [::]:111               [::]:0                 LISTENING
>>   UDP    0.0.0.0:111            *:*
>>   UDP    [::]:111               *:*
>>
>> ~ ./bindtest
>> 1st socket is 3
>> 1st bind OK
>> 1st connect OK
>> 2nd socket is 3
>> 2nd bind OK
>> 2nd connect: Address already in use
>>
>> ~ ./bindtest
>> 1st socket is 3
>> 1st bind OK
>> 1st connect: Address already in use
>>
>> On Fedora 27, running the same STC shows:
>>
>> [mark@lux ~]$ netstat -an|grep :111
>> tcp        0      0 0.0.0.0:111         0.0.0.0:*        LISTEN
>> tcp6       0      0 :::111              :::*             LISTEN
>> udp        0      0 0.0.0.0:111         0.0.0.0:*
>> udp6       0      0 :::111              :::*
>> [mark@lux ~]$ ./bindtest
>> 1st socket is 3
>> 1st bind OK
>> 1st connect OK
>> 2nd socket is 3
>> 2nd bind OK
>> 2nd connect OK
>
> I can't reproduce this:
>
> $ uname -sr
> Linux 4.14.13-300.fc27.x86_64
> $ netstat -an|grep :111
> tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN
> tcp6       0      0 :::111                  :::*                    LISTEN
> udp        0      0 0.0.0.0:111             0.0.0.0:*
> udp6       0      0 :::111                  :::*
> $ ./bindtest
> 1st socket is 3
> 1st bind OK
> 1st connect OK
> 2nd socket is 3
> 2nd bind OK
> 2nd connect: Cannot assign requested address
>
> I tried this a couple of times even as root, just to be sure, but the
> result is invariable "2nd connect: Cannot assign requested address".
>
> The error message is different from Cygwin, but the overall behaviour is
> the same for me, and it matches the comment I wrote in cygwin_setsockopt
> back in 2009 and 2011.
>
> I'm very puzzled that it works for you.  As I wrote in my comment, a
> complete duplicate of a local TCP address is not allowed, regardless of
> SO_REUSEADDR.
>
> If I may quote Mr. Network himself, the late W. R. Stevens, "UNIX
> Network Programming, Networking APIs: Sockets and XTI", Volume 1, 2nd
> Edition.  Section 7.5:
>
>   "With TCP we are never able to start multiple servers that bind the
>    same IP address and the same port: a 'complete duplicate binding'.
>    That is, we cannot start one server that binds 198.69.10.2 port 80
>    and start another that also binds 198.69.10.2 port 80, even if we set
>    the SO_REUSEADDR soocket option for the second server."

Rats.  I'll have to investigate a couple of directions, deeper.  It makes sense 
that connect() returns EADDRINUSE rather than bind() because only connect() 
knows about all 5 parts of the 5-tuple.  Stevens is/was the definitive network 
software guy.  Miss him.  Most accounts I've found deal with SO_REUSEADDR on the 
server side, not the client side, so my intuition is a bit faulty.

Thanks for your time, Corinna.  Raimund: I'll have to do some more digging when 
I get back to a keyboard in a week or so.  Sorry for the delay on this.  There 
might be some issue inside libtirpc where it botches error returns from the 
kernel.  Or something.

Thanks all,

..mark


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cygwin socket option SO_REUSEADDR operates unlike Linux
  2018-01-13 21:39   ` Mark Geisert
@ 2018-01-15 20:03     ` Corinna Vinschen
  2018-01-16 15:42       ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: Corinna Vinschen @ 2018-01-15 20:03 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3334 bytes --]

On Jan 13 13:39, Mark Geisert wrote:
> Corinna Vinschen wrote:
> > On Jan 13 00:36, Mark Geisert wrote:
> > > ~ ./bindtest
> > > 1st socket is 3
> > > 1st bind OK
> > > 1st connect OK
> > > 2nd socket is 3
> > > 2nd bind OK
> > > 2nd connect: Address already in use
> > > 
> > > ~ ./bindtest
> > > 1st socket is 3
> > > 1st bind OK
> > > 1st connect: Address already in use
> > > 
> > > On Fedora 27, running the same STC shows:
> > > 
> > > [mark@lux ~]$ netstat -an|grep :111
> > > tcp        0      0 0.0.0.0:111         0.0.0.0:*        LISTEN
> > > tcp6       0      0 :::111              :::*             LISTEN
> > > udp        0      0 0.0.0.0:111         0.0.0.0:*
> > > udp6       0      0 :::111              :::*
> > > [mark@lux ~]$ ./bindtest
> > > 1st socket is 3
> > > 1st bind OK
> > > 1st connect OK
> > > 2nd socket is 3
> > > 2nd bind OK
> > > 2nd connect OK
> > 
> > I can't reproduce this:
> > 
> > $ uname -sr
> > Linux 4.14.13-300.fc27.x86_64
> > $ netstat -an|grep :111
> > tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN
> > tcp6       0      0 :::111                  :::*                    LISTEN
> > udp        0      0 0.0.0.0:111             0.0.0.0:*
> > udp6       0      0 :::111                  :::*
> > $ ./bindtest
> > 1st socket is 3
> > 1st bind OK
> > 1st connect OK
> > 2nd socket is 3
> > 2nd bind OK
> > 2nd connect: Cannot assign requested address
> > [...]
> Rats.  I'll have to investigate a couple of directions, deeper.  It makes
> sense that connect() returns EADDRINUSE rather than bind() [...]

After some more digging it turns out that both of the above observations
on Linux are correct.  I can reproduce the 2nd connect succeeding by
simply adding a `sleep(1)' after the first close.  So it turns out that
Linux has a timing issue at socket cleanup which can be alleviated
by an extra sleep.  I opened a case about this issue.  EADDRNOTAVAIL
sounds a bit weird in this scenario, but it's kind of ok.

In terms of Cygwin, the EADDRINUSE is a completely different matter.

It turns out that the second connect fails because the first socket
connection is in TIME_WAIT state.  This is not exactly correct in POSIX
terms.  The TIME_WAIT connection should not disallow a new socket to
reuse the same local address.  That's what we observe on Linux (apart from
the timing issue).

But here's the problem:  Regardless if we actually use SO_REUSEADDR or
not, Windows sockets apparently disallows a subsequent connect to
succeed while the first socket is still in TIME_WAIT.  I tweaked Cygwin
to enforce SO_REUSEADDR before bind, but connect still fails with
EADDRINUSE as long as the first socket is in TIME_WAIT.

It seems the code path for listen/accept is different here compared to
connect.  Given that SO_REUSEADDR seems to cover mostly server side
scenarios, and given that I don't see this scenario discussed at all
in Steven's book, I wonder if bind/connect is a bit of a grey area.

Either way, the bottom line is that this is a WinSock restriction,
apparently.  As of today, I don't see any way around that.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cygwin socket option SO_REUSEADDR operates unlike Linux
  2018-01-15 20:03     ` Corinna Vinschen
@ 2018-01-16 15:42       ` Corinna Vinschen
  0 siblings, 0 replies; 5+ messages in thread
From: Corinna Vinschen @ 2018-01-16 15:42 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3705 bytes --]

On Jan 15 21:02, Corinna Vinschen wrote:
> On Jan 13 13:39, Mark Geisert wrote:
> > Corinna Vinschen wrote:
> > > On Jan 13 00:36, Mark Geisert wrote:
> > > > ~ ./bindtest
> > > > 1st socket is 3
> > > > 1st bind OK
> > > > 1st connect OK
> > > > 2nd socket is 3
> > > > 2nd bind OK
> > > > 2nd connect: Address already in use
> > > > 
> > > > ~ ./bindtest
> > > > 1st socket is 3
> > > > 1st bind OK
> > > > 1st connect: Address already in use
> > > > 
> > > > On Fedora 27, running the same STC shows:
> > > > 
> > > > [mark@lux ~]$ netstat -an|grep :111
> > > > tcp        0      0 0.0.0.0:111         0.0.0.0:*        LISTEN
> > > > tcp6       0      0 :::111              :::*             LISTEN
> > > > udp        0      0 0.0.0.0:111         0.0.0.0:*
> > > > udp6       0      0 :::111              :::*
> > > > [mark@lux ~]$ ./bindtest
> > > > 1st socket is 3
> > > > 1st bind OK
> > > > 1st connect OK
> > > > 2nd socket is 3
> > > > 2nd bind OK
> > > > 2nd connect OK
> > > 
> > > I can't reproduce this:
> > > 
> > > $ uname -sr
> > > Linux 4.14.13-300.fc27.x86_64
> > > $ netstat -an|grep :111
> > > tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN
> > > tcp6       0      0 :::111                  :::*                    LISTEN
> > > udp        0      0 0.0.0.0:111             0.0.0.0:*
> > > udp6       0      0 :::111                  :::*
> > > $ ./bindtest
> > > 1st socket is 3
> > > 1st bind OK
> > > 1st connect OK
> > > 2nd socket is 3
> > > 2nd bind OK
> > > 2nd connect: Cannot assign requested address
> > > [...]
> > Rats.  I'll have to investigate a couple of directions, deeper.  It makes
> > sense that connect() returns EADDRINUSE rather than bind() [...]
> 
> After some more digging it turns out that both of the above observations
> on Linux are correct.  I can reproduce the 2nd connect succeeding by
> simply adding a `sleep(1)' after the first close.  So it turns out that
> Linux has a timing issue at socket cleanup which can be alleviated
> by an extra sleep.  I opened a case about this issue.  EADDRNOTAVAIL
> sounds a bit weird in this scenario, but it's kind of ok.
> 
> In terms of Cygwin, the EADDRINUSE is a completely different matter.
> 
> It turns out that the second connect fails because the first socket
> connection is in TIME_WAIT state.  This is not exactly correct in POSIX
> terms.  The TIME_WAIT connection should not disallow a new socket to
> reuse the same local address.  That's what we observe on Linux (apart from
> the timing issue).
> 
> But here's the problem:  Regardless if we actually use SO_REUSEADDR or
> not, Windows sockets apparently disallows a subsequent connect to
> succeed while the first socket is still in TIME_WAIT.  I tweaked Cygwin
> to enforce SO_REUSEADDR before bind, but connect still fails with
> EADDRINUSE as long as the first socket is in TIME_WAIT.
> 
> It seems the code path for listen/accept is different here compared to
> connect.  Given that SO_REUSEADDR seems to cover mostly server side
> scenarios, and given that I don't see this scenario discussed at all
> in Steven's book, I wonder if bind/connect is a bit of a grey area.
> 
> Either way, the bottom line is that this is a WinSock restriction,
> apparently.  As of today, I don't see any way around that.

For completeness sake I converted your testcase into a WinSock-only
executable, built with Mingw-w64, and the problem persists, on Windows7
as well as on Windows 10.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-16 15:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-13  8:37 Cygwin socket option SO_REUSEADDR operates unlike Linux Mark Geisert
2018-01-13 13:51 ` Corinna Vinschen
2018-01-13 21:39   ` Mark Geisert
2018-01-15 20:03     ` Corinna Vinschen
2018-01-16 15:42       ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).