public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Spurious / persistent "exception" condition in half-closed sockets
@ 2022-07-09 15:37 Lavrentiev, Anton (NIH/NLM/NCBI) [C]
  2022-07-09 21:05 ` Ken Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Lavrentiev, Anton (NIH/NLM/NCBI) [C] @ 2022-07-09 15:37 UTC (permalink / raw)
  To: 'cygwin@cygwin.com'

Hi all,

It took me awhile to figure this one out, but I think I have a good test case to
demonstrate a (rather serious, actually) issue with Cygwin sockets and select/poll.

In short, when a reading end of a socket half closes for write (basically, signaling the
other end of no more data to expect, resulting in TCP FIN and, subsequently, EOF in the other
end's read()), if that end keeps reading the still incoming remaining data, it will face with
a lot of "exception" conditions, which are just spurious.  That will also burn CPU instead of
doing a proper wait (so a read() failed with EAGAIN, and then waited for with select()
(or poll() -- the same issue) will be attempted again immediately (as the socket would be reported
as "ready" with an "exception"), and result in another EAGAIN, etc etc...
Eventually, there will be successful reads squeezed in between, though...

Here's the "client" code ("server" code follows):

$ cat client.c
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <netinet/in.h>


static void error(const char* what)
{
    fflush(stdout);
    perror(what);
    exit(1);
}


int main(int argc, const char* argv[])
{
    struct sockaddr_in sin;
    size_t total = 0;
    int c = socket(AF_INET, SOCK_STREAM, 0);

    if (c == -1)
        error("socket");

    memset(&sin, 0, sizeof(sin));
    sin.sin_family      = AF_INET;
    sin.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
    sin.sin_port        = htons(atoi(argv[1]));

    if (connect(c, (struct sockaddr*) &sin, (socklen_t) sizeof(sin)) != 0)
        error("connect");
    if (fcntl(c, F_SETFL, fcntl(c, F_GETFL, 0) | O_NONBLOCK) == -1)
        error("fcntl");
#ifdef BUG
    if (shutdown(c, SHUT_WR) != 0)
        error("shutdown");
#endif

    for(;;) {
        char buf[1000];
        ssize_t n = read(c, buf, sizeof(buf));

        if (n > 0) {
            printf("%zu byte%s received from server\n", n, &"s"[n==1]);
            total += n;
            continue;
        }
        if (n == 0) {
            printf("Connection closed, %zu byte%s received\n",
                   total, &"s"[total==1]);
            break;
        }
        if (errno != EAGAIN  &&  errno != EWOULDBLOCK)
            error("read");
        fflush(stdout);
        perror("read");
        for (;;) {
            fd_set rfds, efds;
            struct timeval tv;
            int m;

            FD_ZERO(&rfds);
            FD_ZERO(&efds);
            FD_SET(c, &rfds);
            FD_SET(c, &efds);
            memset(&tv, 0, sizeof(tv));
            tv.tv_sec = 2;

            printf("Waiting...\n");
            m = select(c + 1, &rfds, 0/*wfds*/, &efds, &tv);
            if (!m)
                continue;
            if (m < 0)
                error("select");
            if (FD_ISSET(c, &efds)) {
                printf("Exception??\n");
                break;
            }
            if (FD_ISSET(c, &rfds)) {
                printf("Read-ready!\n");
                break;
            }
            error("select bug");
            abort();
        }
    }
    close(c);
    printf("Bye-bye\n");
    return 0;
}

$ cat server.c
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <netinet/in.h>


static void error(const char* what)
{
    fflush(stdout);
    perror(what);
    exit(1);
}


int main(int argc, const char* argv[])
{
    struct sockaddr_in sin;
    int s = socket(AF_INET, SOCK_STREAM, 0);

    if (s == -1)
        error("socket");

    memset(&sin, 0, sizeof(sin));
    sin.sin_family      = AF_INET;
    sin.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
    sin.sin_port        = htons(atoi(argv[1]));

    if (bind(s, (struct sockaddr*) &sin, (socklen_t) sizeof(sin)) != 0)
        error("bind");
    if (listen(s, 1) != 0)
        error("listen");

    for(;;) {
        size_t total = 0;
        socklen_t sinlen = sizeof(sin);
        int c = accept(s, (struct sockaddr*) &sin, &sinlen);
        if (c < 0)
            error("accept");
        printf("Client accepted\n");

        for (;;) {
            char buf[1000];
            size_t len = rand() & 255;
            ssize_t n = write(c, buf, len ? len : 1);
            if (n <= 0)
                error("write");
            total += n;
            printf("%zu byte%s sent to client\n", n, &"s"[n==1]);
            if (rand() & 1)
                usleep(100);
            if (!(rand() % 11)) {
                printf("Closing connection, %zu byte%s sent\n",
                       total, &"s"[total==1]);
                break;
            }
        }
        close(c);
    }
}

$ cc -Wall -o server server.c
$ cc -Wall -o client client.c

Start the server (which just sends random garbage to the client, once it's accepted)
in a separate Cygwin terminal:

$ ./server 5555

Now run the client from another Cygwin terminal:

$ ./client 5555

You should see the client connecting and receiving (maybe sometimes waiting)
but never having a blank read (EAGAIN) after a successful select() (that was
read-ready).  Try running the client a few times to see how it works.

Now, since the client is not sending anything (or done sending, in the real case
scenario), it'd want to notify the server that it's only going to receive
(by issuing a shutdown() call).  Recompile the client with -DBUG enabled:

$ cc -Wall -DBUG -o client client.c
$ ./client 5555

When you start the client again, you'd see a ton of Exceptions, and all the waits
(select(), but poll() works exactly the same say, checked) return immediately and
the client keeps spinning around the read()s -- most of them are blank with EAGAIN.
In the end, the client does get everything sent to it, though, but with A LOT of
unnecessary CPU cycles.  Now suppose that you have thousands of such clients,
that would create a lot of unnecessary contention.  Try running the client a few
times to see how disastrous those blank reads can be in numbers!

That's not a correct behavior -- you can check that same code running it on Linux
(or BSD -- Mac).  I don't think there's any "exception" in the socket, to begin
with.  Also, it looks like the condition is simply stuck in there and is not
properly re-evaluated (as the I/O still progressing).

Thanks for looking!

Anton Lavrentiev
Contractor NIH/NLM/NCBI


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Spurious / persistent "exception" condition in half-closed sockets
  2022-07-09 15:37 Spurious / persistent "exception" condition in half-closed sockets Lavrentiev, Anton (NIH/NLM/NCBI) [C]
@ 2022-07-09 21:05 ` Ken Brown
  2022-07-09 23:02   ` [EXTERNAL] " Lavrentiev, Anton (NIH/NLM/NCBI) [C]
  0 siblings, 1 reply; 5+ messages in thread
From: Ken Brown @ 2022-07-09 21:05 UTC (permalink / raw)
  To: cygwin

On 7/9/2022 11:37 AM, Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin wrote:
> Hi all,
> 
> It took me awhile to figure this one out, but I think I have a good test case to
> demonstrate a (rather serious, actually) issue with Cygwin sockets and select/poll.
> 
> In short, when a reading end of a socket half closes for write (basically, signaling the
> other end of no more data to expect, resulting in TCP FIN and, subsequently, EOF in the other
> end's read()), if that end keeps reading the still incoming remaining data, it will face with
> a lot of "exception" conditions, which are just spurious.

This was fixed in Cygwin 3.3.0, as the announcement of the latter stated:

   https://cygwin.com/pipermail/cygwin-announce/2021-October/010268.html

You stated in a different thread that you have chosen to use an old version of 
Cygwin for your everyday work.  But you can still run a parallel Cygwin 
installation that you keep up to date, so that you can check whether a bug has 
been fixed before reporting it.

Ken

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [EXTERNAL] Re: Spurious / persistent "exception" condition in half-closed sockets
  2022-07-09 21:05 ` Ken Brown
@ 2022-07-09 23:02   ` Lavrentiev, Anton (NIH/NLM/NCBI) [C]
  2022-07-11  7:52     ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: Lavrentiev, Anton (NIH/NLM/NCBI) [C] @ 2022-07-09 23:02 UTC (permalink / raw)
  To: Ken Brown, cygwin

> This was fixed in Cygwin 3.3.0, as the announcement of the latter stated:

Thanks!  So maybe it is time to upgrade... after all LOL

> But you can still run a parallel Cygwin installation

I tried that before...  And it did not work out well.  Unless it's a VM,
there's a small but real chance that at some point they are to get intertwined,
and then ... it's quite a mess (learned that the hard way, unfortunately).

Anton Lavrentiev
Contractor NIH/NLM/NCBI


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [EXTERNAL] Re: Spurious / persistent "exception" condition in half-closed sockets
  2022-07-09 23:02   ` [EXTERNAL] " Lavrentiev, Anton (NIH/NLM/NCBI) [C]
@ 2022-07-11  7:52     ` Corinna Vinschen
  2022-07-11  8:16       ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: Corinna Vinschen @ 2022-07-11  7:52 UTC (permalink / raw)
  To: cygwin

On Jul  9 23:02, Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin wrote:
> > This was fixed in Cygwin 3.3.0, as the announcement of the latter stated:
> 
> Thanks!  So maybe it is time to upgrade... after all LOL
> 
> > But you can still run a parallel Cygwin installation
> 
> I tried that before...  And it did not work out well.  Unless it's a VM,
> there's a small but real chance that at some point they are to get intertwined,

This must have been very long ago.  For a long time, Cygwin's path
handling and shared memory interaction between Cygwin processes is
based on the installation path of the Cygwin DLL a process is running
under.  A Cygwin process running under a Cygwin DLL from path A uses
different default Windows PATH and different shared memory names than a
process running under Cygwin DLL from path B.  Keeping Cygwin
installations separate just requires never to run processes from
installation A under Cygwin DLL B.


> and then ... it's quite a mess (learned that the hard way, unfortunately).

It really isn't.  Only if you start to mix paths from two parallel
Cygwin installations inside the same shell session, which should be
easy to avoid.


Corinna

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [EXTERNAL] Re: Spurious / persistent "exception" condition in half-closed sockets
  2022-07-11  7:52     ` Corinna Vinschen
@ 2022-07-11  8:16       ` Corinna Vinschen
  0 siblings, 0 replies; 5+ messages in thread
From: Corinna Vinschen @ 2022-07-11  8:16 UTC (permalink / raw)
  To: cygwin

On Jul 11 09:52, Corinna Vinschen wrote:
> On Jul  9 23:02, Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin wrote:
> > > This was fixed in Cygwin 3.3.0, as the announcement of the latter stated:
> > 
> > Thanks!  So maybe it is time to upgrade... after all LOL
> > 
> > > But you can still run a parallel Cygwin installation
> > 
> > I tried that before...  And it did not work out well.  Unless it's a VM,
> > there's a small but real chance that at some point they are to get intertwined,
> 
> This must have been very long ago.  For a long time, Cygwin's path
> handling and shared memory interaction between Cygwin processes is
> based on the installation path of the Cygwin DLL a process is running
> under.  A Cygwin process running under a Cygwin DLL from path A uses
> different default Windows PATH and different shared memory names than a

make that "different names for all shared objects"

> process running under Cygwin DLL from path B.  Keeping Cygwin
> installations separate just requires never to run processes from
> installation A under Cygwin DLL B.
> 
> 
> > and then ... it's quite a mess (learned that the hard way, unfortunately).
> 
> It really isn't.  Only if you start to mix paths from two parallel
> Cygwin installations inside the same shell session, which should be
> easy to avoid.
> 
> 
> Corinna

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-11  8:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-09 15:37 Spurious / persistent "exception" condition in half-closed sockets Lavrentiev, Anton (NIH/NLM/NCBI) [C]
2022-07-09 21:05 ` Ken Brown
2022-07-09 23:02   ` [EXTERNAL] " Lavrentiev, Anton (NIH/NLM/NCBI) [C]
2022-07-11  7:52     ` Corinna Vinschen
2022-07-11  8:16       ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).