public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* AF_UNIX/SOCK_DGRAM is dropping messages
@ 2021-03-23 15:37 sten.kristian.ivarsson
  2021-03-23 19:20 ` Glenn Strauss
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-03-23 15:37 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

Hi all

Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop messages or at least they are not received in the same order they are sent

Attached C:ish (with C++ threads though) sample program that essentially creates a "client" that writes as much as possible and a "server" that consumes as much as possible

It seems like some buffer is filled and then messages are dropped (or at least it appears so) (if someone is about to test this, the "sleep" might need to be adjusted in order to make this happen)

Hopefully it's just a flaw in our application (and sample), but as far as we can see, this should work


Does anyone, perhaps named Ken, have any insightful thoughts about this ?


Best regards,
Kristian

[-- Attachment #2: af_unix.cpp --]
[-- Type: text/plain, Size: 2538 bytes --]

#include <sys/socket.h>
#include <sys/un.h>

#include <unistd.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <thread>
#include <chrono>


// $ g++ --std=gnu++17 af_unix.cpp

const char* const path = "address";
const int count = 1000;
const int size = BUFSIZ * 2;

int client()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    char buffer[size] = {};

    for( int idx = 0; idx < count; ++idx)
    {
        memcpy( buffer, &idx, sizeof idx);

        const ssize_t result = sendto( fd, buffer, size, 0, (struct sockaddr*)&address, sizeof address);

        if( result == -1)
        {
            perror( "sendto error");
            return -1;
        }
    }

    close( fd);
    return 0;
}

int server()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    const int result = bind( fd, (struct sockaddr*)&address, sizeof address);

    if( result == -1)
    {
        perror( "bind error");
        return -1;
    }

    return fd;
}

int main( int argc, char* argv[])
{
    const int fd = server( );

    if( fd != -1)
    {
        fprintf( stdout, "%d\tnumber of packages\n", count);
        fprintf( stdout, "%d\tbytes per package\n", size);

        std::thread c{ [&](){client( );}};

        std::this_thread::sleep_for( std::chrono::microseconds( 500));
    
        char buffer[size] = {};

        for( int idx = 0; idx < count; ++idx)
        {
            const ssize_t result = recv( fd, buffer, size, 0);

            if( result == -1)
            {
                perror("recv error");
                c.join();
                unlink( path);
                return -1;
            }

            int index = 0;
            memcpy( &index, buffer, sizeof idx);

            if( index != idx)
            {
                fprintf( stderr, "expected %d but got %d\n", idx, index);
                c.join();
                unlink( path);
                return -1;
            }
        }

        c.join();
        close( fd);
        unlink( path);
    }

    return 0;
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-03-23 15:37 AF_UNIX/SOCK_DGRAM is dropping messages sten.kristian.ivarsson
@ 2021-03-23 19:20 ` Glenn Strauss
  2021-03-24  9:18   ` sten.kristian.ivarsson
  0 siblings, 1 reply; 27+ messages in thread
From: Glenn Strauss @ 2021-03-23 19:20 UTC (permalink / raw)
  To: sten.kristian.ivarsson; +Cc: cygwin

On Tue, Mar 23, 2021 at 04:37:52PM +0100, Kristian Ivarsson via Cygwin wrote:
> Hi all
> 
> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop messages or at least they are not received in the same order they are sent
> 
> Attached C:ish (with C++ threads though) sample program that essentially creates a "client" that writes as much as possible and a "server" that consumes as much as possible
> 
> It seems like some buffer is filled and then messages are dropped (or at least it appears so) (if someone is about to test this, the "sleep" might need to be adjusted in order to make this happen)
> 
> Hopefully it's just a flaw in our application (and sample), but as far as we can see, this should work
> 
> 
> Does anyone, perhaps named Ken, have any insightful thoughts about this ?


> const int size = BUFSIZ * 2;


>     char buffer[size] = {};
> 
>     for( int idx = 0; idx < count; ++idx)
>     {
>         memcpy( buffer, &idx, sizeof idx);
> 
>         const ssize_t result = sendto( fd, buffer, size, 0, (struct sockaddr*)&address, sizeof address);


>             const ssize_t result = recv( fd, buffer, size, 0);
...
>             int index = 0;
>             memcpy( &index, buffer, sizeof idx);

This appears to be a programming error, unrelated to Cygwin.

I know that what you provided was an example test case, but you might
want to check if your app is sending way too much when the actual
payload size is much smaller.  In the example you provided, you are
sending 16KB instead of 4 bytes for the count.

Is your code handling partial read/recv and partial write/sendto?
(It is definitely a bug in the use of recv() in the sample code.)

Partial reads and writes can occur more frequently with non-blocking
sockets, but it is still good defensive programming to detect and
handle partial read/writes.

It goes without saying that if your protocol sends a fixed size chunk
of data, that you should ensure that you read the entire fixed size,
even if only using part of the data.

Cheers, Glenn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-03-23 19:20 ` Glenn Strauss
@ 2021-03-24  9:18   ` sten.kristian.ivarsson
  2021-03-30 14:17     ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-03-24  9:18 UTC (permalink / raw)
  To: 'Glenn Strauss'; +Cc: cygwin

[-- Attachment #1: Type: text/plain, Size: 3417 bytes --]

Hi Glenn

Thanks for the reply, so more below

> > Hi all
> >
> > Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop  messages or at least they are not received in the same order they are  sent
> >
> > Attached C:ish (with C++ threads though) sample program that  essentially creates a "client" that writes as much as possible and a  "server" that consumes as much as possible
> >
> > It seems like some buffer is filled and then messages are dropped (or at least it appears so) (if someone is about to test this, the "sleep" might need to be adjusted in order to make this happen)
> >
> > Hopefully it's just a flaw in our application (and sample), but as far as we can see, this should work
> >
> >
> > Does anyone, perhaps named Ken, have any insightful thoughts about this ?
> 
> 
> > const int size = BUFSIZ * 2;
> 
> 
> >     char buffer[size] = {};
> >
> >     for( int idx = 0; idx < count; ++idx)
> >     {
> >         memcpy( buffer, &idx, sizeof idx);
> >
> >         const ssize_t result = sendto( fd, buffer, size, 0, (struct
> > sockaddr*)&address, sizeof address);
> 
> 
> >             const ssize_t result = recv( fd, buffer, size, 0);
> ...
> >             int index = 0;
> >             memcpy( &index, buffer, sizeof idx);
> 
> This appears to be a programming error, unrelated to Cygwin.
> 
> I know that what you provided was an example test case, but you might want  to check if your app is sending way too much when the actual payload size is much smaller.  In the example you provided, you are sending 16KB instead of 4 bytes for the count.

To send a larger buffer (in this case 16 KB) is intentional, but just the sizeof int is relevant. The reason is just to send many bytes and verify that they end up on the other side in correct order


> Is your code handling partial read/recv and partial write/sendto?  (It is definitely a bug in the use of recv() in the sample code.) 

It was not and the updated version does not either, but that is not the issue though but I added a test to verify that the whole chunk is sent/read

> Partial reads and writes can occur more frequently with non-blocking sockets, but it is still good defensive programming to detect and handle partial read/writes.

That might be the case, but this is blocking attempts though (or maybe I've misunderstood the flags ?), but regardless of that, the test-case is not about how to handle partial writes/reads though, but to kind of show that messages seems to be lost, but of cource code need not to be flawed so thanx for the feedback

It almost seems like it is UDP-semantics and that packages can get lost or end up in non sequential order, and of course SOCK_DGRAM tells you that, but the posix description says "UNIX domain datagram sockets are always reliable and don't reorder datagrams"

It seems like when an internal buffer or so of 64 KB is filled the rest of the packages are dropped until consumed, so in this case the 32 first packages arrive in correct order but after that any random package (with index > 32) seems to end up at the "server"

> It goes without saying that if your protocol sends a fixed size chunk of data, that you should ensure that you read the entire fixed size, even if only using part of the data.

That's done in the updated version, or at least verified

> Cheers, Glenn

Best regards,
Kristian

[-- Attachment #2: af_unix.cpp --]
[-- Type: text/plain, Size: 2508 bytes --]

#include <sys/socket.h>
#include <sys/un.h>

#include <unistd.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <thread>
#include <chrono>


// $ g++ --std=gnu++17 af_unix.cpp

const char* const path = "address";
const int count = 1000;
const int size = BUFSIZ * 2;

int client()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    char buffer[size] = {};

    for( int idx = 0; idx < count; ++idx)
    {
        memcpy( buffer, &idx, sizeof idx);

        const ssize_t result = sendto( fd, buffer, size, 0, (struct sockaddr*)&address, sizeof address);

        if( result == -1 || result != size)
        {
            perror( "sendto error");
            return -1;
        }
    }

    close( fd);
    return 0;
}

int server()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    const int result = bind( fd, (struct sockaddr*)&address, sizeof address);

    if( result == -1)
    {
        perror( "bind error");
        return -1;
    }

    return fd;
}

int main( int argc, char* argv[])
{
    const int fd = server( );

    if( fd != -1)
    {
        fprintf( stdout, "%d\tnumber of packages\n", count);
        fprintf( stdout, "%d\tbytes per package\n", size);

        std::thread{ [&](){client( );}}.detach();

        std::this_thread::sleep_for( std::chrono::microseconds( 500));
    
        char buffer[size] = {};

        for( int idx = 0; idx < count; ++idx)
        {
            const ssize_t result = recv( fd, buffer, size, 0);

            if( result == -1 || result != size)
            {
                perror("recv error");
                unlink( path);
                return -1;
            }

            int index = 0;
            memcpy( &index, buffer, sizeof idx);

            if( index != idx)
            {
                fprintf( stderr, "expected %d but got %d\n", idx, index);
                unlink( path);
                return -1;
            }
        }

        close( fd);
        unlink( path);
    }

    return 0;
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-03-24  9:18   ` sten.kristian.ivarsson
@ 2021-03-30 14:17     ` Ken Brown
  2021-03-31  8:24       ` sten.kristian.ivarsson
  0 siblings, 1 reply; 27+ messages in thread
From: Ken Brown @ 2021-03-30 14:17 UTC (permalink / raw)
  To: sten.kristian.ivarsson; +Cc: cygwin

On 3/24/2021 5:18 AM, Kristian Ivarsson via Cygwin wrote:
> Hi Glenn
> 
> Thanks for the reply, so more below
> 
>>> Hi all
>>>
>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop  messages or at least they are not received in the same order they are  sent
>>>
>>> Attached C:ish (with C++ threads though) sample program that  essentially creates a "client" that writes as much as possible and a  "server" that consumes as much as possible
>>>
>>> It seems like some buffer is filled and then messages are dropped (or at least it appears so) (if someone is about to test this, the "sleep" might need to be adjusted in order to make this happen)
>>>
>>> Hopefully it's just a flaw in our application (and sample), but as far as we can see, this should work
>>>
>>>
>>> Does anyone, perhaps named Ken, have any insightful thoughts about this ?
>>
>>
>>> const int size = BUFSIZ * 2;
>>
>>
>>>      char buffer[size] = {};
>>>
>>>      for( int idx = 0; idx < count; ++idx)
>>>      {
>>>          memcpy( buffer, &idx, sizeof idx);
>>>
>>>          const ssize_t result = sendto( fd, buffer, size, 0, (struct
>>> sockaddr*)&address, sizeof address);
>>
>>
>>>              const ssize_t result = recv( fd, buffer, size, 0);
>> ...
>>>              int index = 0;
>>>              memcpy( &index, buffer, sizeof idx);
>>
>> This appears to be a programming error, unrelated to Cygwin.
>>
>> I know that what you provided was an example test case, but you might want  to check if your app is sending way too much when the actual payload size is much smaller.  In the example you provided, you are sending 16KB instead of 4 bytes for the count.
> 
> To send a larger buffer (in this case 16 KB) is intentional, but just the sizeof int is relevant. The reason is just to send many bytes and verify that they end up on the other side in correct order
> 
> 
>> Is your code handling partial read/recv and partial write/sendto?  (It is definitely a bug in the use of recv() in the sample code.)
> 
> It was not and the updated version does not either, but that is not the issue though but I added a test to verify that the whole chunk is sent/read
> 
>> Partial reads and writes can occur more frequently with non-blocking sockets, but it is still good defensive programming to detect and handle partial read/writes.
> 
> That might be the case, but this is blocking attempts though (or maybe I've misunderstood the flags ?), but regardless of that, the test-case is not about how to handle partial writes/reads though, but to kind of show that messages seems to be lost, but of cource code need not to be flawed so thanx for the feedback
> 
> It almost seems like it is UDP-semantics and that packages can get lost or end up in non sequential order, and of course SOCK_DGRAM tells you that, but the posix description says "UNIX domain datagram sockets are always reliable and don't reorder datagrams"
> 
> It seems like when an internal buffer or so of 64 KB is filled the rest of the packages are dropped until consumed, so in this case the 32 first packages arrive in correct order but after that any random package (with index > 32) seems to end up at the "server"
> 
>> It goes without saying that if your protocol sends a fixed size chunk of data, that you should ensure that you read the entire fixed size, even if only using part of the data.
> 
> That's done in the updated version, or at least verified

Thanks for the test case.  I can confirm the problem.  I'm not familiar enough 
with the current AF_UNIX implementation to debug this easily.  I'd rather spend 
my time on the new implementation (on the topic/af_unix branch).  It turns out 
that your test case fails there too, but in a completely different way, due to a 
bug in sendto for datagrams.  I'll see if I can fix that bug and then try again.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-03-30 14:17     ` Ken Brown
@ 2021-03-31  8:24       ` sten.kristian.ivarsson
  2021-03-31 15:07         ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-03-31  8:24 UTC (permalink / raw)
  To: cygwin

[snip]
> >>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop
> >>> messages or at least they are not received in the same order they
> >>> are  sent

[snip] 

> Thanks for the test case.  I can confirm the problem.  I'm not familiar enough
> with the current AF_UNIX implementation to debug this easily.  I'd rather
> spend my time on the new implementation (on the topic/af_unix branch).  It
> turns out that your test case fails there too, but in a completely different way,
> due to a bug in sendto for datagrams.  I'll see if I can fix that bug and then try
> again.
> 
> Ken

Ok, too bad it wasn't our own code base but good that the "mystery" is verified

I finally succeed to build topic/af_unix (after finding out what version of zlib was needed), but not with -D__WITH_AF_UNIX to CXXFLAGS though and thus I haven’t tested it yet

Is it sufficient to add the define to the "main" Makefile or do you have to add it to all the Makefile:s ? I guess I can find out though

Is topic/af_unix fairly up to date with master branch ?

Either way, I'll be glad to help out testing topic/af_unix

Keep up the good work

Best regards,
Kristian


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-03-31  8:24       ` sten.kristian.ivarsson
@ 2021-03-31 15:07         ` Ken Brown
  2021-04-01 16:02           ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Ken Brown @ 2021-03-31 15:07 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 3/31/2021 4:24 AM, sten.kristian.ivarsson@gmail.com wrote:
> [snip]
>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop
>>>>> messages or at least they are not received in the same order they
>>>>> are  sent
> 
> [snip]
> 
>> Thanks for the test case.  I can confirm the problem.  I'm not familiar enough
>> with the current AF_UNIX implementation to debug this easily.  I'd rather
>> spend my time on the new implementation (on the topic/af_unix branch).  It
>> turns out that your test case fails there too, but in a completely different way,
>> due to a bug in sendto for datagrams.  I'll see if I can fix that bug and then try
>> again.
>>
>> Ken
> 
> Ok, too bad it wasn't our own code base but good that the "mystery" is verified
> 
> I finally succeed to build topic/af_unix (after finding out what version of zlib was needed), but not with -D__WITH_AF_UNIX to CXXFLAGS though and thus I haven’t tested it yet
> 
> Is it sufficient to add the define to the "main" Makefile or do you have to add it to all the Makefile:s ? I guess I can find out though

I do it on the configure line, like this:

   ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...

> Is topic/af_unix fairly up to date with master branch ?

Yes, I periodically cherry-pick commits from master to topic/af_unix.  I'll do 
that again right now.

> Either way, I'll be glad to help out testing topic/af_unix

Thanks!

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-03-31 15:07         ` Ken Brown
@ 2021-04-01 16:02           ` Ken Brown
  2021-04-06  7:52             ` Noel Grandin
  2021-04-06 14:50             ` sten.kristian.ivarsson
  0 siblings, 2 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-01 16:02 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

n 3/31/2021 11:07 AM, Ken Brown via Cygwin wrote:
> On 3/31/2021 4:24 AM, sten.kristian.ivarsson@gmail.com wrote:
>> [snip]
>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to drop
>>>>>> messages or at least they are not received in the same order they
>>>>>> are  sent
>>
>> [snip]
>>
>>> Thanks for the test case.  I can confirm the problem.  I'm not familiar enough
>>> with the current AF_UNIX implementation to debug this easily.  I'd rather
>>> spend my time on the new implementation (on the topic/af_unix branch).  It
>>> turns out that your test case fails there too, but in a completely different 
>>> way,
>>> due to a bug in sendto for datagrams.  I'll see if I can fix that bug and 
>>> then try
>>> again.
>>>
>>> Ken
>>
>> Ok, too bad it wasn't our own code base but good that the "mystery" is verified
>>
>> I finally succeed to build topic/af_unix (after finding out what version of 
>> zlib was needed), but not with -D__WITH_AF_UNIX to CXXFLAGS though and thus I 
>> haven’t tested it yet
>>
>> Is it sufficient to add the define to the "main" Makefile or do you have to 
>> add it to all the Makefile:s ? I guess I can find out though
> 
> I do it on the configure line, like this:
> 
>   ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...
> 
>> Is topic/af_unix fairly up to date with master branch ?
> 
> Yes, I periodically cherry-pick commits from master to topic/af_unix.  I'lldo 
> that again right now.
> 
>> Either way, I'll be glad to help out testing topic/af_unix
> 
> Thanks!

I've now pushed a fix for that sendto bug, and your test case runs without error 
on the topic/af_unix branch.

By the way, I think the implementation of sendto/recv for datagrams is very 
inefficient when there are repeated calls to sendto as in your test case. 
Nevertheless, your test case actually runs slightly faster on the topic/af_unix 
branch than it does on master (when the latter succeeds, which it does about 
half the time for me).  So I'm not sure whether it's worth worrying about this.

Here's the issue, briefly.  The communication is done via a Windows named pipe. 
  The receiver creates the pipe when it creates and binds its socket.  It 
creates only one pipe instance.  The sender connects to the pipe, writes, and 
closes its handle.  But the pipe is not available for another sender to connect 
to until the receiver reads the message, after which it disconnects the sender.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-01 16:02           ` Ken Brown
@ 2021-04-06  7:52             ` Noel Grandin
  2021-04-06 14:59               ` Ken Brown
  2021-04-06 14:50             ` sten.kristian.ivarsson
  1 sibling, 1 reply; 27+ messages in thread
From: Noel Grandin @ 2021-04-06  7:52 UTC (permalink / raw)
  To: Ken Brown, sten.kristian.ivarsson, cygwin



On 2021/04/01 6:02 pm, Ken Brown via Cygwin wrote:
> Here's the issue, briefly.  The communication is done via a Windows named pipe. The receiver creates the pipe when it 
> creates and binds its socket.  It creates only one pipe instance.  The sender connects to the pipe, writes, and closes 
> its handle.  But the pipe is not available for another sender to connect to until the receiver reads the message, after 
> which it disconnects the sender.
> 

This

    https://docs.microsoft.com/en-us/windows/win32/ipc/named-pipe-instances

seems to indicate that multiple pipe instances are needed to handle multiple clients nicely - it also has sample code 
for such.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-01 16:02           ` Ken Brown
  2021-04-06  7:52             ` Noel Grandin
@ 2021-04-06 14:50             ` sten.kristian.ivarsson
  2021-04-06 15:24               ` Ken Brown
  2021-04-07 14:56               ` Ken Brown
  1 sibling, 2 replies; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-06 14:50 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

[-- Attachment #1: Type: text/plain, Size: 3938 bytes --]


> >>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to
> >>>>>> drop messages or at least they are not received in the same order
> >>>>>> they are  sent
> >>
> >> [snip]
> >>
> >>> Thanks for the test case.  I can confirm the problem.  I'm not
> >>> familiar enough with the current AF_UNIX implementation to debug
> >>> this easily.  I'd rather spend my time on the new implementation (on
> >>> the topic/af_unix branch).  It turns out that your test case fails
> >>> there too, but in a completely different way, due to a bug in sendto
> >>> for datagrams.  I'll see if I can fix that bug and then try again.
> >>>
> >>> Ken
> >>
> >> Ok, too bad it wasn't our own code base but good that the "mystery"
> >> is verified
> >>
> >> I finally succeed to build topic/af_unix (after finding out what
> >> version of zlib was needed), but not with -D__WITH_AF_UNIX to
> >> CXXFLAGS though and thus I haven’t tested it yet
> >>
> >> Is it sufficient to add the define to the "main" Makefile or do you
> >> have to add it to all the Makefile:s ? I guess I can find out though
> >
> > I do it on the configure line, like this:
> >
> >   ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...
> >
> >> Is topic/af_unix fairly up to date with master branch ?
> >
> > Yes, I periodically cherry-pick commits from master to topic/af_unix.
> > I'lldo that again right now.
> >
> >> Either way, I'll be glad to help out testing topic/af_unix
> >
> > Thanks!
> 
> I've now pushed a fix for that sendto bug, and your test case runs without
> error on the topic/af_unix branch.

It seems like the test-case do work now with topic/af_unix in blocking mode, but when using non-blocking (with MSG_DONTWAIT) there are some issues I think

1. When the queue is empty with non-blocking recv(), errno is set to EPIPE but I think it should be EAGAIN (or maybe the pipe is getting broken for real of some reason ?)

2. When using non-blocking recv() and no message is written at all, it seems like recv() blocks forever

3. Using non-blocking recv() where the "client" does send less than "count" messages, sometimes recv() blocks forever (as well)


My naïve analysis of this is that for the first issue (if any) the wrong errno is set and for the second issue it blocks if no sendto() is done after the first recv(), i.e. nothing kicks the "reader thread" in the butt to realise the queue is empty. It is not super clear though what POSIX says about creating blocking descriptors and then using non-blocking-flags with recv(), but this works in Linux any way

Let me know if I should provide more a specific explanation, but I think minor modifications of the test-case can provoke all behaviours. I think 2 and 3 are of the same reason though (as described above)


> By the way, I think the implementation of sendto/recv for datagrams is very
> inefficient when there are repeated calls to sendto as in your test case.
> Nevertheless, your test case actually runs slightly faster on the topic/af_unix
> branch than it does on master (when the latter succeeds, which it does about
> half the time for me).  So I'm not sure whether it's worth worrying about this.

Of course we would like the best throughput possible 😉

> Here's the issue, briefly.  The communication is done via a Windows named
> pipe.
>   The receiver creates the pipe when it creates and binds its socket.  It creates
> only one pipe instance.  The sender connects to the pipe, writes, and closes its
> handle.  But the pipe is not available for another sender to connect to until the
> receiver reads the message, after which it disconnects the sender.

Ok, in our application we will use long lived descriptors and multiple writers that possible send large business messages (chunked into some smaller pieces per sendto()/recv())

> Ken[Kristian] 

Best regards,
Kristian

[-- Attachment #2: af_unix.cpp --]
[-- Type: text/plain, Size: 2679 bytes --]

#include <sys/socket.h>
#include <sys/un.h>

#undef AF_UNIX
#define AF_UNIX 31

#include <unistd.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <thread>
#include <chrono>


// $ g++ --std=gnu++17 af_unix.cpp

const char* const path = "address";
const int count = 10000;
const int size = BUFSIZ * 8;

int client()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    char buffer[size] = {};

    for( int idx = 0; idx < 100; ++idx)
    {
        memcpy( buffer, &idx, sizeof idx);

        const ssize_t result = sendto( fd, buffer, size, 0, (struct sockaddr*)&address, sizeof address);

        // Assume the whole chunk can be sent
        if( result != size)
        {
            perror( "sendto error");
            return -1;
        }
    }

    close( fd);
    return 0;
}

int server()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    const int result = bind( fd, (struct sockaddr*)&address, sizeof address);

    if( result == -1)
    {
        perror( "bind error");
        return -1;
    }

    return fd;
}

int main( int argc, char* argv[])
{
    const int fd = server( );

    if( fd != -1)
    {
        fprintf( stdout, "%d\tnumber of packages\n", count);
        fprintf( stdout, "%d\tbytes per package\n", size);

        std::thread{ [&](){client( );}}.detach();

        std::this_thread::sleep_for( std::chrono::microseconds( 500));
    
        char buffer[size] = {};

        for( int idx = 0; idx < count; ++idx)
        {
            const ssize_t result = recv( fd, buffer, size, MSG_DONTWAIT);

            // Assume the whole chunk can be read
            if( result != size)
            {
                perror("recv error");
                //fprintf( stderr, "index: %d\n", idx);
                unlink( path);
                return -1;
            }

            int index = 0;
            memcpy( &index, buffer, sizeof idx);

            if( index != idx)
            {
                fprintf( stderr, "expected %d but got %d\n", idx, index);
                unlink( path);
                return -1;
            }
        }

        close( fd);
        unlink( path);
    }

    return 0;
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-06  7:52             ` Noel Grandin
@ 2021-04-06 14:59               ` Ken Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-06 14:59 UTC (permalink / raw)
  To: Noel Grandin, sten.kristian.ivarsson, cygwin

On 4/6/2021 3:52 AM, Noel Grandin wrote:
> 
> 
> On 2021/04/01 6:02 pm, Ken Brown via Cygwin wrote:
>> Here's the issue, briefly.  The communication is done via a Windows named 
>> pipe. The receiver creates the pipe when it creates and binds its socket.  It 
>> creates only one pipe instance.  The sender connects to the pipe, writes, and 
>> closes its handle.  But the pipe is not available for another sender to 
>> connect to until the receiver reads the message, after which it disconnects 
>> the sender.
>>
> 
> This
> 
>    https://docs.microsoft.com/en-us/windows/win32/ipc/named-pipe-instances
> 
> seems to indicate that multiple pipe instances are needed to handle multiple 
> clients nicely - it also has sample code for such.

Yes, we do that for stream sockets that are listening.  Whenever there's a 
connection, a new pipe instance is created so that the listening socket can 
continue listening.  But I don't see an easy way to adapt this to datagram 
sockets, and I'm not even sure it's appropriate in that case.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-06 14:50             ` sten.kristian.ivarsson
@ 2021-04-06 15:24               ` Ken Brown
  2021-04-07 14:56               ` Ken Brown
  1 sibling, 0 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-06 15:24 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/6/2021 10:50 AM, sten.kristian.ivarsson@gmail.com wrote:
> 
>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to
>>>>>>>> drop messages or at least they are not received in the same order
>>>>>>>> they are  sent
>>>>
>>>> [snip]
>>>>
>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>> this easily.  I'd rather spend my time on the new implementation (on
>>>>> the topic/af_unix branch).  It turns out that your test case fails
>>>>> there too, but in a completely different way, due to a bug in sendto
>>>>> for datagrams.  I'll see if I can fix that bug and then try again.
>>>>>
>>>>> Ken
>>>>
>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>> is verified
>>>>
>>>> I finally succeed to build topic/af_unix (after finding out what
>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>
>>>> Is it sufficient to add the define to the "main" Makefile or do you
>>>> have to add it to all the Makefile:s ? I guess I can find out though
>>>
>>> I do it on the configure line, like this:
>>>
>>>    ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...
>>>
>>>> Is topic/af_unix fairly up to date with master branch ?
>>>
>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>> I'lldo that again right now.
>>>
>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>
>>> Thanks!
>>
>> I've now pushed a fix for that sendto bug, and your test case runs without
>> error on the topic/af_unix branch.
> 
> It seems like the test-case do work now with topic/af_unix in blocking mode, but when using non-blocking (with MSG_DONTWAIT) there are some issues I think
> 
> 1. When the queue is empty with non-blocking recv(), errno is set to EPIPE but I think it should be EAGAIN (or maybe the pipe is getting broken for real of some reason ?)
> 
> 2. When using non-blocking recv() and no message is written at all, it seems like recv() blocks forever
> 
> 3. Using non-blocking recv() where the "client" does send less than "count" messages, sometimes recv() blocks forever (as well)
> 
> 
> My naïve analysis of this is that for the first issue (if any) the wrong errno is set and for the second issue it blocks if no sendto() is done after the first recv(), i.e. nothing kicks the "reader thread" in the butt to realise the queue is empty. It is not super clear though what POSIX says about creating blocking descriptors and then using non-blocking-flags with recv(), but this works in Linux any way
> 
> Let me know if I should provide more a specific explanation, but I think minor modifications of the test-case can provoke all behaviours. I think 2 and 3 are of the same reason though (as described above)

Thanks, I'll take a look.

Ken

> 
> 
>> By the way, I think the implementation of sendto/recv for datagrams is very
>> inefficient when there are repeated calls to sendto as in your test case.
>> Nevertheless, your test case actually runs slightly faster on the topic/af_unix
>> branch than it does on master (when the latter succeeds, which it does about
>> half the time for me).  So I'm not sure whether it's worth worrying about this.
> 
> Of course we would like the best throughput possible 😉
> 
>> Here's the issue, briefly.  The communication is done via a Windows named
>> pipe.
>>    The receiver creates the pipe when it creates and binds its socket.  It creates
>> only one pipe instance.  The sender connects to the pipe, writes, and closes its
>> handle.  But the pipe is not available for another sender to connect to until the
>> receiver reads the message, after which it disconnects the sender.
> 
> Ok, in our application we will use long lived descriptors and multiple writers that possible send large business messages (chunked into some smaller pieces per sendto()/recv())
> 
>> Ken[Kristian]
> 
> Best regards,
> Kristian
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-06 14:50             ` sten.kristian.ivarsson
  2021-04-06 15:24               ` Ken Brown
@ 2021-04-07 14:56               ` Ken Brown
  2021-04-08  8:37                 ` sten.kristian.ivarsson
  2021-04-13 14:06                 ` sten.kristian.ivarsson
  1 sibling, 2 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-07 14:56 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/6/2021 10:50 AM, sten.kristian.ivarsson@gmail.com wrote:
> 
>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to
>>>>>>>> drop messages or at least they are not received in the same order
>>>>>>>> they are  sent
>>>>
>>>> [snip]
>>>>
>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>> this easily.  I'd rather spend my time on the new implementation (on
>>>>> the topic/af_unix branch).  It turns out that your test case fails
>>>>> there too, but in a completely different way, due to a bug in sendto
>>>>> for datagrams.  I'll see if I can fix that bug and then try again.
>>>>>
>>>>> Ken
>>>>
>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>> is verified
>>>>
>>>> I finally succeed to build topic/af_unix (after finding out what
>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>
>>>> Is it sufficient to add the define to the "main" Makefile or do you
>>>> have to add it to all the Makefile:s ? I guess I can find out though
>>>
>>> I do it on the configure line, like this:
>>>
>>>    ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...
>>>
>>>> Is topic/af_unix fairly up to date with master branch ?
>>>
>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>> I'lldo that again right now.
>>>
>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>
>>> Thanks!
>>
>> I've now pushed a fix for that sendto bug, and your test case runs without
>> error on the topic/af_unix branch.
> 
> It seems like the test-case do work now with topic/af_unix in blocking mode, but when using non-blocking (with MSG_DONTWAIT) there are some issues I think
> 
> 1. When the queue is empty with non-blocking recv(), errno is set to EPIPE but I think it should be EAGAIN (or maybe the pipe is getting broken for real of some reason ?)
> 
> 2. When using non-blocking recv() and no message is written at all, it seems like recv() blocks forever
> 
> 3. Using non-blocking recv() where the "client" does send less than "count" messages, sometimes recv() blocks forever (as well)
> 
> 
> My naïve analysis of this is that for the first issue (if any) the wrong errno is set and for the second issue it blocks if no sendto() is done after the first recv(), i.e. nothing kicks the "reader thread" in the butt to realise the queue is empty. It is not super clear though what POSIX says about creating blocking descriptors and then using non-blocking-flags with recv(), but this works in Linux any way

The explanation is actually much simpler.  In the recv code where a bound 
datagram socket waits for a remote socket to connect to the pipe, I simply 
forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please retest.

I should add that in all my work so far on the topic/af_unix branch, I've 
thought mainly about stream sockets.  So there may still be things remaining to 
be implemented for the datagram case.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-07 14:56               ` Ken Brown
@ 2021-04-08  8:37                 ` sten.kristian.ivarsson
  2021-04-08 19:47                   ` sten.kristian.ivarsson
  2021-04-13 14:06                 ` sten.kristian.ivarsson
  1 sibling, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-08  8:37 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

[-- Attachment #1: Type: text/plain, Size: 4081 bytes --]

> >>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
> to
> >>>>>>>> drop messages or at least they are not received in the same
> >>>>>>>> order they are  sent
> >>>>
> >>>> [snip]
> >>>>
> >>>>> Thanks for the test case.  I can confirm the problem.  I'm not
> >>>>> familiar enough with the current AF_UNIX implementation to debug
> >>>>> this easily.  I'd rather spend my time on the new implementation
> >>>>> (on the topic/af_unix branch).  It turns out that your test case
> >>>>> fails there too, but in a completely different way, due to a bug
> >>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try again.
> >>>>>
> >>>>> Ken
> >>>>
> >>>> Ok, too bad it wasn't our own code base but good that the "mystery"
> >>>> is verified
> >>>>
> >>>> I finally succeed to build topic/af_unix (after finding out what
> >>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
> >>>> CXXFLAGS though and thus I haven’t tested it yet
> >>>>
> >>>> Is it sufficient to add the define to the "main" Makefile or do you
> >>>> have to add it to all the Makefile:s ? I guess I can find out
> >>>> though
> >>>
> >>> I do it on the configure line, like this:
> >>>
> >>>    ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
> prefix=...
> >>>
> >>>> Is topic/af_unix fairly up to date with master branch ?
> >>>
> >>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
> >>> I'lldo that again right now.
> >>>
> >>>> Either way, I'll be glad to help out testing topic/af_unix
> >>>
> >>> Thanks!
> >>
> >> I've now pushed a fix for that sendto bug, and your test case runs
> >> without error on the topic/af_unix branch.
> >
> > It seems like the test-case do work now with topic/af_unix in blocking
> > mode, but when using non-blocking (with MSG_DONTWAIT) there are
> some
> > issues I think
> >
> > 1. When the queue is empty with non-blocking recv(), errno is set to
> > EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
> > broken for real of some reason ?)
> >
> > 2. When using non-blocking recv() and no message is written at all, it
> > seems like recv() blocks forever
> >
> > 3. Using non-blocking recv() where the "client" does send less than
> > "count" messages, sometimes recv() blocks forever (as well)
> >
> >
> > My naïve analysis of this is that for the first issue (if any) the
> > wrong errno is set and for the second issue it blocks if no sendto()
> > is done after the first recv(), i.e. nothing kicks the "reader thread"
> > in the butt to realise the queue is empty. It is not super clear
> > though what POSIX says about creating blocking descriptors and then
> > using non-blocking-flags with recv(), but this works in Linux any way
> 
> The explanation is actually much simpler.  In the recv code where a bound
> datagram socket waits for a remote socket to connect to the pipe, I simply
> forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please retest.

I tested it and now it seems like we get EAGAIN when there's no msg on the queue, but it seems like the client is blocked as well and that it cannot write any more messages until it is consumed by the server, so the af_unix.cpp test-client end prematurely

If using sendto() with MSG_DONTWAIT as well, that is getting a EAGAIN, but the socket in it self is not a non-blocking socket, it is just the recv() that is done in a non-blocking fashion

As I said earlier, it's a bit fuzzy (or at least for me) what POSIX mean by non/blocking descriptors combined with non/blocking operations, but as far as I understand, it should be possible to use blocking sendto()and messages should be written (as long as some buffer is not filled) at the same time someone is doing non-blocking recv()

What is your take on this ?

> I should add that in all my work so far on the topic/af_unix branch, I've
> thought mainly about stream sockets.  So there may still be things remaining
> to be implemented for the datagram case.
> 
> Ken

[-- Attachment #2: af_unix.cpp --]
[-- Type: text/plain, Size: 2737 bytes --]

#include <sys/socket.h>
#include <sys/un.h>

#undef AF_UNIX
#define AF_UNIX 31

#include <unistd.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <thread>
#include <chrono>


// $ g++ --std=gnu++17 af_unix.cpp

const char* const path = "address";
const int count = 10;
const int size = BUFSIZ * 1;

int client()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    char buffer[size] = {};

    for( int idx = 0; idx < count; ++idx)
    {
        memcpy( buffer, &idx, sizeof idx);

        fprintf( stdout, "client idx: %d\n", idx);
    
        const ssize_t result = sendto( fd, buffer, size, 0, (struct sockaddr*)&address, sizeof address);

        // Assume the whole chunk can be sent
        if( result != size)
        {
            perror( "sendto error");
            return -1;
        }
    }

    close( fd);
    return 0;
}

int server()
{
    const int fd = socket( AF_UNIX, SOCK_DGRAM, 0);

    if( fd == -1)
    {
        perror( "socket error");
        return -1;
    }

    struct sockaddr_un address{};

    strcpy( address.sun_path, path);
    address.sun_family = AF_UNIX;

    const int result = bind( fd, (struct sockaddr*)&address, sizeof address);

    if( result == -1)
    {
        perror( "bind error");
        return -1;
    }

    return fd;
}

int main( int argc, char* argv[])
{
    const int fd = server( );

    if( fd != -1)
    {
        fprintf( stdout, "%d\tnumber of packages\n", count);
        fprintf( stdout, "%d\tbytes per package\n", size);

        std::thread{ [&](){client( );}}.detach();

        std::this_thread::sleep_for( std::chrono::microseconds( 250));
    
        char buffer[size] = {};

        for( int idx = 0; idx < count; ++idx)
        {
            fprintf( stdout, "server idx: %d\n", idx);

            const ssize_t result = recv( fd, buffer, size, MSG_DONTWAIT);

            // Assume the whole chunk can be read
            if( result != size)
            {
                perror("recv error");
                unlink( path);
                return -1;
            }

            int index = 0;
            memcpy( &index, buffer, sizeof idx);

            if( index != idx)
            {
                fprintf( stderr, "expected %d but got %d\n", idx, index);
                unlink( path);
                return -1;
            }
        }

        close( fd);
        unlink( path);
    }

    return 0;
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-08  8:37                 ` sten.kristian.ivarsson
@ 2021-04-08 19:47                   ` sten.kristian.ivarsson
  2021-04-08 21:02                     ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-08 19:47 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

> > >>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
> > to
> > >>>>>>>> drop messages or at least they are not received in the same
> > >>>>>>>> order they are  sent
> > >>>>
> > >>>> [snip]
> > >>>>
> > >>>>> Thanks for the test case.  I can confirm the problem.  I'm not
> > >>>>> familiar enough with the current AF_UNIX implementation to debug
> > >>>>> this easily.  I'd rather spend my time on the new implementation
> > >>>>> (on the topic/af_unix branch).  It turns out that your test case
> > >>>>> fails there too, but in a completely different way, due to a bug
> > >>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try
> again.
> > >>>>>
> > >>>>> Ken
> > >>>>
> > >>>> Ok, too bad it wasn't our own code base but good that the "mystery"
> > >>>> is verified
> > >>>>
> > >>>> I finally succeed to build topic/af_unix (after finding out what
> > >>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
> > >>>> CXXFLAGS though and thus I haven’t tested it yet
> > >>>>
> > >>>> Is it sufficient to add the define to the "main" Makefile or do
> > >>>> you have to add it to all the Makefile:s ? I guess I can find out
> > >>>> though
> > >>>
> > >>> I do it on the configure line, like this:
> > >>>
> > >>>    ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
> > prefix=...
> > >>>
> > >>>> Is topic/af_unix fairly up to date with master branch ?
> > >>>
> > >>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
> > >>> I'lldo that again right now.
> > >>>
> > >>>> Either way, I'll be glad to help out testing topic/af_unix
> > >>>
> > >>> Thanks!
> > >>
> > >> I've now pushed a fix for that sendto bug, and your test case runs
> > >> without error on the topic/af_unix branch.
> > >
> > > It seems like the test-case do work now with topic/af_unix in
> > > blocking mode, but when using non-blocking (with MSG_DONTWAIT) there
> > > are
> > some
> > > issues I think
> > >
> > > 1. When the queue is empty with non-blocking recv(), errno is set to
> > > EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
> > > broken for real of some reason ?)
> > >
> > > 2. When using non-blocking recv() and no message is written at all,
> > > it seems like recv() blocks forever
> > >
> > > 3. Using non-blocking recv() where the "client" does send less than
> > > "count" messages, sometimes recv() blocks forever (as well)
> > >
> > >
> > > My naïve analysis of this is that for the first issue (if any) the
> > > wrong errno is set and for the second issue it blocks if no sendto()
> > > is done after the first recv(), i.e. nothing kicks the "reader thread"
> > > in the butt to realise the queue is empty. It is not super clear
> > > though what POSIX says about creating blocking descriptors and then
> > > using non-blocking-flags with recv(), but this works in Linux any
> > > way
> >
> > The explanation is actually much simpler.  In the recv code where a
> > bound datagram socket waits for a remote socket to connect to the
> > pipe, I simply forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please
> retest.
> 
> I tested it and now it seems like we get EAGAIN when there's no msg on the
> queue, but it seems like the client is blocked as well and that it cannot write
> any more messages until it is consumed by the server, so the af_unix.cpp test-
> client end prematurely
> 
> If using sendto() with MSG_DONTWAIT as well, that is getting a EAGAIN, but
> the socket in it self is not a non-blocking socket, it is just the recv() that is done
> in a non-blocking fashion
> 
> As I said earlier, it's a bit fuzzy (or at least for me) what POSIX mean by
> non/blocking descriptors combined with non/blocking operations, but as far
> as I understand, it should be possible to use blocking sendto()and messages
> should be written (as long as some buffer is not filled) at the same time
> someone is doing non-blocking recv()
> 
> What is your take on this ?

I was thinking of this again and came to the conclusion that the fix semantically probably works ok

It was just me that didn't realise that only one message can be on the queue simultaneously even in blocking mode

The problem is not functional but merely a performance hog, that I guess you have already realised and you mentioned it in previous message but I guess I thought it was about some other issue


So, I guess the fix works ok (I haven't done any more tests than with the sample program), but I guess out of an throughput aspect I guess it would be a good idea to let more messages be written to the queue before the first is consumed or so (I guess you already have some thoughts about this?)

Keep up the good work,
Kristian


> > I should add that in all my work so far on the topic/af_unix branch,
> > I've thought mainly about stream sockets.  So there may still be
> > things remaining to be implemented for the datagram case.
> >
> > Ken


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-08 19:47                   ` sten.kristian.ivarsson
@ 2021-04-08 21:02                     ` Ken Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-08 21:02 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/8/2021 3:47 PM, sten.kristian.ivarsson@gmail.com wrote:
>>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
>>> to
>>>>>>>>>>> drop messages or at least they are not received in the same
>>>>>>>>>>> order they are  sent
>>>>>>>
>>>>>>> [snip]
>>>>>>>
>>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>>>>> this easily.  I'd rather spend my time on the new implementation
>>>>>>>> (on the topic/af_unix branch).  It turns out that your test case
>>>>>>>> fails there too, but in a completely different way, due to a bug
>>>>>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try
>> again.
>>>>>>>>
>>>>>>>> Ken
>>>>>>>
>>>>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>>>>> is verified
>>>>>>>
>>>>>>> I finally succeed to build topic/af_unix (after finding out what
>>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>>>>
>>>>>>> Is it sufficient to add the define to the "main" Makefile or do
>>>>>>> you have to add it to all the Makefile:s ? I guess I can find out
>>>>>>> though
>>>>>>
>>>>>> I do it on the configure line, like this:
>>>>>>
>>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
>>> prefix=...
>>>>>>
>>>>>>> Is topic/af_unix fairly up to date with master branch ?
>>>>>>
>>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>>>>> I'lldo that again right now.
>>>>>>
>>>>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>>>>
>>>>>> Thanks!
>>>>>
>>>>> I've now pushed a fix for that sendto bug, and your test case runs
>>>>> without error on the topic/af_unix branch.
>>>>
>>>> It seems like the test-case do work now with topic/af_unix in
>>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT) there
>>>> are
>>> some
>>>> issues I think
>>>>
>>>> 1. When the queue is empty with non-blocking recv(), errno is set to
>>>> EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
>>>> broken for real of some reason ?)
>>>>
>>>> 2. When using non-blocking recv() and no message is written at all,
>>>> it seems like recv() blocks forever
>>>>
>>>> 3. Using non-blocking recv() where the "client" does send less than
>>>> "count" messages, sometimes recv() blocks forever (as well)
>>>>
>>>>
>>>> My naïve analysis of this is that for the first issue (if any) the
>>>> wrong errno is set and for the second issue it blocks if no sendto()
>>>> is done after the first recv(), i.e. nothing kicks the "reader thread"
>>>> in the butt to realise the queue is empty. It is not super clear
>>>> though what POSIX says about creating blocking descriptors and then
>>>> using non-blocking-flags with recv(), but this works in Linux any
>>>> way
>>>
>>> The explanation is actually much simpler.  In the recv code where a
>>> bound datagram socket waits for a remote socket to connect to the
>>> pipe, I simply forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please
>> retest.
>>
>> I tested it and now it seems like we get EAGAIN when there's no msg on the
>> queue, but it seems like the client is blocked as well and that it cannot write
>> any more messages until it is consumed by the server, so the af_unix.cpp test-
>> client end prematurely
>>
>> If using sendto() with MSG_DONTWAIT as well, that is getting a EAGAIN, but
>> the socket in it self is not a non-blocking socket, it is just the recv() that is done
>> in a non-blocking fashion
>>
>> As I said earlier, it's a bit fuzzy (or at least for me) what POSIX mean by
>> non/blocking descriptors combined with non/blocking operations, but as far
>> as I understand, it should be possible to use blocking sendto()and messages
>> should be written (as long as some buffer is not filled) at the same time
>> someone is doing non-blocking recv()
>>
>> What is your take on this ?
> 
> I was thinking of this again and came to the conclusion that the fix semantically probably works ok
> 
> It was just me that didn't realise that only one message can be on the queue simultaneously even in blocking mode
> 
> The problem is not functional but merely a performance hog, that I guess you have already realised and you mentioned it in previous message but I guess I thought it was about some other issue
> 
> 
> So, I guess the fix works ok (I haven't done any more tests than with the sample program), but I guess out of an throughput aspect I guess it would be a good idea to let more messages be written to the queue before the first is consumed or so (I guess you already have some thoughts about this?)

I have some thoughts, but nothing definitive yet.  I'll keep thinking.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-07 14:56               ` Ken Brown
  2021-04-08  8:37                 ` sten.kristian.ivarsson
@ 2021-04-13 14:06                 ` sten.kristian.ivarsson
  2021-04-13 14:47                   ` Ken Brown
  1 sibling, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-13 14:06 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

Hi Ken

> >>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
> to
> >>>>>>>> drop messages or at least they are not received in the same
> >>>>>>>> order they are  sent
> >>>>
> >>>> [snip]
> >>>>
> >>>>> Thanks for the test case.  I can confirm the problem.  I'm not
> >>>>> familiar enough with the current AF_UNIX implementation to debug
> >>>>> this easily.  I'd rather spend my time on the new implementation
> >>>>> (on the topic/af_unix branch).  It turns out that your test case
> >>>>> fails there too, but in a completely different way, due to a bug
> >>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try again.
> >>>>>
> >>>>> Ken
> >>>>
> >>>> Ok, too bad it wasn't our own code base but good that the "mystery"
> >>>> is verified
> >>>>
> >>>> I finally succeed to build topic/af_unix (after finding out what
> >>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
> >>>> CXXFLAGS though and thus I haven’t tested it yet
> >>>>
> >>>> Is it sufficient to add the define to the "main" Makefile or do you
> >>>> have to add it to all the Makefile:s ? I guess I can find out
> >>>> though
> >>>
> >>> I do it on the configure line, like this:
> >>>
> >>>    ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
> prefix=...
> >>>
> >>>> Is topic/af_unix fairly up to date with master branch ?
> >>>
> >>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
> >>> I'lldo that again right now.
> >>>
> >>>> Either way, I'll be glad to help out testing topic/af_unix
> >>>
> >>> Thanks!
> >>
> >> I've now pushed a fix for that sendto bug, and your test case runs
> >> without error on the topic/af_unix branch.
> >
> > It seems like the test-case do work now with topic/af_unix in blocking
> > mode, but when using non-blocking (with MSG_DONTWAIT) there are
> some
> > issues I think
> >
> > 1. When the queue is empty with non-blocking recv(), errno is set to
> > EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
> > broken for real of some reason ?)
> >
> > 2. When using non-blocking recv() and no message is written at all, it
> > seems like recv() blocks forever
> >
> > 3. Using non-blocking recv() where the "client" does send less than
> > "count" messages, sometimes recv() blocks forever (as well)
> >
> >
> > My naïve analysis of this is that for the first issue (if any) the
> > wrong errno is set and for the second issue it blocks if no sendto()
> > is done after the first recv(), i.e. nothing kicks the "reader thread"
> > in the butt to realise the queue is empty. It is not super clear
> > though what POSIX says about creating blocking descriptors and then
> > using non-blocking-flags with recv(), but this works in Linux any way
> 
> The explanation is actually much simpler.  In the recv code where a bound
> datagram socket waits for a remote socket to connect to the pipe, I simply
> forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please retest.
> 
> I should add that in all my work so far on the topic/af_unix branch, I've
> thought mainly about stream sockets.  So there may still be things remaining
> to be implemented for the datagram case.

I finally got some time to test topic/af_unix in our "real" cygwin-application (casual) and unfortunately very few of our unittests pass

The symptoms are that there's unexpected eternal blocking, sometimes there's unexpected EADDRNOTAVAIL, sometimes it looks like some memory corruption (and core-dumps)

Of course the memory corruption etc could be our self and the core-dumps might be because of uncaught exceptions

Needles to say is that all unittests pass on Linux, but of course cygwin-topic/af_unix could act according to POSIX-standard and the behaviour could be due to our own misinterpretation of how POSIX works


I will try to narrow down the quite complex logic and reproduce the problems

If you of some reason wanna try it with casual, I'd be glad to help you out (it should be easier now that last time (but there might be some documentation missing for Cygwin still))

https://bitbucket.org/casualcore/


Best regards,
Kristian

> Ken


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-13 14:06                 ` sten.kristian.ivarsson
@ 2021-04-13 14:47                   ` Ken Brown
  2021-04-13 22:43                     ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Ken Brown @ 2021-04-13 14:47 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/13/2021 10:06 AM, sten.kristian.ivarsson@gmail.com wrote:
> Hi Ken
> 
>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
>> to
>>>>>>>>>> drop messages or at least they are not received in the same
>>>>>>>>>> order they are  sent
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>>>> this easily.  I'd rather spend my time on the new implementation
>>>>>>> (on the topic/af_unix branch).  It turns out that your test case
>>>>>>> fails there too, but in a completely different way, due to a bug
>>>>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try again.
>>>>>>>
>>>>>>> Ken
>>>>>>
>>>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>>>> is verified
>>>>>>
>>>>>> I finally succeed to build topic/af_unix (after finding out what
>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>>>
>>>>>> Is it sufficient to add the define to the "main" Makefile or do you
>>>>>> have to add it to all the Makefile:s ? I guess I can find out
>>>>>> though
>>>>>
>>>>> I do it on the configure line, like this:
>>>>>
>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
>> prefix=...
>>>>>
>>>>>> Is topic/af_unix fairly up to date with master branch ?
>>>>>
>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>>>> I'lldo that again right now.
>>>>>
>>>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>>>
>>>>> Thanks!
>>>>
>>>> I've now pushed a fix for that sendto bug, and your test case runs
>>>> without error on the topic/af_unix branch.
>>>
>>> It seems like the test-case do work now with topic/af_unix in blocking
>>> mode, but when using non-blocking (with MSG_DONTWAIT) there are
>> some
>>> issues I think
>>>
>>> 1. When the queue is empty with non-blocking recv(), errno is set to
>>> EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
>>> broken for real of some reason ?)
>>>
>>> 2. When using non-blocking recv() and no message is written at all, it
>>> seems like recv() blocks forever
>>>
>>> 3. Using non-blocking recv() where the "client" does send less than
>>> "count" messages, sometimes recv() blocks forever (as well)
>>>
>>>
>>> My naïve analysis of this is that for the first issue (if any) the
>>> wrong errno is set and for the second issue it blocks if no sendto()
>>> is done after the first recv(), i.e. nothing kicks the "reader thread"
>>> in the butt to realise the queue is empty. It is not super clear
>>> though what POSIX says about creating blocking descriptors and then
>>> using non-blocking-flags with recv(), but this works in Linux any way
>>
>> The explanation is actually much simpler.  In the recv code where a bound
>> datagram socket waits for a remote socket to connect to the pipe, I simply
>> forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please retest.
>>
>> I should add that in all my work so far on the topic/af_unix branch, I've
>> thought mainly about stream sockets.  So there may still be things remaining
>> to be implemented for the datagram case.
> 
> I finally got some time to test topic/af_unix in our "real" cygwin-application (casual) and unfortunately very few of our unittests pass
> 
> The symptoms are that there's unexpected eternal blocking, sometimes there's unexpected EADDRNOTAVAIL, sometimes it looks like some memory corruption (and core-dumps)
> 
> Of course the memory corruption etc could be our self and the core-dumps might be because of uncaught exceptions
> 
> Needles to say is that all unittests pass on Linux, but of course cygwin-topic/af_unix could act according to POSIX-standard and the behaviour could be due to our own misinterpretation of how POSIX works

More likely it's due to bugs in the topic/af_unix branch.  This is still very 
much a work in progress.

> I will try to narrow down the quite complex logic and reproduce the problems

That would be ideal.

> If you of some reason wanna try it with casual, I'd be glad to help you out (it should be easier now that last time (but there might be some documentation missing for Cygwin still))
> 
> https://bitbucket.org/casualcore/

I'm going on vacation in a few days, but I might do this when I get back.

Thanks for your testing.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-13 14:47                   ` Ken Brown
@ 2021-04-13 22:43                     ` Ken Brown
  2021-04-14 15:53                       ` Ken Brown
  2021-04-14 17:14                       ` sten.kristian.ivarsson
  0 siblings, 2 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-13 22:43 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/13/2021 10:47 AM, Ken Brown via Cygwin wrote:
> On 4/13/2021 10:06 AM, sten.kristian.ivarsson@gmail.com wrote:
>> Hi Ken
>>
>>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
>>> to
>>>>>>>>>>> drop messages or at least they are not received in the same
>>>>>>>>>>> order they are  sent
>>>>>>>
>>>>>>> [snip]
>>>>>>>
>>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>>>>> this easily.  I'd rather spend my time on the new implementation
>>>>>>>> (on the topic/af_unix branch).  It turns out that your test case
>>>>>>>> fails there too, but in a completely different way, due to a bug
>>>>>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try 
>>>>>>>> again.
>>>>>>>>
>>>>>>>> Ken
>>>>>>>
>>>>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>>>>> is verified
>>>>>>>
>>>>>>> I finally succeed to build topic/af_unix (after finding out what
>>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>>>>
>>>>>>> Is it sufficient to add the define to the "main" Makefile or do you
>>>>>>> have to add it to all the Makefile:s ? I guess I can find out
>>>>>>> though
>>>>>>
>>>>>> I do it on the configure line, like this:
>>>>>>
>>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
>>> prefix=...
>>>>>>
>>>>>>> Is topic/af_unix fairly up to date with master branch ?
>>>>>>
>>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>>>>> I'lldo that again right now.
>>>>>>
>>>>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>>>>
>>>>>> Thanks!
>>>>>
>>>>> I've now pushed a fix for that sendto bug, and your test case runs
>>>>> without error on the topic/af_unix branch.
>>>>
>>>> It seems like the test-case do work now with topic/af_unix in blocking
>>>> mode, but when using non-blocking (with MSG_DONTWAIT) there are
>>> some
>>>> issues I think
>>>>
>>>> 1. When the queue is empty with non-blocking recv(), errno is set to
>>>> EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
>>>> broken for real of some reason ?)
>>>>
>>>> 2. When using non-blocking recv() and no message is written at all, it
>>>> seems like recv() blocks forever
>>>>
>>>> 3. Using non-blocking recv() where the "client" does send less than
>>>> "count" messages, sometimes recv() blocks forever (as well)
>>>>
>>>>
>>>> My naïve analysis of this is that for the first issue (if any) the
>>>> wrong errno is set and for the second issue it blocks if no sendto()
>>>> is done after the first recv(), i.e. nothing kicks the "reader thread"
>>>> in the butt to realise the queue is empty. It is not super clear
>>>> though what POSIX says about creating blocking descriptors and then
>>>> using non-blocking-flags with recv(), but this works in Linux any way
>>>
>>> The explanation is actually much simpler.  In the recv code where a bound
>>> datagram socket waits for a remote socket to connect to the pipe, I simply
>>> forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please retest.
>>>
>>> I should add that in all my work so far on the topic/af_unix branch, I've
>>> thought mainly about stream sockets.  So there may still be things remaining
>>> to be implemented for the datagram case.
>>
>> I finally got some time to test topic/af_unix in our "real" cygwin-application 
>> (casual) and unfortunately very few of our unittests pass
>>
>> The symptoms are that there's unexpected eternal blocking, sometimes there's 
>> unexpected EADDRNOTAVAIL, sometimes it looks like some memory corruption (and 
>> core-dumps)
>>
>> Of course the memory corruption etc could be our self and the core-dumps might 
>> be because of uncaught exceptions
>>
>> Needles to say is that all unittests pass on Linux, but of course 
>> cygwin-topic/af_unix could act according to POSIX-standard and the behaviour 
>> couldbe due to our own misinterpretation of how POSIX works
> 
> More likely it's due to bugs in the topic/af_unix branch.  This is still very 
> much a work in progress.
> 
>> I will try to narrow down the quite complex logic and reproduce the problems
> 
> That would be ideal.
> 
>> If you of some reason wanna try it with casual, I'd be glad to help you out 
>> (it should be easier now that last time (but there might be some documentation 
>> missing for Cygwin still))
>>
>> https://bitbucket.org/casualcore/
> 
> I'm going on vacation in a few days, but I might do this when I get back.
> 
> Thanks for your testing.

By the way, if your code is using datagram sockets, then there are very serious 
problems with our implementation (even aside from the performance issue that 
we've already discussed).  For example, I don't know of any reasonable way for 
select to test whether such a socket is ready for writing.  We'll need to solve 
that somehow.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-13 22:43                     ` Ken Brown
@ 2021-04-14 15:53                       ` Ken Brown
  2021-04-14 17:14                       ` sten.kristian.ivarsson
  1 sibling, 0 replies; 27+ messages in thread
From: Ken Brown @ 2021-04-14 15:53 UTC (permalink / raw)
  To: cygwin

On 4/13/2021 6:43 PM, Ken Brown via Cygwin wrote:
> On 4/13/2021 10:47 AM, Ken Brown via Cygwin wrote:
>> On 4/13/2021 10:06 AM, sten.kristian.ivarsson@gmail.com wrote:
>>> Hi Ken
>>>
>>>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
>>>> to
>>>>>>>>>>>> drop messages or at least they are not received in the same
>>>>>>>>>>>> order they are  sent
>>>>>>>>
>>>>>>>> [snip]
>>>>>>>>
>>>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>>>>>> this easily.  I'd rather spend my time on the new implementation
>>>>>>>>> (on the topic/af_unix branch).  It turns out that your test case
>>>>>>>>> fails there too, but in a completely different way, due to a bug
>>>>>>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try 
>>>>>>>>> again.
>>>>>>>>>
>>>>>>>>> Ken
>>>>>>>>
>>>>>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>>>>>> is verified
>>>>>>>>
>>>>>>>> I finally succeed to build topic/af_unix (after finding out what
>>>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>>>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>>>>>
>>>>>>>> Is it sufficient to add the define to the "main" Makefile or do you
>>>>>>>> have to add it to all the Makefile:s ? I guess I can find out
>>>>>>>> though
>>>>>>>
>>>>>>> I do it on the configure line, like this:
>>>>>>>
>>>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
>>>> prefix=...
>>>>>>>
>>>>>>>> Is topic/af_unix fairly up to date with master branch ?
>>>>>>>
>>>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>>>>>> I'lldo that again right now.
>>>>>>>
>>>>>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>>>>>
>>>>>>> Thanks!
>>>>>>
>>>>>> I've now pushed a fix for that sendto bug, and your test case runs
>>>>>> without error on the topic/af_unix branch.
>>>>>
>>>>> It seems like the test-case do work now with topic/af_unix in blocking
>>>>> mode, but when using non-blocking (with MSG_DONTWAIT) there are
>>>> some
>>>>> issues I think
>>>>>
>>>>> 1. When the queue is empty with non-blocking recv(), errno is set to
>>>>> EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
>>>>> broken for real of some reason ?)
>>>>>
>>>>> 2. When using non-blocking recv() and no message is written at all, it
>>>>> seems like recv() blocks forever
>>>>>
>>>>> 3. Using non-blocking recv() where the "client" does send less than
>>>>> "count" messages, sometimes recv() blocks forever (as well)
>>>>>
>>>>>
>>>>> My naïve analysis of this is that for the first issue (if any) the
>>>>> wrong errno is set and for the second issue it blocks if no sendto()
>>>>> is done after the first recv(), i.e. nothing kicks the "reader thread"
>>>>> in the butt to realise the queue is empty. It is not super clear
>>>>> though what POSIX says about creating blocking descriptors and then
>>>>> using non-blocking-flags with recv(), but this works in Linux any way
>>>>
>>>> The explanation is actually much simpler.  In the recv code where a bound
>>>> datagram socket waits for a remote socket to connect to the pipe, I simply
>>>> forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please retest.
>>>>
>>>> I should add that in all my work so far on the topic/af_unix branch, I've
>>>> thought mainly about stream sockets.  So there may still be things remaining
>>>> to be implemented for the datagram case.
>>>
>>> I finally got some time to test topic/af_unix in our "real" 
>>> cygwin-application (casual) and unfortunately very few of our unittests pass
>>>
>>> The symptoms are that there's unexpected eternal blocking, sometimes there's 
>>> unexpected EADDRNOTAVAIL, sometimes it looks like some memory corruption(and 
>>> core-dumps)
>>>
>>> Of course the memory corruption etc could be our self and the core-dumpsmight 
>>> be because of uncaught exceptions
>>>
>>> Needles to say is that all unittests pass on Linux, but of course 
>>> cygwin-topic/af_unix could act according to POSIX-standard and the behaviour 
>>> couldbe due to our own misinterpretation of how POSIX works
>>
>> More likely it's due to bugs in the topic/af_unix branch.  This is still very 
>> much a work in progress.
>>
>>> I will try to narrow down the quite complex logic and reproduce the problems
>>
>> That would be ideal.
>>
>>> If you of some reason wanna try it with casual, I'd be glad to help you out 
>>> (it should be easier now that last time (but there might be some 
>>> documentation missing for Cygwin still))
>>>
>>> https://bitbucket.org/casualcore/
>>
>> I'm going on vacation in a few days, but I might do this when I get back.
>>
>> Thanks for your testing.
> 
> By the way, if your code is using datagram sockets, then there are very serious 
> problems with our implementation (even aside from the performance issue that 
> we've already discussed).  For example, I don't know of any reasonable way for 
> select to test whether such a socket is ready for writing.  We'll need to solve 
> that somehow.

I'm going to follow-up on the cygwin-developers list.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-13 22:43                     ` Ken Brown
  2021-04-14 15:53                       ` Ken Brown
@ 2021-04-14 17:14                       ` sten.kristian.ivarsson
  2021-04-14 21:58                         ` Ken Brown
  1 sibling, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-14 17:14 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

> >> Hi Ken
> >>
> >>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0)
> seems
> >>> to
> >>>>>>>>>>> drop messages or at least they are not received in the same
> >>>>>>>>>>> order they are  sent
> >>>>>>>
> >>>>>>> [snip]
> >>>>>>>
> >>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
> >>>>>>>> familiar enough with the current AF_UNIX implementation to
> >>>>>>>> debug this easily.  I'd rather spend my time on the new
> >>>>>>>> implementation (on the topic/af_unix branch).  It turns out
> >>>>>>>> that your test case fails there too, but in a completely
> >>>>>>>> different way, due to a bug in sendto for datagrams.  I'll see
> >>>>>>>> if I can fix that bug and then try again.
> >>>>>>>>
> >>>>>>>> Ken
> >>>>>>>
> >>>>>>> Ok, too bad it wasn't our own code base but good that the
> "mystery"
> >>>>>>> is verified
> >>>>>>>
> >>>>>>> I finally succeed to build topic/af_unix (after finding out what
> >>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
> >>>>>>> CXXFLAGS though and thus I haven’t tested it yet
> >>>>>>>
> >>>>>>> Is it sufficient to add the define to the "main" Makefile or do
> >>>>>>> you have to add it to all the Makefile:s ? I guess I can find
> >>>>>>> out though
> >>>>>>
> >>>>>> I do it on the configure line, like this:
> >>>>>>
> >>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
> >>> prefix=...
> >>>>>>
> >>>>>>> Is topic/af_unix fairly up to date with master branch ?
> >>>>>>
> >>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
> >>>>>> I'lldo that again right now.
> >>>>>>
> >>>>>>> Either way, I'll be glad to help out testing topic/af_unix
> >>>>>>
> >>>>>> Thanks!
> >>>>>
> >>>>> I've now pushed a fix for that sendto bug, and your test case runs
> >>>>> without error on the topic/af_unix branch.
> >>>>
> >>>> It seems like the test-case do work now with topic/af_unix in
> >>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT)
> >>>> there are
> >>> some
> >>>> issues I think
> >>>>
> >>>> 1. When the queue is empty with non-blocking recv(), errno is set
> >>>> to EPIPE but I think it should be EAGAIN (or maybe the pipe is
> >>>> getting broken for real of some reason ?)
> >>>>
> >>>> 2. When using non-blocking recv() and no message is written at all,
> >>>> it seems like recv() blocks forever
> >>>>
> >>>> 3. Using non-blocking recv() where the "client" does send less than
> >>>> "count" messages, sometimes recv() blocks forever (as well)
> >>>>
> >>>>
> >>>> My naïve analysis of this is that for the first issue (if any) the
> >>>> wrong errno is set and for the second issue it blocks if no
> >>>> sendto() is done after the first recv(), i.e. nothing kicks the "reader
> thread"
> >>>> in the butt to realise the queue is empty. It is not super clear
> >>>> though what POSIX says about creating blocking descriptors and then
> >>>> using non-blocking-flags with recv(), but this works in Linux any
> >>>> way
> >>>
> >>> The explanation is actually much simpler.  In the recv code where a
> >>> bound datagram socket waits for a remote socket to connect to the
> >>> pipe, I simply forget to handle MSG_DONTWAIT.  I've pushed a
> fix.  Please retest.
> >>>
> >>> I should add that in all my work so far on the topic/af_unix branch,
> >>> I've thought mainly about stream sockets.  So there may still be
> >>> things remaining to be implemented for the datagram case.
> >>
> >> I finally got some time to test topic/af_unix in our "real"
> >> cygwin-application
> >> (casual) and unfortunately very few of our unittests pass
> >>
> >> The symptoms are that there's unexpected eternal blocking, sometimes
> >> there's unexpected EADDRNOTAVAIL, sometimes it looks like some
> memory
> >> corruption (and
> >> core-dumps)
> >>
> >> Of course the memory corruption etc could be our self and the
> >> core-dumps might be because of uncaught exceptions
> >>
> >> Needles to say is that all unittests pass on Linux, but of course
> >> cygwin-topic/af_unix could act according to POSIX-standard and the
> >> behaviour couldbe due to our own misinterpretation of how POSIX works
> >
> > More likely it's due to bugs in the topic/af_unix branch.  This is
> > still very much a work in progress.
> >
> >> I will try to narrow down the quite complex logic and reproduce the
> >> problems
> >
> > That would be ideal.
> >
> >> If you of some reason wanna try it with casual, I'd be glad to help
> >> you out (it should be easier now that last time (but there might be
> >> some documentation missing for Cygwin still))
> >>
> >> https://bitbucket.org/casualcore/
> >
> > I'm going on vacation in a few days, but I might do this when I get back.
> >
> > Thanks for your testing.
> 
> By the way, if your code is using datagram sockets, then there are very serious
> problems with our implementation (even aside from the performance issue
> that we've already discussed).  For example, I don't know of any reasonable
> way for select to test whether such a socket is ready for writing.  We'll need to
> solve that somehow.

If you by that mean if we're using SOCK_DGRAM, the answer is yes

I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but that didn't work at all

As far as I understand, both all types on pretty much all implementations preserves message ordering though

I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the topic/af_unix-branch. Is that worth a try ?

Best regards,
Kristian

> Ken


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-14 17:14                       ` sten.kristian.ivarsson
@ 2021-04-14 21:58                         ` Ken Brown
  2021-04-15 13:15                           ` sten.kristian.ivarsson
  0 siblings, 1 reply; 27+ messages in thread
From: Ken Brown @ 2021-04-14 21:58 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/14/2021 1:14 PM, sten.kristian.ivarsson@gmail.com wrote:
>>>> Hi Ken
>>>>
>>>>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0)
>> seems
>>>>> to
>>>>>>>>>>>>> drop messages or at least they are not received in the same
>>>>>>>>>>>>> order they are  sent
>>>>>>>>>
>>>>>>>>> [snip]
>>>>>>>>>
>>>>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>>>>>>> familiar enough with the current AF_UNIX implementation to
>>>>>>>>>> debug this easily.  I'd rather spend my time on the new
>>>>>>>>>> implementation (on the topic/af_unix branch).  It turns out
>>>>>>>>>> that your test case fails there too, but in a completely
>>>>>>>>>> different way, due to a bug in sendto for datagrams.  I'll see
>>>>>>>>>> if I can fix that bug and then try again.
>>>>>>>>>>
>>>>>>>>>> Ken
>>>>>>>>>
>>>>>>>>> Ok, too bad it wasn't our own code base but good that the
>> "mystery"
>>>>>>>>> is verified
>>>>>>>>>
>>>>>>>>> I finally succeed to build topic/af_unix (after finding out what
>>>>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>>>>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>>>>>>
>>>>>>>>> Is it sufficient to add the define to the "main" Makefile or do
>>>>>>>>> you have to add it to all the Makefile:s ? I guess I can find
>>>>>>>>> out though
>>>>>>>>
>>>>>>>> I do it on the configure line, like this:
>>>>>>>>
>>>>>>>>      ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
>>>>> prefix=...
>>>>>>>>
>>>>>>>>> Is topic/af_unix fairly up to date with master branch ?
>>>>>>>>
>>>>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>>>>>>> I'lldo that again right now.
>>>>>>>>
>>>>>>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>
>>>>>>> I've now pushed a fix for that sendto bug, and your test case runs
>>>>>>> without error on the topic/af_unix branch.
>>>>>>
>>>>>> It seems like the test-case do work now with topic/af_unix in
>>>>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT)
>>>>>> there are
>>>>> some
>>>>>> issues I think
>>>>>>
>>>>>> 1. When the queue is empty with non-blocking recv(), errno is set
>>>>>> to EPIPE but I think it should be EAGAIN (or maybe the pipe is
>>>>>> getting broken for real of some reason ?)
>>>>>>
>>>>>> 2. When using non-blocking recv() and no message is written at all,
>>>>>> it seems like recv() blocks forever
>>>>>>
>>>>>> 3. Using non-blocking recv() where the "client" does send less than
>>>>>> "count" messages, sometimes recv() blocks forever (as well)
>>>>>>
>>>>>>
>>>>>> My naïve analysis of this is that for the first issue (if any) the
>>>>>> wrong errno is set and for the second issue it blocks if no
>>>>>> sendto() is done after the first recv(), i.e. nothing kicks the "reader
>> thread"
>>>>>> in the butt to realise the queue is empty. It is not super clear
>>>>>> though what POSIX says about creating blocking descriptors and then
>>>>>> using non-blocking-flags with recv(), but this works in Linux any
>>>>>> way
>>>>>
>>>>> The explanation is actually much simpler.  In the recv code where a
>>>>> bound datagram socket waits for a remote socket to connect to the
>>>>> pipe, I simply forget to handle MSG_DONTWAIT.  I've pushed a
>> fix.  Please retest.
>>>>>
>>>>> I should add that in all my work so far on the topic/af_unix branch,
>>>>> I've thought mainly about stream sockets.  So there may still be
>>>>> things remaining to be implemented for the datagram case.
>>>>
>>>> I finally got some time to test topic/af_unix in our "real"
>>>> cygwin-application
>>>> (casual) and unfortunately very few of our unittests pass
>>>>
>>>> The symptoms are that there's unexpected eternal blocking, sometimes
>>>> there's unexpected EADDRNOTAVAIL, sometimes it looks like some
>> memory
>>>> corruption (and
>>>> core-dumps)
>>>>
>>>> Of course the memory corruption etc could be our self and the
>>>> core-dumps might be because of uncaught exceptions
>>>>
>>>> Needles to say is that all unittests pass on Linux, but of course
>>>> cygwin-topic/af_unix could act according to POSIX-standard and the
>>>> behaviour couldbe due to our own misinterpretation of how POSIX works
>>>
>>> More likely it's due to bugs in the topic/af_unix branch.  This is
>>> still very much a work in progress.
>>>
>>>> I will try to narrow down the quite complex logic and reproduce the
>>>> problems
>>>
>>> That would be ideal.
>>>
>>>> If you of some reason wanna try it with casual, I'd be glad to help
>>>> you out (it should be easier now that last time (but there might be
>>>> some documentation missing for Cygwin still))
>>>>
>>>> https://bitbucket.org/casualcore/
>>>
>>> I'm going on vacation in a few days, but I might do this when I get back.
>>>
>>> Thanks for your testing.
>>
>> By the way, if your code is using datagram sockets, then there are very serious
>> problems with our implementation (even aside from the performance issue
>> that we've already discussed).  For example, I don't know of any reasonable
>> way for select to test whether such a socket is ready for writing.  We'll need to
>> solve that somehow.
> 
> If you by that mean if we're using SOCK_DGRAM, the answer is yes
> 
> I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but that didn't work at all
> 
> As far as I understand, both all types on pretty much all implementations preserves message ordering though
> 
> I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the topic/af_unix-branch. Is that worth a try ?

SOCK_STREAM is definitely worth a try.  The implementation of that should be 
much more reliable than the implementation of SOCK_DGRAM at the moment.  We 
don't implement SOCK_SEQPACKET.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-14 21:58                         ` Ken Brown
@ 2021-04-15 13:15                           ` sten.kristian.ivarsson
  2021-04-15 15:01                             ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-15 13:15 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

[snip]

> > I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but
> > that didn't work at all
> >
> > As far as I understand, both all types on pretty much all
> > implementations preserves message ordering though
> >
> > I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the
> topic/af_unix-branch. Is that worth a try ?
> 
> SOCK_STREAM is definitely worth a try.  The implementation of that should be
> much more reliable than the implementation of SOCK_DGRAM at the
> moment.  We don't implement SOCK_SEQPACKET.

It might be a complete rewrite of our semantics though, because it's connection based and allows just one writer on each "channel" and messages (chunks) cannot be handled "atomically"


Best regards,
Kristian

> Ken


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-15 13:15                           ` sten.kristian.ivarsson
@ 2021-04-15 15:01                             ` Ken Brown
  2021-04-27 14:56                               ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Ken Brown @ 2021-04-15 15:01 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/15/2021 9:15 AM, sten.kristian.ivarsson@gmail.com wrote:
> [snip]
> 
>>> I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but
>>> that didn't work at all
>>>
>>> As far as I understand, both all types on pretty much all
>>> implementations preserves message ordering though
>>>
>>> I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the
>> topic/af_unix-branch. Is that worth a try ?
>>
>> SOCK_STREAM is definitely worth a try.  The implementation of that should be
>> much more reliable than the implementation of SOCK_DGRAM at the
>> moment.  We don't implement SOCK_SEQPACKET.
> 
> It might be a complete rewrite of our semantics though, because it's connection based and allows just one writer on each "channel" and messages (chunks) cannot be handled "atomically"

In that case, let's try to get the DGRAM case to work.  Corinna has already 
suggested (on cygwin-developers) a way to deal with the select issue I 
mentioned.  I'll make that change shortly.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-15 15:01                             ` Ken Brown
@ 2021-04-27 14:56                               ` Ken Brown
  2021-04-28  7:15                                 ` sten.kristian.ivarsson
  0 siblings, 1 reply; 27+ messages in thread
From: Ken Brown @ 2021-04-27 14:56 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 4/15/2021 11:01 AM, Ken Brown via Cygwin wrote:
> On 4/15/2021 9:15 AM, sten.kristian.ivarsson@gmail.com wrote:
>> [snip]
>>
>>>> I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but
>>>> that didn't work at all
>>>>
>>>> As far as I understand, both all types on pretty much all
>>>> implementations preserves message ordering though
>>>>
>>>> I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the
>>> topic/af_unix-branch. Is that worth a try ?
>>>
>>> SOCK_STREAM is definitely worth a try.  The implementation of that should be
>>> much more reliable than the implementation of SOCK_DGRAM at the
>>> moment.  We don't implement SOCK_SEQPACKET.
>>
>> It might be a complete rewrite of our semantics though, because it's 
>> connection based and allows just one writer on each "channel" and messages 
>> (chunks) cannot be handled "atomically"
> 
> In that case, let's try to get the DGRAM case to work.

I decided to (finally) dig into the AF_UNIX implementation on the master branch 
and try to understand why DGRAM sockets are unreliable.  I think the answer is 
simply that Cygwin implements AF_UNIX sockets using Windows AF_INET sockets, and 
DGRAM sockets in this setting are documented to be unreliable.  It appears that 
if too much is written without anything being read, the Windows WSASendTo 
function simply drops messages without giving any error.

Unfortunately, switching to native Windows AF_UNIX sockets wouldn't help, 
because they don't support DGRAM sockets.

I'm going to follow up on cygwin-developers.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-27 14:56                               ` Ken Brown
@ 2021-04-28  7:15                                 ` sten.kristian.ivarsson
  2021-08-12 12:56                                   ` sten.kristian.ivarsson
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-04-28  7:15 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

> >> [snip]
> >>
> >>>> I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0
> >>>> but that didn't work at all
> >>>>
> >>>> As far as I understand, both all types on pretty much all
> >>>> implementations preserves message ordering though
> >>>>
> >>>> I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the
> >>> topic/af_unix-branch. Is that worth a try ?
> >>>
> >>> SOCK_STREAM is definitely worth a try.  The implementation of that
> >>> should be much more reliable than the implementation of SOCK_DGRAM
> >>> at the moment.  We don't implement SOCK_SEQPACKET.
> >>
> >> It might be a complete rewrite of our semantics though, because it's
> >> connection based and allows just one writer on each "channel" and
> >> messages
> >> (chunks) cannot be handled "atomically"
> >
> > In that case, let's try to get the DGRAM case to work.
> 
> I decided to (finally) dig into the AF_UNIX implementation on the master
> branch and try to understand why DGRAM sockets are unreliable.  I think the
> answer is simply that Cygwin implements AF_UNIX sockets using Windows
> AF_INET sockets, and DGRAM sockets in this setting are documented to be
> unreliable.  It appears that if too much is written without anything being read,
> the Windows WSASendTo function simply drops messages without giving any
> error.

Yeah, that was my amateur analysis as well a while ago

> Unfortunately, switching to native Windows AF_UNIX sockets wouldn't help,
> because they don't support DGRAM sockets. 

That's a bummer ☹

> I'm going to follow up on cygwin-developers.

Great, I'll read about it there

Keep up the good work

Best regards,
Kristian

> Ken


^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-04-28  7:15                                 ` sten.kristian.ivarsson
@ 2021-08-12 12:56                                   ` sten.kristian.ivarsson
  2021-08-13 11:19                                     ` Ken Brown
  0 siblings, 1 reply; 27+ messages in thread
From: sten.kristian.ivarsson @ 2021-08-12 12:56 UTC (permalink / raw)
  To: 'Ken Brown', cygwin

[snip]

> > I'm going to follow up on cygwin-developers.
> 
> Great, I'll read about it there

Does anyone know anything about the progress of this issue ?

Best regards,
Kristian

> Keep up the good work
> 
> Best regards,
> Kristian
> 
> > Ken



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: AF_UNIX/SOCK_DGRAM is dropping messages
  2021-08-12 12:56                                   ` sten.kristian.ivarsson
@ 2021-08-13 11:19                                     ` Ken Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Ken Brown @ 2021-08-13 11:19 UTC (permalink / raw)
  To: sten.kristian.ivarsson, cygwin

On 8/12/2021 8:56 AM, sten.kristian.ivarsson@gmail.com wrote:
> [snip]
> 
>>> I'm going to follow up on cygwin-developers.
>>
>> Great, I'll read about it there
> 
> Does anyone know anything about the progress of this issue ?

I'm afraid there has not been any progress.  We weren't able to find a solution 
using the existing AF_UNIX implementation.

The proposed implementation based on Windows named pipes (topic/af_unix branch) 
stalled because of an issue you reported, which I referred to as a performance 
problem.  But it's a pretty severe performance problem.

An alternative proposed by Mark Geisert based on Posix message queues 
(topic/af_unix_mq branch) also has issues, which may or may not be surmountable.

Sorry I don't have better news.

Ken

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-08-13 11:19 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 15:37 AF_UNIX/SOCK_DGRAM is dropping messages sten.kristian.ivarsson
2021-03-23 19:20 ` Glenn Strauss
2021-03-24  9:18   ` sten.kristian.ivarsson
2021-03-30 14:17     ` Ken Brown
2021-03-31  8:24       ` sten.kristian.ivarsson
2021-03-31 15:07         ` Ken Brown
2021-04-01 16:02           ` Ken Brown
2021-04-06  7:52             ` Noel Grandin
2021-04-06 14:59               ` Ken Brown
2021-04-06 14:50             ` sten.kristian.ivarsson
2021-04-06 15:24               ` Ken Brown
2021-04-07 14:56               ` Ken Brown
2021-04-08  8:37                 ` sten.kristian.ivarsson
2021-04-08 19:47                   ` sten.kristian.ivarsson
2021-04-08 21:02                     ` Ken Brown
2021-04-13 14:06                 ` sten.kristian.ivarsson
2021-04-13 14:47                   ` Ken Brown
2021-04-13 22:43                     ` Ken Brown
2021-04-14 15:53                       ` Ken Brown
2021-04-14 17:14                       ` sten.kristian.ivarsson
2021-04-14 21:58                         ` Ken Brown
2021-04-15 13:15                           ` sten.kristian.ivarsson
2021-04-15 15:01                             ` Ken Brown
2021-04-27 14:56                               ` Ken Brown
2021-04-28  7:15                                 ` sten.kristian.ivarsson
2021-08-12 12:56                                   ` sten.kristian.ivarsson
2021-08-13 11:19                                     ` Ken Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).