public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Ken Brown <kbrown@cornell.edu>
To: sten.kristian.ivarsson@gmail.com, cygwin@cygwin.com
Subject: Re: AF_UNIX/SOCK_DGRAM is dropping messages
Date: Thu, 8 Apr 2021 17:02:10 -0400	[thread overview]
Message-ID: <3e7e2393-b704-0675-f82c-5f070747ada4@cornell.edu> (raw)
In-Reply-To: <000601d72cb0$0263cc40$072b64c0$@gmail.com>

On 4/8/2021 3:47 PM, sten.kristian.ivarsson@gmail.com wrote:
>>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems
>>> to
>>>>>>>>>>> drop messages or at least they are not received in the same
>>>>>>>>>>> order they are  sent
>>>>>>>
>>>>>>> [snip]
>>>>>>>
>>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
>>>>>>>> familiar enough with the current AF_UNIX implementation to debug
>>>>>>>> this easily.  I'd rather spend my time on the new implementation
>>>>>>>> (on the topic/af_unix branch).  It turns out that your test case
>>>>>>>> fails there too, but in a completely different way, due to a bug
>>>>>>>> in sendto for datagrams.  I'll see if I can fix that bug and then try
>> again.
>>>>>>>>
>>>>>>>> Ken
>>>>>>>
>>>>>>> Ok, too bad it wasn't our own code base but good that the "mystery"
>>>>>>> is verified
>>>>>>>
>>>>>>> I finally succeed to build topic/af_unix (after finding out what
>>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
>>>>>>> CXXFLAGS though and thus I haven’t tested it yet
>>>>>>>
>>>>>>> Is it sufficient to add the define to the "main" Makefile or do
>>>>>>> you have to add it to all the Makefile:s ? I guess I can find out
>>>>>>> though
>>>>>>
>>>>>> I do it on the configure line, like this:
>>>>>>
>>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
>>> prefix=...
>>>>>>
>>>>>>> Is topic/af_unix fairly up to date with master branch ?
>>>>>>
>>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
>>>>>> I'lldo that again right now.
>>>>>>
>>>>>>> Either way, I'll be glad to help out testing topic/af_unix
>>>>>>
>>>>>> Thanks!
>>>>>
>>>>> I've now pushed a fix for that sendto bug, and your test case runs
>>>>> without error on the topic/af_unix branch.
>>>>
>>>> It seems like the test-case do work now with topic/af_unix in
>>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT) there
>>>> are
>>> some
>>>> issues I think
>>>>
>>>> 1. When the queue is empty with non-blocking recv(), errno is set to
>>>> EPIPE but I think it should be EAGAIN (or maybe the pipe is getting
>>>> broken for real of some reason ?)
>>>>
>>>> 2. When using non-blocking recv() and no message is written at all,
>>>> it seems like recv() blocks forever
>>>>
>>>> 3. Using non-blocking recv() where the "client" does send less than
>>>> "count" messages, sometimes recv() blocks forever (as well)
>>>>
>>>>
>>>> My naïve analysis of this is that for the first issue (if any) the
>>>> wrong errno is set and for the second issue it blocks if no sendto()
>>>> is done after the first recv(), i.e. nothing kicks the "reader thread"
>>>> in the butt to realise the queue is empty. It is not super clear
>>>> though what POSIX says about creating blocking descriptors and then
>>>> using non-blocking-flags with recv(), but this works in Linux any
>>>> way
>>>
>>> The explanation is actually much simpler.  In the recv code where a
>>> bound datagram socket waits for a remote socket to connect to the
>>> pipe, I simply forget to handle MSG_DONTWAIT.  I've pushed a fix.  Please
>> retest.
>>
>> I tested it and now it seems like we get EAGAIN when there's no msg on the
>> queue, but it seems like the client is blocked as well and that it cannot write
>> any more messages until it is consumed by the server, so the af_unix.cpp test-
>> client end prematurely
>>
>> If using sendto() with MSG_DONTWAIT as well, that is getting a EAGAIN, but
>> the socket in it self is not a non-blocking socket, it is just the recv() that is done
>> in a non-blocking fashion
>>
>> As I said earlier, it's a bit fuzzy (or at least for me) what POSIX mean by
>> non/blocking descriptors combined with non/blocking operations, but as far
>> as I understand, it should be possible to use blocking sendto()and messages
>> should be written (as long as some buffer is not filled) at the same time
>> someone is doing non-blocking recv()
>>
>> What is your take on this ?
> 
> I was thinking of this again and came to the conclusion that the fix semantically probably works ok
> 
> It was just me that didn't realise that only one message can be on the queue simultaneously even in blocking mode
> 
> The problem is not functional but merely a performance hog, that I guess you have already realised and you mentioned it in previous message but I guess I thought it was about some other issue
> 
> 
> So, I guess the fix works ok (I haven't done any more tests than with the sample program), but I guess out of an throughput aspect I guess it would be a good idea to let more messages be written to the queue before the first is consumed or so (I guess you already have some thoughts about this?)

I have some thoughts, but nothing definitive yet.  I'll keep thinking.

Ken

  reply	other threads:[~2021-04-08 21:02 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-23 15:37 sten.kristian.ivarsson
2021-03-23 19:20 ` Glenn Strauss
2021-03-24  9:18   ` sten.kristian.ivarsson
2021-03-30 14:17     ` Ken Brown
2021-03-31  8:24       ` sten.kristian.ivarsson
2021-03-31 15:07         ` Ken Brown
2021-04-01 16:02           ` Ken Brown
2021-04-06  7:52             ` Noel Grandin
2021-04-06 14:59               ` Ken Brown
2021-04-06 14:50             ` sten.kristian.ivarsson
2021-04-06 15:24               ` Ken Brown
2021-04-07 14:56               ` Ken Brown
2021-04-08  8:37                 ` sten.kristian.ivarsson
2021-04-08 19:47                   ` sten.kristian.ivarsson
2021-04-08 21:02                     ` Ken Brown [this message]
2021-04-13 14:06                 ` sten.kristian.ivarsson
2021-04-13 14:47                   ` Ken Brown
2021-04-13 22:43                     ` Ken Brown
2021-04-14 15:53                       ` Ken Brown
2021-04-14 17:14                       ` sten.kristian.ivarsson
2021-04-14 21:58                         ` Ken Brown
2021-04-15 13:15                           ` sten.kristian.ivarsson
2021-04-15 15:01                             ` Ken Brown
2021-04-27 14:56                               ` Ken Brown
2021-04-28  7:15                                 ` sten.kristian.ivarsson
2021-08-12 12:56                                   ` sten.kristian.ivarsson
2021-08-13 11:19                                     ` Ken Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e7e2393-b704-0675-f82c-5f070747ada4@cornell.edu \
    --to=kbrown@cornell.edu \
    --cc=cygwin@cygwin.com \
    --cc=sten.kristian.ivarsson@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).