From: Ken Brown <kbrown@cornell.edu>
To: Norton Allen <allen@huarp.harvard.edu>, cygwin <cygwin@cygwin.com>
Subject: Re: Unix Domain Socket Limitation?
Date: Sat, 5 Dec 2020 18:52:47 -0500 [thread overview]
Message-ID: <816668c9-4848-caa8-7fae-349be2cd5ab7@cornell.edu> (raw)
In-Reply-To: <a13ab85d-bee7-71e3-41d0-1a67422a859f@huarp.harvard.edu>
On 12/4/2020 8:51 AM, Norton Allen wrote:
> On 12/3/2020 8:11 PM, Ken Brown wrote:
>> On 12/2/2020 12:30 PM, Norton Allen wrote:
>>> On 11/30/2020 9:22 PM, Norton Allen wrote:
>>>> Yeah, so now the example no longer blocks for me. Unfortunately these bugs
>>>> are not present in my application, so I will need to keep working on this.
>>>>
>>>
>>> After paring the main application down and back up, I finally narrowed in on
>>> the condition that was causing this blocking behavior. The issue arises when
>>> a client connect()s twice to the same server with non-blocking unix-domain
>>> sockets before calling select().
>>>
>>> There are a few pieces to this. With the client configured to connect() just
>>> once, I can see that the server's select() returns as soon as the client
>>> calls connect(), but then the server's accept() blocks until the client calls
>>> select(). That is not proper non-blocking behavior, but it appears that the
>>> implementation under Cygwin does require that client and server both be
>>> communicating synchronously to accomplish the connect() operation.
>>>
>>> I tried running this under Ubuntu 16.04 and found that connect() succeeded
>>> immediately, so no subsequent select() is required, and there does not appear
>>> to be a possibility for this collision. That proves to hold true even if the
>>> server is not waiting in select() to process the connect() with accept().
>>>
>>> A workaround for this issue may be to keep the socket blocking until after
>>> connect().
>>>
>>> I have pushed the new minimal example program, 'rapid_connects' to
>>> https://github.com/nthallen/cygwin_unix
>>>
>>> The server is run like before as:
>>>
>>> $ ./rapid_connects server
>>>
>>> The client can be run in two different modes. To connect with just one socket:
>>>
>>> $ ./rapid_connects client1
>>>
>>> To connect with two:
>>>
>>> $ ./rapid_connects client2
>>>
>>> My immediate strategy will be to develop a workaround for my project. Having
>>> spent a day inside cygwin1.dll, I can see that I have a steep learning curve
>>> to make much of a contribution there.
>>
>> I'm traveling at the moment and unable to do any testing, but I wonder if
>> you're bumping into an issue that was just discussed on the cygwin-developers
>> list:
>>
>> https://cygwin.com/pipermail/cygwin-developers/2020-December/012015.html
>>
>> A different workaround is described there.
>>
>> If it's the same issue, then I don't think it will happen with the new AF_UNIX
>> implementation. More in a few days.
>>
> It does seem related.
>
> A work around that is working for me is to do a blocking connect() and switch to
> non-blocking when that completes. In my application, the connect() generally
> occurs once at the beginning of a run, so blocking for a few milliseconds does
> not impact responsiveness.
For the record, I can confirm that (a) the problem occurs with the current
AF_UNIX implementation and (b) it does not occur with the new implementation (on
the topic/af_unix branch). With both client1 and client2, I see "connect()
apparently succeeded immediately" using the new implementation.
The new implementation is not yet ready for prime time, but with any luck it
might be ready within a few months.
Ken
next prev parent reply other threads:[~2020-12-05 23:52 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-25 21:47 Norton Allen
2020-11-25 22:27 ` Ken Brown
[not found] ` <4260ad1b-4ab2-fa36-fd0e-7c9644560114@huarp.harvard.edu>
2020-11-26 17:13 ` Ken Brown
2020-11-30 17:19 ` Norton Allen
2020-11-30 18:14 ` Ken Brown
2020-11-30 18:26 ` Norton Allen
2020-11-30 23:19 ` Ken Brown
2020-12-01 2:14 ` Norton Allen
2020-12-01 2:22 ` Norton Allen
2020-12-02 17:30 ` Norton Allen
2020-12-04 1:11 ` Ken Brown
2020-12-04 13:51 ` Norton Allen
2020-12-05 23:52 ` Ken Brown [this message]
2020-12-06 17:17 ` Norton Allen
2020-12-06 22:32 ` Ken Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=816668c9-4848-caa8-7fae-349be2cd5ab7@cornell.edu \
--to=kbrown@cornell.edu \
--cc=allen@huarp.harvard.edu \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).