From: Ken Brown <kbrown@cornell.edu>
To: cygwin-developers@cygwin.com
Subject: Re: The unreliability of AF_UNIX datagram sockets
Date: Thu, 29 Apr 2021 12:44:48 -0400 [thread overview]
Message-ID: <16e1d55e-15ea-6c0e-04e4-aa6cb2c0c1bd@cornell.edu> (raw)
In-Reply-To: <YIrLQezXLUnEo8BS@calimero.vinschen.de>
On 4/29/2021 11:05 AM, Corinna Vinschen wrote:
> On Apr 29 10:38, Ken Brown wrote:
>> On 4/29/2021 7:05 AM, Corinna Vinschen wrote:
>>> On Apr 27 11:47, Ken Brown wrote:
>>>> I'm willing to start working on the switch to native AF_UNIX sockets. (I'm
>>>> frankly getting bored with working on the pipe implementation, and this
>>> ^^^^^^^^^^^^^
>>> I not really surprised, Windows pipe semantics are annoying.
>>>
>>>> doesn't really seem like it has much of a future.) But I'd like to be
>>>> confident that there's a good solution to the datagram problem before I
>>>> invest too much time in this.
>>>
>>> Summary of our short discussion on IRC:
>>>
>>> - Switching to SOCK_STREAM under the hood adds the necessary reliabilty
>>> but breaks DGRAM message boundaries.
>>>
>>> - There appears to be no way in Winsock to handle send buffer overflow
>>> gracefully so that user space knows that messages have been discarded.
>>> Strange enoug there's a SIO_ENABLE_CIRCULAR_QUEUEING ioctl, but that
>>> just makes things worse, by dropping older messages in favor of the
>>> newer ones :-P
>>>
>>> I think it should be possible to switch to STREAM sockets to emulate
>>> DGRAM semantics. Our advantage is that this is all local. For all
>>> practical purposes there's no chance data gets really lost. Windows has
>>> an almost indefinite send buffer.
>>>
>>> If you look at the STREAM as a kind of tunneling layer for getting DGRAM
>>> messages over the (local) line, the DGRAM content could simply be
>>> encapsulated in a tunnel packet or frame, basically the same way the
>>> new, boring AF_UNIX code does it. A DGRAM message encapsulated in a
>>> STREAM message always has a header which at least contains the length of
>>> the actual DGRAM message. So when the peer reads from the socket, it
>>> always only reads the header until it's complete. Then it knows how
>>> much payload is expected and then it reads until the payload has been
>>> received.
>>
>> This should work. We could even use MSG_PEEK to read the header and then
>> MSG_WAITALL to read the whole packet.
>>
>> I'd be happy to try to implement this. Do you want to create a branch
>> (maybe topic/dgram or something like that) for working on it?
>
> You can create topic branches as you see fit, don't worry about it.
>
>>> Ultimately this would even allow to emulate DGRAMs when using native
>>> Windows AF_UNIX sockets. Then we'd just have to keep the old code for
>>> backward compat.
>>
>> Yep.
>>
>>> There's just one problem with this entire switch to non-pipes: Sending
>>> descriptors between peers running under different accounts requires to
>>> be able to switch the user context. You need this if the sender is a
>>> non-admin account to call ImpersonateNamedPipeClient in the receiver.
>>> So we might need to keep the pipes even if just for the purpose of being
>>> able to call ImpersonateNamedPipeClient...
>>>
>>>
>>> Thoughts?
>>
>> Sounds great. Thanks.
>
> Don't start just yet.
>
> I'm still not quite sure if that's really the way to go. As I see it we
> still have something to discuss here.
>
> For one thing, using native AF_UNIX sockets will split our user base
> into two. Those who are not using a recent enough Windows will get the
> old code and no descriptor passing. However, if an application has been
> built with descriptor passing, it won't work for those running older
> Windows versions. I don't think we want that for the distro, or, do we?
Good point. Sounds like a nightmare.
> Next problem... implementing actual STREAM sockets. Even using native
> AF_UNIX sockets, these, too, would have to encapsulate the actual
> payload because of the ancilliary data we want to send with them.
> Whether or not we use native AF_UNIX sockets, they won't be compatible
> with native applications...
>
> So maybe we should really think hard about the alternative
> implementation using POSIX message queues, I guess. And *if* we do
> that, this should be used likewise for STREAM as for DGRAM sockets, so
> the code is easier to maintain. Obvious advantage: No problem with
> older OS versions. And maybe it's even dirt easy to implement in
> comparison with using other methods, because the transport mechanism
> is already in place.
Yes, I don't think it should be too hard. The one thing I can think of that's
missing is a facility for doing a partial read of a message on the message
queue. (This would be needed for a recv call on a STREAM socket, in which the
buffer is smaller than the payload of the next message on the queue.) But this
should be straightforward to implement.
Alternatively, I guess we could read the whole message and store the excess in a
readahead buffer.
> What's missing is the ImpersonateNamedPipeClient stuff (but that's not
> different from using native AF_UNIX) and reflections about the permission
> handling.
On 4/29/2021 11:18 AM, Corinna Vinschen wrote:
> While searching the net I found this additional gem of information:
>
> Native AF_UNIX sockets don't support abstract sockets. You must bind to
> a valid path, so you always have a visible file in the filesystem.
> Discussed here: https://github.com/microsoft/WSL/issues/4240
>
> We could workaround that with our POSIX unlink semantics, probably,
> but it's YA downside
Agreed. The more features that are missing from native AF_UNIX sockets, the
less appealing they become.
Concerning abstract sockets, would we still have an issue if we used message
queues? Wouldn't there be a visible file under /dev/mqueue? Or is there a way
around that?
Ken
next prev parent reply other threads:[~2021-04-29 16:44 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-27 15:47 Ken Brown
2021-04-29 11:05 ` Corinna Vinschen
2021-04-29 11:16 ` Corinna Vinschen
2021-04-29 14:38 ` Ken Brown
2021-04-29 15:05 ` Corinna Vinschen
2021-04-29 15:18 ` Corinna Vinschen
2021-04-29 16:44 ` Ken Brown [this message]
2021-04-29 17:39 ` Corinna Vinschen
2021-05-01 21:41 ` Ken Brown
2021-05-03 10:30 ` Corinna Vinschen
2021-05-03 15:45 ` Corinna Vinschen
2021-05-03 16:56 ` Ken Brown
2021-05-03 18:40 ` Corinna Vinschen
2021-05-03 19:48 ` Ken Brown
2021-05-03 20:50 ` Ken Brown
2021-05-04 11:06 ` Corinna Vinschen
2021-05-13 14:30 ` Ken Brown
2021-05-17 10:26 ` Corinna Vinschen
2021-05-17 13:02 ` Ken Brown
2021-05-17 13:02 ` Ken Brown
2021-05-20 13:46 ` Ken Brown
2021-05-20 19:25 ` Corinna Vinschen
2021-05-21 21:54 ` Ken Brown
2021-05-22 15:49 ` Corinna Vinschen
2021-05-22 16:50 ` Ken Brown
2021-05-22 18:21 ` Ken Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=16e1d55e-15ea-6c0e-04e4-aa6cb2c0c1bd@cornell.edu \
--to=kbrown@cornell.edu \
--cc=cygwin-developers@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).