Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)

public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
@ 2017-01-09 13:29 Erik Bray
  2017-01-09 14:13 ` Corinna Vinschen
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Bray @ 2017-01-09 13:29 UTC (permalink / raw)
  To: cygwin

On Mon, Jan 9, 2017 at 12:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:
> On Fri, Jan 6, 2017 at 12:40 PM, Erik Bray <erik.m.bray@gmail.com> wrote:
>> Hello, and happy new-ish year,
>>
>> I've been working on and off over the past few months on bringing
>> Python's compatibility with Cygwin up to snuff, including having all
>> pertinent tests passing.  I've noticed that there are several tests
>> (which I currently skip) that cause the process to hang indefinitely,
>> and not respond to any signals from Cygwin (it can only be killed from
>> Windows).  This is Cygwin 64-bit--I have not tested 32-bit.
>>
>> I finally looked into this problem and found the lockup to be in
>> pselect() somewhere.  Attached I've provided the most minimal example
>> I've been able to come up with so far that reproduces the problem,
>> which I'll describe in a bit more detail next. I would attach a
>> cygcheck output if requested, but I was also able to reproduce this on
>> a recent build from source.
>>
>> So far as I've been able to tell, the problem only occurs with AF_UNIX
>> sockets.  In the example I have a 'server' socket and a 'client'
>> socket both set to non-blocking.  The client connects to the socket,
>> returning errno EINPROGRESS as expected.  Then I do a pselect on the
>> client socket to wait until it is ready to be read from.  The hang
>> only happens when I pselect on the client socket, and not on the
>> server socket.  It doesn't seem to make a difference what the timeout
>> is.  One thing I have no tried is if the client and server are
>> actually different processes, but the example from the Python tests
>> this is reproducing is where they are both in the same process.
>>
>> Below is (I think) the most relevant output from strace on the test
>> case.  It seems to hang somewhere in socket_cleanup, but I haven't
>> investigated any further than that.
>
> I made a little bit of progress debugging this, but now I'm stumped.
> It seems the problem is this:
>
> For each socket whose fd is passed to select() a thread_socket is
> started which calls peek_socket until there are bits ready on the
> socket, or until the timeout is reached.  This in turn calls
> fhandler_socket::evaluate_events.
>
> The reason it's only locking up on my "client thread" on which
> connect() is called, is that evaluate_events notes that the socket is
> waiting to connect, and this passes control to
> fhandler_socket::af_local_connect().  af_local_connect() temporarily
> sets the socket to blocking, then sends a magic string to the socket
> (you can see in my strace log that this succeeds).  What's strange,
> and what I don't understand, is that there are no FD_READ or FD_OOB
> events recorded for the WSASendTo call from af_local_send_secret().
> Then, after af_local_send_secret() it calls af_local_recv_secret().
> This calls recv_internal() which in turn calls recursively into
> fhandler_socket::evaluate_events where it waits for an FD_READ or
> FD_OOB event that never arrives.  And since it set the socket to
> blocking it just sits in an infinite loop.
>
> Meanwhile the timer for the select() call expires and tries to shut
> down the thread_socket but it can't because it never completes.
>
> What I don't understand is why there is not an event recorded for the
> WSASendTo in send_internal.  I even wrapped it with the following
> debug code to wait for an FD_READ event immediately following the
> WSASendTo:
>
>       else if (get_socket_type () == SOCK_STREAM)
>       {
>         WSAEventSelect(get_socket (), wsock_evt, EVENT_MASK);
>         res = WSASendTo (get_socket (), out_buf, out_idx, &ret, flags,
>                  wsamsg->name, wsamsg->namelen, NULL, NULL);
>           debug_printf("WSASendTo sent %d bytes; ret: %d", ret, res);
>           while (!(res=wait_for_events (FD_READ | FD_OOB, 0))) {
>               debug_printf("Waiting for socket to be readable");
>           }
>       }
>
>
>
> But the strace at this point just outputs:
>    62  108286 [socksel] poll_test 24152
> fhandler_socket::af_local_connect: af_local_connect called,
> no_getpeereid=0
>   156  108442 [socksel] poll_test 24152
> fhandler_socket::send_internal: WSASendTo sent 16 bytes; ret: 0
>
> It never returns from send_internal.  I don't have deep knowledge of
> WinSock, but from what I've read ISTM WSASendTo should have triggered
> an FD_READ event on the socket, and it doesn't for some reason.

After playing around with this a bit more I came up with a much
simpler example.  This has nothing to do with select( ) at all,
directly.

The simplified example is just:

#include <arpa/inet.h>
#include <sys/socket.h>
#include <string.h>
#include <stdio.h>
#include <sys/un.h>
#include <errno.h>

int main(void) {
    fd_set rfds;
    int sock_server, sock_client;
    int retval;
    struct sockaddr_un addr;

    memset(&addr, 0, sizeof(addr));
    addr.sun_family = AF_UNIX;
    strcpy(addr.sun_path, "@test.sock");

    sock_server = socket(AF_UNIX, SOCK_STREAM, 0);
    if (bind(sock_server, (struct sockaddr*)&addr, sizeof(addr))) {
        printf("binding server socket failed");
        return 1;
    }

    retval = listen(sock_server, 5);
    printf("Ret from listen: %d\n", retval);

    sock_client = socket(AF_UNIX, SOCK_STREAM, 0);
    retval = connect(sock_client, (struct sockaddr*)&addr, sizeof(addr));
    printf("Ret from client connect: %d; errno: %d\n", retval, errno);

    return 0;
}


On Linux this example works as I expect, and the connect() call
returns immediately.  However, on Cygwin the connect() call hangs
after af_local_send_secret(), as described in my first message.

However, when I split this example up into separate client and server
processes it works as expected and the connect() is properly
negotiated and returns immediately.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-09 13:29 Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect) Erik Bray
@ 2017-01-09 14:13 ` Corinna Vinschen
  2017-01-09 15:46   ` Erik Bray
  0 siblings, 1 reply; 8+ messages in thread
From: Corinna Vinschen @ 2017-01-09 14:13 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2835 bytes --]

Hi Erik,

On Jan  9 14:29, Erik Bray wrote:
> On Mon, Jan 9, 2017 at 12:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:
> > On Fri, Jan 6, 2017 at 12:40 PM, Erik Bray <erik.m.bray@gmail.com> wrote:
> >> Hello, and happy new-ish year,
> >>
> >> I've been working on and off over the past few months on bringing
> >> Python's compatibility with Cygwin up to snuff, including having all
> >> pertinent tests passing.  I've noticed that there are several tests
> >> (which I currently skip) that cause the process to hang indefinitely,
> >> and not respond to any signals from Cygwin (it can only be killed from
> >> Windows).  This is Cygwin 64-bit--I have not tested 32-bit.
> >> [...]
> > I made a little bit of progress debugging this, but now I'm stumped.
> > It seems the problem is this:
> >
> > For each socket whose fd is passed to select() a thread_socket is
> > started which calls peek_socket until there are bits ready on the

Yes and no.  One thread_socket is called per 62 sockets, to account
for the maximum number of handles per WaitForMultipleObjects call.

> > socket, or until the timeout is reached.  This in turn calls
> > fhandler_socket::evaluate_events.
> > [...]
> After playing around with this a bit more I came up with a much
> simpler example.  This has nothing to do with select( ) at all,
> directly.

Right.  It has to do with how connect/accept works on AF_LOCAL sockets.
The handshake doesn't work well for situations like yours, where the
same thread tries to connect and accept on the same socket.

This has been found a problem in porting postfix already and at the time
we added a patch to circumvent the problem.  Before calling connect, add
this:

  setsockopt (sock_server, SOL_SOCKET, SO_PEERCRED, NULL, 0);
  setsockopt (sock_client, SOL_SOCKET, SO_PEERCRED, NULL, 0);

This is, of course, a hack.  The problem here is that server and client
of a socket are independent of each other, and there's typically no
way to know which process created the server side unless you already
are connected.  Chicken/egg.

While replying to your mail, a thought occured to me, though.

We might get away without the above setsockopt calls by adding a check
to connect.  It could test if the socket has already been opened by
the same process and is bound.  This could be accomplished by scanning
the file descriptor table (dtable) of the process.  If we find it,
we set the above socket option on both ends and continue without the
secret and credential check.  Credentials could be set manually since we
know user, group, and pid at this point.

It's a bit of work but might be feasible.

Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-09 14:13 ` Corinna Vinschen
@ 2017-01-09 15:46   ` Erik Bray
  2017-01-09 17:16     ` Corinna Vinschen
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Bray @ 2017-01-09 15:46 UTC (permalink / raw)
  To: cygwin

Hi Corinna,

Thanks for the response.

On Mon, Jan 9, 2017 at 3:13 PM, Corinna Vinschen
<corinna-cygwin@cygwin.com> wrote:
> Hi Erik,
>
> On Jan  9 14:29, Erik Bray wrote:
>> On Mon, Jan 9, 2017 at 12:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:
>> > On Fri, Jan 6, 2017 at 12:40 PM, Erik Bray <erik.m.bray@gmail.com> wrote:
>> >> Hello, and happy new-ish year,
>> >>
>> >> I've been working on and off over the past few months on bringing
>> >> Python's compatibility with Cygwin up to snuff, including having all
>> >> pertinent tests passing.  I've noticed that there are several tests
>> >> (which I currently skip) that cause the process to hang indefinitely,
>> >> and not respond to any signals from Cygwin (it can only be killed from
>> >> Windows).  This is Cygwin 64-bit--I have not tested 32-bit.
>> >> [...]
>> > I made a little bit of progress debugging this, but now I'm stumped.
>> > It seems the problem is this:
>> >
>> > For each socket whose fd is passed to select() a thread_socket is
>> > started which calls peek_socket until there are bits ready on the
>
> Yes and no.  One thread_socket is called per 62 sockets, to account
> for the maximum number of handles per WaitForMultipleObjects call.
>
>> > socket, or until the timeout is reached.  This in turn calls
>> > fhandler_socket::evaluate_events.
>> > [...]
>> After playing around with this a bit more I came up with a much
>> simpler example.  This has nothing to do with select( ) at all,
>> directly.
>
> Right.  It has to do with how connect/accept works on AF_LOCAL sockets.
> The handshake doesn't work well for situations like yours, where the
> same thread tries to connect and accept on the same socket.

Actually I'm not entirely sure now that that's the issue, even
considering that this has come up before.  Or at the very least,
there's an additional issue.  I realized that when I tried separate
client/server processes, in the server I had put an accept() call at
the end so it would block there.  With the server waiting to accept a
connection it succeeded.  However, when I replaced the accept() with a
long sleep(), the client's connect() never returns.

IIUC the handshake can't succeed until and unless the server accepts a
connection from the client.  On Linux, however, connect() returns
immediately after a successful TCP handshake, and the connection is
placed on the server's listen queue.  I don't know if the same holds
on Windows.  But since the underlying winsock is in non-blocking mode
anyways it shouldn't have to then block until the af_local handshake
can succeed.  I almost wonder if the server side in this case
shouldn't start up a thread to accept the af_local handshake, but you
would know better.

> This has been found a problem in porting postfix already and at the time
> we added a patch to circumvent the problem.  Before calling connect, add
> this:
>
>   setsockopt (sock_server, SOL_SOCKET, SO_PEERCRED, NULL, 0);
>   setsockopt (sock_client, SOL_SOCKET, SO_PEERCRED, NULL, 0);
>
> This is, of course, a hack.  The problem here is that server and client
> of a socket are independent of each other, and there's typically no
> way to know which process created the server side unless you already
> are connected.  Chicken/egg.

I tried it and it worked, both in the single process and separate
process examples.  I see now--this sets
fhandler_socket::no_getpeerid=true, so it doesn't have to do the
handshake at all.

> While replying to your mail, a thought occured to me, though.
>
> We might get away without the above setsockopt calls by adding a check
> to connect.  It could test if the socket has already been opened by
> the same process and is bound.  This could be accomplished by scanning
> the file descriptor table (dtable) of the process.  If we find it,
> we set the above socket option on both ends and continue without the
> secret and credential check.  Credentials could be set manually since we
> know user, group, and pid at this point.
>
> It's a bit of work but might be feasible.

I see what you're saying, but it appears that would only work in the
case where both sockets are opened by the same process.  Of course,
that was my original use case, but now I'm realizing the problem
extends beyond that--that the handshake can't complete unless the
server is explicitly accepting connections.

Thanks,
Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-09 15:46   ` Erik Bray
@ 2017-01-09 17:16     ` Corinna Vinschen
  2017-01-12 10:59       ` Erik Bray
  0 siblings, 1 reply; 8+ messages in thread
From: Corinna Vinschen @ 2017-01-09 17:16 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3048 bytes --]

On Jan  9 16:46, Erik Bray wrote:
> Hi Corinna,
> 
> Thanks for the response.
> 
> On Mon, Jan 9, 2017 at 3:13 PM, Corinna Vinschen wrote:
> > Right.  It has to do with how connect/accept works on AF_LOCAL sockets.
> > The handshake doesn't work well for situations like yours, where the
> > same thread tries to connect and accept on the same socket.
> 
> Actually I'm not entirely sure now that that's the issue, even
> considering that this has come up before.  Or at the very least,
> there's an additional issue.  I realized that when I tried separate
> client/server processes, in the server I had put an accept() call at
> the end so it would block there.  With the server waiting to accept a
> connection it succeeded.  However, when I replaced the accept() with a
> long sleep(), the client's connect() never returns.

That's because connect infinitely waits for the accept to reply the
second half of the handshake.

> IIUC the handshake can't succeed until and unless the server accepts a
> connection from the client.

This is exactly the underlying problem.  And interesting enough, even
though the handshake is in Cygwin since 2001, we never had a problem
with this until Christian started porting postfix in 2014!

> I almost wonder if the server side in this case
> shouldn't start up a thread to accept the af_local handshake, but you
> would know better.

No, I don't.  We discussed this issue briefly back in 2014, but as
you can see we don't have a solution for this border case yet.

Starting a thread may or may not work, but there are a couple of
use-cases to keep in mind (which I can't reproduce off the top of my head).
The old postfix cygwin-apps thread from 2014 might give you some idea.

> > This has been found a problem in porting postfix already and at the time
> > we added a patch to circumvent the problem.  Before calling connect, add
> > this:
> >
> >   setsockopt (sock_server, SOL_SOCKET, SO_PEERCRED, NULL, 0);
> >   setsockopt (sock_client, SOL_SOCKET, SO_PEERCRED, NULL, 0);
> >
> > This is, of course, a hack.  The problem here is that server and client
> > of a socket are independent of each other, and there's typically no
> > way to know which process created the server side unless you already
> > are connected.  Chicken/egg.
> 
> I tried it and it worked, both in the single process and separate
> process examples.  I see now--this sets
> fhandler_socket::no_getpeerid=true, so it doesn't have to do the
> handshake at all.

Right.  A better solution for the problem would be nice.  Ultimately
we want to check if the other side of the socket is actually a Cygwin
process which knows the secret, not a stray native Windows process
which accidentally hopped on the bandwagon, and we want to exchange
the credentials so a subsequent SO_PEERCRED call returns correct values.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-09 17:16     ` Corinna Vinschen
@ 2017-01-12 10:59       ` Erik Bray
  2017-01-12 22:13         ` Corinna Vinschen
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Bray @ 2017-01-12 10:59 UTC (permalink / raw)
  To: cygwin

On Mon, Jan 9, 2017 at 6:16 PM, Corinna Vinschen
<corinna-cygwin@cygwin.com> wrote:
> On Jan  9 16:46, Erik Bray wrote:
>> Hi Corinna,
>>
>> Thanks for the response.
>>
>> On Mon, Jan 9, 2017 at 3:13 PM, Corinna Vinschen wrote:
>> > Right.  It has to do with how connect/accept works on AF_LOCAL sockets.
>> > The handshake doesn't work well for situations like yours, where the
>> > same thread tries to connect and accept on the same socket.
>>
>> Actually I'm not entirely sure now that that's the issue, even
>> considering that this has come up before.  Or at the very least,
>> there's an additional issue.  I realized that when I tried separate
>> client/server processes, in the server I had put an accept() call at
>> the end so it would block there.  With the server waiting to accept a
>> connection it succeeded.  However, when I replaced the accept() with a
>> long sleep(), the client's connect() never returns.
>
> That's because connect infinitely waits for the accept to reply the
> second half of the handshake.
>
>> IIUC the handshake can't succeed until and unless the server accepts a
>> connection from the client.
>
> This is exactly the underlying problem.  And interesting enough, even
> though the handshake is in Cygwin since 2001, we never had a problem
> with this until Christian started porting postfix in 2014!
>
>> I almost wonder if the server side in this case
>> shouldn't start up a thread to accept the af_local handshake, but you
>> would know better.
>
> No, I don't.  We discussed this issue briefly back in 2014, but as
> you can see we don't have a solution for this border case yet.
>
> Starting a thread may or may not work, but there are a couple of
> use-cases to keep in mind (which I can't reproduce off the top of my head).
> The old postfix cygwin-apps thread from 2014 might give you some idea.
>
>> > This has been found a problem in porting postfix already and at the time
>> > we added a patch to circumvent the problem.  Before calling connect, add
>> > this:
>> >
>> >   setsockopt (sock_server, SOL_SOCKET, SO_PEERCRED, NULL, 0);
>> >   setsockopt (sock_client, SOL_SOCKET, SO_PEERCRED, NULL, 0);
>> >
>> > This is, of course, a hack.  The problem here is that server and client
>> > of a socket are independent of each other, and there's typically no
>> > way to know which process created the server side unless you already
>> > are connected.  Chicken/egg.
>>
>> I tried it and it worked, both in the single process and separate
>> process examples.  I see now--this sets
>> fhandler_socket::no_getpeerid=true, so it doesn't have to do the
>> handshake at all.
>
> Right.  A better solution for the problem would be nice.  Ultimately
> we want to check if the other side of the socket is actually a Cygwin
> process which knows the secret, not a stray native Windows process
> which accidentally hopped on the bandwagon, and we want to exchange
> the credentials so a subsequent SO_PEERCRED call returns correct values.

Ah, okay. I found the original thread you mentioned, and I see that
you sort of discussed some possibilities but nothing was quite
satisfactory at the time, and it was dropped--you mentioned some idea
about exchanging information via pipes, but that was a bit complicated
and half-baked.

Christian described a scheme in that thread which at least seemed like
a way out of the connect hanging problem, and also improved the
security (I think) by having separate server and client secrets, so
that a malicious server could not gain the socket secret from the
client.  But he also worried:

> The only drawback which remains is that the client performs the send()
> before first recv() unconditionally. It will realize the bad server secret
> lately on first recv().

Though you wrote:

> Yeah, but it might be better than nothing and if it avoids the hangs,
> even better.

Which is sort of how I feel, though I do appreciate the security
implication.  One workaround to that which I think might be relatively
simple:  In Christian's scheme, after a connect() the client would be
in a "connected but secret missing" state.  What I would propose
adding is that the client then fires up a thread to wait on receiving
the server's secret (which it would send after receiving the client's
secret in an accept()).  Meanwhile, while the cliet is in the "secret
missing" state, any subsequent send()s would place the sent data on a
local buffer (no bigger than getsockopt(SO_SNDBUF) ?) that would only
get flushed out to actual WSASendTo calls once the server secret is
received.

The only downside I see to this is the added overhead of having to
start a thread for the purpose of waiting to receive the server's
secret which--in many common cases--would be unneeded since the server
may accept() immediately.  So in that case we might default to
blocking to receive the server's secret, but with a relatively brief
timeout, and then only start up a thread in case the server secret
isn't received quickly.

Best,
Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-12 10:59       ` Erik Bray
@ 2017-01-12 22:13         ` Corinna Vinschen
  2017-01-13  0:54           ` Michael Enright
  0 siblings, 1 reply; 8+ messages in thread
From: Corinna Vinschen @ 2017-01-12 22:13 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3032 bytes --]

On Jan 12 11:59, Erik Bray wrote:
> On Mon, Jan 9, 2017 at 6:16 PM, Corinna Vinschen wrote:
> > Right.  A better solution for the problem would be nice.  Ultimately
> > we want to check if the other side of the socket is actually a Cygwin
> > process which knows the secret, not a stray native Windows process
> > which accidentally hopped on the bandwagon, and we want to exchange
> > the credentials so a subsequent SO_PEERCRED call returns correct values.
> 
> Ah, okay. I found the original thread you mentioned, and I see that
> you sort of discussed some possibilities but nothing was quite
> satisfactory at the time, and it was dropped--you mentioned some idea
> about exchanging information via pipes, but that was a bit complicated
> and half-baked.
> 
> Christian described a scheme in that thread which at least seemed like
> a way out of the connect hanging problem, and also improved the
> security (I think) by having separate server and client secrets, so
> that a malicious server could not gain the socket secret from the
> client.  But he also worried:
> 
> > The only drawback which remains is that the client performs the send()
> > before first recv() unconditionally. It will realize the bad server secret
> > lately on first recv().
> 
> Though you wrote:
> 
> > Yeah, but it might be better than nothing and if it avoids the hangs,
> > even better.
> 
> Which is sort of how I feel, though I do appreciate the security
> implication.

I'm not sure there actually are security implications.  AF_LOCAL sockets
are local only, no network access.  And every Cygwin process knowing the
name of the socket file and having sufficient permisson to read the file
can connect.  And even a non-Cygwin process written by someone who knows
how Cygwin AF_LOCAL sockets work.  It's open source after all.

What this method mainly solves is to make reasonably sure that the peers
are actually Cygwin processes on both sides which know that this is an
AF_INET socket emulating an AF_LOCAL socket.  Plus SO_PEERCRED
emulation, but that's another problem just attached to the original
handshake.

Maybe we need to take a step back and just consider for a while what we
want:

Step 1:

  Make sure with whatever method that the process on the other side is
  actually a Cygwin process opening this socket as an AF_LOCAL socket.

Step 2:

  Exchange SO_PEERCRED information.

Step 3:

  If we did it really intelligent, maybe we finally also have a method
  to implement descriptor passing.  Finally.  After all these years.

And maybe, we should not actually use the socket itself to exchange
the information but rather create some kind of side-channle for that.

Especially in terms of step 3, I'm mulling over this for years now
and always something else got in the way and had to be done first.

Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-12 22:13         ` Corinna Vinschen
@ 2017-01-13  0:54           ` Michael Enright
  2017-01-13  8:42             ` Corinna Vinschen
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Enright @ 2017-01-13  0:54 UTC (permalink / raw)
  To: cygwin

On Thu, Jan 12, 2017 at 2:13 PM, Corinna Vinschen
<corinna-cygwin@cygwin.com> wrote:
> Step 3:
>
>   If we did it really intelligent, maybe we finally also have a method
>   to implement descriptor passing.  Finally.  After all these years.
>
> And maybe, we should not actually use the socket itself to exchange
> the information but rather create some kind of side-channle for that.
>
> Especially in terms of step 3, I'm mulling over this for years now
> and always something else got in the way and had to be done first.
>
>

I made a program that needed to pass windows HANDLEs between processes
and so that receiving process could access the shared memory
represented by the HANDLEs. I was emulating facilities many programs
implement using send_msg, but I was using Windows (named?) pipes. It
felt a lot like what you need for send_msg, and it required newer
Windows APIs. So by doing the crazy thing of completely rewriting your
AF_UNIX sockets you could "easily" add descriptor passing.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)
  2017-01-13  0:54           ` Michael Enright
@ 2017-01-13  8:42             ` Corinna Vinschen
  0 siblings, 0 replies; 8+ messages in thread
From: Corinna Vinschen @ 2017-01-13  8:42 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]

On Jan 12 16:54, Michael Enright wrote:
> On Thu, Jan 12, 2017 at 2:13 PM, Corinna Vinschen
> <corinna-cygwin@cygwin.com> wrote:
> > Step 3:
> >
> >   If we did it really intelligent, maybe we finally also have a method
> >   to implement descriptor passing.  Finally.  After all these years.
> >
> > And maybe, we should not actually use the socket itself to exchange
> > the information but rather create some kind of side-channle for that.
> >
> > Especially in terms of step 3, I'm mulling over this for years now
> > and always something else got in the way and had to be done first.
> >
> >
> 
> I made a program that needed to pass windows HANDLEs between processes
> and so that receiving process could access the shared memory
> represented by the HANDLEs. I was emulating facilities many programs
> implement using send_msg, but I was using Windows (named?) pipes. It
> felt a lot like what you need for send_msg, and it required newer
> Windows APIs. So by doing the crazy thing of completely rewriting your
> AF_UNIX sockets you could "easily" add descriptor passing.

/me spilled her coffee reading the word "easily".

I'm aware that named pipes have a facility to switch the user context,
which helps to handle the descriptor duplication.  I thought about this,
too, but it's really a lot of work since it doesn't fit well into the
current fhandler layout.

I'm not generally opposed to split off AF_LOCAL sockets from the generic
socket fhandler and rewrite it completely, but it took a long time
getting sockets to behave mostly POSIXy and I fear we introduce a
completely new set of POSIX incompatibilies which take another long time
to iron out.  That's why I suggested to use an additional named pipe
per AF_LOCAL socket as a side-channel.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-01-13  8:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-09 13:29 Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect) Erik Bray
2017-01-09 14:13 ` Corinna Vinschen
2017-01-09 15:46   ` Erik Bray
2017-01-09 17:16     ` Corinna Vinschen
2017-01-12 10:59       ` Erik Bray
2017-01-12 22:13         ` Corinna Vinschen
2017-01-13  0:54           ` Michael Enright
2017-01-13  8:42             ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).