From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7379 invoked by alias); 9 Jan 2017 15:46:38 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 7367 invoked by uid 89); 9 Jan 2017 15:46:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=no version=3.3.2 spammy=Chicken, sk:corinna, Hx-languages-length:4439, client's X-HELO: mail-ua0-f173.google.com Received: from mail-ua0-f173.google.com (HELO mail-ua0-f173.google.com) (209.85.217.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jan 2017 15:46:27 +0000 Received: by mail-ua0-f173.google.com with SMTP id i68so360638836uad.0 for ; Mon, 09 Jan 2017 07:46:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=pQWbyQnZtpy9fYEwxnVOQoMhedMHf7kRWMzFNij0Zo4=; b=YeFgkGKbJCnsl1/Cecvwd0+YW+gusVkDf07zjAQPFMO8D0pLyRqmqp/410VyRryfy6 SLrG81Q2973ApY5Yw8e4MeIoMfrOn0CLPJ67mtV0eZzNnMxUDnTqWaZ8SRNbWvraszCw wiN9IqgaisU5g05USnaxT1WuQmUKESaBzR+cATtAdM8a7ynj9OJ2NpvVjsj3yvL+z9BR j5QIyx78xfwxkZ5p4/naC8f8naNCIBHUlvsWJRH9rkbi6V6VH2DaQ7QrUtPrk0vRekCj y1jltz05XOVcyWlIlphsTXrub0vA59TYzzzyjuXUZr01XN+R8nTB4oiTszogA8wmx5M1 3Gbw== X-Gm-Message-State: AIkVDXLpGxK1uT2mNx7l00PQK1sKpmzt0ATAffX5SU7TVElrOo1e7hKr6lp1m4fBVx7dgIF5nmOimVgAGRh9jw== X-Received: by 10.176.84.148 with SMTP id p20mr2984967uaa.50.1483976785832; Mon, 09 Jan 2017 07:46:25 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.133.147 with HTTP; Mon, 9 Jan 2017 07:46:25 -0800 (PST) In-Reply-To: <20170109141306.GB843@calimero.vinschen.de> References: <20170109141306.GB843@calimero.vinschen.de> From: Erik Bray Date: Mon, 09 Jan 2017 15:46:00 -0000 Message-ID: Subject: Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect) To: cygwin@cygwin.com Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2017-01/txt/msg00056.txt.bz2 Hi Corinna, Thanks for the response. On Mon, Jan 9, 2017 at 3:13 PM, Corinna Vinschen wrote: > Hi Erik, > > On Jan 9 14:29, Erik Bray wrote: >> On Mon, Jan 9, 2017 at 12:01 PM, Erik Bray wrote: >> > On Fri, Jan 6, 2017 at 12:40 PM, Erik Bray wrote: >> >> Hello, and happy new-ish year, >> >> >> >> I've been working on and off over the past few months on bringing >> >> Python's compatibility with Cygwin up to snuff, including having all >> >> pertinent tests passing. I've noticed that there are several tests >> >> (which I currently skip) that cause the process to hang indefinitely, >> >> and not respond to any signals from Cygwin (it can only be killed from >> >> Windows). This is Cygwin 64-bit--I have not tested 32-bit. >> >> [...] >> > I made a little bit of progress debugging this, but now I'm stumped. >> > It seems the problem is this: >> > >> > For each socket whose fd is passed to select() a thread_socket is >> > started which calls peek_socket until there are bits ready on the > > Yes and no. One thread_socket is called per 62 sockets, to account > for the maximum number of handles per WaitForMultipleObjects call. > >> > socket, or until the timeout is reached. This in turn calls >> > fhandler_socket::evaluate_events. >> > [...] >> After playing around with this a bit more I came up with a much >> simpler example. This has nothing to do with select( ) at all, >> directly. > > Right. It has to do with how connect/accept works on AF_LOCAL sockets. > The handshake doesn't work well for situations like yours, where the > same thread tries to connect and accept on the same socket. Actually I'm not entirely sure now that that's the issue, even considering that this has come up before. Or at the very least, there's an additional issue. I realized that when I tried separate client/server processes, in the server I had put an accept() call at the end so it would block there. With the server waiting to accept a connection it succeeded. However, when I replaced the accept() with a long sleep(), the client's connect() never returns. IIUC the handshake can't succeed until and unless the server accepts a connection from the client. On Linux, however, connect() returns immediately after a successful TCP handshake, and the connection is placed on the server's listen queue. I don't know if the same holds on Windows. But since the underlying winsock is in non-blocking mode anyways it shouldn't have to then block until the af_local handshake can succeed. I almost wonder if the server side in this case shouldn't start up a thread to accept the af_local handshake, but you would know better. > This has been found a problem in porting postfix already and at the time > we added a patch to circumvent the problem. Before calling connect, add > this: > > setsockopt (sock_server, SOL_SOCKET, SO_PEERCRED, NULL, 0); > setsockopt (sock_client, SOL_SOCKET, SO_PEERCRED, NULL, 0); > > This is, of course, a hack. The problem here is that server and client > of a socket are independent of each other, and there's typically no > way to know which process created the server side unless you already > are connected. Chicken/egg. I tried it and it worked, both in the single process and separate process examples. I see now--this sets fhandler_socket::no_getpeerid=true, so it doesn't have to do the handshake at all. > While replying to your mail, a thought occured to me, though. > > We might get away without the above setsockopt calls by adding a check > to connect. It could test if the socket has already been opened by > the same process and is bound. This could be accomplished by scanning > the file descriptor table (dtable) of the process. If we find it, > we set the above socket option on both ends and continue without the > secret and credential check. Credentials could be set manually since we > know user, group, and pid at this point. > > It's a bit of work but might be feasible. I see what you're saying, but it appears that would only work in the case where both sockets are opened by the same process. Of course, that was my original use case, but now I'm realizing the problem extends beyond that--that the handshake can't complete unless the server is explicitly accepting connections. Thanks, Erik -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple