From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.130]) by sourceware.org (Postfix) with ESMTPS id 1D2EE3858D37 for ; Wed, 2 Dec 2020 13:38:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1D2EE3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=cygwin.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=corinna-cygwin@cygwin.com Received: from calimero.vinschen.de ([24.134.7.25]) by mrelayeu.kundenserver.de (mreue010 [212.227.15.167]) with ESMTPSA (Nemesis) id 1N6sWd-1k752X1mDX-018Hun for ; Wed, 02 Dec 2020 14:38:14 +0100 Received: by calimero.vinschen.de (Postfix, from userid 500) id AF310A80D26; Wed, 2 Dec 2020 14:38:13 +0100 (CET) Date: Wed, 2 Dec 2020 14:38:13 +0100 From: Corinna Vinschen To: cygwin-developers@cygwin.com Subject: Re: python fails asyncio tests (py 3.7 & 3.8) Message-ID: <20201202133813.GP303847@calimero.vinschen.de> Reply-To: cygwin-developers@cygwin.com Mail-Followup-To: cygwin-developers@cygwin.com References: <9976c726-8bfd-febe-ac86-f7cbf3cc958b@maxrnd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <9976c726-8bfd-febe-ac86-f7cbf3cc958b@maxrnd.com> X-Provags-ID: V03:K1:R7pPSpF6xwDDGUTnqvDPwWKz8Pe6bhFZQNPhV4LfGyozM3Y6sRg 1XVCG0Zc0hzKxjvXUrJ+NkQh1cMSOpHP2p8O/D7LPfd5EotI+pU/fBeSqNcMuYli2fn4Vwi sVW6xd5yr8vtM46BPMly+SljtcrVNxWyk6pf7UcpbDhJQMm06OdHiXoBptZtxboy9tIAUJ5 ggD9zxCAtIuyWUYEbehcA== X-UI-Out-Filterresults: notjunk:1;V03:K0:d1N5gAp4yZA=:+P09feuxrLhBlyudic3VFF 21nlFQ6Jf5WiclAc/3nCFFt6lkOjoWKt+bDmxLtSz7HZEw+1vESrFUCpRDhRRpUXJ+yXsFirE uy35LYemJ87SIgZTMS5FxjTmgl3Vyd3nVBtWjr3RIaphK8UjAI1P5dF8A/5EcKZ3XysEqQWOf BspBMP/na3pbDZqhCV5f4fqz3hn7jyoZMFzSzgWoHo3Llxe5TCbmJgwILyzH+oDG2a/rBoHsa 77uBCTY920msz/5h9OECOK+2+p2ubCeu+g+HMdsYrX2PNzVQCLUbo3uP4VuYeUmBdRYViboaM 9BF8MnQzENE6iedAtQN7+cIL8UexOGO+eq/ZYk2ZnDTciK4423YraBw/EBIWMCiIewAGC0Hmj qmEt7fk4nNxDcvNF/sBl7f04X4K/bKTZraRROFORJlrsqD0XmqP3saKh8DZmWrWjSlklOd5VQ XKGSql5Ykw== X-Spam-Status: No, score=-100.6 required=5.0 tests=BAYES_00, GOOD_FROM_CORINNA_CYGWIN, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NEUTRAL, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin-developers@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin core component developers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Dec 2020 13:38:17 -0000 Hi Mark, On Dec 2 01:01, Mark Geisert wrote: > Hi folks, > I'm following up on the OP's investigation supplied in > https://cygwin.com/pipermail/cygwin/2020-November/246832.html . > The situation is a socket select thread stuck in a wait-for-event loop that > doesn't realize select() is trying to cleanup that thread before returning a > result to the app. Here is the relevant part of an strace log: > > > 114 8495682 [main] python3.8 1987 start_thread_socket: stuff_start 0xFFFF8C38 > > 68 8495750 [main] python3.8 1987 cygthread::create: name socksel, id 0x737C, this 0x180234778 > > 76 8495826 [main] python3.8 1987 cygthread::create: activated name 'socksel', thread_sync 0x3A8 for id 0x737C > > 122 8495948 [socksel] python3.8 1987 thread_socket: stuff_start 0xFFFF8C38, timeout 4294967295 > > 78 8496026 [main] python3.8 1987 select_stuff::wait: m 4, us 10000, wmfo_timeout -1 > > 77 8496103 [socksel] python3.8 1987 fhandler_socket_local::af_local_connect: af_local_connect called, no_getpeereid=0 > > 115 8496218 [socksel] python3.8 1987 fhandler_socket_local::af_local_send_secret: Sending af_local secret succeeded > > 95 8496313 [socksel] python3.8 1987 fhandler_socket_local::af_local_recv_secret: entered > > 11450 8507763 [main] python3.8 1987 select_stuff::wait: wait_ret 3, m = 4. verifying > > 135 8507898 [main] python3.8 1987 select_stuff::wait: timed out > > 98 8507996 [main] python3.8 1987 select_stuff::wait: returning 1 > > 84 8508080 [main] python3.8 1987 select: sel.wait returns 1 > > 73 8508153 [main] python3.8 1987 select_stuff::cleanup: calling cleanup routines > > 78 8508231 [main] python3.8 1987 socket_cleanup: si 0x800324910 si->thread 0x180234778 > [end of strace.. nothing further happens] > > The 'socksel' thread is shown entering af_local_recv_secret(), so this is > all part of local socket connection startup, when a secret is sent and > received, then credentials are sent and received. The socksel thread is > looping on a WSAWaitForMultipleEvents() call. The OP suggested using > select_info.stop_thread to indicate that wait loops should exit. That would > work further up the stack, but at this level the code doesn't (currently) > see the appropriate select_info. This is apparently an old problem in the still current AF_LOCAL implementation. Christian Franke encountered it when porting postfix: https://sourceware.org/legacy-ml/cygwin/2014-08/msg00420.html The problem is the security handshake between listening/accepting socket and connecting socket. The connecting socket send its half of the handshake and waits for accept on the other side to return the other half. However, if the listening side doesn't accept right away, the connecting side hangs. The workaround right now is to call int peercred_off = 1; fd = socket (AF_LOCAL, SOCK_STREAM, 0); setsockopt(fd, SOL_SOCKET, SO_PEERCRED, &peercred_off, sizeof peercred_off); This disables the security handshake. Corinna