From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from conssluserg-05.nifty.com (conssluserg-05.nifty.com [210.131.2.90]) by sourceware.org (Postfix) with ESMTPS id 0CD103858D32 for ; Mon, 16 Jan 2023 14:45:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0CD103858D32 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=nifty.ne.jp Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nifty.ne.jp Received: from HP-Z230 (aj135041.dynamic.ppp.asahi-net.or.jp [220.150.135.41]) (authenticated) by conssluserg-05.nifty.com with ESMTP id 30GEjVUI032548 for ; Mon, 16 Jan 2023 23:45:31 +0900 DKIM-Filter: OpenDKIM Filter v2.10.3 conssluserg-05.nifty.com 30GEjVUI032548 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.ne.jp; s=dec2015msa; t=1673880331; bh=lHFEGZGyOGFfAitlweQOc/PilESdqzjYcmqzbNTUhIs=; h=Date:From:To:Subject:In-Reply-To:References:From; b=kDmKShD1TkdfNBXLucK9kruUFR1pL7FnV3ySFq3sJnMrEsS/6tKaQDBiWA+3P824A grgGpQub20C3kcofx4qM8RRxqgQL9cUsl9MzvwQ3X3R9ZmIYq+km0YsoJrjw3QNFDb qwzTx7bYRS42Hr9IvOTuQRtgVelAS2Z6EZHCjiAsuDUrc276MXhRQcFyzJlZy4NLX9 RmosDLqHMYXWINAlI1R/4quXCBkYu4/7IwhhyEpRavOZCAlgC0gnEAK42Ka19pwSvo wulfBIA8YcXGSRRUCvnVPhjXaJV4UtG59HXJo54RudJ84zeJbCPKE/r8R5M/U6YAIi rJVqDggBNQcFg== X-Nifty-SrcIP: [220.150.135.41] Date: Mon, 16 Jan 2023 23:45:32 +0900 From: Takashi Yano To: cygwin@cygwin.com Subject: Re: Cygwin 3.4.3 and 3.5.0... hangs in make, top, procps, ls /proc/PID/... Message-Id: <20230116234532.f567e64fe7bf9a0a13704af9@nifty.ne.jp> In-Reply-To: References: <4a4427cc-422b-1d14-015e-26523e620d9b@Shaw.ca> <20230102113201.476c10bef7a5643bddc00762@nifty.ne.jp> <20230102143803.53f89d07a545a1bdd596e1e8@nifty.ne.jp> <20230102172147.83789d400bb0400cb8c8ca74@nifty.ne.jp> <20230116180213.0e03a896f512d784933f54da@nifty.ne.jp> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 16 Jan 2023 11:23:54 +0100 Corinna Vinschen wrote: > On Jan 16 18:02, Takashi Yano via Cygwin wrote: > > Hi Corinna, > > > > On Mon, 9 Jan 2023 14:20:56 +0100 > > Corinna Vinschen wrote: > > > On Jan 2 17:21, Takashi Yano via Cygwin wrote: > > > > On Mon, 2 Jan 2023 14:38:03 +0900 > > > > Takashi Yano wrote: > > > > > On Mon, 2 Jan 2023 11:32:01 +0900 > > > > > Takashi Yano wrote: > > > > > > On Sat, 31 Dec 2022 13:01:29 -0700 > > > > > > Brian Inglis wrote: > > > > > > > was also getting the messages below locally and still on GitHub scallywag: > > > > > > > > > > > > > > cygcheck (6936) child_copy: cygheap read copy failed, > > > > > > > > > > > > > > ../curl/scallywag/1_x86_64 build.log:2022-12-26T00:39:35.6163236Z 0 > > > > > > > [main] cygcheck (6936) child_copy: cygheap read copy failed, 0x0..0x80003B5F0, > > > > > > > done 0, windows pid 6936, Win32 error 299 > > > > > > > [...] > > > > > I found this issue occurs after the commit 30add3e6b3e3: > > > > > "Cygwin: exec: don't access cygheap before it's initialized" > > > > > . > > > > > > > > > > Reverting this commit solves the issue. > > > > > > That would break strace again, but... > > > > > > > I'm not sure if this is the right thing, but the following > > > > patch seems to fix the issue. > > > > > > This looks pretty good to me and it keeps strace working per the > > > description in 30add3e6b3e3. Please push this to master and the > > > 3.4 branch. > > > > I noticed that the following error occurs even with this patch. > > If you run: > > while true; do cygcheck -cd cygwin > /dev/null; done > > for one day or so, you will find the issue can be reproduced. > > > > Both cygwin-3_4-branch and main (master) branch have this issue, > > while cygwin 3.3.6 does not. > > > > $ while true; do cygcheck -cd cygwin > /dev/null; done > > 0 [main] cygcheck (15244) C:\cygwin64\bin\cygcheck.exe: *** fatal error - > > MapViewOfFileEx 'shared.5'(0x138), Win32 error 487. Terminating. > > 3540 [main] cygcheck (15244) cygwin_exception::open_stackdumpfile: Dumping st > > ack trace to cygcheck.exe.stackdump > > 0 [main] cygcheck (10844) C:\cygwin64\bin\cygcheck.exe: *** fatal error - > > MapViewOfFileEx 'cygpid.51742'(0x148), Win32 error 487. Terminating. > > 0 [main] cygcheck (1976) C:\cygwin64\bin\cygcheck.exe: *** fatal error - M > [...] > > Errors seem to be three types: (null), cygpid.xxx and shared.5. > > I'm not sure what is happening and why at all, however, this > > did not seem to happen before the commit 30add3e6b3e3. > > I'll try to reproduce this issue. But the weird thing is certainly > this: The affected shared mem regions are apparently not the cygheap. > Rather, they are the "shared" and "cygpid" shared mem regions, which > should not at all collide with the cygheap. I guess we need more > debug output in the api_fatal call inside open_shared... I am now trying the test case with reverting the commit 60675f1a7eb2 "Cygwin: decouple shared mem regions from Cygwin DLL", and the issue does not happen for several hours so far. I guess this most likely is the direct cause of the problem. -- Takashi Yano