From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from conssluserg-06.nifty.com (conssluserg-06.nifty.com [210.131.2.91]) by sourceware.org (Postfix) with ESMTPS id 468583858D28 for ; Fri, 1 Apr 2022 08:45:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 468583858D28 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=nifty.ne.jp Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nifty.ne.jp Received: from Express5800-S70 (ak044095.dynamic.ppp.asahi-net.or.jp [119.150.44.95]) (authenticated) by conssluserg-06.nifty.com with ESMTP id 2318jc6W010772 for ; Fri, 1 Apr 2022 17:45:38 +0900 DKIM-Filter: OpenDKIM Filter v2.10.3 conssluserg-06.nifty.com 2318jc6W010772 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.ne.jp; s=dec2015msa; t=1648802738; bh=j4Uyz0o43Yapfsqwq53QDfHipdUDeZSnFin5zt1akyU=; h=Date:From:To:Subject:In-Reply-To:References:From; b=AXQQpj0Hj69wWTWKoPbg+1kfnBWH6ag/lhq9C8Q5LFLg9t5Re/WtZOjS5gzvdx8y3 rAOSe/34P3Vz5P7npIgDg36poUZufZJcoG997M6RzEZxVcgRS8ENOM0/T7TFdrVz3k rHVrpgzb17FnNvuwoQq3rQvbdWIb4GM+OgZMMZYdvpC9Xoml0adJnvJibs17nCFdII CKJG7rVC3k2K4Q7QbCvACetXLdSO8OvwZ1qaW+UaFJ1v6wFMoZBPxUt4UjNUMDmGPk OvjRqMrsAZ9KkwZewA57q1rye2VSlpTeIpzQl/c6zqFFISlDAzK9UwNxY6RCCYpW6I BxhiNMcNwJa6g== X-Nifty-SrcIP: [119.150.44.95] Date: Fri, 1 Apr 2022 17:45:51 +0900 From: Takashi Yano To: cygwin@cygwin.com Subject: Re: GNU make losing jobserver tokens Message-Id: <20220401174551.820cbc148852554108397e03@nifty.ne.jp> In-Reply-To: <9b9da583-124d-9d5f-4c10-6622602ca8dc@oracle.com> References: <9b9da583-124d-9d5f-4c10-6622602ca8dc@oracle.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2022 08:45:59 -0000 On Mon, 21 Mar 2022 15:28:17 +0100 Magnus Ihse Bursie wrote: > Hi, > > I'm working for Oracle on the OpenJDK build team. We're using GNU make > to build the JDK on all supported platforms. For Windows, we use Cygwin > as our build environment, including the Cygwin version of GNU make. > > We have had a long-standing issue with make losing jobserver tokens. > ("long-standing" here means for years, and years, at least since GNU > make 4.0, up to and including the current latest version in Cygwin.) > > Most runs end with something like: > > make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be > 12! > > Since the build still succeeds, and it just affects performance (and > typically not that much), we have not spend too much time getting to the > bottom of this. > > Now, however, I've come across a machine where this happens repeatedly, > and on a much worse scale: > > make[2]: INTERNAL: Exiting with 1 jobserver tokens available; should be 24! > > This effectively turns the highly parallelized builds into > single-threaded builds, and is absolutely detrimental for performance. > On the flip side, this also makes for the perfect testing environment to > really get to the bottom of this issue. > > I started out by sending a question to bug-make@gnu.org. The folks over > there reported that this was not a known problem with GNU make on > Windows in general, and that as far as they knew, the mingw port did not > suffer from this problem. > > Instead, they suggested that it was a Cygwin-specific problem, possibly > related to issues with emulating Posix pipes and/or signals in Cygwin. > > So, my first question is: Is this a known problem in Cygwin GNU make? > Are there any workarounds/fixes to get around it? > > Otherwise: Any suggestions on how to go on and debug this? I am willing > to build and test an instrumented debug build of make, but I will need > assistance to find my way around the source and spot likely candidates > for the source of the problem. I have tried to reproduce the issue by building OpenJDK from source, however, I could not. Instead, I encountered another issue. Building OpenJDK sometimes (rarely) failed with error such as: 0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0 common/modules/GensrcModuleInfo.gmk:77: *** open: /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: No such file or directory. Stop. make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] Error 2 make[2]: *** Waiting for unfinished jobs.... I looked into this new problem and found that wait_sig() thread crashes with segfault. It seems that accessing _main_tls causes access violation if a signal is sent just after the process is started. static void WINAPI wait_sig (VOID *) { [...] if (!pack.mask) { tl_entry = cygheap->find_tls (_main_tls); dummy_mask = _main_tls->sigmask; // <--- Segfault here cygheap->unlock_tls (tl_entry); pack.mask = &dummy_mask; } I also found the following patch resolves the issue. diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc index 62df96652..3824af199 100644 --- a/winsup/cygwin/sigproc.cc +++ b/winsup/cygwin/sigproc.cc @@ -1325,6 +1325,10 @@ wait_sig (VOID *) _sig_tls = &_my_tls; bool sig_held = false; + /* Wait for _main_tls initialization. */ + while (!cygwin_finished_initializing) + Sleep (10); + sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig %p", my_readsig, my_sendsig); I guess _main_tls may not be initialized correctly until cygwin_finished_initializing is set. Any comments would be appreciated. -- Takashi Yano