public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Takashi Yano <takashi.yano@nifty.ne.jp>
To: cygwin@cygwin.com
Subject: Re: GNU make losing jobserver tokens
Date: Fri, 1 Apr 2022 17:45:51 +0900	[thread overview]
Message-ID: <20220401174551.820cbc148852554108397e03@nifty.ne.jp> (raw)
In-Reply-To: <9b9da583-124d-9d5f-4c10-6622602ca8dc@oracle.com>

On Mon, 21 Mar 2022 15:28:17 +0100
Magnus Ihse Bursie wrote:
> Hi,
> 
> I'm working for Oracle on the OpenJDK build team. We're using GNU make 
> to build the JDK on all supported platforms. For Windows, we use Cygwin 
> as our build environment, including the Cygwin version of GNU make.
> 
> We have had a long-standing issue with make losing jobserver tokens. 
> ("long-standing" here means for years, and years, at least since GNU 
> make 4.0, up to and including the current latest version in Cygwin.)
> 
> Most runs end with something like:
> 
> make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be 
> 12!
> 
> Since the build still succeeds, and it just affects performance (and 
> typically not that much), we have not spend too much time getting to the 
> bottom of this.
> 
> Now, however, I've come across a machine where this happens repeatedly, 
> and on a much worse scale:
> 
> make[2]: INTERNAL: Exiting with 1 jobserver tokens available; should be 24!
> 
> This effectively turns the highly parallelized builds into 
> single-threaded builds, and is absolutely detrimental for performance. 
> On the flip side, this also makes for the perfect testing environment to 
> really get to the bottom of this issue.
> 
> I started out by sending a question to bug-make@gnu.org. The folks over 
> there reported that this was not a known problem with GNU make on 
> Windows in general, and that as far as they knew, the mingw port did not 
> suffer from this problem.
> 
> Instead, they suggested that it was a Cygwin-specific problem, possibly 
> related to issues with emulating Posix pipes and/or signals in Cygwin.
> 
> So, my first question is: Is this a known problem in Cygwin GNU make? 
> Are there any workarounds/fixes to get around it?
> 
> Otherwise: Any suggestions on how to go on and debug this? I am willing 
> to build and test an instrumented debug build of make, but I will need 
> assistance to find my way around the source and spot likely candidates 
> for the source of the problem.

I have tried to reproduce the issue by building OpenJDK
from source, however, I could not.

Instead, I encountered another issue.

Building OpenJDK sometimes (rarely) failed with error such as:

      0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0
 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0
common/modules/GensrcModuleInfo.gmk:77: *** open: /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: No such file or directory.  Stop.
make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] Error 2
make[2]: *** Waiting for unfinished jobs....


I looked into this new problem and found that wait_sig() thread
crashes with segfault. It seems that accessing _main_tls causes
access violation if a signal is sent just after the process is
started.

static void WINAPI
wait_sig (VOID *)
{
  [...]
      if (!pack.mask)
	{
	  tl_entry = cygheap->find_tls (_main_tls);
	  dummy_mask = _main_tls->sigmask;       // <--- Segfault here
	  cygheap->unlock_tls (tl_entry);
	  pack.mask = &dummy_mask;
	}

I also found the following patch resolves the issue.

diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc
index 62df96652..3824af199 100644
--- a/winsup/cygwin/sigproc.cc
+++ b/winsup/cygwin/sigproc.cc
@@ -1325,6 +1325,10 @@ wait_sig (VOID *)
   _sig_tls = &_my_tls;
   bool sig_held = false;
 
+  /* Wait for _main_tls initialization. */
+  while (!cygwin_finished_initializing)
+    Sleep (10);
+
   sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig %p",
 		  my_readsig, my_sendsig);
 

I guess _main_tls may not be initialized correctly until
cygwin_finished_initializing is set.

Any comments would be appreciated.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

  parent reply	other threads:[~2022-04-01  8:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-21 14:28 Magnus Ihse Bursie
2022-03-21 15:09 ` Ken Brown
2022-03-22  6:54   ` Noel Grandin
2022-03-22 17:52     ` GNU make losing jobserver tokens in pipes Brian Inglis
2022-03-22 19:38   ` checking cyg version (was Re: GNU make losing jobserver tokens) L A Walsh
2022-03-22 21:58     ` Adam Dinwoodie
2022-03-22 23:06     ` Mark Geisert
2022-03-23 17:47       ` Samuel Lelièvre
2022-03-23  6:24 ` GNU make losing jobserver tokens Roumen Petrov
2022-04-01  8:45 ` Takashi Yano [this message]
2022-04-27 14:13   ` Takashi Yano
2022-04-28 13:42     ` Ken Brown
2022-04-28 14:09       ` Corinna Vinschen
2022-04-28 15:01         ` Takashi Yano
2022-04-28 15:32           ` Corinna Vinschen
2022-04-29  9:10             ` Takashi Yano
2022-04-30 21:51               ` Ken Brown
2022-05-01  0:51                 ` Takashi Yano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220401174551.820cbc148852554108397e03@nifty.ne.jp \
    --to=takashi.yano@nifty.ne.jp \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).