public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: Mark Geisert <mark@maxrnd.com>, cygwin-developers@cygwin.com
Subject: Re: load average calculation failing
Date: Mon, 9 May 2022 10:45:58 +0200	[thread overview]
Message-ID: <YnjUxi/IsgkTInKL@calimero.vinschen.de> (raw)
In-Reply-To: <223aa826-7bf9-281a-aed8-e16349de5b96@dronecode.org.uk>

[redirect back to cygwin-developers]

On May  8 11:27, Jon Turney wrote:
> On 08/05/2022 08:01, Mark Geisert wrote:
> > Mark Geisert wrote (on the main Cygwin mailing list):
> > > I've recently noticed that the 'xload' I routinely run shows zero
> > > load even with compute-bound processes running.  This is on both
> > > Cygwin pre-3.4.0 as well as 3.3.4.  A test program, shown below,
> > > indicates that getloadavg() is returning with 0 status, i.e. not an
> > > error but no elems
> > > of the passed-in array updated.
> > > 
> > > Stepping with gdb through the test program seems weird within the
> > > loadavginfo::load_init method.  Single-stepping at line
> > > loadavg.cc:68 goes to strace.h:52 and then to _sigbe ?!
> > > 
> > > I had recently updated both Cygwin and Windows 10 to latest at the
> > > same time so I cannot say when the failure started.  Last day or two
> > > at most.
> > > 
> [...]
> > 
> > I've debugged a bit further..  Within Cygwin's loadavg.cc:load_init(),
> > the PdhOpenQueryW() call returns successfully.  The subsequent
> > PdhAddEnglishCounterW() call is unsuccessful.  It returns status

This is a bit weird.  I tried to debug this for a while on Friday on
W11 and on W11 I can reproduce *a* problem, too, just not the same you
report here.

On W11 I see load_init() working fine, the calls to
PdhAddEnglishCounterW succeed.  But then the call to
PdhGetFormattedCounterValue in get_load() fails with error
PDH_INVALID_DATA.  The CStatus member of fmtvalue1 is set to
PDH_CSTATUS_NO_INSTANCE.

If I tweak get_load to call PdhCollectQueryData again after a fail,
the second call succeeds.  The only problem with this is, the returned
data doesn't make a lot of sense. It only starts to make sense if I
add a Sleep(1000) before the second PdhCollectQueryData call, which is
rather disappointing.

Jon, would it, perhaps, make sense to call PdhCollectQueryData in
load_init(), without actually checking the return value?  The idea is,
to make sure to have a base for the next call to PdhCollectQueryData
from inside load_init.

But even then, the first values returned by getloadavg might not make
much sense, so I guess this is just clutching for straws...

> > 0x800007D0 == PDH_CSTATUS_NO_MACHINE. The code (at line 68 mentioned

This is a weird error.

  "The path did not contain a computer name and the function was unable
   to retrieve the local computer name."

Yeah, sure.

Mark, did you try to add the computer name to the path by calling
GetComputerName() in load_init?

I tried the patch I pasted to the end of this mail, but it did not help
the first PdhGetFormattedCounterValue call in get_load to return
success.

> > above) calls debug_printf() to conditionally display the error, which is
> > what leads to the strace.h and _sigbe; that's fine.
> > 
> > The weird PDH_CSTATUS_NO_MACHINE is the problem.  I'll try running the
> > example from an elevated shell.  Or rebooting the machine.  After that
> > it's consulting some oracle TBD. :-(
> > 
> 
> Thanks for looking into this.
> You can find the user space version of this code I initially wrote at
> https://github.com/jon-turney/windows-loadavg, which might save you some
> time.
> 
> I can't reproduce this on W10 21H1, so I think this must be due to some
> change in later Windows...

I can reproduce this on W10 21H1, too, and the problem is the one
I outlined above, with load_init working fine and just the
PdhGetFormattedCounterValue failing in get_load.


Corinna



diff --git a/winsup/cygwin/loadavg.cc b/winsup/cygwin/loadavg.cc
index 127591a2e1f5..a014c2eb758c 100644
--- a/winsup/cygwin/loadavg.cc
+++ b/winsup/cygwin/loadavg.cc
@@ -40,6 +40,7 @@
 #include <time.h>
 #include <sys/strace.h>
 #include <pdh.h>
 
 static PDH_HQUERY query;
 static PDH_HCOUNTER counter1;
@@ -55,6 +55,17 @@ static bool load_init (void)
     tried = true;
 
     PDH_STATUS status;
+    DWORD size = MAX_PATH;
+    WCHAR machine_name[MAX_PATH];
+    WCHAR counter_name[MAX_PATH + 64];
+    PWCHAR counter_p = counter_name;
+
+    if (GetComputerNameW (machine_name, &size))
+      {
+	*counter_p++ = L'\\';
+	*counter_p++ = L'\\';
+	counter_p = wcpcpy (counter_p, machine_name);
+      }
 
     status = PdhOpenQueryW (NULL, 0, &query);
     if (status != STATUS_SUCCESS)
@@ -62,18 +73,17 @@ static bool load_init (void)
 	debug_printf ("PdhOpenQueryW, status %y", status);
 	return false;
       }
-    status = PdhAddEnglishCounterW (query,
-				    L"\\Processor(_Total)\\% Processor Time",
-				    0, &counter1);
+
+    wcpcpy (counter_p, L"\\Processor(_Total)\\% Processor Time");
+    status = PdhAddEnglishCounterW (query, counter_name, 0, &counter1);
     if (status != STATUS_SUCCESS)
       {
 	debug_printf ("PdhAddEnglishCounterW(time), status %y", status);
 	return false;
       }
-    status = PdhAddEnglishCounterW (query,
-				    L"\\System\\Processor Queue Length",
-				    0, &counter2);
 
+    wcpcpy (counter_p, L"\\System\\Processor Queue Length");
+    status = PdhAddEnglishCounterW (query, counter_name, 0, &counter2);
     if (status != STATUS_SUCCESS)
       {
 	debug_printf ("PdhAddEnglishCounterW(queue length), status %y", status);

  parent reply	other threads:[~2022-05-09  8:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.BSF.4.63.2205051618470.42373@m0.truegem.net>
2022-05-08  7:01 ` Mark Geisert
     [not found]   ` <223aa826-7bf9-281a-aed8-e16349de5b96@dronecode.org.uk>
2022-05-09  8:45     ` Corinna Vinschen [this message]
2022-05-09  8:53       ` Corinna Vinschen
2022-05-10  8:34       ` Mark Geisert
2022-05-10 13:37         ` Jon Turney
2022-05-11 23:40           ` load average calculation failing -- fixed by Windows update Mark Geisert
2022-05-12  8:17             ` Corinna Vinschen
2022-05-12  8:24               ` Mark Geisert
2022-05-12  8:43                 ` Corinna Vinschen
2022-05-12  9:48             ` Corinna Vinschen
2022-05-13 10:34               ` Jon Turney
2022-05-13 11:04                 ` Corinna Vinschen
2022-05-13 11:05                   ` Corinna Vinschen
2022-05-16  5:25                     ` load average calculation imperfections Mark Geisert
2022-05-16 16:49                       ` Jon Turney
2022-05-17  5:39                         ` Mark Geisert
2022-05-17 14:48                     ` load average calculation failing -- fixed by Windows update Jon Turney
2022-05-17 19:48                       ` Mark Geisert
2022-05-09 11:29   ` load average calculation failing Jon Turney
2022-05-10  8:21     ` Mark Geisert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnjUxi/IsgkTInKL@calimero.vinschen.de \
    --to=corinna-cygwin@cygwin.com \
    --cc=cygwin-developers@cygwin.com \
    --cc=mark@maxrnd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).