public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simon.marchi@efficios.com>
To: gdb-patches@sourceware.org
Cc: Simon Marchi <simon.marchi@efficios.com>
Subject: [PATCH 2/2] gdb/linux-nat: get core count using /sys/devices/system/cpu/possible
Date: Fri,  4 Nov 2022 11:51:37 -0400	[thread overview]
Message-ID: <20221104155137.1463129-2-simon.marchi@efficios.com> (raw)
In-Reply-To: <20221104155137.1463129-1-simon.marchi@efficios.com>

I get this test failure on my CI;

  FAIL: gdb.base/info-os.exp: get process list

The particularity of this setup is that builds are done in containers
who are allocated 4 CPUs on a machine that has 40.  The code in
nat/linux-osdata.c fails to properly fetch the core number for each
task.

linux_xfer_osdata_processes uses `sysconf (_SC_NPROCESSORS_ONLN)`, which
returns 4, so it allocates an array of 4 integers.  However, the core
numbers read from /proc/pid/task/tid/stat, by function
linux_common_core_of_thread, returns a value anywhere between 0 and 39.
The core numbers above 3 are therefore ignored, many processes end up
with no core value, and the regexp in the test doesn't match (it
requires an integer as the core field).

The way this the CPUs are exposed to the container is that the container
sees 40 CPUs "present" and "possible", but only 4 arbitrary CPUs
actually online:

    root@ci-node-jammy-amd64-04-08:~# cat /sys/devices/system/cpu/present
    0-39
    root@ci-node-jammy-amd64-04-08:~# cat /sys/devices/system/cpu/online
    5,11,24,31
    root@ci-node-jammy-amd64-04-08:~# cat /sys/devices/system/cpu/possible
    0-39

The solution proposed in this patch is to find out the number of
possible CPUs using /sys/devices/system/cpu/possible.  In practice, this
will probably always contain `0-N`, where N is the number of CPUs, minus
one.  But the documentation [1] doesn't such guarantee, so I'll assume
that it can contain a more complex range list such as `2,4-31,32-63`,
like the other files in that directory can have.  The solution is to
iterate over these numbers to find the highest possible CPU id, and
use that that value plus one as the size of the array to allocate.

[1] https://www.kernel.org/doc/Documentation/admin-guide/cputopology.rst

Change-Id: I7abce2e43b000c1327fa94cd7b99d46e49d7ccf3
---
 gdb/nat/linux-osdata.c | 70 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 66 insertions(+), 4 deletions(-)

diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
index f9c43f6691e..8639f090910 100644
--- a/gdb/nat/linux-osdata.c
+++ b/gdb/nat/linux-osdata.c
@@ -271,6 +271,68 @@ get_cores_used_by_process (PID_T pid, int *cores, const int num_cores)
   return task_count;
 }
 
+/* get_core_array_size helper that uses /sys/devices/system/cpu/possible.  */
+
+static gdb::optional<size_t>
+get_core_array_size_using_sys_possible ()
+{
+  gdb::optional<std::string> possible
+    = read_text_file_to_string ("/sys/devices/system/cpu/possible");
+
+  if (!possible.has_value ())
+    return {};
+
+  /* The format is documented here:
+
+       https://www.kernel.org/doc/Documentation/admin-guide/cputopology.rst
+
+     For the purpose of this function, we assume the file can contain a complex
+     set of ranges, like `2,4-31,32-63`.  Read all number, disregarding commands
+     and dashes, in order to find the largest possible core number.  The size
+     of the array to allocate is that plus one.  */
+
+  unsigned long max_id = 0;
+  for (std::string::size_type start = 0; start < possible->size ();)
+    {
+      const char *start_p = &(*possible)[start];
+      char *end_p;
+
+      /* Parse one number.  */
+      errno = 0;
+      unsigned long id = strtoul (start_p, &end_p, 10);
+      if (errno != 0)
+	return {};
+
+      max_id = std::max (max_id, id);
+
+      start += end_p - start_p;
+      gdb_assert (start <= possible->size ());
+
+      /* Skip comma, dash, or new line (if we are at the end).  */
+      ++start;
+    }
+
+  return max_id + 1;
+}
+
+/* Return the array size to allocate in order to be able to index it using
+   CPU core numbers.  This may be more than the actual number of cores if
+   the core numbers are not contiguous.  */
+
+static size_t
+get_core_array_size ()
+{
+  /* Using /sys/.../possible is prefered, because it handles the case where
+     we are in a container that has access to a subset of the host's cores.
+     It will return a size that considers all the CPU cores available to the
+     host.  If that fials for some reason, fall back to sysconf.  */
+  gdb::optional<size_t> count = get_core_array_size_using_sys_possible ();
+  if (count.has_value ())
+    return *count;
+
+  return sysconf (_SC_NPROCESSORS_ONLN);
+}
+
 static void
 linux_xfer_osdata_processes (struct buffer *buffer)
 {
@@ -281,7 +343,7 @@ linux_xfer_osdata_processes (struct buffer *buffer)
   dirp = opendir ("/proc");
   if (dirp)
     {
-      const int num_cores = sysconf (_SC_NPROCESSORS_ONLN);
+      const int core_array_size = get_core_array_size ();
       struct dirent *dp;
 
       while ((dp = readdir (dirp)) != NULL)
@@ -308,10 +370,10 @@ linux_xfer_osdata_processes (struct buffer *buffer)
 	    strcpy (user, "?");
 
 	  /* Find CPU cores used by the process.  */
-	  cores = XCNEWVEC (int, num_cores);
-	  task_count = get_cores_used_by_process (pid, cores, num_cores);
+	  cores = XCNEWVEC (int, core_array_size);
+	  task_count = get_cores_used_by_process (pid, cores, core_array_size);
 
-	  for (i = 0; i < num_cores && task_count > 0; ++i)
+	  for (i = 0; i < core_array_size && task_count > 0; ++i)
 	    if (cores[i])
 	      {
 		string_appendf (cores_str, "%d", i);
-- 
2.37.3


  reply	other threads:[~2022-11-04 15:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 15:51 [PATCH 1/2] gdbsupport, gdb: add read_text_file_to_string, use it in linux_common_core_of_thread Simon Marchi
2022-11-04 15:51 ` Simon Marchi [this message]
2022-11-08 17:17   ` [PATCH 2/2] gdb/linux-nat: get core count using /sys/devices/system/cpu/possible Tom Tromey
2022-11-08 17:13 ` [PATCH 1/2] gdbsupport, gdb: add read_text_file_to_string, use it in linux_common_core_of_thread Tom Tromey
2022-11-08 20:36   ` Simon Marchi
2022-11-08 21:48     ` Tom Tromey
2022-11-08 21:52       ` Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221104155137.1463129-2-simon.marchi@efficios.com \
    --to=simon.marchi@efficios.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).