public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Fix attaching to process when it has zombie threads
@ 2024-04-20  5:56 Thiago Jung Bauermann
  2024-04-20  5:56 ` [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Thiago Jung Bauermann @ 2024-04-20  5:56 UTC (permalink / raw)
  To: gdb-patches; +Cc: Christophe Lyon, Luis Machado, Pedro Alves

Hello,

Thank you for your review of the patches! This version incorporates the
comments.

Luis suggested adding macros to identify the field indexes in
/proc/PID/stat, but Pedro doesn't think they help. I personally think they
help, but I agree that not by that much, so I don't have a strong opinion.

Since the previous version didn't have the macros, I'm sending v2 with
them so that it's easy to compare both and decide. I can remove the
macros if they're deemed unnecessary.

v1 of this series is here:

https://inbox.sourceware.org/gdb-patches/20240321231149.519549-1-thiago.bauermann@linaro.org/

Thiago Jung Bauermann (3):
  gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread
  gdb/nat: Factor linux_proc_get_stat_field out of
    linux_common_core_of_thread
  gdb/nat/linux: Fix attaching to process when it has zombie threads

 gdb/nat/linux-osdata.c | 38 +++--------------
 gdb/nat/linux-osdata.h |  3 ++
 gdb/nat/linux-procfs.c | 96 ++++++++++++++++++++++++++++++++++++++++++
 gdb/nat/linux-procfs.h | 11 +++++
 4 files changed, 115 insertions(+), 33 deletions(-)


base-commit: edc27e2d90a2af3afddd68e7a534afcc428719e5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread
  2024-04-20  5:56 [PATCH v2 0/3] Fix attaching to process when it has zombie threads Thiago Jung Bauermann
@ 2024-04-20  5:56 ` Thiago Jung Bauermann
  2024-04-22 19:39   ` Pedro Alves
  2024-04-20  5:56 ` [PATCH v2 2/3] gdb/nat: Factor linux_proc_get_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
  2024-04-20  5:56 ` [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
  2 siblings, 1 reply; 8+ messages in thread
From: Thiago Jung Bauermann @ 2024-04-20  5:56 UTC (permalink / raw)
  To: gdb-patches; +Cc: Christophe Lyon, Luis Machado, Pedro Alves

The code and comment reference stat fields by made-up indexes.  The
procfs(5) man page, which describes the /proc/PID/stat file, has a numbered
list of these fields so it's more convenient to use those numbers instead.

This is currently an implementation detail inside the function so it's
not really relevant with the code as-is, but a future patch will do some
refactoring which will make the index more prominent.

Therefore, make this change in a separate patch so that it's simpler to
review.

Reviewed-By: Luis Machado <luis.machado@arm.com>
---
 gdb/nat/linux-osdata.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Changes in v2:
- Added macros for field indexes in /proc/PID/stat (Suggested by Luis).

diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
index 6ffabe90aa7d..079f9ea20c37 100644
--- a/gdb/nat/linux-osdata.c
+++ b/gdb/nat/linux-osdata.c
@@ -52,6 +52,10 @@ typedef long long TIME_T;
 
 #define MAX_PID_T_STRLEN  (sizeof ("-9223372036854775808") - 1)
 
+/* Index of fields of interest in /proc/PID/stat, from procfs(5) man page.  */
+#define LINUX_PROC_STAT_STATE 3
+#define LINUX_PROC_STAT_PROCESSOR 39
+
 /* Returns the CPU core that thread PTID is currently running on.  */
 
 /* Compute and return the processor core of a given thread.  */
@@ -74,10 +78,10 @@ linux_common_core_of_thread (ptid_t ptid)
   if (pos == std::string::npos)
     return -1;
 
-  /* If the first field after program name has index 0, then core number is
-     the field with index 36 (so, the 37th).  There's no constant for that
-     anywhere.  */
-  for (int i = 0; i < 37; ++i)
+  /* If the first field after program name has index 3, then core number is
+     the field with index 39.  These are the indexes shown in the procfs(5)
+     man page.  */
+  for (int i = LINUX_PROC_STAT_STATE; i <= LINUX_PROC_STAT_PROCESSOR; ++i)
     {
       /* Find separator.  */
       pos = content->find_first_of (' ', pos);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] gdb/nat: Factor linux_proc_get_stat_field out of linux_common_core_of_thread
  2024-04-20  5:56 [PATCH v2 0/3] Fix attaching to process when it has zombie threads Thiago Jung Bauermann
  2024-04-20  5:56 ` [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
@ 2024-04-20  5:56 ` Thiago Jung Bauermann
  2024-04-20  5:56 ` [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
  2 siblings, 0 replies; 8+ messages in thread
From: Thiago Jung Bauermann @ 2024-04-20  5:56 UTC (permalink / raw)
  To: gdb-patches; +Cc: Christophe Lyon, Luis Machado, Pedro Alves

The new function will be used in a subsequent patch to read a different
stat field.

The new code is believed to be equivalent to the old code, so there
should be no change in GDB behaviour.  The only material change was to
use std::string and string_printf rather than a fixed char array to
build the path to the stat file.

Also, take the opportunity to move the function's documentation comment to
the header file, to conform with GDB practice.

Reviewed-By: Luis Machado <luis.machado@arm.com>
---
 gdb/nat/linux-osdata.c | 42 +++++------------------------------------
 gdb/nat/linux-osdata.h |  3 +++
 gdb/nat/linux-procfs.c | 43 ++++++++++++++++++++++++++++++++++++++++++
 gdb/nat/linux-procfs.h | 10 ++++++++++
 4 files changed, 61 insertions(+), 37 deletions(-)

Changes in v2:
- Added macros for field indexes in /proc/PID/stat (Suggested by Luis).
- Moved linux_find_proc_stat_field from linux-osdata.c to linux-procfs.c
  and changed prefix to linux_proc (Suggested by Pedro).
- Use string_printf in linux_proc_get_stat_field to build path to stat
  file, to avoid having to copy the PID_T and MAX_PID_T_STRLEN macros.

diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
index 079f9ea20c37..3a6215015f12 100644
--- a/gdb/nat/linux-osdata.c
+++ b/gdb/nat/linux-osdata.c
@@ -36,6 +36,7 @@
 #include <sys/stat.h>
 #include "gdbsupport/filestuff.h"
 #include <algorithm>
+#include "linux-procfs.h"
 
 #define NAMELEN(dirent) strlen ((dirent)->d_name)
 
@@ -52,51 +53,18 @@ typedef long long TIME_T;
 
 #define MAX_PID_T_STRLEN  (sizeof ("-9223372036854775808") - 1)
 
-/* Index of fields of interest in /proc/PID/stat, from procfs(5) man page.  */
-#define LINUX_PROC_STAT_STATE 3
-#define LINUX_PROC_STAT_PROCESSOR 39
-
-/* Returns the CPU core that thread PTID is currently running on.  */
-
-/* Compute and return the processor core of a given thread.  */
+/* See linux-osdata.h.  */
 
 int
 linux_common_core_of_thread (ptid_t ptid)
 {
-  char filename[sizeof ("/proc//task//stat") + 2 * MAX_PID_T_STRLEN];
+  std::optional<std::string> field
+    = linux_proc_get_stat_field (ptid, LINUX_PROC_STAT_PROCESSOR);
   int core;
 
-  sprintf (filename, "/proc/%lld/task/%lld/stat",
-	   (PID_T) ptid.pid (), (PID_T) ptid.lwp ());
-
-  std::optional<std::string> content = read_text_file_to_string (filename);
-  if (!content.has_value ())
+  if (!field.has_value () || sscanf (field->c_str (), "%d", &core) == 0)
     return -1;
 
-  /* ps command also relies on no trailing fields ever contain ')'.  */
-  std::string::size_type pos = content->find_last_of (')');
-  if (pos == std::string::npos)
-    return -1;
-
-  /* If the first field after program name has index 3, then core number is
-     the field with index 39.  These are the indexes shown in the procfs(5)
-     man page.  */
-  for (int i = LINUX_PROC_STAT_STATE; i <= LINUX_PROC_STAT_PROCESSOR; ++i)
-    {
-      /* Find separator.  */
-      pos = content->find_first_of (' ', pos);
-      if (pos == std::string::npos)
-	return {};
-
-      /* Find beginning of field.  */
-      pos = content->find_first_not_of (' ', pos);
-      if (pos == std::string::npos)
-	return {};
-    }
-
-  if (sscanf (&(*content)[pos], "%d", &core) == 0)
-    core = -1;
-
   return core;
 }
 
diff --git a/gdb/nat/linux-osdata.h b/gdb/nat/linux-osdata.h
index 833915cdb2fd..a82fb08b998e 100644
--- a/gdb/nat/linux-osdata.h
+++ b/gdb/nat/linux-osdata.h
@@ -20,7 +20,10 @@
 #ifndef NAT_LINUX_OSDATA_H
 #define NAT_LINUX_OSDATA_H
 
+/* Returns the CPU core that thread PTID is currently running on.  */
+
 extern int linux_common_core_of_thread (ptid_t ptid);
+
 extern LONGEST linux_common_xfer_osdata (const char *annex, gdb_byte *readbuf,
 					 ULONGEST offset, ULONGEST len);
 
diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
index e2086952ce6b..23231c301a3f 100644
--- a/gdb/nat/linux-procfs.c
+++ b/gdb/nat/linux-procfs.c
@@ -230,6 +230,49 @@ linux_proc_pid_is_zombie (pid_t pid)
 
 /* See linux-procfs.h.  */
 
+std::optional<std::string>
+linux_proc_get_stat_field (ptid_t ptid, int field)
+{
+  /* We never need to read PID from the stat file, and there's
+     command_from_pid to read the comm field.  */
+  gdb_assert (field >= LINUX_PROC_STAT_STATE);
+
+  std::string filename = string_printf ("/proc/%ld/task/%ld/stat",
+					(long) ptid.pid (), (long) ptid.lwp ());
+
+  std::optional<std::string> content
+    = read_text_file_to_string (filename.c_str ());
+  if (!content.has_value ())
+    return {};
+
+  /* ps command also relies on no trailing fields ever contain ')'.  */
+  std::string::size_type pos = content->find_last_of (')');
+  if (pos == std::string::npos)
+    return {};
+
+  for (int i = LINUX_PROC_STAT_STATE; i <= field; ++i)
+    {
+      /* Find separator.  */
+      pos = content->find_first_of (' ', pos);
+      if (pos == std::string::npos)
+	return {};
+
+      /* Find beginning of field.  */
+      pos = content->find_first_not_of (' ', pos);
+      if (pos == std::string::npos)
+	return {};
+    }
+
+  /* Find end of field.  */
+  std::string::size_type end_pos = content->find_first_of (' ', pos);
+  if (end_pos == std::string::npos)
+    return content->substr (pos);
+  else
+    return content->substr (pos, end_pos - pos);
+}
+
+/* See linux-procfs.h.  */
+
 const char *
 linux_proc_tid_get_name (ptid_t ptid)
 {
diff --git a/gdb/nat/linux-procfs.h b/gdb/nat/linux-procfs.h
index 880afbcdd615..ec1f37651fbf 100644
--- a/gdb/nat/linux-procfs.h
+++ b/gdb/nat/linux-procfs.h
@@ -54,6 +54,16 @@ extern int linux_proc_pid_is_zombie_nowarn (pid_t pid);
 
 extern int linux_proc_pid_is_gone (pid_t pid);
 
+/* Index of fields of interest in /proc/PID/stat, from procfs(5) man page.  */
+#define LINUX_PROC_STAT_STATE 3
+#define LINUX_PROC_STAT_PROCESSOR 39
+
+/* Returns FIELD (as numbered in procfs(5) man page) of
+   /proc/PID/task/LWP/stat file.  */
+
+extern std::optional<std::string> linux_proc_get_stat_field (ptid_t ptid,
+							     int field);
+
 /* Return a string giving the thread's name or NULL if the
    information is unavailable.  The returned value points to a statically
    allocated buffer.  The value therefore becomes invalid at the next

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
  2024-04-20  5:56 [PATCH v2 0/3] Fix attaching to process when it has zombie threads Thiago Jung Bauermann
  2024-04-20  5:56 ` [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
  2024-04-20  5:56 ` [PATCH v2 2/3] gdb/nat: Factor linux_proc_get_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
@ 2024-04-20  5:56 ` Thiago Jung Bauermann
  2024-04-22 19:59   ` Pedro Alves
  2 siblings, 1 reply; 8+ messages in thread
From: Thiago Jung Bauermann @ 2024-04-20  5:56 UTC (permalink / raw)
  To: gdb-patches; +Cc: Christophe Lyon, Luis Machado, Pedro Alves

When GDB attaches to a multi-threaded process, it calls
linux_proc_attach_tgid_threads () to go through all threads found in
/proc/PID/task/ and call attach_proc_task_lwp_callback () on each of
them.  If it does that twice without the callback reporting that a new
thread was found, then it considers that all inferior threads have been
found and returns.

The problem is that the callback considers any thread that it hasn't
attached to yet as new.  This causes problems if the process has one or
more zombie threads, because GDB can't attach to it and the loop will
always "find" a new thread (the zombie one), and get stuck in an
infinite loop.

This is easy to trigger (at least on aarch64-linux and powerpc64le-linux)
with the gdb.threads/attach-many-short-lived-threads.exp testcase, because
its test program constantly creates and finishes joinable threads so the
chance of having zombie threads is high.

This problem causes the following failures:

FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: no new threads (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted on (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break break_fn (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 1 (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 2 (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 3 (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: reset timer in the inferior (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: print seconds_left (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: detach (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted off (timeout)
FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: delete all breakpoints, watchpoints, tracepoints, and catchpoints in delete_breakpoints (timeout)
ERROR: breakpoints not deleted

The iteration number is random, and all tests in the subsequent iterations
fail too, because GDB is stuck in the attach command at the beginning of
the iteration.

The solution is to make linux_proc_attach_tgid_threads () remember when it
has already processed a given LWP and skip it in the subsequent iterations.

PR testsuite/31312
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31312

Reviewed-By: Luis Machado <luis.machado@arm.com>
---
 gdb/nat/linux-procfs.c | 53 ++++++++++++++++++++++++++++++++++++++++++
 gdb/nat/linux-procfs.h |  1 +
 2 files changed, 54 insertions(+)

Changes in v2:
- Added macro for field index in /proc/PID/stat (Suggested by Luis).
- Moved linux_get_starttime to linux-procfs.c and changed its prefix
  to linux_proc (Suggested by Pedro).
- Changed visited_lwps from std::set to std::unordered_set. Had to add a
  hash function (Suggested by Pedro).

diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
index 23231c301a3f..2ed7b36ecd55 100644
--- a/gdb/nat/linux-procfs.c
+++ b/gdb/nat/linux-procfs.c
@@ -20,6 +20,8 @@
 #include "gdbsupport/filestuff.h"
 #include <dirent.h>
 #include <sys/stat.h>
+#include <unordered_set>
+#include <utility>
 
 /* Return the TGID of LWPID from /proc/pid/status.  Returns -1 if not
    found.  */
@@ -271,6 +273,29 @@ linux_proc_get_stat_field (ptid_t ptid, int field)
     return content->substr (pos, end_pos - pos);
 }
 
+/* Get the start time of thread PTID.  */
+
+static std::optional<ULONGEST>
+linux_proc_get_starttime (ptid_t ptid)
+{
+  std::optional<std::string> field
+    = linux_proc_get_stat_field (ptid, LINUX_PROC_STAT_STARTTIME);
+
+  if (!field.has_value ())
+    return {};
+
+  errno = 0;
+  const char *trailer;
+  ULONGEST starttime = strtoulst (field->c_str (), &trailer, 10);
+  if (starttime == ULONGEST_MAX && errno == ERANGE)
+    return {};
+  else if (*trailer != '\0')
+    /* There were unexpected characters.  */
+    return {};
+
+  return starttime;
+}
+
 /* See linux-procfs.h.  */
 
 const char *
@@ -332,6 +357,21 @@ linux_proc_attach_tgid_threads (pid_t pid,
       return;
     }
 
+  /* Callable object to hash elements in visited_lpws.  */
+  struct pair_hash
+  {
+    std::size_t operator() (const std::pair<unsigned long, ULONGEST> &v) const
+    {
+      return (std::hash<unsigned long>() (v.first)
+	      ^ std::hash<ULONGEST>() (v.second));
+    }
+  };
+
+  /* Keeps track of the LWPs we have already visited in /proc,
+     identified by their PID and starttime to detect PID reuse.  */
+  std::unordered_set<std::pair<unsigned long, ULONGEST>,
+		     pair_hash> visited_lwps;
+
   /* Scan the task list for existing threads.  While we go through the
      threads, new threads may be spawned.  Cycle through the list of
      threads until we have done two iterations without finding new
@@ -350,6 +390,19 @@ linux_proc_attach_tgid_threads (pid_t pid,
 	  if (lwp != 0)
 	    {
 	      ptid_t ptid = ptid_t (pid, lwp);
+	      std::optional<ULONGEST> starttime
+		= linux_proc_get_starttime (ptid);
+
+	      if (starttime.has_value ())
+		{
+		  std::pair<unsigned long, ULONGEST> key (lwp, *starttime);
+
+		  /* If we already visited this LWP, skip it this time.  */
+		  if (visited_lwps.find (key) != visited_lwps.cend ())
+		    continue;
+
+		  visited_lwps.insert (key);
+		}
 
 	      if (attach_lwp (ptid))
 		new_threads_found = 1;
diff --git a/gdb/nat/linux-procfs.h b/gdb/nat/linux-procfs.h
index ec1f37651fbf..64224801c8f2 100644
--- a/gdb/nat/linux-procfs.h
+++ b/gdb/nat/linux-procfs.h
@@ -56,6 +56,7 @@ extern int linux_proc_pid_is_gone (pid_t pid);
 
 /* Index of fields of interest in /proc/PID/stat, from procfs(5) man page.  */
 #define LINUX_PROC_STAT_STATE 3
+#define LINUX_PROC_STAT_STARTTIME 22
 #define LINUX_PROC_STAT_PROCESSOR 39
 
 /* Returns FIELD (as numbered in procfs(5) man page) of

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread
  2024-04-20  5:56 ` [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
@ 2024-04-22 19:39   ` Pedro Alves
  2024-04-24 23:09     ` Thiago Jung Bauermann
  0 siblings, 1 reply; 8+ messages in thread
From: Pedro Alves @ 2024-04-22 19:39 UTC (permalink / raw)
  To: Thiago Jung Bauermann, gdb-patches; +Cc: Christophe Lyon, Luis Machado

On 2024-04-20 06:56, Thiago Jung Bauermann wrote:

> diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
> index 6ffabe90aa7d..079f9ea20c37 100644
> --- a/gdb/nat/linux-osdata.c
> +++ b/gdb/nat/linux-osdata.c
> @@ -52,6 +52,10 @@ typedef long long TIME_T;
>  
>  #define MAX_PID_T_STRLEN  (sizeof ("-9223372036854775808") - 1)
>  
> +/* Index of fields of interest in /proc/PID/stat, from procfs(5) man page.  */
> +#define LINUX_PROC_STAT_STATE 3
> +#define LINUX_PROC_STAT_PROCESSOR 39
> +


> +  /* If the first field after program name has index 3, then core number is
> +     the field with index 39.  These are the indexes shown in the procfs(5)
> +     man page.  */

If you're going to have macros, then the comment shouldn't be repeating the
numbers.  This is one way I thought it would be worse -- before the numbers were
all in the same four lines, there was basically no chance of getting them out of
sync.  Now they are spread out in two different places.  

For the logic below, what isn't obvious and deserves a comment is, IMO:

- the state field is that one that comes first after the command name field.
- we want to get to the "processor" field (not obvious because we call it "core").

> +  for (int i = LINUX_PROC_STAT_STATE; i <= LINUX_PROC_STAT_PROCESSOR; ++i)
>      {
>        /* Find separator.  */
>        pos = content->find_first_of (' ', pos);



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
  2024-04-20  5:56 ` [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
@ 2024-04-22 19:59   ` Pedro Alves
  2024-04-24 23:15     ` Thiago Jung Bauermann
  0 siblings, 1 reply; 8+ messages in thread
From: Pedro Alves @ 2024-04-22 19:59 UTC (permalink / raw)
  To: Thiago Jung Bauermann, gdb-patches; +Cc: Christophe Lyon, Luis Machado

On 2024-04-20 06:56, Thiago Jung Bauermann wrote:
> When GDB attaches to a multi-threaded process, it calls
> linux_proc_attach_tgid_threads () to go through all threads found in
> /proc/PID/task/ and call attach_proc_task_lwp_callback () on each of
> them.  If it does that twice without the callback reporting that a new
> thread was found, then it considers that all inferior threads have been
> found and returns.
> 
> The problem is that the callback considers any thread that it hasn't
> attached to yet as new.  This causes problems if the process has one or
> more zombie threads, because GDB can't attach to it and the loop will
> always "find" a new thread (the zombie one), and get stuck in an
> infinite loop.
> 
> This is easy to trigger (at least on aarch64-linux and powerpc64le-linux)
> with the gdb.threads/attach-many-short-lived-threads.exp testcase, because
> its test program constantly creates and finishes joinable threads so the
> chance of having zombie threads is high.
> 
> This problem causes the following failures:
> 
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: no new threads (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted on (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break break_fn (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 1 (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 2 (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 3 (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: reset timer in the inferior (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: print seconds_left (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: detach (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint always-inserted off (timeout)
> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: delete all breakpoints, watchpoints, tracepoints, and catchpoints in delete_breakpoints (timeout)
> ERROR: breakpoints not deleted
> 
> The iteration number is random, and all tests in the subsequent iterations
> fail too, because GDB is stuck in the attach command at the beginning of
> the iteration.
> 
> The solution is to make linux_proc_attach_tgid_threads () remember when it
> has already processed a given LWP and skip it in the subsequent iterations.
> 
> PR testsuite/31312
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31312
> 
> Reviewed-By: Luis Machado <luis.machado@arm.com>

Approved-By: Pedro Alves <pedro@palves.net>

BTW, after seeing the other patches after patch #1, I do agree with giving names
to the stat fields.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread
  2024-04-22 19:39   ` Pedro Alves
@ 2024-04-24 23:09     ` Thiago Jung Bauermann
  0 siblings, 0 replies; 8+ messages in thread
From: Thiago Jung Bauermann @ 2024-04-24 23:09 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Christophe Lyon, Luis Machado


Hello Pedro,

Thank you for the review!

Pedro Alves <pedro@palves.net> writes:

> On 2024-04-20 06:56, Thiago Jung Bauermann wrote:
>
>> diff --git a/gdb/nat/linux-osdata.c b/gdb/nat/linux-osdata.c
>> index 6ffabe90aa7d..079f9ea20c37 100644
>> --- a/gdb/nat/linux-osdata.c
>> +++ b/gdb/nat/linux-osdata.c
>> @@ -52,6 +52,10 @@ typedef long long TIME_T;
>>  
>>  #define MAX_PID_T_STRLEN  (sizeof ("-9223372036854775808") - 1)
>>  
>> +/* Index of fields of interest in /proc/PID/stat, from procfs(5) man page.  */
>> +#define LINUX_PROC_STAT_STATE 3
>> +#define LINUX_PROC_STAT_PROCESSOR 39
>> +
>
>
>> +  /* If the first field after program name has index 3, then core number is
>> +     the field with index 39.  These are the indexes shown in the procfs(5)
>> +     man page.  */
>
> If you're going to have macros, then the comment shouldn't be repeating the
> numbers.  This is one way I thought it would be worse -- before the numbers were
> all in the same four lines, there was basically no chance of getting them out of
> sync.  Now they are spread out in two different places.  
>
> For the logic below, what isn't obvious and deserves a comment is, IMO:
>
> - the state field is that one that comes first after the command name field.
> - we want to get to the "processor" field (not obvious because we call it "core").

Yes, good points. I reworded the comment in v3 to address them.

>> +  for (int i = LINUX_PROC_STAT_STATE; i <= LINUX_PROC_STAT_PROCESSOR; ++i)
>>      {
>>        /* Find separator.  */
>>        pos = content->find_first_of (' ', pos);

-- 
Thiago

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads
  2024-04-22 19:59   ` Pedro Alves
@ 2024-04-24 23:15     ` Thiago Jung Bauermann
  0 siblings, 0 replies; 8+ messages in thread
From: Thiago Jung Bauermann @ 2024-04-24 23:15 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Christophe Lyon, Luis Machado

Pedro Alves <pedro@palves.net> writes:

> On 2024-04-20 06:56, Thiago Jung Bauermann wrote:
>> When GDB attaches to a multi-threaded process, it calls
>> linux_proc_attach_tgid_threads () to go through all threads found in
>> /proc/PID/task/ and call attach_proc_task_lwp_callback () on each of
>> them.  If it does that twice without the callback reporting that a new
>> thread was found, then it considers that all inferior threads have been
>> found and returns.
>> 
>> The problem is that the callback considers any thread that it hasn't
>> attached to yet as new.  This causes problems if the process has one or
>> more zombie threads, because GDB can't attach to it and the loop will
>> always "find" a new thread (the zombie one), and get stuck in an
>> infinite loop.
>> 
>> This is easy to trigger (at least on aarch64-linux and powerpc64le-linux)
>> with the gdb.threads/attach-many-short-lived-threads.exp testcase, because
>> its test program constantly creates and finishes joinable threads so the
>> chance of having zombie threads is high.
>> 
>> This problem causes the following failures:
>> 
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: no new threads (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint
>> always-inserted on (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break break_fn (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 1
>> (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 2
>> (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: break at break_fn: 3
>> (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: reset timer in the
>> inferior (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: print seconds_left
>> (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: detach (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: set breakpoint
>> always-inserted off (timeout)
>> FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: delete all breakpoints,
>> watchpoints, tracepoints, and catchpoints in delete_breakpoints (timeout)
>> ERROR: breakpoints not deleted
>> 
>> The iteration number is random, and all tests in the subsequent iterations
>> fail too, because GDB is stuck in the attach command at the beginning of
>> the iteration.
>> 
>> The solution is to make linux_proc_attach_tgid_threads () remember when it
>> has already processed a given LWP and skip it in the subsequent iterations.
>> 
>> PR testsuite/31312
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31312
>> 
>> Reviewed-By: Luis Machado <luis.machado@arm.com>
>
> Approved-By: Pedro Alves <pedro@palves.net>

Thank you!

> BTW, after seeing the other patches after patch #1, I do agree with giving names
> to the stat fields.

Great. You did bring up interesting points about them though, so it was
an interesting exercise.

-- 
Thiago

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-24 23:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-20  5:56 [PATCH v2 0/3] Fix attaching to process when it has zombie threads Thiago Jung Bauermann
2024-04-20  5:56 ` [PATCH v2 1/3] gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread Thiago Jung Bauermann
2024-04-22 19:39   ` Pedro Alves
2024-04-24 23:09     ` Thiago Jung Bauermann
2024-04-20  5:56 ` [PATCH v2 2/3] gdb/nat: Factor linux_proc_get_stat_field out of linux_common_core_of_thread Thiago Jung Bauermann
2024-04-20  5:56 ` [PATCH v2 3/3] gdb/nat/linux: Fix attaching to process when it has zombie threads Thiago Jung Bauermann
2024-04-22 19:59   ` Pedro Alves
2024-04-24 23:15     ` Thiago Jung Bauermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).