public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Linux: Access memory even if threads are running
@ 2021-06-11 17:36 Pedro Alves
  2021-06-11 17:55 ` Pedro Alves
  2021-07-01 13:18 ` [PATCH] " Luis Machado
  0 siblings, 2 replies; 16+ messages in thread
From: Pedro Alves @ 2021-06-11 17:36 UTC (permalink / raw)
  To: gdb-patches

Currently, on GNU/Linux, if you try to access memory and you have a
running thread selected, GDB fails the memory accesses, like:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 Cannot access memory at address 0x555555558010

Or:

 (gdb) b main
 Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59.
 Warning:
 Cannot insert breakpoint 2.
 Cannot access memory at address 0x55555555524d

This patch removes this limitation.  It teaches the native Linux
target to read/write memory even if the target is running.  And it
does this without temporarily stopping threads.  We now get:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 $1 = 123
 (gdb) b main
 Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62.

(The scenarios above work correctly with current GDBserver, because
GDBserver temporarily stops all threads in the process whenever GDB
wants to access memory (see prepare_to_access_memory /
done_accessing_memory).  Freezing the whole process makes sense when
we need to be sure that we have a consistent view of memory and don't
race with the inferior changing it at the same time as GDB is
accessing it.  But I think that's a too-heavy hammer for the default
behavior.  I think that ideally, whether to stop all threads or not
should be policy decided by gdb core, probably best implemented by
exposing something like gdbserver's prepare_to_access_memory /
done_accessing_memory to gdb core.)

Currently, if we're accessing (reading/writing) just a few bytes, then
the Linux native backend does not try accessing memory via
/proc/<pid>/mem and goes straight to ptrace
PTRACE_PEEKTEXT/PTRACE_POKETEXT.  However, ptrace always fails when
the ptracee is running.  So the first step is to prefer
/proc/<pid>/mem even for small accesses.  Without further changes
however, that may cause a performance regression, due to constantly
opening and closing /proc/<pid>/mem for each memory access.  So the
next step is to keep the /proc/<pid>/mem file open across memory
accesses.  If we have this, then it doesn't make sense anymore to even
have the ptrace fallback, so the patch disables it.

I've made it such that GDB only ever has one /proc/<pid>/mem file open
at any time.  As long as a memory access hits the same inferior
process as the previous access, then we reuse the previously open
file.  If however, we access memory of a different process, then we
close the previous file and open a new one for the new process.

If we wanted, we could keep one /proc/<pid>/mem file open per
inferior, and never close them (unless the inferior exits or execs).
However, having seen bfd patches recently about hitting too many open
file descriptors, I kept the logic to have only one file open tops.
Also, we need to handle memory accesses for processes for which we
don't have an inferior object, for when we need to detach a
fork-child, and we'd probaly want to handle caching the open file for
that scenario (no inferior for process) too, which would probably end
up meaning caching for last non-inferior process, which is very much
what I'm proposing anyhow.  So always having one file open likely ends
up a smaller patch.

The next step is handling the case of GDB reading/writing memory
through a thread that is running and exits.  The access should not
result in a user-visible failure if the inferior/process is still
alive.

Once we manage to open a /proc/<lwpid>/mem file, then that file is
usable for memory accesses even if the corresponding lwp exits and is
reaped.  I double checked that trying to open the same
/proc/<lwpid>/mem path again fails because the lwp is really gone so
there's no /proc/<lwpid>/ entry on the filesystem anymore, but the
previously open file remains usable.  It's only when the whole process
execs that we need to reopen a new file.

When the kernel destroys the whole address space, i.e., when the
process exits or execs, the reads/writes fail with 0 aka EOF, in which
case there's nothing else to do than returning a memory access
failure.  Note this means that when we get an exec event, we need to
reopen the file, to access the process's new address space.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening it for exits before we open it and before we reap the
LWP (i.e., the LWP is zombie), the open fails with EACCES.  The patch
handles this by just looking for another thread until it finds one
that we can open a /proc/<pid>/mem successfully for.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening it had exited and we already reaped it, which is the
case if the selected thread is in THREAD_EXIT state, the open fails
with ENOENT.  The patch handles this the same way as a zombie race
(EACESS), instead of checking upfront whether we're accessing a
known-exited thread, because that would result in more complicated
code, because we also need to handle accessing lwps that are not
listed in the core thread list, and it's the core thread list that
records the THREAD_EXIT state.

The patch includes two testcases:

#1 - gdb.threads/access-mem-running.exp

  This is the conceptually simplest - it has two threads, and has GDB
  read and write memory from each of the threads.  It also tests
  setting a breakpoint while all threads are running, and checks that
  the breakpoint is hit immediately.

#2 - gdb.threads/access-mem-running-thread-exit.exp

  This one is more elaborate, as it continuously spawns short-lived
  threads in order to exercise accessing memory just while threads are
  exiting.  It also spawns two different processes and alternates
  accessing memory between the two processes to exercise the reopening
  the /proc file frequently.  This also ends up exercising GDB reading
  from an exited thread frequently.  I confirmed by putting abort()
  calls in the EACESS/ENOENT paths added by the patch that we do hit
  all of them frequently with the testcase.  It also exits the
  process's main thread (i.e., the main thread becomes zombie), to
  make sure accessing memory in such a corner-case scenario works now
  and in the future.

The tests fail on GNU/Linux native before the code changes, and pass
after.  They pass against current GDBserver, again because GDBserver
supports memory access even if all threads are running, by
transparently pausing the whole process.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	* linux-nat.c (linux_nat_target::detach): Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file
	if it was open for this process.
	(linux_nat_target::xfer_partial): Adjust.  Do not fall back to
	inf_ptrace_target::xfer_partial for memory accesses.
	(last_proc_mem_file): New.
	(maybe_close_proc_mem_file): New.
	(linux_proc_xfer_memory_partial_pid): New, with bits factored out
	from linux_proc_xfer_partial.
	(linux_proc_xfer_partial): Delete.
	(linux_proc_xfer_memory_partial): New.

gdb/testsuite/ChangeLog
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	* gdb.threads/access-mem-running.c: New.
	* gdb.threads/access-mem-running.exp: New.
	* gdb.threads/access-mem-running-thread-exit.c: New.
	* gdb.threads/access-mem-running-thread-exit.exp: New.

Change-Id: Ib3c082528872662a3fc0ca9b31c34d4876c874c9
---
 gdb/linux-nat.c                               | 262 ++++++++++++++----
 .../access-mem-running-thread-exit.c          | 123 ++++++++
 .../access-mem-running-thread-exit.exp        | 166 +++++++++++
 .../gdb.threads/access-mem-running.c          |  80 ++++++
 .../gdb.threads/access-mem-running.exp        | 132 +++++++++
 5 files changed, 716 insertions(+), 47 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running.c
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running.exp

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 34a2aee41d7..0b4107af523 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -280,6 +280,8 @@ static int lwp_status_pending_p (struct lwp_info *lp);
 
 static void save_stop_reason (struct lwp_info *lp);
 
+static void maybe_close_proc_mem_file (pid_t pid);
+
 \f
 /* LWP accessors.  */
 
@@ -1486,6 +1488,8 @@ linux_nat_target::detach (inferior *inf, int from_tty)
 
       detach_success (inf);
     }
+
+  maybe_close_proc_mem_file (pid);
 }
 
 /* Resume execution of the inferior process.  If STEP is nonzero,
@@ -2025,6 +2029,10 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
     {
       linux_nat_debug_printf ("Got exec event from LWP %ld", lp->ptid.lwp ());
 
+      /* Close the /proc/<pid>/mem file if it was open for this
+	 inferior.  */
+      maybe_close_proc_mem_file (lp->ptid.pid ());
+
       ourstatus->kind = TARGET_WAITKIND_EXECD;
       ourstatus->value.execd_pathname
 	= xstrdup (linux_proc_pid_to_exec_file (pid));
@@ -3589,6 +3597,10 @@ linux_nat_target::mourn_inferior ()
 
   purge_lwp_list (pid);
 
+  /* Close the /proc/<pid>/mem file if it was open for this
+     inferior.  */
+  maybe_close_proc_mem_file (pid);
+
   if (! forks_exist_p ())
     /* Normal case, no other forks available.  */
     inf_ptrace_target::mourn_inferior ();
@@ -3681,10 +3693,8 @@ linux_nat_xfer_osdata (enum target_object object,
 		       ULONGEST *xfered_len);
 
 static enum target_xfer_status
-linux_proc_xfer_partial (enum target_object object,
-			 const char *annex, gdb_byte *readbuf,
-			 const gdb_byte *writebuf,
-			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
+linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
+				ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
 
 enum target_xfer_status
 linux_nat_target::xfer_partial (enum target_object object,
@@ -3692,8 +3702,6 @@ linux_nat_target::xfer_partial (enum target_object object,
 				const gdb_byte *writebuf,
 				ULONGEST offset, ULONGEST len, ULONGEST *xfered_len)
 {
-  enum target_xfer_status xfer;
-
   if (object == TARGET_OBJECT_SIGNAL_INFO)
     return linux_xfer_siginfo (object, annex, readbuf, writebuf,
 			       offset, len, xfered_len);
@@ -3712,25 +3720,21 @@ linux_nat_target::xfer_partial (enum target_object object,
     return linux_nat_xfer_osdata (object, annex, readbuf, writebuf,
 				  offset, len, xfered_len);
 
-  /* GDB calculates all addresses in the largest possible address
-     width.
-     The address width must be masked before its final use - either by
-     linux_proc_xfer_partial or inf_ptrace_target::xfer_partial.
-
-     Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
-
   if (object == TARGET_OBJECT_MEMORY)
     {
+      /* GDB calculates all addresses in the largest possible address
+	 width.  The address width must be masked before its final use
+	 by linux_proc_xfer_partial.
+
+	 Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
       int addr_bit = gdbarch_addr_bit (target_gdbarch ());
 
       if (addr_bit < (sizeof (ULONGEST) * HOST_CHAR_BIT))
 	offset &= ((ULONGEST) 1 << addr_bit) - 1;
-    }
 
-  xfer = linux_proc_xfer_partial (object, annex, readbuf, writebuf,
-				  offset, len, xfered_len);
-  if (xfer != TARGET_XFER_EOF)
-    return xfer;
+      return linux_proc_xfer_memory_partial (readbuf, writebuf,
+					     offset, len, xfered_len);
+    }
 
   return inf_ptrace_target::xfer_partial (object, annex, readbuf, writebuf,
 					  offset, len, xfered_len);
@@ -3794,35 +3798,88 @@ linux_nat_target::pid_to_exec_file (int pid)
   return linux_proc_pid_to_exec_file (pid);
 }
 
-/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
-   Because we can use a single read/write call, this can be much more
-   efficient than banging away at PTRACE_PEEKTEXT.  */
+/* Keep the /proc/<pid>/mem file open between memory accesses, as a
+   cache to avoid constantly closing/opening the file in the common
+   case of multiple memory reads/writes from/to the same inferior.
+   Note we don't keep a file open per inferior to avoid keeping too
+   many file descriptors open, which can run into resource limits.  */
+static struct
+{
+  /* The LWP this open file is for.  Note that after opening the file,
+     even if the thread subsequently exits, the open file is still
+     usable for accessing memory.  It's only when the whole process
+     exits or execs that the file becomes invalid (at which point
+     reads/writes return EOF).  */
+  ptid_t ptid;
 
-static enum target_xfer_status
-linux_proc_xfer_partial (enum target_object object,
-			 const char *annex, gdb_byte *readbuf,
-			 const gdb_byte *writebuf,
-			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len)
+  /* The file descriptor.  -1 if file is not open.  */
+  int fd = -1;
+
+  /* Close FD and clear it to -1.  */
+  void close ()
+  {
+    linux_nat_debug_printf ("closing fd %d for pid %ld\n",
+			    fd, ptid.lwp ());
+    ::close (fd);
+    fd = -1;
+  }
+} last_proc_mem_file;
+
+/* Close the /proc/<pid>/mem file if its LWP matches PTID.  */
+
+static void
+maybe_close_proc_mem_file (pid_t pid)
 {
-  LONGEST ret;
-  int fd;
-  char filename[64];
+  if (last_proc_mem_file.ptid.pid () == pid)
+    last_proc_mem_file.close ();
+}
 
-  if (object != TARGET_OBJECT_MEMORY)
-    return TARGET_XFER_EOF;
+/* Helper for linux_proc_xfer_memory_partial.  Accesses /proc via
+   PTID.  Returns -1 on error, with errno set.  Returns number of
+   read/written bytes otherwise.  Returns 0 on EOF, which indicates
+   the address space is gone (because the process exited or
+   execed).  */
 
-  /* Don't bother for one word.  */
-  if (len < 3 * sizeof (long))
-    return TARGET_XFER_EOF;
+static ssize_t
+linux_proc_xfer_memory_partial_pid (ptid_t ptid,
+				    gdb_byte *readbuf, const gdb_byte *writebuf,
+				    ULONGEST offset, LONGEST len)
+{
+  ssize_t ret;
 
-  /* We could keep this file open and cache it - possibly one per
-     thread.  That requires some juggling, but is even faster.  */
-  xsnprintf (filename, sizeof filename, "/proc/%ld/mem",
-	     inferior_ptid.lwp ());
-  fd = gdb_open_cloexec (filename, ((readbuf ? O_RDONLY : O_WRONLY)
-				    | O_LARGEFILE), 0);
-  if (fd == -1)
-    return TARGET_XFER_EOF;
+  /* As long as we're hitting the same inferior, the previously open
+     file is good, even if the thread it was open for exits.  */
+  if (last_proc_mem_file.fd != -1
+      && last_proc_mem_file.ptid.pid () != ptid.pid ())
+    last_proc_mem_file.close ();
+
+  if (last_proc_mem_file.fd == -1)
+    {
+      /* Actually use /proc/<pid>/task/<lwp>/mem instead of
+	 /proc/<lwp>/mem to avoid PID-reuse races, as we may be trying
+	 to read memory via a thread which we've already reaped.  If
+	 the PID is reused for the same process it's OK.  */
+      char filename[64];
+      xsnprintf (filename, sizeof filename,
+		 "/proc/%d/task/%ld/mem", ptid.pid (), ptid.lwp ());
+
+      last_proc_mem_file.fd
+	= gdb_open_cloexec (filename, O_RDWR | O_LARGEFILE, 0);
+
+      if (last_proc_mem_file.fd == -1)
+	{
+	  linux_nat_debug_printf ("opening fd for lwp %ld failed: %s (%d)\n",
+				  ptid.lwp (),
+				  safe_strerror (errno), errno);
+	  return -1;
+	}
+      last_proc_mem_file.ptid = ptid;
+
+      linux_nat_debug_printf ("opened fd %d for lwp %ld\n",
+			      last_proc_mem_file.fd, ptid.lwp ());
+    }
+
+  int fd = last_proc_mem_file.fd;
 
   /* Use pread64/pwrite64 if available, since they save a syscall and can
      handle 64-bit offsets even on 32-bit platforms (for instance, SPARC
@@ -3837,17 +3894,128 @@ linux_proc_xfer_partial (enum target_object object,
 	   : write (fd, writebuf, len));
 #endif
 
-  close (fd);
+  if (ret == -1)
+    {
+      linux_nat_debug_printf ("accessing fd %d for pid %ld failed: %s (%d)\n",
+			      fd, ptid.lwp (),
+			      safe_strerror (errno), errno);
+    }
+  else if (ret == 0)
+    {
+      linux_nat_debug_printf ("accessing fd %d for pid %ld got EOF\n",
+			      fd, ptid.lwp ());
+    }
+
+  return ret;
+}
 
-  if (ret == -1 || ret == 0)
-    return TARGET_XFER_EOF;
+/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
+   Because we can use a single read/write call, this can be much more
+   efficient than banging away at PTRACE_PEEKTEXT.  Also, unlike
+   PTRACE_PEEKTEXT/PTRACE_POKETEXT, this works with running
+   threads.  */
+
+static enum target_xfer_status
+linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
+				ULONGEST offset, LONGEST len,
+				ULONGEST *xfered_len)
+{
+  /* Unlike PTRACE_PEEKTEXT/PTRACE_POKETEXT, reading/writing from/to
+     /proc/<pid>/mem works with running threads, and even exited
+     threads if the file was already open.  If we need to open or
+     reopen the /proc file though, we may get an EACCES error
+     ("Permission denied"), meaning the thread is gone but its exit
+     status isn't reaped yet, or ENOENT if the thread is gone and
+     already reaped.  In that case, just try opening the file for
+     another thread in the process.  If all threads fail, then it must
+     mean the whole process exited, in which case there's nothing else
+     to do and we just fail the memory access.
+
+     Note we don't simply always access via the leader thread because
+     the leader may exit without exiting the whole process.  See
+     gdb.threads/leader-exit.exp, for example.  */
+
+  /* It's frequently the case that the selected thread is stopped, and
+     is thus not likely to exit (unless something kills the process
+     outside our control, with e.g., SIGKILL).  Give that one a try
+     first.
+
+     Also, inferior_ptid may actually point at an LWP not in lwp_list.
+     This happens when we're detaching from a fork child that we don't
+     want to debug ("set detach-on-fork on"), and the breakpoints
+     module uninstalls breakpoints from the fork child.  Which process
+     to access is given by inferior_ptid.  */
+  int res = linux_proc_xfer_memory_partial_pid (inferior_ptid,
+						readbuf, writebuf,
+						offset, len);
+  if (res == 0)
+    {
+      /* EOF means the address space is gone, the whole
+	 process exited or execed.  */
+      return TARGET_XFER_EOF;
+    }
+  else if (res != -1)
+    {
+      *xfered_len = res;
+      return TARGET_XFER_OK;
+    }
   else
     {
-      *xfered_len = ret;
+      /* If we simply raced with the thread exiting (EACCES), or the
+	 current thread is THREAD_EXITED (ENOENT), try some other
+	 thread.  It's easier to handle an ENOENT failure than check
+	 for THREAD_EXIT upfront because this function is called
+	 before a thread for inferior_ptid is added to the thread
+	 list.  */
+      if (errno != EACCES && errno != ENOENT)
+	return TARGET_XFER_EOF;
+    }
+
+  int cur_pid = current_inferior ()->pid;
+
+  if (inferior_ptid.pid () != cur_pid)
+    {
+      /* We're accessing a fork child, and the access above failed.
+	 Don't bother iterating the LWP list, since there's no other
+	 LWP for this process.  */
+      return TARGET_XFER_EOF;
+    }
+
+  /* Iterate over LWPs of the current inferior, trying to access
+     memory through one of them.  */
+  for (lwp_info *lp = lwp_list; lp != nullptr; lp = lp->next)
+    {
+      if (lp->ptid.pid () != cur_pid)
+	continue;
+
+      res = linux_proc_xfer_memory_partial_pid (lp->ptid,
+						readbuf, writebuf,
+						offset, len);
+
+      if (res == 0)
+	{
+	  /* EOF means the address space is gone, the whole process
+	     exited or execed.  */
+	  return TARGET_XFER_EOF;
+	}
+      else if (res == -1)
+	{
+	  if (errno == EACCES)
+	    {
+	      /* This LWP is gone, try another one.  */
+	      continue;
+	    }
+
+	  return TARGET_XFER_EOF;
+	}
+
+      *xfered_len = res;
       return TARGET_XFER_OK;
     }
-}
 
+  /* No luck.  */
+  return TARGET_XFER_EOF;
+}
 
 /* Parse LINE as a signal set and add its set bits to SIGS.  */
 
diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
new file mode 100644
index 00000000000..5f89d223065
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
@@ -0,0 +1,123 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+
+#define THREADS 20
+
+static volatile unsigned int global_var = 123;
+
+/* Wrapper around pthread_create.   */
+
+static void
+create_thread (pthread_t *child,
+	       void *(*start_routine) (void *), void *arg)
+{
+  int rc;
+
+  while ((rc = pthread_create (child, NULL, start_routine, arg)) != 0)
+    {
+      fprintf (stderr, "unexpected error from pthread_create: %s (%d)\n",
+	       strerror (rc), rc);
+      sleep (1);
+    }
+}
+
+/* Data passed to threads on creation.  This is allocated on the heap
+   and ownership transferred from parent to child.  */
+
+struct thread_arg
+{
+  /* The thread's parent.  */
+  pthread_t parent;
+
+  /* Whether to call pthread_join on the parent.  */
+  int join_parent;
+};
+
+/* Entry point for threads.  */
+
+static void *
+thread_fn (void *arg)
+{
+  struct thread_arg *p = arg;
+
+  /* Passing no argument makes the thread exit immediately.  */
+  if (p == NULL)
+    return NULL;
+
+  if (p->join_parent)
+    assert (pthread_join (p->parent, NULL) == 0);
+
+  /* Spawn a number of threads that exit immediately, and then join
+     them.  The idea is to maximize the time window when we mostly
+     have threads exiting.  */
+  {
+    pthread_t child[THREADS];
+    int i;
+
+    /* Passing no argument makes the thread exit immediately.  */
+    for (i = 0; i < THREADS; i++)
+      create_thread (&child[i], thread_fn, NULL);
+
+    for (i = 0; i < THREADS; i++)
+      pthread_join (child[i], NULL);
+  }
+
+  /* Spawn a new thread that joins us, and exit.  The idea here is to
+     not have any thread that stays around forever.  */
+  {
+    pthread_t child;
+
+    p->parent = pthread_self ();
+    p->join_parent = 1;
+    create_thread (&child, thread_fn, p);
+  }
+
+  return NULL;
+}
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < 4; i++)
+    {
+      struct thread_arg *p;
+      pthread_t child;
+
+      p = malloc (sizeof *p);
+      p->parent = pthread_self ();
+      /* Only join the parent once.  */
+      if (i == 0)
+	p->join_parent = 1;
+      else
+	p->join_parent = 0;
+      create_thread (&child, thread_fn, p);
+    }
+
+  /* Exit the leader to make sure that we can access memory with the
+     leader gone.  */
+  pthread_exit (NULL);
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
new file mode 100644
index 00000000000..ea228e4ba13
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
@@ -0,0 +1,166 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that we can access memory while all the threads of the inferior
+# are running, and even if:
+#
+# - the leader thread exits
+# - the selected thread exits
+#
+# This test constantly spawns short lived threads to make sure that on
+# systems with debug APIs that require passing down a specific thread
+# to work with (e.g., GNU/Linux ptrace and /proc filesystem), GDB
+# copes with accessing memory just while the thread it is accessing
+# memory through exits.
+#
+# The test spawns two processes and alternates memory accesses between
+# them to force flushing per-process caches.  At the time of writing,
+# the Linux backend accesses inferior memory via /proc/<pid>/mem, and
+# keeps one such file open, as a cache.  Alternating inferiors forces
+# opening such file for a different process, which fails if GDB tries
+# to open the file for a thread that exited.  The test does ensures
+# those reopen/fail code paths are exercised.
+
+standard_testfile
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
+    return -1
+}
+
+# The test proper.  NON_STOP indicates whether we're testing in
+# non-stop, or all-stop mode.
+
+proc test { non_stop } {
+    global binfile
+    global gdb_prompt
+    global GDBFLAGS
+
+    save_vars { GDBFLAGS } {
+      append GDBFLAGS " -ex \"set non-stop $non_stop\""
+      clean_restart ${binfile}
+    }
+
+    if ![runto_main] {
+	fail "cannot run to main"
+	return -1
+    }
+
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		unsupported "can't issue commands while target is running"
+		return 0
+	    }
+	}
+    }
+
+    delete_breakpoints
+
+    # Start the second inferior.
+    with_test_prefix "second inferior" {
+	gdb_test "add-inferior -no-connection" "New inferior 2.*"
+	gdb_test "inferior 2" "Switching to inferior 2 .*"
+
+	gdb_load $binfile
+
+	if ![runto_main] {
+	    fail "cannot run to main"
+	    return -1
+	}
+    }
+
+    delete_breakpoints
+
+    # These put too much noise in the logs.
+    gdb_test_no_output "set print thread-events off"
+
+    # Continue all threads of both processes.
+    gdb_test_no_output "set schedule-multiple on"
+    if {$non_stop == "off"} {
+	set cmd "continue &"
+    } else {
+	set cmd "continue -a &"
+    }
+    gdb_test_multiple $cmd "continuing" {
+	-re "Continuing\.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Like gdb_test, but:
+    # - don't issue a pass on success.
+    # - on failure, clear the ok variable in the calling context, and
+    #   break it.
+    proc my_gdb_test {cmd pattern message} {
+	upvar inf inf
+	upvar iter iter
+	if {[gdb_test_multiple $cmd "access mem ($message, inf=$inf, iter=$iter)" {
+	    -wrap -re $pattern {
+	    }
+	}] != 0} {
+	    uplevel 1 {set ok 0}
+	    return -code break
+	}
+    }
+
+    # Hammer away for 5 seconds, alternating between inferiors.
+    set ::done 0
+    after 5000 { set ::done 1 }
+
+    set inf 1
+    set ok 1
+    set iter 0
+    while {!$::done && $ok} {
+	incr iter
+	verbose -log "xxxxx: iteration $iter"
+	gdb_test "info threads" ".*" ""
+
+	if {$inf == 1} {
+	    set inf 2
+	} else {
+	    set inf 1
+	}
+
+	my_gdb_test "inferior $inf" ".*" "inferior $inf"
+
+	my_gdb_test "print global_var = 555" " = 555" \
+	    "write to global_var"
+	my_gdb_test "print global_var" " = 555" \
+	    "print global_var after writing"
+	my_gdb_test "print global_var = 333" " = 333" \
+	    "write to global_var again"
+	my_gdb_test "print global_var" " = 333" \
+	    "print global_var after writing again"
+    }
+
+   if {$ok} {
+       pass "access mem"
+   }
+}
+
+foreach non_stop { "off" "on" } {
+    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
+    with_test_prefix "$stop_mode" {
+	test $non_stop
+    }
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running.c b/gdb/testsuite/gdb.threads/access-mem-running.c
new file mode 100644
index 00000000000..91359ee863b
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running.c
@@ -0,0 +1,80 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+
+static pthread_barrier_t barrier;
+
+static unsigned int global_counter;
+
+static volatile unsigned int global_var = 123;
+
+static void
+maybe_stop_here ()
+{
+}
+
+static void *
+child_function (void *arg)
+{
+  global_counter = 1;
+
+  pthread_barrier_wait (&barrier);
+
+  while (global_counter > 0)
+    {
+      global_counter++;
+
+      /* Less than 1s, so the counter increments at least once while
+	 the .exp sleep 1s, but slow enough that the counter doesn't
+	 wrap in 1s.  */
+      usleep (5000);
+
+      maybe_stop_here ();
+    }
+
+  pthread_exit (NULL);
+}
+
+static int
+wait_threads (void)
+{
+  return 1;
+}
+
+int
+main (void)
+{
+  pthread_t child;
+
+  pthread_barrier_init (&barrier, NULL, 2);
+
+  pthread_create (&child, NULL, child_function, NULL);
+
+  pthread_barrier_wait (&barrier);
+  wait_threads ();
+
+  pthread_join (child, NULL);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running.exp b/gdb/testsuite/gdb.threads/access-mem-running.exp
new file mode 100644
index 00000000000..fbcfa8c8633
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running.exp
@@ -0,0 +1,132 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that we can access memory while all the threads of the inferior
+# are running.
+
+standard_testfile
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
+    return -1
+}
+
+# The test proper.  NON_STOP indicates whether we're testing in
+# non-stop, or all-stop mode.
+
+proc test { non_stop } {
+    global srcfile binfile
+    global gdb_prompt
+    global GDBFLAGS
+    global decimal
+
+    save_vars { GDBFLAGS } {
+      append GDBFLAGS " -ex \"set non-stop $non_stop\""
+      clean_restart ${binfile}
+    }
+
+    if ![runto wait_threads] {
+	return -1
+    }
+
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		unsupported "can't issue commands while target is running"
+		return 0
+	    }
+	}
+    }
+
+    delete_breakpoints
+
+    if {$non_stop == "off"} {
+	set cmd "continue &"
+    } else {
+	set cmd "continue -a &"
+    }
+    gdb_test_multiple $cmd "continuing" {
+	-re "Continuing\.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Check we can read/write variables, no matter which thread is
+    # selected.
+    for {set thr 1} {$thr <= 2} {incr thr} {
+	with_test_prefix "thread $thr" {
+	    gdb_test "thread $thr" "Switching to .*"
+
+	    # Check that we can read the counter variable, and that
+	    # the counter is increasing, meaning the threads really
+	    # are running.
+
+	    sleep 1
+	    set global_counter1 \
+		[get_integer_valueof "global_counter" 0 \
+		     "get global_counter once"]
+
+	    sleep 1
+	    set global_counter2 \
+		[get_integer_valueof "global_counter" 0 \
+		     "get global_counter twice"]
+
+	    gdb_assert {$global_counter1 != 0 \
+			    && $global_counter2 != 0 \
+			    && $global_counter1 != $global_counter2} \
+		"value changed"
+
+	    # Check that we can write variables.
+
+	    gdb_test "print global_var" " = 123" \
+		"print global_var before writing"
+	    gdb_test "print global_var = 321" " = 321" \
+		"write to global_var"
+	    gdb_test "print global_var" " = 321" \
+		"print global_var after writing"
+	    gdb_test "print global_var = 123" " = 123" \
+		"write to global_var again"
+	}
+    }
+
+    # Check we can set a breakpoint while all threads are running.
+    # The breakpoint should hit immediately.
+    set any "\[^\r\n\]*"
+
+    gdb_test_multiple "b maybe_stop_here" "" {
+	-re "Breakpoint $decimal at $any: file $any${srcfile}, line $decimal.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+    gdb_test_multiple "" "breakpoint hits" {
+	-re "Thread 2 ${any}hit Breakpoint $decimal, maybe_stop_here \\(\\) at $any${srcfile}:$decimal\r\n" {
+	    pass "$gdb_test_name"
+	}
+    }
+}
+
+foreach non_stop { "off" "on" } {
+    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
+    with_test_prefix "$stop_mode" {
+	test $non_stop
+    }
+}

base-commit: 873793ae09b5dcba8c8da7345ee283f296558b8e
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-06-11 17:36 [PATCH] Linux: Access memory even if threads are running Pedro Alves
@ 2021-06-11 17:55 ` Pedro Alves
  2021-06-11 18:00   ` [PATCH v2] " Pedro Alves
  2021-07-01 13:18 ` [PATCH] " Luis Machado
  1 sibling, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-06-11 17:55 UTC (permalink / raw)
  To: gdb-patches

On 2021-06-11 6:36 p.m., Pedro Alves wrote:

> #1 - gdb.threads/access-mem-running.exp
> 
>   This is the conceptually simplest - it has two threads, and has GDB
>   read and write memory from each of the threads.  It also tests
>   setting a breakpoint while all threads are running, and checks that
>   the breakpoint is hit immediately.

Hmm, after pressing send, I realized that this testcase doesn't have to
be multithreaded -- if I make it single-threaded, then it becomes runnable
against more targets.  The multi-threading aspect it was testing is already
covered by the other testcase.  I've send a v2 with that change shortly...

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] Linux: Access memory even if threads are running
  2021-06-11 17:55 ` Pedro Alves
@ 2021-06-11 18:00   ` Pedro Alves
  2021-06-25 16:48     ` Simon Marchi
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-06-11 18:00 UTC (permalink / raw)
  To: gdb-patches

In v2:
  The access-mem-running.exp testcase is now single-threaded, and was
  moved from gdb.threads/ to gdb.base/.

Currently, on GNU/Linux, if you try to access memory and you have a
running thread selected, GDB fails the memory accesses, like:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 Cannot access memory at address 0x555555558010

Or:

 (gdb) b main
 Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59.
 Warning:
 Cannot insert breakpoint 2.
 Cannot access memory at address 0x55555555524d

This patch removes this limitation.  It teaches the native Linux
target to read/write memory even if the target is running.  And it
does this without temporarily stopping threads.  We now get:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 $1 = 123
 (gdb) b main
 Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62.

(The scenarios above work correctly with current GDBserver, because
GDBserver temporarily stops all threads in the process whenever GDB
wants to access memory (see prepare_to_access_memory /
done_accessing_memory).  Freezing the whole process makes sense when
we need to be sure that we have a consistent view of memory and don't
race with the inferior changing it at the same time as GDB is
accessing it.  But I think that's a too-heavy hammer for the default
behavior.  I think that ideally, whether to stop all threads or not
should be policy decided by gdb core, probably best implemented by
exposing something like gdbserver's prepare_to_access_memory /
done_accessing_memory to gdb core.)

Currently, if we're accessing (reading/writing) just a few bytes, then
the Linux native backend does not try accessing memory via
/proc/<pid>/mem and goes straight to ptrace
PTRACE_PEEKTEXT/PTRACE_POKETEXT.  However, ptrace always fails when
the ptracee is running.  So the first step is to prefer
/proc/<pid>/mem even for small accesses.  Without further changes
however, that may cause a performance regression, due to constantly
opening and closing /proc/<pid>/mem for each memory access.  So the
next step is to keep the /proc/<pid>/mem file open across memory
accesses.  If we have this, then it doesn't make sense anymore to even
have the ptrace fallback, so the patch disables it.

I've made it such that GDB only ever has one /proc/<pid>/mem file open
at any time.  As long as a memory access hits the same inferior
process as the previous access, then we reuse the previously open
file.  If however, we access memory of a different process, then we
close the previous file and open a new one for the new process.

If we wanted, we could keep one /proc/<pid>/mem file open per
inferior, and never close them (unless the inferior exits or execs).
However, having seen bfd patches recently about hitting too many open
file descriptors, I kept the logic to have only one file open tops.
Also, we need to handle memory accesses for processes for which we
don't have an inferior object, for when we need to detach a
fork-child, and we'd probaly want to handle caching the open file for
that scenario (no inferior for process) too, which would probably end
up meaning caching for last non-inferior process, which is very much
what I'm proposing anyhow.  So always having one file open likely ends
up a smaller patch.

The next step is handling the case of GDB reading/writing memory
through a thread that is running and exits.  The access should not
result in a user-visible failure if the inferior/process is still
alive.

Once we manage to open a /proc/<lwpid>/mem file, then that file is
usable for memory accesses even if the corresponding lwp exits and is
reaped.  I double checked that trying to open the same
/proc/<lwpid>/mem path again fails because the lwp is really gone so
there's no /proc/<lwpid>/ entry on the filesystem anymore, but the
previously open file remains usable.  It's only when the whole process
execs that we need to reopen a new file.

When the kernel destroys the whole address space, i.e., when the
process exits or execs, the reads/writes fail with 0 aka EOF, in which
case there's nothing else to do than returning a memory access
failure.  Note this means that when we get an exec event, we need to
reopen the file, to access the process's new address space.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening it for exits before we open it and before we reap the
LWP (i.e., the LWP is zombie), the open fails with EACCES.  The patch
handles this by just looking for another thread until it finds one
that we can open a /proc/<pid>/mem successfully for.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening it had exited and we already reaped it, which is the
case if the selected thread is in THREAD_EXIT state, the open fails
with ENOENT.  The patch handles this the same way as a zombie race
(EACESS), instead of checking upfront whether we're accessing a
known-exited thread, because that would result in more complicated
code, because we also need to handle accessing lwps that are not
listed in the core thread list, and it's the core thread list that
records the THREAD_EXIT state.

The patch includes two testcases:

#1 - gdb.base/access-mem-running.exp

  This is the conceptually simplest - it is single-threaded, and has
  GDB read and write memory while the program is running.  It also
  tests setting a breakpoint while the program is running, and checks
  that the breakpoint is hit immediately.

#2 - gdb.threads/access-mem-running-thread-exit.exp

  This one is more elaborate, as it continuously spawns short-lived
  threads in order to exercise accessing memory just while threads are
  exiting.  It also spawns two different processes and alternates
  accessing memory between the two processes to exercise the reopening
  the /proc file frequently.  This also ends up exercising GDB reading
  from an exited thread frequently.  I confirmed by putting abort()
  calls in the EACESS/ENOENT paths added by the patch that we do hit
  all of them frequently with the testcase.  It also exits the
  process's main thread (i.e., the main thread becomes zombie), to
  make sure accessing memory in such a corner-case scenario works now
  and in the future.

The tests fail on GNU/Linux native before the code changes, and pass
after.  They pass against current GDBserver, again because GDBserver
supports memory access even if all threads are running, by
transparently pausing the whole process.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	* linux-nat.c (linux_nat_target::detach): Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file
	if it was open for this process.
	(linux_nat_target::xfer_partial): Adjust.  Do not fall back to
	inf_ptrace_target::xfer_partial for memory accesses.
	(last_proc_mem_file): New.
	(maybe_close_proc_mem_file): New.
	(linux_proc_xfer_memory_partial_pid): New, with bits factored out
	from linux_proc_xfer_partial.
	(linux_proc_xfer_partial): Delete.
	(linux_proc_xfer_memory_partial): New.

gdb/testsuite/ChangeLog
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	* gdb.base/access-mem-running.c: New.
	* gdb.base/access-mem-running.exp: New.
	* gdb.threads/access-mem-running-thread-exit.c: New.
	* gdb.threads/access-mem-running-thread-exit.exp: New.

Change-Id: Ib3c082528872662a3fc0ca9b31c34d4876c874c9
---
 gdb/linux-nat.c                               | 262 ++++++++++++++----
 gdb/testsuite/gdb.base/access-mem-running.c   |  47 ++++
 gdb/testsuite/gdb.base/access-mem-running.exp | 124 +++++++++
 .../access-mem-running-thread-exit.c          | 123 ++++++++
 .../access-mem-running-thread-exit.exp        | 166 +++++++++++
 5 files changed, 675 insertions(+), 47 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/access-mem-running.c
 create mode 100644 gdb/testsuite/gdb.base/access-mem-running.exp
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 34a2aee41d7..0b4107af523 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -280,6 +280,8 @@ static int lwp_status_pending_p (struct lwp_info *lp);
 
 static void save_stop_reason (struct lwp_info *lp);
 
+static void maybe_close_proc_mem_file (pid_t pid);
+
 \f
 /* LWP accessors.  */
 
@@ -1486,6 +1488,8 @@ linux_nat_target::detach (inferior *inf, int from_tty)
 
       detach_success (inf);
     }
+
+  maybe_close_proc_mem_file (pid);
 }
 
 /* Resume execution of the inferior process.  If STEP is nonzero,
@@ -2025,6 +2029,10 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
     {
       linux_nat_debug_printf ("Got exec event from LWP %ld", lp->ptid.lwp ());
 
+      /* Close the /proc/<pid>/mem file if it was open for this
+	 inferior.  */
+      maybe_close_proc_mem_file (lp->ptid.pid ());
+
       ourstatus->kind = TARGET_WAITKIND_EXECD;
       ourstatus->value.execd_pathname
 	= xstrdup (linux_proc_pid_to_exec_file (pid));
@@ -3589,6 +3597,10 @@ linux_nat_target::mourn_inferior ()
 
   purge_lwp_list (pid);
 
+  /* Close the /proc/<pid>/mem file if it was open for this
+     inferior.  */
+  maybe_close_proc_mem_file (pid);
+
   if (! forks_exist_p ())
     /* Normal case, no other forks available.  */
     inf_ptrace_target::mourn_inferior ();
@@ -3681,10 +3693,8 @@ linux_nat_xfer_osdata (enum target_object object,
 		       ULONGEST *xfered_len);
 
 static enum target_xfer_status
-linux_proc_xfer_partial (enum target_object object,
-			 const char *annex, gdb_byte *readbuf,
-			 const gdb_byte *writebuf,
-			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
+linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
+				ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
 
 enum target_xfer_status
 linux_nat_target::xfer_partial (enum target_object object,
@@ -3692,8 +3702,6 @@ linux_nat_target::xfer_partial (enum target_object object,
 				const gdb_byte *writebuf,
 				ULONGEST offset, ULONGEST len, ULONGEST *xfered_len)
 {
-  enum target_xfer_status xfer;
-
   if (object == TARGET_OBJECT_SIGNAL_INFO)
     return linux_xfer_siginfo (object, annex, readbuf, writebuf,
 			       offset, len, xfered_len);
@@ -3712,25 +3720,21 @@ linux_nat_target::xfer_partial (enum target_object object,
     return linux_nat_xfer_osdata (object, annex, readbuf, writebuf,
 				  offset, len, xfered_len);
 
-  /* GDB calculates all addresses in the largest possible address
-     width.
-     The address width must be masked before its final use - either by
-     linux_proc_xfer_partial or inf_ptrace_target::xfer_partial.
-
-     Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
-
   if (object == TARGET_OBJECT_MEMORY)
     {
+      /* GDB calculates all addresses in the largest possible address
+	 width.  The address width must be masked before its final use
+	 by linux_proc_xfer_partial.
+
+	 Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
       int addr_bit = gdbarch_addr_bit (target_gdbarch ());
 
       if (addr_bit < (sizeof (ULONGEST) * HOST_CHAR_BIT))
 	offset &= ((ULONGEST) 1 << addr_bit) - 1;
-    }
 
-  xfer = linux_proc_xfer_partial (object, annex, readbuf, writebuf,
-				  offset, len, xfered_len);
-  if (xfer != TARGET_XFER_EOF)
-    return xfer;
+      return linux_proc_xfer_memory_partial (readbuf, writebuf,
+					     offset, len, xfered_len);
+    }
 
   return inf_ptrace_target::xfer_partial (object, annex, readbuf, writebuf,
 					  offset, len, xfered_len);
@@ -3794,35 +3798,88 @@ linux_nat_target::pid_to_exec_file (int pid)
   return linux_proc_pid_to_exec_file (pid);
 }
 
-/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
-   Because we can use a single read/write call, this can be much more
-   efficient than banging away at PTRACE_PEEKTEXT.  */
+/* Keep the /proc/<pid>/mem file open between memory accesses, as a
+   cache to avoid constantly closing/opening the file in the common
+   case of multiple memory reads/writes from/to the same inferior.
+   Note we don't keep a file open per inferior to avoid keeping too
+   many file descriptors open, which can run into resource limits.  */
+static struct
+{
+  /* The LWP this open file is for.  Note that after opening the file,
+     even if the thread subsequently exits, the open file is still
+     usable for accessing memory.  It's only when the whole process
+     exits or execs that the file becomes invalid (at which point
+     reads/writes return EOF).  */
+  ptid_t ptid;
 
-static enum target_xfer_status
-linux_proc_xfer_partial (enum target_object object,
-			 const char *annex, gdb_byte *readbuf,
-			 const gdb_byte *writebuf,
-			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len)
+  /* The file descriptor.  -1 if file is not open.  */
+  int fd = -1;
+
+  /* Close FD and clear it to -1.  */
+  void close ()
+  {
+    linux_nat_debug_printf ("closing fd %d for pid %ld\n",
+			    fd, ptid.lwp ());
+    ::close (fd);
+    fd = -1;
+  }
+} last_proc_mem_file;
+
+/* Close the /proc/<pid>/mem file if its LWP matches PTID.  */
+
+static void
+maybe_close_proc_mem_file (pid_t pid)
 {
-  LONGEST ret;
-  int fd;
-  char filename[64];
+  if (last_proc_mem_file.ptid.pid () == pid)
+    last_proc_mem_file.close ();
+}
 
-  if (object != TARGET_OBJECT_MEMORY)
-    return TARGET_XFER_EOF;
+/* Helper for linux_proc_xfer_memory_partial.  Accesses /proc via
+   PTID.  Returns -1 on error, with errno set.  Returns number of
+   read/written bytes otherwise.  Returns 0 on EOF, which indicates
+   the address space is gone (because the process exited or
+   execed).  */
 
-  /* Don't bother for one word.  */
-  if (len < 3 * sizeof (long))
-    return TARGET_XFER_EOF;
+static ssize_t
+linux_proc_xfer_memory_partial_pid (ptid_t ptid,
+				    gdb_byte *readbuf, const gdb_byte *writebuf,
+				    ULONGEST offset, LONGEST len)
+{
+  ssize_t ret;
 
-  /* We could keep this file open and cache it - possibly one per
-     thread.  That requires some juggling, but is even faster.  */
-  xsnprintf (filename, sizeof filename, "/proc/%ld/mem",
-	     inferior_ptid.lwp ());
-  fd = gdb_open_cloexec (filename, ((readbuf ? O_RDONLY : O_WRONLY)
-				    | O_LARGEFILE), 0);
-  if (fd == -1)
-    return TARGET_XFER_EOF;
+  /* As long as we're hitting the same inferior, the previously open
+     file is good, even if the thread it was open for exits.  */
+  if (last_proc_mem_file.fd != -1
+      && last_proc_mem_file.ptid.pid () != ptid.pid ())
+    last_proc_mem_file.close ();
+
+  if (last_proc_mem_file.fd == -1)
+    {
+      /* Actually use /proc/<pid>/task/<lwp>/mem instead of
+	 /proc/<lwp>/mem to avoid PID-reuse races, as we may be trying
+	 to read memory via a thread which we've already reaped.  If
+	 the PID is reused for the same process it's OK.  */
+      char filename[64];
+      xsnprintf (filename, sizeof filename,
+		 "/proc/%d/task/%ld/mem", ptid.pid (), ptid.lwp ());
+
+      last_proc_mem_file.fd
+	= gdb_open_cloexec (filename, O_RDWR | O_LARGEFILE, 0);
+
+      if (last_proc_mem_file.fd == -1)
+	{
+	  linux_nat_debug_printf ("opening fd for lwp %ld failed: %s (%d)\n",
+				  ptid.lwp (),
+				  safe_strerror (errno), errno);
+	  return -1;
+	}
+      last_proc_mem_file.ptid = ptid;
+
+      linux_nat_debug_printf ("opened fd %d for lwp %ld\n",
+			      last_proc_mem_file.fd, ptid.lwp ());
+    }
+
+  int fd = last_proc_mem_file.fd;
 
   /* Use pread64/pwrite64 if available, since they save a syscall and can
      handle 64-bit offsets even on 32-bit platforms (for instance, SPARC
@@ -3837,17 +3894,128 @@ linux_proc_xfer_partial (enum target_object object,
 	   : write (fd, writebuf, len));
 #endif
 
-  close (fd);
+  if (ret == -1)
+    {
+      linux_nat_debug_printf ("accessing fd %d for pid %ld failed: %s (%d)\n",
+			      fd, ptid.lwp (),
+			      safe_strerror (errno), errno);
+    }
+  else if (ret == 0)
+    {
+      linux_nat_debug_printf ("accessing fd %d for pid %ld got EOF\n",
+			      fd, ptid.lwp ());
+    }
+
+  return ret;
+}
 
-  if (ret == -1 || ret == 0)
-    return TARGET_XFER_EOF;
+/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
+   Because we can use a single read/write call, this can be much more
+   efficient than banging away at PTRACE_PEEKTEXT.  Also, unlike
+   PTRACE_PEEKTEXT/PTRACE_POKETEXT, this works with running
+   threads.  */
+
+static enum target_xfer_status
+linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
+				ULONGEST offset, LONGEST len,
+				ULONGEST *xfered_len)
+{
+  /* Unlike PTRACE_PEEKTEXT/PTRACE_POKETEXT, reading/writing from/to
+     /proc/<pid>/mem works with running threads, and even exited
+     threads if the file was already open.  If we need to open or
+     reopen the /proc file though, we may get an EACCES error
+     ("Permission denied"), meaning the thread is gone but its exit
+     status isn't reaped yet, or ENOENT if the thread is gone and
+     already reaped.  In that case, just try opening the file for
+     another thread in the process.  If all threads fail, then it must
+     mean the whole process exited, in which case there's nothing else
+     to do and we just fail the memory access.
+
+     Note we don't simply always access via the leader thread because
+     the leader may exit without exiting the whole process.  See
+     gdb.threads/leader-exit.exp, for example.  */
+
+  /* It's frequently the case that the selected thread is stopped, and
+     is thus not likely to exit (unless something kills the process
+     outside our control, with e.g., SIGKILL).  Give that one a try
+     first.
+
+     Also, inferior_ptid may actually point at an LWP not in lwp_list.
+     This happens when we're detaching from a fork child that we don't
+     want to debug ("set detach-on-fork on"), and the breakpoints
+     module uninstalls breakpoints from the fork child.  Which process
+     to access is given by inferior_ptid.  */
+  int res = linux_proc_xfer_memory_partial_pid (inferior_ptid,
+						readbuf, writebuf,
+						offset, len);
+  if (res == 0)
+    {
+      /* EOF means the address space is gone, the whole
+	 process exited or execed.  */
+      return TARGET_XFER_EOF;
+    }
+  else if (res != -1)
+    {
+      *xfered_len = res;
+      return TARGET_XFER_OK;
+    }
   else
     {
-      *xfered_len = ret;
+      /* If we simply raced with the thread exiting (EACCES), or the
+	 current thread is THREAD_EXITED (ENOENT), try some other
+	 thread.  It's easier to handle an ENOENT failure than check
+	 for THREAD_EXIT upfront because this function is called
+	 before a thread for inferior_ptid is added to the thread
+	 list.  */
+      if (errno != EACCES && errno != ENOENT)
+	return TARGET_XFER_EOF;
+    }
+
+  int cur_pid = current_inferior ()->pid;
+
+  if (inferior_ptid.pid () != cur_pid)
+    {
+      /* We're accessing a fork child, and the access above failed.
+	 Don't bother iterating the LWP list, since there's no other
+	 LWP for this process.  */
+      return TARGET_XFER_EOF;
+    }
+
+  /* Iterate over LWPs of the current inferior, trying to access
+     memory through one of them.  */
+  for (lwp_info *lp = lwp_list; lp != nullptr; lp = lp->next)
+    {
+      if (lp->ptid.pid () != cur_pid)
+	continue;
+
+      res = linux_proc_xfer_memory_partial_pid (lp->ptid,
+						readbuf, writebuf,
+						offset, len);
+
+      if (res == 0)
+	{
+	  /* EOF means the address space is gone, the whole process
+	     exited or execed.  */
+	  return TARGET_XFER_EOF;
+	}
+      else if (res == -1)
+	{
+	  if (errno == EACCES)
+	    {
+	      /* This LWP is gone, try another one.  */
+	      continue;
+	    }
+
+	  return TARGET_XFER_EOF;
+	}
+
+      *xfered_len = res;
       return TARGET_XFER_OK;
     }
-}
 
+  /* No luck.  */
+  return TARGET_XFER_EOF;
+}
 
 /* Parse LINE as a signal set and add its set bits to SIGS.  */
 
diff --git a/gdb/testsuite/gdb.base/access-mem-running.c b/gdb/testsuite/gdb.base/access-mem-running.c
new file mode 100644
index 00000000000..2bb4f9766b5
--- /dev/null
+++ b/gdb/testsuite/gdb.base/access-mem-running.c
@@ -0,0 +1,47 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <unistd.h>
+
+static unsigned int global_counter = 1;
+
+static volatile unsigned int global_var = 123;
+
+static void
+maybe_stop_here ()
+{
+}
+
+int
+main (void)
+{
+  global_counter = 1;
+
+  while (global_counter > 0)
+    {
+      global_counter++;
+
+      /* Less than 1s, so the counter increments at least once while
+	 the .exp sleep 1s, but slow enough that the counter doesn't
+	 wrap in 1s.  */
+      usleep (5000);
+
+      maybe_stop_here ();
+    }
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/access-mem-running.exp b/gdb/testsuite/gdb.base/access-mem-running.exp
new file mode 100644
index 00000000000..6990d906da2
--- /dev/null
+++ b/gdb/testsuite/gdb.base/access-mem-running.exp
@@ -0,0 +1,124 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that we can access memory while the inferior is running.
+
+standard_testfile
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug}] == -1} {
+    return -1
+}
+
+# The test proper.  NON_STOP indicates whether we're testing in
+# non-stop, or all-stop mode.
+
+proc test { non_stop } {
+    global srcfile binfile
+    global gdb_prompt
+    global GDBFLAGS
+    global decimal
+
+    save_vars { GDBFLAGS } {
+      append GDBFLAGS " -ex \"set non-stop $non_stop\""
+      clean_restart ${binfile}
+    }
+
+    if ![runto_main] {
+	return -1
+    }
+
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		unsupported "can't issue commands while target is running"
+		return 0
+	    }
+	}
+    }
+
+    delete_breakpoints
+
+    if {$non_stop == "off"} {
+	set cmd "continue &"
+    } else {
+	set cmd "continue -a &"
+    }
+    gdb_test_multiple $cmd "continuing" {
+	-re "Continuing\.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Check we can read/write variables.
+
+    # Check that we can read the counter variable, and that the
+    # counter is increasing, meaning the process really is running.
+
+    sleep 1
+    set global_counter1 \
+	[get_integer_valueof "global_counter" 0 \
+	     "get global_counter once"]
+
+    sleep 1
+    set global_counter2 \
+	[get_integer_valueof "global_counter" 0 \
+	     "get global_counter twice"]
+
+    gdb_assert {$global_counter1 != 0 \
+		    && $global_counter2 != 0 \
+		    && $global_counter1 != $global_counter2} \
+	"value changed"
+
+    # Check that we can write variables.
+
+    gdb_test "print global_var" " = 123" \
+	"print global_var before writing"
+    gdb_test "print global_var = 321" " = 321" \
+	"write to global_var"
+    gdb_test "print global_var" " = 321" \
+	"print global_var after writing"
+    gdb_test "print global_var = 123" " = 123" \
+	"write to global_var again"
+
+    # Check we can set a breakpoint while the process is running.  The
+    # breakpoint should hit immediately.
+    set any "\[^\r\n\]*"
+
+    gdb_test_multiple "b maybe_stop_here" "" {
+	-re "Breakpoint $decimal at $any: file $any${srcfile}, line $decimal.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+    gdb_test_multiple "" "breakpoint hits" {
+	-re "Breakpoint $decimal, maybe_stop_here \\(\\) at $any${srcfile}:$decimal\r\n" {
+	    pass "$gdb_test_name"
+	}
+    }
+}
+
+foreach non_stop { "off" "on" } {
+    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
+    with_test_prefix "$stop_mode" {
+	test $non_stop
+    }
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
new file mode 100644
index 00000000000..5f89d223065
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
@@ -0,0 +1,123 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+
+#define THREADS 20
+
+static volatile unsigned int global_var = 123;
+
+/* Wrapper around pthread_create.   */
+
+static void
+create_thread (pthread_t *child,
+	       void *(*start_routine) (void *), void *arg)
+{
+  int rc;
+
+  while ((rc = pthread_create (child, NULL, start_routine, arg)) != 0)
+    {
+      fprintf (stderr, "unexpected error from pthread_create: %s (%d)\n",
+	       strerror (rc), rc);
+      sleep (1);
+    }
+}
+
+/* Data passed to threads on creation.  This is allocated on the heap
+   and ownership transferred from parent to child.  */
+
+struct thread_arg
+{
+  /* The thread's parent.  */
+  pthread_t parent;
+
+  /* Whether to call pthread_join on the parent.  */
+  int join_parent;
+};
+
+/* Entry point for threads.  */
+
+static void *
+thread_fn (void *arg)
+{
+  struct thread_arg *p = arg;
+
+  /* Passing no argument makes the thread exit immediately.  */
+  if (p == NULL)
+    return NULL;
+
+  if (p->join_parent)
+    assert (pthread_join (p->parent, NULL) == 0);
+
+  /* Spawn a number of threads that exit immediately, and then join
+     them.  The idea is to maximize the time window when we mostly
+     have threads exiting.  */
+  {
+    pthread_t child[THREADS];
+    int i;
+
+    /* Passing no argument makes the thread exit immediately.  */
+    for (i = 0; i < THREADS; i++)
+      create_thread (&child[i], thread_fn, NULL);
+
+    for (i = 0; i < THREADS; i++)
+      pthread_join (child[i], NULL);
+  }
+
+  /* Spawn a new thread that joins us, and exit.  The idea here is to
+     not have any thread that stays around forever.  */
+  {
+    pthread_t child;
+
+    p->parent = pthread_self ();
+    p->join_parent = 1;
+    create_thread (&child, thread_fn, p);
+  }
+
+  return NULL;
+}
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < 4; i++)
+    {
+      struct thread_arg *p;
+      pthread_t child;
+
+      p = malloc (sizeof *p);
+      p->parent = pthread_self ();
+      /* Only join the parent once.  */
+      if (i == 0)
+	p->join_parent = 1;
+      else
+	p->join_parent = 0;
+      create_thread (&child, thread_fn, p);
+    }
+
+  /* Exit the leader to make sure that we can access memory with the
+     leader gone.  */
+  pthread_exit (NULL);
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
new file mode 100644
index 00000000000..ea228e4ba13
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
@@ -0,0 +1,166 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that we can access memory while all the threads of the inferior
+# are running, and even if:
+#
+# - the leader thread exits
+# - the selected thread exits
+#
+# This test constantly spawns short lived threads to make sure that on
+# systems with debug APIs that require passing down a specific thread
+# to work with (e.g., GNU/Linux ptrace and /proc filesystem), GDB
+# copes with accessing memory just while the thread it is accessing
+# memory through exits.
+#
+# The test spawns two processes and alternates memory accesses between
+# them to force flushing per-process caches.  At the time of writing,
+# the Linux backend accesses inferior memory via /proc/<pid>/mem, and
+# keeps one such file open, as a cache.  Alternating inferiors forces
+# opening such file for a different process, which fails if GDB tries
+# to open the file for a thread that exited.  The test does ensures
+# those reopen/fail code paths are exercised.
+
+standard_testfile
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
+    return -1
+}
+
+# The test proper.  NON_STOP indicates whether we're testing in
+# non-stop, or all-stop mode.
+
+proc test { non_stop } {
+    global binfile
+    global gdb_prompt
+    global GDBFLAGS
+
+    save_vars { GDBFLAGS } {
+      append GDBFLAGS " -ex \"set non-stop $non_stop\""
+      clean_restart ${binfile}
+    }
+
+    if ![runto_main] {
+	fail "cannot run to main"
+	return -1
+    }
+
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		unsupported "can't issue commands while target is running"
+		return 0
+	    }
+	}
+    }
+
+    delete_breakpoints
+
+    # Start the second inferior.
+    with_test_prefix "second inferior" {
+	gdb_test "add-inferior -no-connection" "New inferior 2.*"
+	gdb_test "inferior 2" "Switching to inferior 2 .*"
+
+	gdb_load $binfile
+
+	if ![runto_main] {
+	    fail "cannot run to main"
+	    return -1
+	}
+    }
+
+    delete_breakpoints
+
+    # These put too much noise in the logs.
+    gdb_test_no_output "set print thread-events off"
+
+    # Continue all threads of both processes.
+    gdb_test_no_output "set schedule-multiple on"
+    if {$non_stop == "off"} {
+	set cmd "continue &"
+    } else {
+	set cmd "continue -a &"
+    }
+    gdb_test_multiple $cmd "continuing" {
+	-re "Continuing\.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Like gdb_test, but:
+    # - don't issue a pass on success.
+    # - on failure, clear the ok variable in the calling context, and
+    #   break it.
+    proc my_gdb_test {cmd pattern message} {
+	upvar inf inf
+	upvar iter iter
+	if {[gdb_test_multiple $cmd "access mem ($message, inf=$inf, iter=$iter)" {
+	    -wrap -re $pattern {
+	    }
+	}] != 0} {
+	    uplevel 1 {set ok 0}
+	    return -code break
+	}
+    }
+
+    # Hammer away for 5 seconds, alternating between inferiors.
+    set ::done 0
+    after 5000 { set ::done 1 }
+
+    set inf 1
+    set ok 1
+    set iter 0
+    while {!$::done && $ok} {
+	incr iter
+	verbose -log "xxxxx: iteration $iter"
+	gdb_test "info threads" ".*" ""
+
+	if {$inf == 1} {
+	    set inf 2
+	} else {
+	    set inf 1
+	}
+
+	my_gdb_test "inferior $inf" ".*" "inferior $inf"
+
+	my_gdb_test "print global_var = 555" " = 555" \
+	    "write to global_var"
+	my_gdb_test "print global_var" " = 555" \
+	    "print global_var after writing"
+	my_gdb_test "print global_var = 333" " = 333" \
+	    "write to global_var again"
+	my_gdb_test "print global_var" " = 333" \
+	    "print global_var after writing again"
+    }
+
+   if {$ok} {
+       pass "access mem"
+   }
+}
+
+foreach non_stop { "off" "on" } {
+    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
+    with_test_prefix "$stop_mode" {
+	test $non_stop
+    }
+}

base-commit: 873793ae09b5dcba8c8da7345ee283f296558b8e
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] Linux: Access memory even if threads are running
  2021-06-11 18:00   ` [PATCH v2] " Pedro Alves
@ 2021-06-25 16:48     ` Simon Marchi
  2021-07-01 13:07       ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-06-25 16:48 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

Hi Pedrot,

This patch LGTM, I noted some small comments below.

On 2021-06-11 2:00 p.m., Pedro Alves wrote:
> In v2:
>   The access-mem-running.exp testcase is now single-threaded, and was
>   moved from gdb.threads/ to gdb.base/.
> 
> Currently, on GNU/Linux, if you try to access memory and you have a
> running thread selected, GDB fails the memory accesses, like:
> 
>  (gdb) c&
>  Continuing.
>  (gdb) p global_var
>  Cannot access memory at address 0x555555558010
> 
> Or:
> 
>  (gdb) b main
>  Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59.
>  Warning:
>  Cannot insert breakpoint 2.
>  Cannot access memory at address 0x55555555524d
> 
> This patch removes this limitation.  It teaches the native Linux
> target to read/write memory even if the target is running.  And it
> does this without temporarily stopping threads.  We now get:
> 
>  (gdb) c&
>  Continuing.
>  (gdb) p global_var
>  $1 = 123
>  (gdb) b main
>  Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62.
> 
> (The scenarios above work correctly with current GDBserver, because
> GDBserver temporarily stops all threads in the process whenever GDB
> wants to access memory (see prepare_to_access_memory /
> done_accessing_memory).  Freezing the whole process makes sense when
> we need to be sure that we have a consistent view of memory and don't
> race with the inferior changing it at the same time as GDB is
> accessing it.  But I think that's a too-heavy hammer for the default
> behavior.  I think that ideally, whether to stop all threads or not
> should be policy decided by gdb core, probably best implemented by
> exposing something like gdbserver's prepare_to_access_memory /
> done_accessing_memory to gdb core.)
> 
> Currently, if we're accessing (reading/writing) just a few bytes, then
> the Linux native backend does not try accessing memory via
> /proc/<pid>/mem and goes straight to ptrace
> PTRACE_PEEKTEXT/PTRACE_POKETEXT.  However, ptrace always fails when
> the ptracee is running.  So the first step is to prefer
> /proc/<pid>/mem even for small accesses.  Without further changes
> however, that may cause a performance regression, due to constantly
> opening and closing /proc/<pid>/mem for each memory access.  So the
> next step is to keep the /proc/<pid>/mem file open across memory
> accesses.  If we have this, then it doesn't make sense anymore to even
> have the ptrace fallback, so the patch disables it.
> 
> I've made it such that GDB only ever has one /proc/<pid>/mem file open
> at any time.  As long as a memory access hits the same inferior
> process as the previous access, then we reuse the previously open
> file.  If however, we access memory of a different process, then we
> close the previous file and open a new one for the new process.
> 
> If we wanted, we could keep one /proc/<pid>/mem file open per
> inferior, and never close them (unless the inferior exits or execs).
> However, having seen bfd patches recently about hitting too many open
> file descriptors, I kept the logic to have only one file open tops.
> Also, we need to handle memory accesses for processes for which we
> don't have an inferior object, for when we need to detach a
> fork-child, and we'd probaly want to handle caching the open file for
> that scenario (no inferior for process) too, which would probably end
> up meaning caching for last non-inferior process, which is very much
> what I'm proposing anyhow.  So always having one file open likely ends
> up a smaller patch.
> 
> The next step is handling the case of GDB reading/writing memory
> through a thread that is running and exits.  The access should not
> result in a user-visible failure if the inferior/process is still
> alive.
> 
> Once we manage to open a /proc/<lwpid>/mem file, then that file is
> usable for memory accesses even if the corresponding lwp exits and is
> reaped.  I double checked that trying to open the same
> /proc/<lwpid>/mem path again fails because the lwp is really gone so
> there's no /proc/<lwpid>/ entry on the filesystem anymore, but the
> previously open file remains usable.  It's only when the whole process
> execs that we need to reopen a new file.
> 
> When the kernel destroys the whole address space, i.e., when the
> process exits or execs, the reads/writes fail with 0 aka EOF, in which
> case there's nothing else to do than returning a memory access
> failure.  Note this means that when we get an exec event, we need to
> reopen the file, to access the process's new address space.
> 
> If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
> we're opening it for exits before we open it and before we reap the
> LWP (i.e., the LWP is zombie), the open fails with EACCES.  The patch
> handles this by just looking for another thread until it finds one
> that we can open a /proc/<pid>/mem successfully for.
> 
> If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
> we're opening it had exited and we already reaped it, which is the

had -> has

> case if the selected thread is in THREAD_EXIT state, the open fails
> with ENOENT.  The patch handles this the same way as a zombie race
> (EACESS), instead of checking upfront whether we're accessing a

EACESS -> EACCES

> known-exited thread, because that would result in more complicated
> code, because we also need to handle accessing lwps that are not
> listed in the core thread list, and it's the core thread list that
> records the THREAD_EXIT state.
> 
> The patch includes two testcases:
> 
> #1 - gdb.base/access-mem-running.exp
> 
>   This is the conceptually simplest - it is single-threaded, and has
>   GDB read and write memory while the program is running.  It also
>   tests setting a breakpoint while the program is running, and checks
>   that the breakpoint is hit immediately.
> 
> #2 - gdb.threads/access-mem-running-thread-exit.exp
> 
>   This one is more elaborate, as it continuously spawns short-lived
>   threads in order to exercise accessing memory just while threads are
>   exiting.  It also spawns two different processes and alternates
>   accessing memory between the two processes to exercise the reopening
>   the /proc file frequently.  This also ends up exercising GDB reading
>   from an exited thread frequently.  I confirmed by putting abort()
>   calls in the EACESS/ENOENT paths added by the patch that we do hit

EACESS -> EACCES

> +/* Keep the /proc/<pid>/mem file open between memory accesses, as a
> +   cache to avoid constantly closing/opening the file in the common
> +   case of multiple memory reads/writes from/to the same inferior.
> +   Note we don't keep a file open per inferior to avoid keeping too
> +   many file descriptors open, which can run into resource limits.  */
> +static struct
> +{
> +  /* The LWP this open file is for.  Note that after opening the file,
> +     even if the thread subsequently exits, the open file is still
> +     usable for accessing memory.  It's only when the whole process
> +     exits or execs that the file becomes invalid (at which point
> +     reads/writes return EOF).  */
> +  ptid_t ptid;

I don't think there is, but just to be sure: is there a problem with
tid reuse here?  If we open the mem file for a given ptid, that thread
exits, and another subsequent thread uses the same ptid.  As long as the
process doesn't exit/exec, the file handle will stay valid, as you
explained.

>  
> -static enum target_xfer_status
> -linux_proc_xfer_partial (enum target_object object,
> -			 const char *annex, gdb_byte *readbuf,
> -			 const gdb_byte *writebuf,
> -			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len)
> +  /* The file descriptor.  -1 if file is not open.  */
> +  int fd = -1;
> +
> +  /* Close FD and clear it to -1.  */
> +  void close ()
> +  {
> +    linux_nat_debug_printf ("closing fd %d for pid %ld\n",
> +			    fd, ptid.lwp ());

To be unambiguous and to be able to relate this to other debug messages,
I think it would be good to print both pid and lwp here.

> +static ssize_t
> +linux_proc_xfer_memory_partial_pid (ptid_t ptid,
> +				    gdb_byte *readbuf, const gdb_byte *writebuf,
> +				    ULONGEST offset, LONGEST len)
> +{
> +  ssize_t ret;
>  
> -  /* We could keep this file open and cache it - possibly one per
> -     thread.  That requires some juggling, but is even faster.  */
> -  xsnprintf (filename, sizeof filename, "/proc/%ld/mem",
> -	     inferior_ptid.lwp ());
> -  fd = gdb_open_cloexec (filename, ((readbuf ? O_RDONLY : O_WRONLY)
> -				    | O_LARGEFILE), 0);
> -  if (fd == -1)
> -    return TARGET_XFER_EOF;
> +  /* As long as we're hitting the same inferior, the previously open
> +     file is good, even if the thread it was open for exits.  */
> +  if (last_proc_mem_file.fd != -1
> +      && last_proc_mem_file.ptid.pid () != ptid.pid ())
> +    last_proc_mem_file.close ();
> +
> +  if (last_proc_mem_file.fd == -1)
> +    {
> +      /* Actually use /proc/<pid>/task/<lwp>/mem instead of
> +	 /proc/<lwp>/mem to avoid PID-reuse races, as we may be trying
> +	 to read memory via a thread which we've already reaped.  If
> +	 the PID is reused for the same process it's OK.  */
> +      char filename[64];
> +      xsnprintf (filename, sizeof filename,
> +		 "/proc/%d/task/%ld/mem", ptid.pid (), ptid.lwp ());
> +
> +      last_proc_mem_file.fd
> +	= gdb_open_cloexec (filename, O_RDWR | O_LARGEFILE, 0);
> +
> +      if (last_proc_mem_file.fd == -1)
> +	{
> +	  linux_nat_debug_printf ("opening fd for lwp %ld failed: %s (%d)\n",
> +				  ptid.lwp (),
> +				  safe_strerror (errno), errno);

Here, you might as well print the path we tried to open (`filename`):

  opening %s failed: ...

> +proc test { non_stop } {
> +    global srcfile binfile
> +    global gdb_prompt
> +    global GDBFLAGS
> +    global decimal
> +
> +    save_vars { GDBFLAGS } {
> +      append GDBFLAGS " -ex \"set non-stop $non_stop\""
> +      clean_restart ${binfile}
> +    }

I'm starting to think that clean_restart should have a switch:

  clean_restart ${binfile} -append-gdbflags "..."

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] Linux: Access memory even if threads are running
  2021-06-25 16:48     ` Simon Marchi
@ 2021-07-01 13:07       ` Pedro Alves
  2021-07-01 13:25         ` Tom Tromey
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-01 13:07 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

Hi Simon,

Thanks for the review.

On 2021-06-25 5:48 p.m., Simon Marchi wrote:

>> If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
>> we're opening it had exited and we already reaped it, which is the
> 
> had -> has

I changed, though I think past perfect also worked.  I don't know why
I put the "it" in there though, I removed it now:

    If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
    we're opening has exited and we already reaped it, which is the case

> 
>> case if the selected thread is in THREAD_EXIT state, the open fails
>> with ENOENT.  The patch handles this the same way as a zombie race
>> (EACESS), instead of checking upfront whether we're accessing a
> 
> EACESS -> EACCES

Thanks, fixed throughout.

> I don't think there is, but just to be sure: is there a problem with
> tid reuse here?  If we open the mem file for a given ptid, that thread
> exits, and another subsequent thread uses the same ptid.  As long as the
> process doesn't exit/exec, the file handle will stay valid, as you
> explained.

Right, the pid is only used by the kernel when the file is opened.  Internally in the
kernel, the file holds a reference to the mm (address space).  At  read/lseek time, the
pid is not used at all.  This is all in fs/proc/base.c in the kernel, see the
proc_mem_operations callbacks.

> 
>>  
>> -static enum target_xfer_status
>> -linux_proc_xfer_partial (enum target_object object,
>> -			 const char *annex, gdb_byte *readbuf,
>> -			 const gdb_byte *writebuf,
>> -			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len)
>> +  /* The file descriptor.  -1 if file is not open.  */
>> +  int fd = -1;
>> +
>> +  /* Close FD and clear it to -1.  */
>> +  void close ()
>> +  {
>> +    linux_nat_debug_printf ("closing fd %d for pid %ld\n",
>> +			    fd, ptid.lwp ());
> 
> To be unambiguous and to be able to relate this to other debug messages,
> I think it would be good to print both pid and lwp here.

I've made it print the full path here:

+    linux_nat_debug_printf ("closing fd %d for /proc/%d/task/%ld/mem\n",
+                           fd, ptid.pid (), ptid.lwp ());

>> +      if (last_proc_mem_file.fd == -1)
>> +	{
>> +	  linux_nat_debug_printf ("opening fd for lwp %ld failed: %s (%d)\n",
>> +				  ptid.lwp (),
>> +				  safe_strerror (errno), errno);
> 
> Here, you might as well print the path we tried to open (`filename`):
> 
>   opening %s failed: ...

Indeed, I've done that now, thanks:

+         linux_nat_debug_printf ("opening %s failed: %s (%d)\n",
+                                 filename, safe_strerror (errno), errno);


I've now merged the patch, as below.

BTW, I forgot to mention earlier, but the first thing I tried was to use
process_vm_readv/process_vm_writev but those turn out useless for us,
because they don't bypass memory protections, i.e., you can't write
breakpoints with them, for example.

From 05c06f318fd9a112529dfc313e6512b399a645e4 Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Fri, 11 Jun 2021 17:56:32 +0100
Subject: [PATCH] Linux: Access memory even if threads are running

Currently, on GNU/Linux, if you try to access memory and you have a
running thread selected, GDB fails the memory accesses, like:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 Cannot access memory at address 0x555555558010

Or:

 (gdb) b main
 Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59.
 Warning:
 Cannot insert breakpoint 2.
 Cannot access memory at address 0x55555555524d

This patch removes this limitation.  It teaches the native Linux
target to read/write memory even if the target is running.  And it
does this without temporarily stopping threads.  We now get:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 $1 = 123
 (gdb) b main
 Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62.

(The scenarios above work correctly with current GDBserver, because
GDBserver temporarily stops all threads in the process whenever GDB
wants to access memory (see prepare_to_access_memory /
done_accessing_memory).  Freezing the whole process makes sense when
we need to be sure that we have a consistent view of memory and don't
race with the inferior changing it at the same time as GDB is
accessing it.  But I think that's a too-heavy hammer for the default
behavior.  I think that ideally, whether to stop all threads or not
should be policy decided by gdb core, probably best implemented by
exposing something like gdbserver's prepare_to_access_memory /
done_accessing_memory to gdb core.)

Currently, if we're accessing (reading/writing) just a few bytes, then
the Linux native backend does not try accessing memory via
/proc/<pid>/mem and goes straight to ptrace
PTRACE_PEEKTEXT/PTRACE_POKETEXT.  However, ptrace always fails when
the ptracee is running.  So the first step is to prefer
/proc/<pid>/mem even for small accesses.  Without further changes
however, that may cause a performance regression, due to constantly
opening and closing /proc/<pid>/mem for each memory access.  So the
next step is to keep the /proc/<pid>/mem file open across memory
accesses.  If we have this, then it doesn't make sense anymore to even
have the ptrace fallback, so the patch disables it.

I've made it such that GDB only ever has one /proc/<pid>/mem file open
at any time.  As long as a memory access hits the same inferior
process as the previous access, then we reuse the previously open
file.  If however, we access memory of a different process, then we
close the previous file and open a new one for the new process.

If we wanted, we could keep one /proc/<pid>/mem file open per
inferior, and never close them (unless the inferior exits or execs).
However, having seen bfd patches recently about hitting too many open
file descriptors, I kept the logic to have only one file open tops.
Also, we need to handle memory accesses for processes for which we
don't have an inferior object, for when we need to detach a
fork-child, and we'd probaly want to handle caching the open file for
that scenario (no inferior for process) too, which would probably end
up meaning caching for last non-inferior process, which is very much
what I'm proposing anyhow.  So always having one file open likely ends
up a smaller patch.

The next step is handling the case of GDB reading/writing memory
through a thread that is running and exits.  The access should not
result in a user-visible failure if the inferior/process is still
alive.

Once we manage to open a /proc/<lwpid>/mem file, then that file is
usable for memory accesses even if the corresponding lwp exits and is
reaped.  I double checked that trying to open the same
/proc/<lwpid>/mem path again fails because the lwp is really gone so
there's no /proc/<lwpid>/ entry on the filesystem anymore, but the
previously open file remains usable.  It's only when the whole process
execs that we need to reopen a new file.

When the kernel destroys the whole address space, i.e., when the
process exits or execs, the reads/writes fail with 0 aka EOF, in which
case there's nothing else to do than returning a memory access
failure.  Note this means that when we get an exec event, we need to
reopen the file, to access the process's new address space.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening it for exits before we open it and before we reap the
LWP (i.e., the LWP is zombie), the open fails with EACCES.  The patch
handles this by just looking for another thread until it finds one
that we can open a /proc/<pid>/mem successfully for.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening has exited and we already reaped it, which is the case
if the selected thread is in THREAD_EXIT state, the open fails with
ENOENT.  The patch handles this the same way as a zombie race
(EACCES), instead of checking upfront whether we're accessing a
known-exited thread, because that would result in more complicated
code, because we also need to handle accessing lwps that are not
listed in the core thread list, and it's the core thread list that
records the THREAD_EXIT state.

The patch includes two testcases:

#1 - gdb.base/access-mem-running.exp

  This is the conceptually simplest - it is single-threaded, and has
  GDB read and write memory while the program is running.  It also
  tests setting a breakpoint while the program is running, and checks
  that the breakpoint is hit immediately.

#2 - gdb.threads/access-mem-running-thread-exit.exp

  This one is more elaborate, as it continuously spawns short-lived
  threads in order to exercise accessing memory just while threads are
  exiting.  It also spawns two different processes and alternates
  accessing memory between the two processes to exercise the reopening
  the /proc file frequently.  This also ends up exercising GDB reading
  from an exited thread frequently.  I confirmed by putting abort()
  calls in the EACCES/ENOENT paths added by the patch that we do hit
  all of them frequently with the testcase.  It also exits the
  process's main thread (i.e., the main thread becomes zombie), to
  make sure accessing memory in such a corner-case scenario works now
  and in the future.

The tests fail on GNU/Linux native before the code changes, and pass
after.  They pass against current GDBserver, again because GDBserver
supports memory access even if all threads are running, by
transparently pausing the whole process.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR mi/15729
	PR gdb/13463
	* linux-nat.c (linux_nat_target::detach): Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file
	if it was open for this process.
	(linux_nat_target::xfer_partial): Adjust.  Do not fall back to
	inf_ptrace_target::xfer_partial for memory accesses.
	(last_proc_mem_file): New.
	(maybe_close_proc_mem_file): New.
	(linux_proc_xfer_memory_partial_pid): New, with bits factored out
	from linux_proc_xfer_partial.
	(linux_proc_xfer_partial): Delete.
	(linux_proc_xfer_memory_partial): New.

gdb/testsuite/ChangeLog
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR mi/15729
	PR gdb/13463
	* gdb.base/access-mem-running.c: New.
	* gdb.base/access-mem-running.exp: New.
	* gdb.threads/access-mem-running-thread-exit.c: New.
	* gdb.threads/access-mem-running-thread-exit.exp: New.

Change-Id: Ib3c082528872662a3fc0ca9b31c34d4876c874c9
---
 gdb/ChangeLog                                 |  19 ++
 gdb/linux-nat.c                               | 265 ++++++++++++++----
 gdb/testsuite/ChangeLog                       |   9 +
 gdb/testsuite/gdb.base/access-mem-running.c   |  47 ++++
 gdb/testsuite/gdb.base/access-mem-running.exp | 124 ++++++++
 .../access-mem-running-thread-exit.c          | 123 ++++++++
 .../access-mem-running-thread-exit.exp        | 166 +++++++++++
 7 files changed, 706 insertions(+), 47 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/access-mem-running.c
 create mode 100644 gdb/testsuite/gdb.base/access-mem-running.exp
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
 create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 699bf6fe568..d1091792b26 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,22 @@
+2021-07-01  Pedro Alves  <pedro@palves.net>
+
+	PR mi/15729
+	PR gdb/13463
+	* linux-nat.c (linux_nat_target::detach): Close the
+	/proc/<pid>/mem file if it was open for this process.
+	(linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the
+	/proc/<pid>/mem file if it was open for this process.
+	(linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file
+	if it was open for this process.
+	(linux_nat_target::xfer_partial): Adjust.  Do not fall back to
+	inf_ptrace_target::xfer_partial for memory accesses.
+	(last_proc_mem_file): New.
+	(maybe_close_proc_mem_file): New.
+	(linux_proc_xfer_memory_partial_pid): New, with bits factored out
+	from linux_proc_xfer_partial.
+	(linux_proc_xfer_partial): Delete.
+	(linux_proc_xfer_memory_partial): New.
+
 2021-06-29  Simon Marchi  <simon.marchi@polymtl.ca>
 
 	* frame.h (FRAME_SCOPED_DEBUG_ENTER_EXIT): New.
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 34a2aee41d7..f206b874929 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -280,6 +280,8 @@ static int lwp_status_pending_p (struct lwp_info *lp);
 
 static void save_stop_reason (struct lwp_info *lp);
 
+static void maybe_close_proc_mem_file (pid_t pid);
+
 \f
 /* LWP accessors.  */
 
@@ -1486,6 +1488,8 @@ linux_nat_target::detach (inferior *inf, int from_tty)
 
       detach_success (inf);
     }
+
+  maybe_close_proc_mem_file (pid);
 }
 
 /* Resume execution of the inferior process.  If STEP is nonzero,
@@ -2025,6 +2029,10 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
     {
       linux_nat_debug_printf ("Got exec event from LWP %ld", lp->ptid.lwp ());
 
+      /* Close the /proc/<pid>/mem file if it was open for this
+	 inferior.  */
+      maybe_close_proc_mem_file (lp->ptid.pid ());
+
       ourstatus->kind = TARGET_WAITKIND_EXECD;
       ourstatus->value.execd_pathname
 	= xstrdup (linux_proc_pid_to_exec_file (pid));
@@ -3589,6 +3597,10 @@ linux_nat_target::mourn_inferior ()
 
   purge_lwp_list (pid);
 
+  /* Close the /proc/<pid>/mem file if it was open for this
+     inferior.  */
+  maybe_close_proc_mem_file (pid);
+
   if (! forks_exist_p ())
     /* Normal case, no other forks available.  */
     inf_ptrace_target::mourn_inferior ();
@@ -3681,10 +3693,8 @@ linux_nat_xfer_osdata (enum target_object object,
 		       ULONGEST *xfered_len);
 
 static enum target_xfer_status
-linux_proc_xfer_partial (enum target_object object,
-			 const char *annex, gdb_byte *readbuf,
-			 const gdb_byte *writebuf,
-			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
+linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
+				ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
 
 enum target_xfer_status
 linux_nat_target::xfer_partial (enum target_object object,
@@ -3692,8 +3702,6 @@ linux_nat_target::xfer_partial (enum target_object object,
 				const gdb_byte *writebuf,
 				ULONGEST offset, ULONGEST len, ULONGEST *xfered_len)
 {
-  enum target_xfer_status xfer;
-
   if (object == TARGET_OBJECT_SIGNAL_INFO)
     return linux_xfer_siginfo (object, annex, readbuf, writebuf,
 			       offset, len, xfered_len);
@@ -3712,25 +3720,21 @@ linux_nat_target::xfer_partial (enum target_object object,
     return linux_nat_xfer_osdata (object, annex, readbuf, writebuf,
 				  offset, len, xfered_len);
 
-  /* GDB calculates all addresses in the largest possible address
-     width.
-     The address width must be masked before its final use - either by
-     linux_proc_xfer_partial or inf_ptrace_target::xfer_partial.
-
-     Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
-
   if (object == TARGET_OBJECT_MEMORY)
     {
+      /* GDB calculates all addresses in the largest possible address
+	 width.  The address width must be masked before its final use
+	 by linux_proc_xfer_partial.
+
+	 Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
       int addr_bit = gdbarch_addr_bit (target_gdbarch ());
 
       if (addr_bit < (sizeof (ULONGEST) * HOST_CHAR_BIT))
 	offset &= ((ULONGEST) 1 << addr_bit) - 1;
-    }
 
-  xfer = linux_proc_xfer_partial (object, annex, readbuf, writebuf,
-				  offset, len, xfered_len);
-  if (xfer != TARGET_XFER_EOF)
-    return xfer;
+      return linux_proc_xfer_memory_partial (readbuf, writebuf,
+					     offset, len, xfered_len);
+    }
 
   return inf_ptrace_target::xfer_partial (object, annex, readbuf, writebuf,
 					  offset, len, xfered_len);
@@ -3794,35 +3798,91 @@ linux_nat_target::pid_to_exec_file (int pid)
   return linux_proc_pid_to_exec_file (pid);
 }
 
-/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
-   Because we can use a single read/write call, this can be much more
-   efficient than banging away at PTRACE_PEEKTEXT.  */
+/* Keep the /proc/<pid>/mem file open between memory accesses, as a
+   cache to avoid constantly closing/opening the file in the common
+   case of multiple memory reads/writes from/to the same inferior.
+   Note we don't keep a file open per inferior to avoid keeping too
+   many file descriptors open, which can run into resource limits.  */
+static struct
+{
+  /* The LWP this open file is for.  Note that after opening the file,
+     even if the thread subsequently exits, the open file is still
+     usable for accessing memory.  It's only when the whole process
+     exits or execs that the file becomes invalid (at which point
+     reads/writes return EOF).  */
+  ptid_t ptid;
 
-static enum target_xfer_status
-linux_proc_xfer_partial (enum target_object object,
-			 const char *annex, gdb_byte *readbuf,
-			 const gdb_byte *writebuf,
-			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len)
+  /* The file descriptor.  -1 if file is not open.  */
+  int fd = -1;
+
+  /* Close FD and clear it to -1.  */
+  void close ()
+  {
+    linux_nat_debug_printf ("closing fd %d for /proc/%d/task/%ld/mem\n",
+			    fd, ptid.pid (), ptid.lwp ());
+    ::close (fd);
+    fd = -1;
+  }
+} last_proc_mem_file;
+
+/* Close the /proc/<pid>/mem file if its LWP matches PTID.  */
+
+static void
+maybe_close_proc_mem_file (pid_t pid)
 {
-  LONGEST ret;
-  int fd;
-  char filename[64];
+  if (last_proc_mem_file.ptid.pid () == pid)
+    last_proc_mem_file.close ();
+}
 
-  if (object != TARGET_OBJECT_MEMORY)
-    return TARGET_XFER_EOF;
+/* Helper for linux_proc_xfer_memory_partial.  Accesses /proc via
+   PTID.  Returns -1 on error, with errno set.  Returns number of
+   read/written bytes otherwise.  Returns 0 on EOF, which indicates
+   the address space is gone (because the process exited or
+   execed).  */
 
-  /* Don't bother for one word.  */
-  if (len < 3 * sizeof (long))
-    return TARGET_XFER_EOF;
+static ssize_t
+linux_proc_xfer_memory_partial_pid (ptid_t ptid,
+				    gdb_byte *readbuf, const gdb_byte *writebuf,
+				    ULONGEST offset, LONGEST len)
+{
+  ssize_t ret;
 
-  /* We could keep this file open and cache it - possibly one per
-     thread.  That requires some juggling, but is even faster.  */
-  xsnprintf (filename, sizeof filename, "/proc/%ld/mem",
-	     inferior_ptid.lwp ());
-  fd = gdb_open_cloexec (filename, ((readbuf ? O_RDONLY : O_WRONLY)
-				    | O_LARGEFILE), 0);
-  if (fd == -1)
-    return TARGET_XFER_EOF;
+  /* As long as we're hitting the same inferior, the previously open
+     file is good, even if the thread it was open for exits.  */
+  if (last_proc_mem_file.fd != -1
+      && last_proc_mem_file.ptid.pid () != ptid.pid ())
+    last_proc_mem_file.close ();
+
+  if (last_proc_mem_file.fd == -1)
+    {
+      /* Actually use /proc/<pid>/task/<lwp>/mem instead of
+	 /proc/<lwp>/mem to avoid PID-reuse races, as we may be trying
+	 to read memory via a thread which we've already reaped.
+	 /proc/<lwp>/mem could open a file for the wrong process.  If
+	 the LWPID is reused for the same process it's OK, we can read
+	 memory through it just fine.  If the LWPID is reused for a
+	 different process, then the open will fail because the path
+	 won't exist.  */
+      char filename[64];
+      xsnprintf (filename, sizeof filename,
+		 "/proc/%d/task/%ld/mem", ptid.pid (), ptid.lwp ());
+
+      last_proc_mem_file.fd
+	= gdb_open_cloexec (filename, O_RDWR | O_LARGEFILE, 0);
+
+      if (last_proc_mem_file.fd == -1)
+	{
+	  linux_nat_debug_printf ("opening %s failed: %s (%d)\n",
+				  filename, safe_strerror (errno), errno);
+	  return -1;
+	}
+      last_proc_mem_file.ptid = ptid;
+
+      linux_nat_debug_printf ("opened fd %d for %s\n",
+			      last_proc_mem_file.fd, filename);
+    }
+
+  int fd = last_proc_mem_file.fd;
 
   /* Use pread64/pwrite64 if available, since they save a syscall and can
      handle 64-bit offsets even on 32-bit platforms (for instance, SPARC
@@ -3837,17 +3897,128 @@ linux_proc_xfer_partial (enum target_object object,
 	   : write (fd, writebuf, len));
 #endif
 
-  close (fd);
+  if (ret == -1)
+    {
+      linux_nat_debug_printf ("accessing fd %d for pid %ld failed: %s (%d)\n",
+			      fd, ptid.lwp (),
+			      safe_strerror (errno), errno);
+    }
+  else if (ret == 0)
+    {
+      linux_nat_debug_printf ("accessing fd %d for pid %ld got EOF\n",
+			      fd, ptid.lwp ());
+    }
+
+  return ret;
+}
 
-  if (ret == -1 || ret == 0)
-    return TARGET_XFER_EOF;
+/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
+   Because we can use a single read/write call, this can be much more
+   efficient than banging away at PTRACE_PEEKTEXT.  Also, unlike
+   PTRACE_PEEKTEXT/PTRACE_POKETEXT, this works with running
+   threads.  */
+
+static enum target_xfer_status
+linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
+				ULONGEST offset, LONGEST len,
+				ULONGEST *xfered_len)
+{
+  /* Unlike PTRACE_PEEKTEXT/PTRACE_POKETEXT, reading/writing from/to
+     /proc/<pid>/mem works with running threads, and even exited
+     threads if the file was already open.  If we need to open or
+     reopen the /proc file though, we may get an EACCES error
+     ("Permission denied"), meaning the thread is gone but its exit
+     status isn't reaped yet, or ENOENT if the thread is gone and
+     already reaped.  In that case, just try opening the file for
+     another thread in the process.  If all threads fail, then it must
+     mean the whole process exited, in which case there's nothing else
+     to do and we just fail the memory access.
+
+     Note we don't simply always access via the leader thread because
+     the leader may exit without exiting the whole process.  See
+     gdb.threads/leader-exit.exp, for example.  */
+
+  /* It's frequently the case that the selected thread is stopped, and
+     is thus not likely to exit (unless something kills the process
+     outside our control, with e.g., SIGKILL).  Give that one a try
+     first.
+
+     Also, inferior_ptid may actually point at an LWP not in lwp_list.
+     This happens when we're detaching from a fork child that we don't
+     want to debug ("set detach-on-fork on"), and the breakpoints
+     module uninstalls breakpoints from the fork child.  Which process
+     to access is given by inferior_ptid.  */
+  int res = linux_proc_xfer_memory_partial_pid (inferior_ptid,
+						readbuf, writebuf,
+						offset, len);
+  if (res == 0)
+    {
+      /* EOF means the address space is gone, the whole
+	 process exited or execed.  */
+      return TARGET_XFER_EOF;
+    }
+  else if (res != -1)
+    {
+      *xfered_len = res;
+      return TARGET_XFER_OK;
+    }
   else
     {
-      *xfered_len = ret;
+      /* If we simply raced with the thread exiting (EACCES), or the
+	 current thread is THREAD_EXITED (ENOENT), try some other
+	 thread.  It's easier to handle an ENOENT failure than check
+	 for THREAD_EXIT upfront because this function is called
+	 before a thread for inferior_ptid is added to the thread
+	 list.  */
+      if (errno != EACCES && errno != ENOENT)
+	return TARGET_XFER_EOF;
+    }
+
+  int cur_pid = current_inferior ()->pid;
+
+  if (inferior_ptid.pid () != cur_pid)
+    {
+      /* We're accessing a fork child, and the access above failed.
+	 Don't bother iterating the LWP list, since there's no other
+	 LWP for this process.  */
+      return TARGET_XFER_EOF;
+    }
+
+  /* Iterate over LWPs of the current inferior, trying to access
+     memory through one of them.  */
+  for (lwp_info *lp = lwp_list; lp != nullptr; lp = lp->next)
+    {
+      if (lp->ptid.pid () != cur_pid)
+	continue;
+
+      res = linux_proc_xfer_memory_partial_pid (lp->ptid,
+						readbuf, writebuf,
+						offset, len);
+
+      if (res == 0)
+	{
+	  /* EOF means the address space is gone, the whole process
+	     exited or execed.  */
+	  return TARGET_XFER_EOF;
+	}
+      else if (res == -1)
+	{
+	  if (errno == EACCES)
+	    {
+	      /* This LWP is gone, try another one.  */
+	      continue;
+	    }
+
+	  return TARGET_XFER_EOF;
+	}
+
+      *xfered_len = res;
       return TARGET_XFER_OK;
     }
-}
 
+  /* No luck.  */
+  return TARGET_XFER_EOF;
+}
 
 /* Parse LINE as a signal set and add its set bits to SIGS.  */
 
diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog
index 38d8bbccd6c..67fa42cbe3b 100644
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@@ -1,3 +1,12 @@
+2021-07-01  Pedro Alves  <pedro@palves.net>
+
+	PR mi/15729
+	PR gdb/13463
+	* gdb.base/access-mem-running.c: New.
+	* gdb.base/access-mem-running.exp: New.
+	* gdb.threads/access-mem-running-thread-exit.c: New.
+	* gdb.threads/access-mem-running-thread-exit.exp: New.
+
 2021-06-29  Simon Marchi  <simon.marchi@polymtl.ca>
 
 	* gdb.dwarf2/dw2-reg-undefined.exp: Update regexp.
diff --git a/gdb/testsuite/gdb.base/access-mem-running.c b/gdb/testsuite/gdb.base/access-mem-running.c
new file mode 100644
index 00000000000..2bb4f9766b5
--- /dev/null
+++ b/gdb/testsuite/gdb.base/access-mem-running.c
@@ -0,0 +1,47 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <unistd.h>
+
+static unsigned int global_counter = 1;
+
+static volatile unsigned int global_var = 123;
+
+static void
+maybe_stop_here ()
+{
+}
+
+int
+main (void)
+{
+  global_counter = 1;
+
+  while (global_counter > 0)
+    {
+      global_counter++;
+
+      /* Less than 1s, so the counter increments at least once while
+	 the .exp sleep 1s, but slow enough that the counter doesn't
+	 wrap in 1s.  */
+      usleep (5000);
+
+      maybe_stop_here ();
+    }
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/access-mem-running.exp b/gdb/testsuite/gdb.base/access-mem-running.exp
new file mode 100644
index 00000000000..6990d906da2
--- /dev/null
+++ b/gdb/testsuite/gdb.base/access-mem-running.exp
@@ -0,0 +1,124 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that we can access memory while the inferior is running.
+
+standard_testfile
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug}] == -1} {
+    return -1
+}
+
+# The test proper.  NON_STOP indicates whether we're testing in
+# non-stop, or all-stop mode.
+
+proc test { non_stop } {
+    global srcfile binfile
+    global gdb_prompt
+    global GDBFLAGS
+    global decimal
+
+    save_vars { GDBFLAGS } {
+      append GDBFLAGS " -ex \"set non-stop $non_stop\""
+      clean_restart ${binfile}
+    }
+
+    if ![runto_main] {
+	return -1
+    }
+
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		unsupported "can't issue commands while target is running"
+		return 0
+	    }
+	}
+    }
+
+    delete_breakpoints
+
+    if {$non_stop == "off"} {
+	set cmd "continue &"
+    } else {
+	set cmd "continue -a &"
+    }
+    gdb_test_multiple $cmd "continuing" {
+	-re "Continuing\.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Check we can read/write variables.
+
+    # Check that we can read the counter variable, and that the
+    # counter is increasing, meaning the process really is running.
+
+    sleep 1
+    set global_counter1 \
+	[get_integer_valueof "global_counter" 0 \
+	     "get global_counter once"]
+
+    sleep 1
+    set global_counter2 \
+	[get_integer_valueof "global_counter" 0 \
+	     "get global_counter twice"]
+
+    gdb_assert {$global_counter1 != 0 \
+		    && $global_counter2 != 0 \
+		    && $global_counter1 != $global_counter2} \
+	"value changed"
+
+    # Check that we can write variables.
+
+    gdb_test "print global_var" " = 123" \
+	"print global_var before writing"
+    gdb_test "print global_var = 321" " = 321" \
+	"write to global_var"
+    gdb_test "print global_var" " = 321" \
+	"print global_var after writing"
+    gdb_test "print global_var = 123" " = 123" \
+	"write to global_var again"
+
+    # Check we can set a breakpoint while the process is running.  The
+    # breakpoint should hit immediately.
+    set any "\[^\r\n\]*"
+
+    gdb_test_multiple "b maybe_stop_here" "" {
+	-re "Breakpoint $decimal at $any: file $any${srcfile}, line $decimal.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+    gdb_test_multiple "" "breakpoint hits" {
+	-re "Breakpoint $decimal, maybe_stop_here \\(\\) at $any${srcfile}:$decimal\r\n" {
+	    pass "$gdb_test_name"
+	}
+    }
+}
+
+foreach non_stop { "off" "on" } {
+    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
+    with_test_prefix "$stop_mode" {
+	test $non_stop
+    }
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
new file mode 100644
index 00000000000..5f89d223065
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
@@ -0,0 +1,123 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+
+#define THREADS 20
+
+static volatile unsigned int global_var = 123;
+
+/* Wrapper around pthread_create.   */
+
+static void
+create_thread (pthread_t *child,
+	       void *(*start_routine) (void *), void *arg)
+{
+  int rc;
+
+  while ((rc = pthread_create (child, NULL, start_routine, arg)) != 0)
+    {
+      fprintf (stderr, "unexpected error from pthread_create: %s (%d)\n",
+	       strerror (rc), rc);
+      sleep (1);
+    }
+}
+
+/* Data passed to threads on creation.  This is allocated on the heap
+   and ownership transferred from parent to child.  */
+
+struct thread_arg
+{
+  /* The thread's parent.  */
+  pthread_t parent;
+
+  /* Whether to call pthread_join on the parent.  */
+  int join_parent;
+};
+
+/* Entry point for threads.  */
+
+static void *
+thread_fn (void *arg)
+{
+  struct thread_arg *p = arg;
+
+  /* Passing no argument makes the thread exit immediately.  */
+  if (p == NULL)
+    return NULL;
+
+  if (p->join_parent)
+    assert (pthread_join (p->parent, NULL) == 0);
+
+  /* Spawn a number of threads that exit immediately, and then join
+     them.  The idea is to maximize the time window when we mostly
+     have threads exiting.  */
+  {
+    pthread_t child[THREADS];
+    int i;
+
+    /* Passing no argument makes the thread exit immediately.  */
+    for (i = 0; i < THREADS; i++)
+      create_thread (&child[i], thread_fn, NULL);
+
+    for (i = 0; i < THREADS; i++)
+      pthread_join (child[i], NULL);
+  }
+
+  /* Spawn a new thread that joins us, and exit.  The idea here is to
+     not have any thread that stays around forever.  */
+  {
+    pthread_t child;
+
+    p->parent = pthread_self ();
+    p->join_parent = 1;
+    create_thread (&child, thread_fn, p);
+  }
+
+  return NULL;
+}
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < 4; i++)
+    {
+      struct thread_arg *p;
+      pthread_t child;
+
+      p = malloc (sizeof *p);
+      p->parent = pthread_self ();
+      /* Only join the parent once.  */
+      if (i == 0)
+	p->join_parent = 1;
+      else
+	p->join_parent = 0;
+      create_thread (&child, thread_fn, p);
+    }
+
+  /* Exit the leader to make sure that we can access memory with the
+     leader gone.  */
+  pthread_exit (NULL);
+}
diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
new file mode 100644
index 00000000000..ea228e4ba13
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
@@ -0,0 +1,166 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that we can access memory while all the threads of the inferior
+# are running, and even if:
+#
+# - the leader thread exits
+# - the selected thread exits
+#
+# This test constantly spawns short lived threads to make sure that on
+# systems with debug APIs that require passing down a specific thread
+# to work with (e.g., GNU/Linux ptrace and /proc filesystem), GDB
+# copes with accessing memory just while the thread it is accessing
+# memory through exits.
+#
+# The test spawns two processes and alternates memory accesses between
+# them to force flushing per-process caches.  At the time of writing,
+# the Linux backend accesses inferior memory via /proc/<pid>/mem, and
+# keeps one such file open, as a cache.  Alternating inferiors forces
+# opening such file for a different process, which fails if GDB tries
+# to open the file for a thread that exited.  The test does ensures
+# those reopen/fail code paths are exercised.
+
+standard_testfile
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
+    return -1
+}
+
+# The test proper.  NON_STOP indicates whether we're testing in
+# non-stop, or all-stop mode.
+
+proc test { non_stop } {
+    global binfile
+    global gdb_prompt
+    global GDBFLAGS
+
+    save_vars { GDBFLAGS } {
+      append GDBFLAGS " -ex \"set non-stop $non_stop\""
+      clean_restart ${binfile}
+    }
+
+    if ![runto_main] {
+	fail "cannot run to main"
+	return -1
+    }
+
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		unsupported "can't issue commands while target is running"
+		return 0
+	    }
+	}
+    }
+
+    delete_breakpoints
+
+    # Start the second inferior.
+    with_test_prefix "second inferior" {
+	gdb_test "add-inferior -no-connection" "New inferior 2.*"
+	gdb_test "inferior 2" "Switching to inferior 2 .*"
+
+	gdb_load $binfile
+
+	if ![runto_main] {
+	    fail "cannot run to main"
+	    return -1
+	}
+    }
+
+    delete_breakpoints
+
+    # These put too much noise in the logs.
+    gdb_test_no_output "set print thread-events off"
+
+    # Continue all threads of both processes.
+    gdb_test_no_output "set schedule-multiple on"
+    if {$non_stop == "off"} {
+	set cmd "continue &"
+    } else {
+	set cmd "continue -a &"
+    }
+    gdb_test_multiple $cmd "continuing" {
+	-re "Continuing\.\r\n$gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Like gdb_test, but:
+    # - don't issue a pass on success.
+    # - on failure, clear the ok variable in the calling context, and
+    #   break it.
+    proc my_gdb_test {cmd pattern message} {
+	upvar inf inf
+	upvar iter iter
+	if {[gdb_test_multiple $cmd "access mem ($message, inf=$inf, iter=$iter)" {
+	    -wrap -re $pattern {
+	    }
+	}] != 0} {
+	    uplevel 1 {set ok 0}
+	    return -code break
+	}
+    }
+
+    # Hammer away for 5 seconds, alternating between inferiors.
+    set ::done 0
+    after 5000 { set ::done 1 }
+
+    set inf 1
+    set ok 1
+    set iter 0
+    while {!$::done && $ok} {
+	incr iter
+	verbose -log "xxxxx: iteration $iter"
+	gdb_test "info threads" ".*" ""
+
+	if {$inf == 1} {
+	    set inf 2
+	} else {
+	    set inf 1
+	}
+
+	my_gdb_test "inferior $inf" ".*" "inferior $inf"
+
+	my_gdb_test "print global_var = 555" " = 555" \
+	    "write to global_var"
+	my_gdb_test "print global_var" " = 555" \
+	    "print global_var after writing"
+	my_gdb_test "print global_var = 333" " = 333" \
+	    "write to global_var again"
+	my_gdb_test "print global_var" " = 333" \
+	    "print global_var after writing again"
+    }
+
+   if {$ok} {
+       pass "access mem"
+   }
+}
+
+foreach non_stop { "off" "on" } {
+    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
+    with_test_prefix "$stop_mode" {
+	test $non_stop
+    }
+}

base-commit: 75a2da57a1bbff8686f56a43aabe1d7e55147894
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-06-11 17:36 [PATCH] Linux: Access memory even if threads are running Pedro Alves
  2021-06-11 17:55 ` Pedro Alves
@ 2021-07-01 13:18 ` Luis Machado
  2021-07-01 14:34   ` Pedro Alves
  1 sibling, 1 reply; 16+ messages in thread
From: Luis Machado @ 2021-07-01 13:18 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

Out of curiosity... Why not attempt to lift the restriction of PTRACE 
not being able to read/write memory of running threads? Sounds like that 
would've been useful.

On 6/11/21 2:36 PM, Pedro Alves wrote:
> Currently, on GNU/Linux, if you try to access memory and you have a
> running thread selected, GDB fails the memory accesses, like:
> 
>   (gdb) c&
>   Continuing.
>   (gdb) p global_var
>   Cannot access memory at address 0x555555558010
> 
> Or:
> 
>   (gdb) b main
>   Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59.
>   Warning:
>   Cannot insert breakpoint 2.
>   Cannot access memory at address 0x55555555524d
> 
> This patch removes this limitation.  It teaches the native Linux
> target to read/write memory even if the target is running.  And it
> does this without temporarily stopping threads.  We now get:
> 
>   (gdb) c&
>   Continuing.
>   (gdb) p global_var
>   $1 = 123
>   (gdb) b main
>   Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62.
> 
> (The scenarios above work correctly with current GDBserver, because
> GDBserver temporarily stops all threads in the process whenever GDB
> wants to access memory (see prepare_to_access_memory /
> done_accessing_memory).  Freezing the whole process makes sense when
> we need to be sure that we have a consistent view of memory and don't
> race with the inferior changing it at the same time as GDB is
> accessing it.  But I think that's a too-heavy hammer for the default
> behavior.  I think that ideally, whether to stop all threads or not
> should be policy decided by gdb core, probably best implemented by
> exposing something like gdbserver's prepare_to_access_memory /
> done_accessing_memory to gdb core.)
> 
> Currently, if we're accessing (reading/writing) just a few bytes, then
> the Linux native backend does not try accessing memory via
> /proc/<pid>/mem and goes straight to ptrace
> PTRACE_PEEKTEXT/PTRACE_POKETEXT.  However, ptrace always fails when
> the ptracee is running.  So the first step is to prefer
> /proc/<pid>/mem even for small accesses.  Without further changes
> however, that may cause a performance regression, due to constantly
> opening and closing /proc/<pid>/mem for each memory access.  So the
> next step is to keep the /proc/<pid>/mem file open across memory
> accesses.  If we have this, then it doesn't make sense anymore to even
> have the ptrace fallback, so the patch disables it.
> 
> I've made it such that GDB only ever has one /proc/<pid>/mem file open
> at any time.  As long as a memory access hits the same inferior
> process as the previous access, then we reuse the previously open
> file.  If however, we access memory of a different process, then we
> close the previous file and open a new one for the new process.
> 
> If we wanted, we could keep one /proc/<pid>/mem file open per
> inferior, and never close them (unless the inferior exits or execs).
> However, having seen bfd patches recently about hitting too many open
> file descriptors, I kept the logic to have only one file open tops.
> Also, we need to handle memory accesses for processes for which we
> don't have an inferior object, for when we need to detach a
> fork-child, and we'd probaly want to handle caching the open file for
> that scenario (no inferior for process) too, which would probably end
> up meaning caching for last non-inferior process, which is very much
> what I'm proposing anyhow.  So always having one file open likely ends
> up a smaller patch.
> 
> The next step is handling the case of GDB reading/writing memory
> through a thread that is running and exits.  The access should not
> result in a user-visible failure if the inferior/process is still
> alive.
> 
> Once we manage to open a /proc/<lwpid>/mem file, then that file is
> usable for memory accesses even if the corresponding lwp exits and is
> reaped.  I double checked that trying to open the same
> /proc/<lwpid>/mem path again fails because the lwp is really gone so
> there's no /proc/<lwpid>/ entry on the filesystem anymore, but the
> previously open file remains usable.  It's only when the whole process
> execs that we need to reopen a new file.
> 
> When the kernel destroys the whole address space, i.e., when the
> process exits or execs, the reads/writes fail with 0 aka EOF, in which
> case there's nothing else to do than returning a memory access
> failure.  Note this means that when we get an exec event, we need to
> reopen the file, to access the process's new address space.
> 
> If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
> we're opening it for exits before we open it and before we reap the
> LWP (i.e., the LWP is zombie), the open fails with EACCES.  The patch
> handles this by just looking for another thread until it finds one
> that we can open a /proc/<pid>/mem successfully for.
> 
> If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
> we're opening it had exited and we already reaped it, which is the
> case if the selected thread is in THREAD_EXIT state, the open fails
> with ENOENT.  The patch handles this the same way as a zombie race
> (EACESS), instead of checking upfront whether we're accessing a
> known-exited thread, because that would result in more complicated
> code, because we also need to handle accessing lwps that are not
> listed in the core thread list, and it's the core thread list that
> records the THREAD_EXIT state.
> 
> The patch includes two testcases:
> 
> #1 - gdb.threads/access-mem-running.exp
> 
>    This is the conceptually simplest - it has two threads, and has GDB
>    read and write memory from each of the threads.  It also tests
>    setting a breakpoint while all threads are running, and checks that
>    the breakpoint is hit immediately.
> 
> #2 - gdb.threads/access-mem-running-thread-exit.exp
> 
>    This one is more elaborate, as it continuously spawns short-lived
>    threads in order to exercise accessing memory just while threads are
>    exiting.  It also spawns two different processes and alternates
>    accessing memory between the two processes to exercise the reopening
>    the /proc file frequently.  This also ends up exercising GDB reading
>    from an exited thread frequently.  I confirmed by putting abort()
>    calls in the EACESS/ENOENT paths added by the patch that we do hit
>    all of them frequently with the testcase.  It also exits the
>    process's main thread (i.e., the main thread becomes zombie), to
>    make sure accessing memory in such a corner-case scenario works now
>    and in the future.
> 
> The tests fail on GNU/Linux native before the code changes, and pass
> after.  They pass against current GDBserver, again because GDBserver
> supports memory access even if all threads are running, by
> transparently pausing the whole process.
> 
> gdb/ChangeLog:
> yyyy-mm-dd  Pedro Alves  <pedro@palves.net>
> 
> 	* linux-nat.c (linux_nat_target::detach): Close the
> 	/proc/<pid>/mem file if it was open for this process.
> 	(linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the
> 	/proc/<pid>/mem file if it was open for this process.
> 	(linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file
> 	if it was open for this process.
> 	(linux_nat_target::xfer_partial): Adjust.  Do not fall back to
> 	inf_ptrace_target::xfer_partial for memory accesses.
> 	(last_proc_mem_file): New.
> 	(maybe_close_proc_mem_file): New.
> 	(linux_proc_xfer_memory_partial_pid): New, with bits factored out
> 	from linux_proc_xfer_partial.
> 	(linux_proc_xfer_partial): Delete.
> 	(linux_proc_xfer_memory_partial): New.
> 
> gdb/testsuite/ChangeLog
> yyyy-mm-dd  Pedro Alves  <pedro@palves.net>
> 
> 	* gdb.threads/access-mem-running.c: New.
> 	* gdb.threads/access-mem-running.exp: New.
> 	* gdb.threads/access-mem-running-thread-exit.c: New.
> 	* gdb.threads/access-mem-running-thread-exit.exp: New.
> 
> Change-Id: Ib3c082528872662a3fc0ca9b31c34d4876c874c9
> ---
>   gdb/linux-nat.c                               | 262 ++++++++++++++----
>   .../access-mem-running-thread-exit.c          | 123 ++++++++
>   .../access-mem-running-thread-exit.exp        | 166 +++++++++++
>   .../gdb.threads/access-mem-running.c          |  80 ++++++
>   .../gdb.threads/access-mem-running.exp        | 132 +++++++++
>   5 files changed, 716 insertions(+), 47 deletions(-)
>   create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
>   create mode 100644 gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
>   create mode 100644 gdb/testsuite/gdb.threads/access-mem-running.c
>   create mode 100644 gdb/testsuite/gdb.threads/access-mem-running.exp
> 
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 34a2aee41d7..0b4107af523 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -280,6 +280,8 @@ static int lwp_status_pending_p (struct lwp_info *lp);
>   
>   static void save_stop_reason (struct lwp_info *lp);
>   
> +static void maybe_close_proc_mem_file (pid_t pid);
> +
>   \f
>   /* LWP accessors.  */
>   
> @@ -1486,6 +1488,8 @@ linux_nat_target::detach (inferior *inf, int from_tty)
>   
>         detach_success (inf);
>       }
> +
> +  maybe_close_proc_mem_file (pid);
>   }
>   
>   /* Resume execution of the inferior process.  If STEP is nonzero,
> @@ -2025,6 +2029,10 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
>       {
>         linux_nat_debug_printf ("Got exec event from LWP %ld", lp->ptid.lwp ());
>   
> +      /* Close the /proc/<pid>/mem file if it was open for this
> +	 inferior.  */
> +      maybe_close_proc_mem_file (lp->ptid.pid ());
> +
>         ourstatus->kind = TARGET_WAITKIND_EXECD;
>         ourstatus->value.execd_pathname
>   	= xstrdup (linux_proc_pid_to_exec_file (pid));
> @@ -3589,6 +3597,10 @@ linux_nat_target::mourn_inferior ()
>   
>     purge_lwp_list (pid);
>   
> +  /* Close the /proc/<pid>/mem file if it was open for this
> +     inferior.  */
> +  maybe_close_proc_mem_file (pid);
> +
>     if (! forks_exist_p ())
>       /* Normal case, no other forks available.  */
>       inf_ptrace_target::mourn_inferior ();
> @@ -3681,10 +3693,8 @@ linux_nat_xfer_osdata (enum target_object object,
>   		       ULONGEST *xfered_len);
>   
>   static enum target_xfer_status
> -linux_proc_xfer_partial (enum target_object object,
> -			 const char *annex, gdb_byte *readbuf,
> -			 const gdb_byte *writebuf,
> -			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
> +linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
> +				ULONGEST offset, LONGEST len, ULONGEST *xfered_len);
>   
>   enum target_xfer_status
>   linux_nat_target::xfer_partial (enum target_object object,
> @@ -3692,8 +3702,6 @@ linux_nat_target::xfer_partial (enum target_object object,
>   				const gdb_byte *writebuf,
>   				ULONGEST offset, ULONGEST len, ULONGEST *xfered_len)
>   {
> -  enum target_xfer_status xfer;
> -
>     if (object == TARGET_OBJECT_SIGNAL_INFO)
>       return linux_xfer_siginfo (object, annex, readbuf, writebuf,
>   			       offset, len, xfered_len);
> @@ -3712,25 +3720,21 @@ linux_nat_target::xfer_partial (enum target_object object,
>       return linux_nat_xfer_osdata (object, annex, readbuf, writebuf,
>   				  offset, len, xfered_len);
>   
> -  /* GDB calculates all addresses in the largest possible address
> -     width.
> -     The address width must be masked before its final use - either by
> -     linux_proc_xfer_partial or inf_ptrace_target::xfer_partial.
> -
> -     Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
> -
>     if (object == TARGET_OBJECT_MEMORY)
>       {
> +      /* GDB calculates all addresses in the largest possible address
> +	 width.  The address width must be masked before its final use
> +	 by linux_proc_xfer_partial.
> +
> +	 Compare ADDR_BIT first to avoid a compiler warning on shift overflow.  */
>         int addr_bit = gdbarch_addr_bit (target_gdbarch ());
>   
>         if (addr_bit < (sizeof (ULONGEST) * HOST_CHAR_BIT))
>   	offset &= ((ULONGEST) 1 << addr_bit) - 1;
> -    }
>   
> -  xfer = linux_proc_xfer_partial (object, annex, readbuf, writebuf,
> -				  offset, len, xfered_len);
> -  if (xfer != TARGET_XFER_EOF)
> -    return xfer;
> +      return linux_proc_xfer_memory_partial (readbuf, writebuf,
> +					     offset, len, xfered_len);
> +    }
>   
>     return inf_ptrace_target::xfer_partial (object, annex, readbuf, writebuf,
>   					  offset, len, xfered_len);
> @@ -3794,35 +3798,88 @@ linux_nat_target::pid_to_exec_file (int pid)
>     return linux_proc_pid_to_exec_file (pid);
>   }
>   
> -/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
> -   Because we can use a single read/write call, this can be much more
> -   efficient than banging away at PTRACE_PEEKTEXT.  */
> +/* Keep the /proc/<pid>/mem file open between memory accesses, as a
> +   cache to avoid constantly closing/opening the file in the common
> +   case of multiple memory reads/writes from/to the same inferior.
> +   Note we don't keep a file open per inferior to avoid keeping too
> +   many file descriptors open, which can run into resource limits.  */
> +static struct
> +{
> +  /* The LWP this open file is for.  Note that after opening the file,
> +     even if the thread subsequently exits, the open file is still
> +     usable for accessing memory.  It's only when the whole process
> +     exits or execs that the file becomes invalid (at which point
> +     reads/writes return EOF).  */
> +  ptid_t ptid;
>   
> -static enum target_xfer_status
> -linux_proc_xfer_partial (enum target_object object,
> -			 const char *annex, gdb_byte *readbuf,
> -			 const gdb_byte *writebuf,
> -			 ULONGEST offset, LONGEST len, ULONGEST *xfered_len)
> +  /* The file descriptor.  -1 if file is not open.  */
> +  int fd = -1;
> +
> +  /* Close FD and clear it to -1.  */
> +  void close ()
> +  {
> +    linux_nat_debug_printf ("closing fd %d for pid %ld\n",
> +			    fd, ptid.lwp ());
> +    ::close (fd);
> +    fd = -1;
> +  }
> +} last_proc_mem_file;
> +
> +/* Close the /proc/<pid>/mem file if its LWP matches PTID.  */
> +
> +static void
> +maybe_close_proc_mem_file (pid_t pid)
>   {
> -  LONGEST ret;
> -  int fd;
> -  char filename[64];
> +  if (last_proc_mem_file.ptid.pid () == pid)
> +    last_proc_mem_file.close ();
> +}
>   
> -  if (object != TARGET_OBJECT_MEMORY)
> -    return TARGET_XFER_EOF;
> +/* Helper for linux_proc_xfer_memory_partial.  Accesses /proc via
> +   PTID.  Returns -1 on error, with errno set.  Returns number of
> +   read/written bytes otherwise.  Returns 0 on EOF, which indicates
> +   the address space is gone (because the process exited or
> +   execed).  */
>   
> -  /* Don't bother for one word.  */
> -  if (len < 3 * sizeof (long))
> -    return TARGET_XFER_EOF;
> +static ssize_t
> +linux_proc_xfer_memory_partial_pid (ptid_t ptid,
> +				    gdb_byte *readbuf, const gdb_byte *writebuf,
> +				    ULONGEST offset, LONGEST len)
> +{
> +  ssize_t ret;
>   
> -  /* We could keep this file open and cache it - possibly one per
> -     thread.  That requires some juggling, but is even faster.  */
> -  xsnprintf (filename, sizeof filename, "/proc/%ld/mem",
> -	     inferior_ptid.lwp ());
> -  fd = gdb_open_cloexec (filename, ((readbuf ? O_RDONLY : O_WRONLY)
> -				    | O_LARGEFILE), 0);
> -  if (fd == -1)
> -    return TARGET_XFER_EOF;
> +  /* As long as we're hitting the same inferior, the previously open
> +     file is good, even if the thread it was open for exits.  */
> +  if (last_proc_mem_file.fd != -1
> +      && last_proc_mem_file.ptid.pid () != ptid.pid ())
> +    last_proc_mem_file.close ();
> +
> +  if (last_proc_mem_file.fd == -1)
> +    {
> +      /* Actually use /proc/<pid>/task/<lwp>/mem instead of
> +	 /proc/<lwp>/mem to avoid PID-reuse races, as we may be trying
> +	 to read memory via a thread which we've already reaped.  If
> +	 the PID is reused for the same process it's OK.  */
> +      char filename[64];
> +      xsnprintf (filename, sizeof filename,
> +		 "/proc/%d/task/%ld/mem", ptid.pid (), ptid.lwp ());
> +
> +      last_proc_mem_file.fd
> +	= gdb_open_cloexec (filename, O_RDWR | O_LARGEFILE, 0);
> +
> +      if (last_proc_mem_file.fd == -1)
> +	{
> +	  linux_nat_debug_printf ("opening fd for lwp %ld failed: %s (%d)\n",
> +				  ptid.lwp (),
> +				  safe_strerror (errno), errno);
> +	  return -1;
> +	}
> +      last_proc_mem_file.ptid = ptid;
> +
> +      linux_nat_debug_printf ("opened fd %d for lwp %ld\n",
> +			      last_proc_mem_file.fd, ptid.lwp ());
> +    }
> +
> +  int fd = last_proc_mem_file.fd;
>   
>     /* Use pread64/pwrite64 if available, since they save a syscall and can
>        handle 64-bit offsets even on 32-bit platforms (for instance, SPARC
> @@ -3837,17 +3894,128 @@ linux_proc_xfer_partial (enum target_object object,
>   	   : write (fd, writebuf, len));
>   #endif
>   
> -  close (fd);
> +  if (ret == -1)
> +    {
> +      linux_nat_debug_printf ("accessing fd %d for pid %ld failed: %s (%d)\n",
> +			      fd, ptid.lwp (),
> +			      safe_strerror (errno), errno);
> +    }
> +  else if (ret == 0)
> +    {
> +      linux_nat_debug_printf ("accessing fd %d for pid %ld got EOF\n",
> +			      fd, ptid.lwp ());
> +    }
> +
> +  return ret;
> +}
>   
> -  if (ret == -1 || ret == 0)
> -    return TARGET_XFER_EOF;
> +/* Implement the to_xfer_partial target method using /proc/<pid>/mem.
> +   Because we can use a single read/write call, this can be much more
> +   efficient than banging away at PTRACE_PEEKTEXT.  Also, unlike
> +   PTRACE_PEEKTEXT/PTRACE_POKETEXT, this works with running
> +   threads.  */
> +
> +static enum target_xfer_status
> +linux_proc_xfer_memory_partial (gdb_byte *readbuf, const gdb_byte *writebuf,
> +				ULONGEST offset, LONGEST len,
> +				ULONGEST *xfered_len)
> +{
> +  /* Unlike PTRACE_PEEKTEXT/PTRACE_POKETEXT, reading/writing from/to
> +     /proc/<pid>/mem works with running threads, and even exited
> +     threads if the file was already open.  If we need to open or
> +     reopen the /proc file though, we may get an EACCES error
> +     ("Permission denied"), meaning the thread is gone but its exit
> +     status isn't reaped yet, or ENOENT if the thread is gone and
> +     already reaped.  In that case, just try opening the file for
> +     another thread in the process.  If all threads fail, then it must
> +     mean the whole process exited, in which case there's nothing else
> +     to do and we just fail the memory access.
> +
> +     Note we don't simply always access via the leader thread because
> +     the leader may exit without exiting the whole process.  See
> +     gdb.threads/leader-exit.exp, for example.  */
> +
> +  /* It's frequently the case that the selected thread is stopped, and
> +     is thus not likely to exit (unless something kills the process
> +     outside our control, with e.g., SIGKILL).  Give that one a try
> +     first.
> +
> +     Also, inferior_ptid may actually point at an LWP not in lwp_list.
> +     This happens when we're detaching from a fork child that we don't
> +     want to debug ("set detach-on-fork on"), and the breakpoints
> +     module uninstalls breakpoints from the fork child.  Which process
> +     to access is given by inferior_ptid.  */
> +  int res = linux_proc_xfer_memory_partial_pid (inferior_ptid,
> +						readbuf, writebuf,
> +						offset, len);
> +  if (res == 0)
> +    {
> +      /* EOF means the address space is gone, the whole
> +	 process exited or execed.  */
> +      return TARGET_XFER_EOF;
> +    }
> +  else if (res != -1)
> +    {
> +      *xfered_len = res;
> +      return TARGET_XFER_OK;
> +    }
>     else
>       {
> -      *xfered_len = ret;
> +      /* If we simply raced with the thread exiting (EACCES), or the
> +	 current thread is THREAD_EXITED (ENOENT), try some other
> +	 thread.  It's easier to handle an ENOENT failure than check
> +	 for THREAD_EXIT upfront because this function is called
> +	 before a thread for inferior_ptid is added to the thread
> +	 list.  */
> +      if (errno != EACCES && errno != ENOENT)
> +	return TARGET_XFER_EOF;
> +    }
> +
> +  int cur_pid = current_inferior ()->pid;
> +
> +  if (inferior_ptid.pid () != cur_pid)
> +    {
> +      /* We're accessing a fork child, and the access above failed.
> +	 Don't bother iterating the LWP list, since there's no other
> +	 LWP for this process.  */
> +      return TARGET_XFER_EOF;
> +    }
> +
> +  /* Iterate over LWPs of the current inferior, trying to access
> +     memory through one of them.  */
> +  for (lwp_info *lp = lwp_list; lp != nullptr; lp = lp->next)
> +    {
> +      if (lp->ptid.pid () != cur_pid)
> +	continue;
> +
> +      res = linux_proc_xfer_memory_partial_pid (lp->ptid,
> +						readbuf, writebuf,
> +						offset, len);
> +
> +      if (res == 0)
> +	{
> +	  /* EOF means the address space is gone, the whole process
> +	     exited or execed.  */
> +	  return TARGET_XFER_EOF;
> +	}
> +      else if (res == -1)
> +	{
> +	  if (errno == EACCES)
> +	    {
> +	      /* This LWP is gone, try another one.  */
> +	      continue;
> +	    }
> +
> +	  return TARGET_XFER_EOF;
> +	}
> +
> +      *xfered_len = res;
>         return TARGET_XFER_OK;
>       }
> -}
>   
> +  /* No luck.  */
> +  return TARGET_XFER_EOF;
> +}
>   
>   /* Parse LINE as a signal set and add its set bits to SIGS.  */
>   
> diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
> new file mode 100644
> index 00000000000..5f89d223065
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.c
> @@ -0,0 +1,123 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2021 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#define _GNU_SOURCE
> +#include <assert.h>
> +#include <pthread.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <string.h>
> +
> +#define THREADS 20
> +
> +static volatile unsigned int global_var = 123;
> +
> +/* Wrapper around pthread_create.   */
> +
> +static void
> +create_thread (pthread_t *child,
> +	       void *(*start_routine) (void *), void *arg)
> +{
> +  int rc;
> +
> +  while ((rc = pthread_create (child, NULL, start_routine, arg)) != 0)
> +    {
> +      fprintf (stderr, "unexpected error from pthread_create: %s (%d)\n",
> +	       strerror (rc), rc);
> +      sleep (1);
> +    }
> +}
> +
> +/* Data passed to threads on creation.  This is allocated on the heap
> +   and ownership transferred from parent to child.  */
> +
> +struct thread_arg
> +{
> +  /* The thread's parent.  */
> +  pthread_t parent;
> +
> +  /* Whether to call pthread_join on the parent.  */
> +  int join_parent;
> +};
> +
> +/* Entry point for threads.  */
> +
> +static void *
> +thread_fn (void *arg)
> +{
> +  struct thread_arg *p = arg;
> +
> +  /* Passing no argument makes the thread exit immediately.  */
> +  if (p == NULL)
> +    return NULL;
> +
> +  if (p->join_parent)
> +    assert (pthread_join (p->parent, NULL) == 0);
> +
> +  /* Spawn a number of threads that exit immediately, and then join
> +     them.  The idea is to maximize the time window when we mostly
> +     have threads exiting.  */
> +  {
> +    pthread_t child[THREADS];
> +    int i;
> +
> +    /* Passing no argument makes the thread exit immediately.  */
> +    for (i = 0; i < THREADS; i++)
> +      create_thread (&child[i], thread_fn, NULL);
> +
> +    for (i = 0; i < THREADS; i++)
> +      pthread_join (child[i], NULL);
> +  }
> +
> +  /* Spawn a new thread that joins us, and exit.  The idea here is to
> +     not have any thread that stays around forever.  */
> +  {
> +    pthread_t child;
> +
> +    p->parent = pthread_self ();
> +    p->join_parent = 1;
> +    create_thread (&child, thread_fn, p);
> +  }
> +
> +  return NULL;
> +}
> +
> +int
> +main (void)
> +{
> +  int i;
> +
> +  for (i = 0; i < 4; i++)
> +    {
> +      struct thread_arg *p;
> +      pthread_t child;
> +
> +      p = malloc (sizeof *p);
> +      p->parent = pthread_self ();
> +      /* Only join the parent once.  */
> +      if (i == 0)
> +	p->join_parent = 1;
> +      else
> +	p->join_parent = 0;
> +      create_thread (&child, thread_fn, p);
> +    }
> +
> +  /* Exit the leader to make sure that we can access memory with the
> +     leader gone.  */
> +  pthread_exit (NULL);
> +}
> diff --git a/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
> new file mode 100644
> index 00000000000..ea228e4ba13
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/access-mem-running-thread-exit.exp
> @@ -0,0 +1,166 @@
> +# Copyright (C) 2021 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test that we can access memory while all the threads of the inferior
> +# are running, and even if:
> +#
> +# - the leader thread exits
> +# - the selected thread exits
> +#
> +# This test constantly spawns short lived threads to make sure that on
> +# systems with debug APIs that require passing down a specific thread
> +# to work with (e.g., GNU/Linux ptrace and /proc filesystem), GDB
> +# copes with accessing memory just while the thread it is accessing
> +# memory through exits.
> +#
> +# The test spawns two processes and alternates memory accesses between
> +# them to force flushing per-process caches.  At the time of writing,
> +# the Linux backend accesses inferior memory via /proc/<pid>/mem, and
> +# keeps one such file open, as a cache.  Alternating inferiors forces
> +# opening such file for a different process, which fails if GDB tries
> +# to open the file for a thread that exited.  The test does ensures
> +# those reopen/fail code paths are exercised.
> +
> +standard_testfile
> +
> +if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
> +    return -1
> +}
> +
> +# The test proper.  NON_STOP indicates whether we're testing in
> +# non-stop, or all-stop mode.
> +
> +proc test { non_stop } {
> +    global binfile
> +    global gdb_prompt
> +    global GDBFLAGS
> +
> +    save_vars { GDBFLAGS } {
> +      append GDBFLAGS " -ex \"set non-stop $non_stop\""
> +      clean_restart ${binfile}
> +    }
> +
> +    if ![runto_main] {
> +	fail "cannot run to main"
> +	return -1
> +    }
> +
> +    # If debugging with target remote, check whether the all-stop variant
> +    # of the RSP is being used.  If so, we can't run the background tests.
> +    if {!$non_stop
> +	&& [target_info exists gdb_protocol]
> +	&& ([target_info gdb_protocol] == "remote"
> +	    || [target_info gdb_protocol] == "extended-remote")} {
> +
> +	gdb_test_multiple "maint show target-non-stop" "" {
> +	    -wrap -re "(is|currently) on.*" {
> +	    }
> +	    -wrap -re "(is|currently) off.*" {
> +		unsupported "can't issue commands while target is running"
> +		return 0
> +	    }
> +	}
> +    }
> +
> +    delete_breakpoints
> +
> +    # Start the second inferior.
> +    with_test_prefix "second inferior" {
> +	gdb_test "add-inferior -no-connection" "New inferior 2.*"
> +	gdb_test "inferior 2" "Switching to inferior 2 .*"
> +
> +	gdb_load $binfile
> +
> +	if ![runto_main] {
> +	    fail "cannot run to main"
> +	    return -1
> +	}
> +    }
> +
> +    delete_breakpoints
> +
> +    # These put too much noise in the logs.
> +    gdb_test_no_output "set print thread-events off"
> +
> +    # Continue all threads of both processes.
> +    gdb_test_no_output "set schedule-multiple on"
> +    if {$non_stop == "off"} {
> +	set cmd "continue &"
> +    } else {
> +	set cmd "continue -a &"
> +    }
> +    gdb_test_multiple $cmd "continuing" {
> +	-re "Continuing\.\r\n$gdb_prompt " {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # Like gdb_test, but:
> +    # - don't issue a pass on success.
> +    # - on failure, clear the ok variable in the calling context, and
> +    #   break it.
> +    proc my_gdb_test {cmd pattern message} {
> +	upvar inf inf
> +	upvar iter iter
> +	if {[gdb_test_multiple $cmd "access mem ($message, inf=$inf, iter=$iter)" {
> +	    -wrap -re $pattern {
> +	    }
> +	}] != 0} {
> +	    uplevel 1 {set ok 0}
> +	    return -code break
> +	}
> +    }
> +
> +    # Hammer away for 5 seconds, alternating between inferiors.
> +    set ::done 0
> +    after 5000 { set ::done 1 }
> +
> +    set inf 1
> +    set ok 1
> +    set iter 0
> +    while {!$::done && $ok} {
> +	incr iter
> +	verbose -log "xxxxx: iteration $iter"
> +	gdb_test "info threads" ".*" ""
> +
> +	if {$inf == 1} {
> +	    set inf 2
> +	} else {
> +	    set inf 1
> +	}
> +
> +	my_gdb_test "inferior $inf" ".*" "inferior $inf"
> +
> +	my_gdb_test "print global_var = 555" " = 555" \
> +	    "write to global_var"
> +	my_gdb_test "print global_var" " = 555" \
> +	    "print global_var after writing"
> +	my_gdb_test "print global_var = 333" " = 333" \
> +	    "write to global_var again"
> +	my_gdb_test "print global_var" " = 333" \
> +	    "print global_var after writing again"
> +    }
> +
> +   if {$ok} {
> +       pass "access mem"
> +   }
> +}
> +
> +foreach non_stop { "off" "on" } {
> +    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
> +    with_test_prefix "$stop_mode" {
> +	test $non_stop
> +    }
> +}
> diff --git a/gdb/testsuite/gdb.threads/access-mem-running.c b/gdb/testsuite/gdb.threads/access-mem-running.c
> new file mode 100644
> index 00000000000..91359ee863b
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/access-mem-running.c
> @@ -0,0 +1,80 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2021 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#define _GNU_SOURCE
> +#include <assert.h>
> +#include <pthread.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <string.h>
> +
> +static pthread_barrier_t barrier;
> +
> +static unsigned int global_counter;
> +
> +static volatile unsigned int global_var = 123;
> +
> +static void
> +maybe_stop_here ()
> +{
> +}
> +
> +static void *
> +child_function (void *arg)
> +{
> +  global_counter = 1;
> +
> +  pthread_barrier_wait (&barrier);
> +
> +  while (global_counter > 0)
> +    {
> +      global_counter++;
> +
> +      /* Less than 1s, so the counter increments at least once while
> +	 the .exp sleep 1s, but slow enough that the counter doesn't
> +	 wrap in 1s.  */
> +      usleep (5000);
> +
> +      maybe_stop_here ();
> +    }
> +
> +  pthread_exit (NULL);
> +}
> +
> +static int
> +wait_threads (void)
> +{
> +  return 1;
> +}
> +
> +int
> +main (void)
> +{
> +  pthread_t child;
> +
> +  pthread_barrier_init (&barrier, NULL, 2);
> +
> +  pthread_create (&child, NULL, child_function, NULL);
> +
> +  pthread_barrier_wait (&barrier);
> +  wait_threads ();
> +
> +  pthread_join (child, NULL);
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.threads/access-mem-running.exp b/gdb/testsuite/gdb.threads/access-mem-running.exp
> new file mode 100644
> index 00000000000..fbcfa8c8633
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/access-mem-running.exp
> @@ -0,0 +1,132 @@
> +# Copyright (C) 2021 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test that we can access memory while all the threads of the inferior
> +# are running.
> +
> +standard_testfile
> +
> +if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
> +    return -1
> +}
> +
> +# The test proper.  NON_STOP indicates whether we're testing in
> +# non-stop, or all-stop mode.
> +
> +proc test { non_stop } {
> +    global srcfile binfile
> +    global gdb_prompt
> +    global GDBFLAGS
> +    global decimal
> +
> +    save_vars { GDBFLAGS } {
> +      append GDBFLAGS " -ex \"set non-stop $non_stop\""
> +      clean_restart ${binfile}
> +    }
> +
> +    if ![runto wait_threads] {
> +	return -1
> +    }
> +
> +    # If debugging with target remote, check whether the all-stop variant
> +    # of the RSP is being used.  If so, we can't run the background tests.
> +    if {!$non_stop
> +	&& [target_info exists gdb_protocol]
> +	&& ([target_info gdb_protocol] == "remote"
> +	    || [target_info gdb_protocol] == "extended-remote")} {
> +
> +	gdb_test_multiple "maint show target-non-stop" "" {
> +	    -wrap -re "(is|currently) on.*" {
> +	    }
> +	    -wrap -re "(is|currently) off.*" {
> +		unsupported "can't issue commands while target is running"
> +		return 0
> +	    }
> +	}
> +    }
> +
> +    delete_breakpoints
> +
> +    if {$non_stop == "off"} {
> +	set cmd "continue &"
> +    } else {
> +	set cmd "continue -a &"
> +    }
> +    gdb_test_multiple $cmd "continuing" {
> +	-re "Continuing\.\r\n$gdb_prompt " {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # Check we can read/write variables, no matter which thread is
> +    # selected.
> +    for {set thr 1} {$thr <= 2} {incr thr} {
> +	with_test_prefix "thread $thr" {
> +	    gdb_test "thread $thr" "Switching to .*"
> +
> +	    # Check that we can read the counter variable, and that
> +	    # the counter is increasing, meaning the threads really
> +	    # are running.
> +
> +	    sleep 1
> +	    set global_counter1 \
> +		[get_integer_valueof "global_counter" 0 \
> +		     "get global_counter once"]
> +
> +	    sleep 1
> +	    set global_counter2 \
> +		[get_integer_valueof "global_counter" 0 \
> +		     "get global_counter twice"]
> +
> +	    gdb_assert {$global_counter1 != 0 \
> +			    && $global_counter2 != 0 \
> +			    && $global_counter1 != $global_counter2} \
> +		"value changed"
> +
> +	    # Check that we can write variables.
> +
> +	    gdb_test "print global_var" " = 123" \
> +		"print global_var before writing"
> +	    gdb_test "print global_var = 321" " = 321" \
> +		"write to global_var"
> +	    gdb_test "print global_var" " = 321" \
> +		"print global_var after writing"
> +	    gdb_test "print global_var = 123" " = 123" \
> +		"write to global_var again"
> +	}
> +    }
> +
> +    # Check we can set a breakpoint while all threads are running.
> +    # The breakpoint should hit immediately.
> +    set any "\[^\r\n\]*"
> +
> +    gdb_test_multiple "b maybe_stop_here" "" {
> +	-re "Breakpoint $decimal at $any: file $any${srcfile}, line $decimal.\r\n$gdb_prompt " {
> +	    pass $gdb_test_name
> +	}
> +    }
> +    gdb_test_multiple "" "breakpoint hits" {
> +	-re "Thread 2 ${any}hit Breakpoint $decimal, maybe_stop_here \\(\\) at $any${srcfile}:$decimal\r\n" {
> +	    pass "$gdb_test_name"
> +	}
> +    }
> +}
> +
> +foreach non_stop { "off" "on" } {
> +    set stop_mode [expr ($non_stop=="off")?"all-stop":"non-stop"]
> +    with_test_prefix "$stop_mode" {
> +	test $non_stop
> +    }
> +}
> 
> base-commit: 873793ae09b5dcba8c8da7345ee283f296558b8e
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] Linux: Access memory even if threads are running
  2021-07-01 13:07       ` Pedro Alves
@ 2021-07-01 13:25         ` Tom Tromey
  0 siblings, 0 replies; 16+ messages in thread
From: Tom Tromey @ 2021-07-01 13:25 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Simon Marchi, gdb-patches

Pedro> BTW, I forgot to mention earlier, but the first thing I tried was to use
Pedro> process_vm_readv/process_vm_writev but those turn out useless for us,
Pedro> because they don't bypass memory protections, i.e., you can't write
Pedro> breakpoints with them, for example.

Thanks for looking into that.  I've thought about changing gdb to use
these, in the past, but now we can officially not bother.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 13:18 ` [PATCH] " Luis Machado
@ 2021-07-01 14:34   ` Pedro Alves
  2021-07-01 15:49     ` Luis Machado
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-01 14:34 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

On 2021-07-01 2:18 p.m., Luis Machado wrote:
> Out of curiosity... Why not attempt to lift the restriction of PTRACE not being able to read/write memory of running threads? Sounds like that would've been useful.

PTRACE_PEEKTEXT/PTRACE_POKETEXT can only read/write a word at a time, while with /proc
there's no practical limit on how much you can read/write with one syscall.  We
were already preferring /proc if the access was more than a word.  We'd still want to have
the /proc code working for efficiency, and the new code that now tries to open a different lwp if
the thread exits would still be required.  I'm not seeing how changing ptrace would help
in practice.  (And then it would only work with newer kernels, of course.)

Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 14:34   ` Pedro Alves
@ 2021-07-01 15:49     ` Luis Machado
  2021-07-01 16:49       ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Machado @ 2021-07-01 15:49 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 7/1/21 11:34 AM, Pedro Alves wrote:
> On 2021-07-01 2:18 p.m., Luis Machado wrote:
>> Out of curiosity... Why not attempt to lift the restriction of PTRACE not being able to read/write memory of running threads? Sounds like that would've been useful.
> 
> PTRACE_PEEKTEXT/PTRACE_POKETEXT can only read/write a word at a time, while with /proc
> there's no practical limit on how much you can read/write with one syscall.  We
> were already preferring /proc if the access was more than a word.  We'd still want to have
> the /proc code working for efficiency, and the new code that now tries to open a different lwp if
> the thread exits would still be required.  I'm not seeing how changing ptrace would help
> in practice.  (And then it would only work with newer kernels, of course.)

Right. I get the motivation for this particular use case.

Lifting this restriction would be an improvement to the ptrace interface 
in general. I'm thinking beyond reading individual words of memory, but 
reading registers, memory tags etc. I'm assuming any ptrace request for 
a running thread will result in a failure. Is that the case?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 15:49     ` Luis Machado
@ 2021-07-01 16:49       ` Pedro Alves
  2021-07-01 17:08         ` Luis Machado
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-01 16:49 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

On 2021-07-01 4:49 p.m., Luis Machado wrote:
> On 7/1/21 11:34 AM, Pedro Alves wrote:
>> On 2021-07-01 2:18 p.m., Luis Machado wrote:
>>> Out of curiosity... Why not attempt to lift the restriction of PTRACE not being able to read/write memory of running threads? Sounds like that would've been useful.
>>
>> PTRACE_PEEKTEXT/PTRACE_POKETEXT can only read/write a word at a time, while with /proc
>> there's no practical limit on how much you can read/write with one syscall.  We
>> were already preferring /proc if the access was more than a word.  We'd still want to have
>> the /proc code working for efficiency, and the new code that now tries to open a different lwp if
>> the thread exits would still be required.  I'm not seeing how changing ptrace would help
>> in practice.  (And then it would only work with newer kernels, of course.)
> 
> Right. I get the motivation for this particular use case.
> 
> Lifting this restriction would be an improvement to the ptrace interface in general. I'm thinking beyond reading individual words of memory, but reading registers, memory tags etc. I'm assuming any ptrace request for a running thread will result in a failure. Is that the case?
> 

Yes, most ptrace operations require a stopped tracee.  From man ptrace:

             
              "(...) For requests other than PTRACE_ATTACH, PTRACE_SEIZE, PTRACE_INTERRUPT, and PTRACE_KILL, the tracee must be
              stopped."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 16:49       ` Pedro Alves
@ 2021-07-01 17:08         ` Luis Machado
  2021-07-01 17:30           ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Machado @ 2021-07-01 17:08 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 7/1/21 1:49 PM, Pedro Alves wrote:
> On 2021-07-01 4:49 p.m., Luis Machado wrote:
>> On 7/1/21 11:34 AM, Pedro Alves wrote:
>>> On 2021-07-01 2:18 p.m., Luis Machado wrote:
>>>> Out of curiosity... Why not attempt to lift the restriction of PTRACE not being able to read/write memory of running threads? Sounds like that would've been useful.
>>>
>>> PTRACE_PEEKTEXT/PTRACE_POKETEXT can only read/write a word at a time, while with /proc
>>> there's no practical limit on how much you can read/write with one syscall.  We
>>> were already preferring /proc if the access was more than a word.  We'd still want to have
>>> the /proc code working for efficiency, and the new code that now tries to open a different lwp if
>>> the thread exits would still be required.  I'm not seeing how changing ptrace would help
>>> in practice.  (And then it would only work with newer kernels, of course.)
>>
>> Right. I get the motivation for this particular use case.
>>
>> Lifting this restriction would be an improvement to the ptrace interface in general. I'm thinking beyond reading individual words of memory, but reading registers, memory tags etc. I'm assuming any ptrace request for a running thread will result in a failure. Is that the case?
>>
> 
> Yes, most ptrace operations require a stopped tracee.  From man ptrace:
> 
>               
>                "(...) For requests other than PTRACE_ATTACH, PTRACE_SEIZE, PTRACE_INTERRUPT, and PTRACE_KILL, the tracee must be
>                stopped."
> 

I see. Thanks.

So even though memory reads/writes will now be allowed, reading 
registers/memory tags or doing any other sort of ptrace request for data 
won't work.

For example, if we want to validate memory tags for a particular tagged 
pointer, we won't be able to do so because the memory will be available, 
but the tags won't be.

I guess it would be a similar situation if you have a variable that 
lives in a register, right? Or a variable that lives in a vector 
register that has its size determined by another register (SVE vector 
length).

In this context, I guess I'm missing the main motivation for enabling 
this feature like this instead of fixing ptrace. I understand that 
fixing ptrace is a much more complicated issue, but still, it sounds 
like a more long term solution.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 17:08         ` Luis Machado
@ 2021-07-01 17:30           ` Pedro Alves
  2021-07-01 18:03             ` Luis Machado
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-01 17:30 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

On 2021-07-01 6:08 p.m., Luis Machado wrote:
> On 7/1/21 1:49 PM, Pedro Alves wrote:
>> On 2021-07-01 4:49 p.m., Luis Machado wrote:
>>> On 7/1/21 11:34 AM, Pedro Alves wrote:
>>>> On 2021-07-01 2:18 p.m., Luis Machado wrote:
>>>>> Out of curiosity... Why not attempt to lift the restriction of PTRACE not being able to read/write memory of running threads? Sounds like that would've been useful.
>>>>
>>>> PTRACE_PEEKTEXT/PTRACE_POKETEXT can only read/write a word at a time, while with /proc
>>>> there's no practical limit on how much you can read/write with one syscall.  We
>>>> were already preferring /proc if the access was more than a word.  We'd still want to have
>>>> the /proc code working for efficiency, and the new code that now tries to open a different lwp if
>>>> the thread exits would still be required.  I'm not seeing how changing ptrace would help
>>>> in practice.  (And then it would only work with newer kernels, of course.)
>>>
>>> Right. I get the motivation for this particular use case.
>>>
>>> Lifting this restriction would be an improvement to the ptrace interface in general. I'm thinking beyond reading individual words of memory, but reading registers, memory tags etc. I'm assuming any ptrace request for a running thread will result in a failure. Is that the case?
>>>
>>
>> Yes, most ptrace operations require a stopped tracee.  From man ptrace:
>>
>>                              "(...) For requests other than PTRACE_ATTACH, PTRACE_SEIZE, PTRACE_INTERRUPT, and PTRACE_KILL, the tracee must be
>>                stopped."
>>
> 
> I see. Thanks.
> 
> So even though memory reads/writes will now be allowed, reading registers/memory tags or doing any other sort of ptrace request for data won't work.
> 
> For example, if we want to validate memory tags for a particular tagged pointer, we won't be able to do so because the memory will be available, but the tags won't be.
> 
> I guess it would be a similar situation if you have a variable that lives in a register, right? Or a variable that lives in a vector register that has its size determined by another register (SVE vector length).

GDB won't even try to read registers off of running threads:

 (gdb) info registers 
 Selected thread is running.
 (gdb) p $rax
 Selected thread is running.

Variables that live in registers only make sense in the context of some frame.  But if the
thread is running, what frame would that be?  Between thinking about reading the register,
and actually reading it, a zillion instructions will have executed.  Might be useful to be able
to get a snapshot of registers for sampling where the program is (basically read the PC) with
minimal disturbance (though the kernel would still need to interrupt the thread for a brief
moment, I believe), but I'm not sure I'd let the user that in the normal case.  It seems more
confusing than helpful.  I think that the only registers that make sense reading normally
are global system registers shared by all threads.  I don't think SVE registers fit
in that model?

I don't know how memory tags work, I have not followed that thread.

> 
> In this context, I guess I'm missing the main motivation for enabling this feature like this instead of fixing ptrace. I understand that fixing ptrace is a much more complicated issue, but still, it sounds like a more long term solution.
> 

I think you're still missing my point quoted above -- even if you made ptrace work with running thread,
you'd still want to work with /proc for reads/writes larger than a word, and so all the code I added is
still desirable.  There's no "instead of".  If someone makes ptrace work with running threads, it won't
make the new code obsolete.

I did not consider changing ptrace simply because I had no use for reading registers off of
running threads.

Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 17:30           ` Pedro Alves
@ 2021-07-01 18:03             ` Luis Machado
  2021-07-01 18:34               ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Machado @ 2021-07-01 18:03 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 7/1/21 2:30 PM, Pedro Alves wrote:
> On 2021-07-01 6:08 p.m., Luis Machado wrote:
>> On 7/1/21 1:49 PM, Pedro Alves wrote:
>>> On 2021-07-01 4:49 p.m., Luis Machado wrote:
>>>> On 7/1/21 11:34 AM, Pedro Alves wrote:
>>>>> On 2021-07-01 2:18 p.m., Luis Machado wrote:
>>>>>> Out of curiosity... Why not attempt to lift the restriction of PTRACE not being able to read/write memory of running threads? Sounds like that would've been useful.
>>>>>
>>>>> PTRACE_PEEKTEXT/PTRACE_POKETEXT can only read/write a word at a time, while with /proc
>>>>> there's no practical limit on how much you can read/write with one syscall.  We
>>>>> were already preferring /proc if the access was more than a word.  We'd still want to have
>>>>> the /proc code working for efficiency, and the new code that now tries to open a different lwp if
>>>>> the thread exits would still be required.  I'm not seeing how changing ptrace would help
>>>>> in practice.  (And then it would only work with newer kernels, of course.)
>>>>
>>>> Right. I get the motivation for this particular use case.
>>>>
>>>> Lifting this restriction would be an improvement to the ptrace interface in general. I'm thinking beyond reading individual words of memory, but reading registers, memory tags etc. I'm assuming any ptrace request for a running thread will result in a failure. Is that the case?
>>>>
>>>
>>> Yes, most ptrace operations require a stopped tracee.  From man ptrace:
>>>
>>>                               "(...) For requests other than PTRACE_ATTACH, PTRACE_SEIZE, PTRACE_INTERRUPT, and PTRACE_KILL, the tracee must be
>>>                 stopped."
>>>
>>
>> I see. Thanks.
>>
>> So even though memory reads/writes will now be allowed, reading registers/memory tags or doing any other sort of ptrace request for data won't work.
>>
>> For example, if we want to validate memory tags for a particular tagged pointer, we won't be able to do so because the memory will be available, but the tags won't be.
>>
>> I guess it would be a similar situation if you have a variable that lives in a register, right? Or a variable that lives in a vector register that has its size determined by another register (SVE vector length).
> 
> GDB won't even try to read registers off of running threads:
> 
>   (gdb) info registers
>   Selected thread is running.
>   (gdb) p $rax
>   Selected thread is running.
> 
> Variables that live in registers only make sense in the context of some frame.  But if the
> thread is running, what frame would that be?  Between thinking about reading the register,
> and actually reading it, a zillion instructions will have executed.  Might be useful to be able
> to get a snapshot of registers for sampling where the program is (basically read the PC) with
> minimal disturbance (though the kernel would still need to interrupt the thread for a brief
> moment, I believe), but I'm not sure I'd let the user that in the normal case.  It seems more
> confusing than helpful.  I think that the only registers that make sense reading normally
> are global system registers shared by all threads.  I don't think SVE registers fit
> in that model?
> 
> I don't know how memory tags work, I have not followed that thread.
> 

Whenever you attempt to print a tagged pointer, GDB will try to fetch 
the memory tags (via ptrace). In this particular case, it will fail to 
fetch the tags, even if the architecture supports it.

I guess since this is mostly non-stop, which is used less often than 
all-stop, this won't matter much.

>>
>> In this context, I guess I'm missing the main motivation for enabling this feature like this instead of fixing ptrace. I understand that fixing ptrace is a much more complicated issue, but still, it sounds like a more long term solution.
>>
> 
> I think you're still missing my point quoted above -- even if you made ptrace work with running thread,
> you'd still want to work with /proc for reads/writes larger than a word, and so all the code I added is
> still desirable.  There's no "instead of".  If someone makes ptrace work with running threads, it won't
> make the new code obsolete.

No, I get that. But memory can be just as volatile in a running 
environment as a register. It is just a snapshot at that moment after 
all. Thus why it wasn't clear to me what the motivation was for allowing 
memory snapshots off of a running thread to be accessible.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 18:03             ` Luis Machado
@ 2021-07-01 18:34               ` Pedro Alves
  2021-07-01 19:14                 ` Luis Machado
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-01 18:34 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

On 2021-07-01 7:03 p.m., Luis Machado wrote:
> On 7/1/21 2:30 PM, Pedro Alves wrote:

>> I don't know how memory tags work, I have not followed that thread.
>>
> 
> Whenever you attempt to print a tagged pointer, GDB will try to fetch the memory tags (via ptrace). In this particular case, it will fail to fetch the tags, even if the architecture supports it.

Even if the pointer variable is e.g., a history variable (which could be for an unrelated inferior,
or for a past instance of the inferior, etc.) ?

> 
> I guess since this is mostly non-stop, which is used less often than all-stop, this won't matter much.

That's a bit dismissive.  :-)  Some users and some IDEs rely on non-stop mode all the time.  But yeah,
not great, but not different from what you had before my patch.  The user just needs to
pause some thread.  But I'd suggest trying it out and seeing how bad is the breakage.  AFAICS,
print_command_1 does the validation before printing the variable.  It might be useful to catch
errors and say something along the lines of "cannot validate tags, thread is running" and then
still print the value.

> 
>>>
>>> In this context, I guess I'm missing the main motivation for enabling this feature like this instead of fixing ptrace. I understand that fixing ptrace is a much more complicated issue, but still, it sounds like a more long term solution.
>>>
>>
>> I think you're still missing my point quoted above -- even if you made ptrace work with running thread,
>> you'd still want to work with /proc for reads/writes larger than a word, and so all the code I added is
>> still desirable.  There's no "instead of".  If someone makes ptrace work with running threads, it won't
>> make the new code obsolete.
> 
> No, I get that. But memory can be just as volatile in a running environment as a register. It is just a snapshot at that moment after all. Thus why it wasn't clear to me what the motivation was for allowing memory snapshots off of a running thread to be accessible.
> 

Well, that's an orthogonal remark to whether we fix ptrace or use /proc.  But the accessing memory use case
is much more about global state than local thread state.  The main use case is being able to set breakpoints.
Still, global variables aren't as volatile as local variables, which come in and out of existence in a blip.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 18:34               ` Pedro Alves
@ 2021-07-01 19:14                 ` Luis Machado
  2021-07-02  9:05                   ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Luis Machado @ 2021-07-01 19:14 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 7/1/21 3:34 PM, Pedro Alves wrote:
> On 2021-07-01 7:03 p.m., Luis Machado wrote:
>> On 7/1/21 2:30 PM, Pedro Alves wrote:
> 
>>> I don't know how memory tags work, I have not followed that thread.
>>>
>>
>> Whenever you attempt to print a tagged pointer, GDB will try to fetch the memory tags (via ptrace). In this particular case, it will fail to fetch the tags, even if the architecture supports it.
> 
> Even if the pointer variable is e.g., a history variable (which could be for an unrelated inferior,
> or for a past instance of the inferior, etc.) ?

Yes. Any TYPE_CODE_PTR that is tagged will have its logical tag 
validated against the allocation tag in memory.

> 
>>
>> I guess since this is mostly non-stop, which is used less often than all-stop, this won't matter much.
> 
> That's a bit dismissive.  :-)  Some users and some IDEs rely on non-stop mode all the time.  But yeah,
> not great, but not different from what you had before my patch.  The user just needs to
> pause some thread.  But I'd suggest trying it out and seeing how bad is the breakage.  AFAICS,
> print_command_1 does the validation before printing the variable.  It might be useful to catch
> errors and say something along the lines of "cannot validate tags, thread is running" and then
> still print the value.
> 

Sorry if it came out that way. My point is not to dismiss non-stop, but 
most of the cases I'm handling are all-stop-target-stopped scenarios. 
I'm trying to assess what is the impact of this new use case for memory 
tagging. If most of our users are using all-stop, then this change won't 
matter much. That's the context of my point above.

I have a patch on the list to catch validation errors and fallback to 
regular printing of variables. In that case, users will just not see the 
tags being validated, which is not ideal, but acceptable.

>>
>>>>
>>>> In this context, I guess I'm missing the main motivation for enabling this feature like this instead of fixing ptrace. I understand that fixing ptrace is a much more complicated issue, but still, it sounds like a more long term solution.
>>>>
>>>
>>> I think you're still missing my point quoted above -- even if you made ptrace work with running thread,
>>> you'd still want to work with /proc for reads/writes larger than a word, and so all the code I added is
>>> still desirable.  There's no "instead of".  If someone makes ptrace work with running threads, it won't
>>> make the new code obsolete.
>>
>> No, I get that. But memory can be just as volatile in a running environment as a register. It is just a snapshot at that moment after all. Thus why it wasn't clear to me what the motivation was for allowing memory snapshots off of a running thread to be accessible.
>>
> 
> Well, that's an orthogonal remark to whether we fix ptrace or use /proc.  But the accessing memory use case
> is much more about global state than local thread state.  The main use case is being able to set breakpoints.
> Still, global variables aren't as volatile as local variables, which come in and out of existence in a blip.
> 

Thanks. That clears things up for me.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Linux: Access memory even if threads are running
  2021-07-01 19:14                 ` Luis Machado
@ 2021-07-02  9:05                   ` Pedro Alves
  0 siblings, 0 replies; 16+ messages in thread
From: Pedro Alves @ 2021-07-02  9:05 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

On 2021-07-01 8:14 p.m., Luis Machado wrote:
> On 7/1/21 3:34 PM, Pedro Alves wrote:
>> On 2021-07-01 7:03 p.m., Luis Machado wrote:
>>> On 7/1/21 2:30 PM, Pedro Alves wrote:
>>
>>>> I don't know how memory tags work, I have not followed that thread.
>>>>
>>>
>>> Whenever you attempt to print a tagged pointer, GDB will try to fetch the memory tags (via ptrace). In this particular case, it will fail to fetch the tags, even if the architecture supports it.
>>
>> Even if the pointer variable is e.g., a history variable (which could be for an unrelated inferior,
>> or for a past instance of the inferior, etc.) ?
> 
> Yes. Any TYPE_CODE_PTR that is tagged will have its logical tag validated against the allocation tag in memory.
> 
>>
>>>
>>> I guess since this is mostly non-stop, which is used less often than all-stop, this won't matter much.
>>
>> That's a bit dismissive.  :-)  Some users and some IDEs rely on non-stop mode all the time.  But yeah,
>> not great, but not different from what you had before my patch.  The user just needs to
>> pause some thread.  But I'd suggest trying it out and seeing how bad is the breakage.  AFAICS,
>> print_command_1 does the validation before printing the variable.  It might be useful to catch
>> errors and say something along the lines of "cannot validate tags, thread is running" and then
>> still print the value.
>>
> 
> Sorry if it came out that way. My point is not to dismiss non-stop, but most of the cases I'm handling are all-stop-target-stopped scenarios. I'm trying to assess what is the impact of this new use case for memory tagging. If most of our users are using all-stop, then this change won't matter much. That's the context of my point above.

No worries.  Note this isn't a new use case.  With gdbserver, it has been possible to set breakpoints
and access memory while threads are running for years.  Only native debugging did not.

> 
> I have a patch on the list to catch validation errors and fallback to regular printing of variables. In that case, users will just not see the tags being validated, which is not ideal, but acceptable.

I've found it, and replied on that thread.

Cheers,
Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-07-02  9:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-11 17:36 [PATCH] Linux: Access memory even if threads are running Pedro Alves
2021-06-11 17:55 ` Pedro Alves
2021-06-11 18:00   ` [PATCH v2] " Pedro Alves
2021-06-25 16:48     ` Simon Marchi
2021-07-01 13:07       ` Pedro Alves
2021-07-01 13:25         ` Tom Tromey
2021-07-01 13:18 ` [PATCH] " Luis Machado
2021-07-01 14:34   ` Pedro Alves
2021-07-01 15:49     ` Luis Machado
2021-07-01 16:49       ` Pedro Alves
2021-07-01 17:08         ` Luis Machado
2021-07-01 17:30           ` Pedro Alves
2021-07-01 18:03             ` Luis Machado
2021-07-01 18:34               ` Pedro Alves
2021-07-01 19:14                 ` Luis Machado
2021-07-02  9:05                   ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).