public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: gdb-patches@sourceware.org
Cc: Andrew Burgess <aburgess@redhat.com>
Subject: [PATCHv2 07/10] gdb: add qMachineId packet
Date: Fri, 25 Aug 2023 16:34:40 +0100	[thread overview]
Message-ID: <050e4522564ed979d49880203cd7ac7b9af6ec48.1692977354.git.aburgess@redhat.com> (raw)
In-Reply-To: <cover.1692977354.git.aburgess@redhat.com>

In later commits I want to make two related changes to GDB, these are:

  1. Have GDB know when it can safely ignore a 'target:' prefix in the
  sysroot, and so avoid copying files from the remote target, and

  2. Have GDB know that it can safely use the file specified with the
  'file' command to start a remote inferior, rather than requiring the
  'set remote exec-file' command to have been used.

Both of these changes require that GDB be able to know if it is
running on the same host as the remote target.

In this commit I propose a mechanism by which this can be achieved,
that is, the introduction of the qMachineId packet.

The idea of the qMachineId packet is that, during the initial
connection phase, GDB will send the qMachineId packet, and the remote
will return a reply that describes the machine the remote target is
running on.

Back on the GDB side, GDB will generate a description of the machine
it is running on and compare this to the reply received from the
remote target.

If the two match then GDB will assume it is running on the same
machine as the remote target, and that it can access the same set of
files, in this case we can enable the two improvements listed above.

If the remote target doesn't support qMachineId, or the reply from the
remote target doesn't match the machine-id generated within GDB, then
GDB will assume that the target is truly remote, just as it does right
now.

This commit does NOT implement the two improvements listed above,
these will be added in follow on commits.  This commit just adds
support for the qMachineId packet.

Generating a suitable machine-id is, I think, always going to be
target specific.  As such, I've structured the code in a way that
allows different targets to provide their own implementations, but
I've only implemented a solution for the Linux targets.

The reply to a qMachineId packet looks like this:

  predicate;key=value[;key=value]*

the idea being that the reply consists of a number of key/value pairs,
each of which must match in order for GDB to consider the machine-id a
match.  I currently propose just two keys:

  linux-boot-id - this returns the value from the file
  /proc/sys/kernel/random/boot_id, which, if I understand correctly,
  should be unique(ish) for each boot of each machine, and

  cuserid - this returns the value of the cuserid call.

My thinking is that if we know we are on the same machine (thanks to
linux-boot-id), and we know we are the same effective user (thanks to
cuserid) then there's a pretty good chance that GDB and the remote can
access the same set of files.

As well as the 'predicate;' based reply, a remote can respond to a
qMachineId packet with one of these replies:

  local
  remote

The 'local' reply forces GDB to consider the remote as being local to
GDB.  I've documented this as something that should be used with
extreme care, obviously it would be easy for a user to run a remote
non-locally, in which case, if the remote claims to be local, then GDB
is going to try to access the files directly ... but maybe there will
be some use case where this is helpful.

The 'remote' reply is the opposite, it forces GDB to consider the
remote as being truly remote (which is the behaviour we have today).
It's always safe to return this reply, though this prevents GDB from
performing any of the improvements listed above.

For the GDB/gdbserver implementation, the code to generate the values
for the machine-id has been placed in gdb/nat/linux-machine-id.c, and
is shared between GDB and gdbserver.

There are no tests in this commit as there's no new commands, or user
visible behaviour changes (without turning on debug output), that can
be seen.  However, later commits will add new functionality, which
will rely on this packet working correctly.
---
 gdb/Makefile.in            |   3 +
 gdb/NEWS                   |   5 ++
 gdb/configure.nat          |   2 +-
 gdb/doc/gdb.texinfo        | 126 +++++++++++++++++++++++++++
 gdb/linux-nat.c            |  42 +++++++++
 gdb/nat/linux-machine-id.c |  85 +++++++++++++++++++
 gdb/nat/linux-machine-id.h |  75 ++++++++++++++++
 gdb/nat/linux-namespaces.c |  13 +++
 gdb/nat/linux-namespaces.h |  14 +++
 gdb/remote-machine-id.c    |  69 +++++++++++++++
 gdb/remote-machine-id.h    | 108 ++++++++++++++++++++++++
 gdb/remote.c               | 169 +++++++++++++++++++++++++++++++++++++
 gdbserver/Makefile.in      |   1 +
 gdbserver/configure.srv    |   2 +-
 gdbserver/linux-low.cc     |  24 ++++++
 gdbserver/linux-low.h      |   2 +
 gdbserver/server.cc        |  12 +++
 gdbserver/target.cc        |   8 ++
 gdbserver/target.h         |   9 ++
 19 files changed, 767 insertions(+), 2 deletions(-)
 create mode 100644 gdb/nat/linux-machine-id.c
 create mode 100644 gdb/nat/linux-machine-id.h
 create mode 100644 gdb/remote-machine-id.c
 create mode 100644 gdb/remote-machine-id.h

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 9b992a3d8c0..8eda606f838 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -1171,6 +1171,7 @@ COMMON_SFILES = \
 	reggroups.c \
 	remote.c \
 	remote-fileio.c \
+	remote-machine-id.c \
 	remote-notif.c \
 	reverse.c \
 	run-on-main-thread.c \
@@ -1447,6 +1448,7 @@ HFILES_NO_SRCDIR = \
 	regset.h \
 	remote.h \
 	remote-fileio.h \
+	remote-machine-id.h \
 	remote-notif.h \
 	riscv-fbsd-tdep.h \
 	riscv-ravenscar-thread.h \
@@ -1567,6 +1569,7 @@ HFILES_NO_SRCDIR = \
 	nat/gdb_thread_db.h \
 	nat/fork-inferior.h \
 	nat/linux-btrace.h \
+	nat/linux-machine-id.h \
 	nat/linux-namespaces.h \
 	nat/linux-nat.h \
 	nat/linux-osdata.h \
diff --git a/gdb/NEWS b/gdb/NEWS
index d78929c1398..2b1f265f5b8 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -293,6 +293,11 @@ qDefaultExecAndArgs
   which the server was started.  If no such information was given to
   the server then this is reflected in the reply.
 
+qMachineId
+  This packet returns an identifier that allows GDB to determine if
+  the remote server and GDB are running on the same host, and can see
+  the same filesystem.
+
 *** Changes in GDB 13
 
 * MI version 1 is deprecated, and will be removed in GDB 14.
diff --git a/gdb/configure.nat b/gdb/configure.nat
index aabcdeff989..92a8c4592bc 100644
--- a/gdb/configure.nat
+++ b/gdb/configure.nat
@@ -58,7 +58,7 @@ case ${gdb_host} in
 		proc-service.o \
 		linux-thread-db.o linux-nat.o nat/linux-osdata.o linux-fork.o \
 		nat/linux-procfs.o nat/linux-ptrace.o nat/linux-waitpid.o \
-		nat/linux-personality.o nat/linux-namespaces.o'
+		nat/linux-personality.o nat/linux-machine-id.o nat/linux-namespaces.o'
 	NAT_CDEPS='$(srcdir)/proc-service.list'
 	LOADLIBES='-ldl $(RDYNAMIC)'
 	;;
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index fa062e25bca..6b417ab85d8 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -41730,6 +41730,7 @@
 * Traceframe Info Format::
 * Branch Trace Format::
 * Branch Trace Configuration Format::
+* Machine-Id Details::
 @end menu
 
 @node Overview
@@ -44812,6 +44813,49 @@
 Indicates an error was encountered.
 @end table
 
+@anchor{Machine-Id Packet}
+@cindex query remote machine-id, remote request
+@cindex @samp{qMachineId} packet
+@item qMachineId
+Access a remote @dfn{machine-id}.  The machine-id returned in response
+to this packet is compared to a machine-id created on the host the
+debugger is running on, if the two machine-ids match then the debugger
+will assume that the remote server and the debugger are running on the
+same machine, and can access the same files, the debugger will use
+this knowledge to avoid unnecessary copying of files from the remote
+(@pxref{File-I/O Remote Protocol Extension}).
+
+Reply:
+@table @samp
+@item predicate;@var{key}=@var{value}@r{[};@var{key}=@var{value}@r{]*}
+Returning a string starting with @samp{predicate;}, followed by one or
+more @var{key}=@var{value} pairs, defines a machine-id.  Each
+@var{key} and @var{value} is a non-empty string that must not contain
+the characters @samp{;} or @samp{=}.  Each @var{key} must be unique
+within a single reply.  See @ref{Machine-Id Details}, for details of
+valid @var{key}s and their @var{value}s.
+
+@item remote
+Returning the string @samp{remote} indicates that the remote server
+should always be considered truly remote, and files the debugger needs
+to access should be first copied from the remote.
+
+@item local
+Returning the string @samp{local} indicates that the remote server
+should always be treated as running on the same host as the debugger.
+The debugger will avoid copying files from the remote server, and will
+instead try to access files directly.
+
+Sending this reply will rarely be appropriate, as it implies certainty
+about where the remote server and debugger are running, however, in
+some tightly controlled environments this might be appropriate.  Using
+a @samp{predicate} based reply would be better if at all possible.
+
+@item E @var{NN}
+@itemx E.errtext
+Indicates an error was encountered.
+@end table
+
 @item Qbtrace:bts
 Enable branch tracing for the current thread using Branch Trace Store.
 
@@ -47519,6 +47563,88 @@
 <!ATTLIST pt	size	CDATA	#IMPLIED>
 @end smallexample
 
+@node Machine-Id Details
+@section Machine-Id Details
+@cindex machine-id key-value pair details
+
+This section describes the valid @var{key}s and @var{values}s that can
+be returned in response to the @samp{qMachineId} packet
+(@pxref{Machine-Id Packet}), specifically, when using a
+@samp{predicate;} based reply.  Other reply types for the
+@samp{qMachineId} packet don't include @var{key}s and @var{value}s.
+
+There are two types of @var{key}, master keys and secondary keys.  A
+reply should contain exactly one master key, and zero or more
+secondary keys.  The set of valid secondary keys will depend on which
+master key is used.
+
+No @var{key} of @var{value} can contain the characters @samp{;} or
+@samp{=}.
+
+The order of the @var{key}/@var{value} pairs in the reply does not
+matter.
+
+Currently supported master and secondary keys are described below:
+
+@table @samp
+@item linux-boot-id
+The value for this master key contains the contents of the first line
+of the file @file{/proc/sys/kernel/random/boot_id} with any @samp{-}
+characters filtered out.
+
+@table @samp
+@item cuserid
+The value for this secondary key contains a username string associated
+with the effective user ID of the remote server process.
+
+@item namespaces
+The value for this secondary key contains a set of namespace
+identifiers for the remote server process.  Currently the only two
+namespaces that are identified are @file{mnt} and @file{user}.  The
+format of this string is @samp{mnt:@var{mnt-id},user:@var{user-id}}.
+
+The @var{mnt-id} and @var{user-id} are either the character @samp{-}
+if the particular namespace is not support on this host, or is the
+inode of the underlying namespace (as obtained by a @code{stat()}
+call) formatted in hex without any @samp{0x} prefix.
+
+An example of this string is:
+
+@smallexample
+mnt:f0000000,user:effffffd
+@end smallexample
+@end table
+
+An example reply using @samp{linux-boot-id} and all secondary keys is
+below, due to length the line below had to be wrapped to fit on the
+page, but the actual format does not include any extra newlines or
+white space:
+
+@smallexample
+@c Not sure if wrapping this is the best solution.  If I don't wrap
+@c it, then in the pdf version of the manual this overflows the page
+@c and is truncated, but if I do wrap it, then maybe it's not clear to
+@c the reader what the actual format is?
+predicate;linux-boot-id=28d154b3b1518383b3b4efcbd221fa7d;
+       cuserid=username;
+       namespaces=mnt:f0000000,user:effffffd
+@end smallexample
+
+@end table
+
+When matching a machine-id @value{GDBN} first checks the reply for a
+master key that it understands.  If a suitable key is found
+@value{GDBN} checks that the value for the master key matches its
+value for the master key.  If the master key value matches, then
+@value{GDBN} checks all the remaining @var{key}/@var{value} pairs;
+each @var{key} must be known secondary key associated with the
+previously matched master key, and the secondary @var{value} must
+match @value{GDBN}'s computed value.
+
+If all @var{key}s are known, and their @var{value}s match, then
+@value{GDBN} considers the machine-id a match, otherwise, the
+machine-id is considered non-matching.
+
 @include agentexpr.texi
 
 @node Target Descriptions
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index e58d67183b5..9f8db6bb7cc 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -69,6 +69,8 @@
 #include "gdbsupport/scope-exit.h"
 #include "gdbsupport/gdb-sigmask.h"
 #include "gdbsupport/common-debug.h"
+#include "remote-machine-id.h"
+#include "nat/linux-machine-id.h"
 #include <unordered_map>
 
 /* This comment documents high-level logic of this file.
@@ -4491,6 +4493,42 @@ current_lwp_ptid (void)
   return inferior_ptid;
 }
 
+struct linux_nat_machine_id_validation : public machine_id_validation
+{
+  linux_nat_machine_id_validation ()
+    : machine_id_validation ("linux-boot-id")
+  { /* Nothing.  */ }
+
+  bool check_master_key (const std::string &value) override
+  {
+    std::string boot_id = gdb_linux_machine_id_linux_boot_id ();
+    if (boot_id.empty ())
+      return false;
+    return boot_id == value;
+  }
+
+  bool check_secondary_key (const std::string &key,
+			    const std::string &value) override
+  {
+    if (key == "cuserid")
+      {
+	std::string username = gdb_linux_machine_id_cuserid ();
+	if (username.empty ())
+	  return false;
+	return username == value;
+      }
+    else if (key == "namespaces")
+      {
+	std::string namespaces = gdb_linux_machine_id_namespaces ();
+	if (namespaces.empty ())
+	  return false;
+	return namespaces == value;
+      }
+
+    return false;
+  }
+};
+
 void _initialize_linux_nat ();
 void
 _initialize_linux_nat ()
@@ -4528,6 +4566,10 @@ Enables printf debugging output."),
   sigemptyset (&blocked_mask);
 
   lwp_lwpid_htab_create ();
+
+  std::unique_ptr<linux_nat_machine_id_validation> validation
+    (new linux_nat_machine_id_validation);
+  register_machine_id_validation (std::move (validation));
 }
 \f
 
diff --git a/gdb/nat/linux-machine-id.c b/gdb/nat/linux-machine-id.c
new file mode 100644
index 00000000000..afa32177dbd
--- /dev/null
+++ b/gdb/nat/linux-machine-id.c
@@ -0,0 +1,85 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "gdbsupport/common-defs.h"
+
+#include "nat/linux-machine-id.h"
+#include "nat/linux-namespaces.h"
+
+#include "safe-ctype.h"
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+
+/* See nat/linux-machine-id.h.  */
+
+std::string
+gdb_linux_machine_id_linux_boot_id ()
+{
+  int fd = open ("/proc/sys/kernel/random/boot_id", O_RDONLY);
+  if (fd < 0)
+    return "";
+
+  std::string boot_id;
+  char buf;
+  while (read (fd, &buf, sizeof (buf)) == sizeof (buf))
+    {
+      if (ISXDIGIT (buf))
+	boot_id += buf;
+    }
+
+  close (fd);
+
+  return boot_id;
+}
+
+/* See nat/linux-machine-id.h.  */
+
+std::string
+gdb_linux_machine_id_cuserid ()
+{
+  char cuserid_str[L_cuserid];
+  char *res = cuserid (cuserid_str);
+  if (res == nullptr)
+    return "";
+
+  return std::string (cuserid_str);
+}
+
+/* See nat/linux-machine-id.h.  */
+
+std::string
+gdb_linux_machine_id_namespaces ()
+{
+  std::vector<std::string> namespaces;
+  namespaces.emplace_back (linux_ns_id (LINUX_NS_MNT));
+  namespaces.emplace_back (linux_ns_id (LINUX_NS_USER));
+
+  /* Ensure namespaces are sorted alphabetically.  */
+  std::sort (namespaces.begin (), namespaces.end ());
+
+  std::string str;
+  for (const std::string &n : namespaces)
+    {
+      if (!str.empty ())
+	str += ",";
+      str += n;
+    }
+
+  return str;
+}
diff --git a/gdb/nat/linux-machine-id.h b/gdb/nat/linux-machine-id.h
new file mode 100644
index 00000000000..4167aca4bcb
--- /dev/null
+++ b/gdb/nat/linux-machine-id.h
@@ -0,0 +1,75 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef NAT_LINUX_MACHINE_ID_H
+#define NAT_LINUX_MACHINE_ID_H
+
+#include <string>
+
+/* Return a string that contains the Linux boot-id, formatted for use in
+   the qMachineId packet.  If anything goes wrong then an empty string is
+   returned, otherwise a non-empty string is returned.
+
+   This is used by gdbserver when sending the reply to a qMachineId packet,
+   and used by GDB to check the value returned in for a qMachineId
+   packet.  */
+
+extern std::string gdb_linux_machine_id_linux_boot_id ();
+
+/* Return a string that contains the result of calling cuserid, that is, a
+   username associated with the effective user-id of the current process.
+   If anything goes wrong then an empty string is returned, otherwise a
+   non-empty string is returned.
+
+   This is used by gdbserver when sending the reply to a qMachineId packet,
+   and used by GDB to check the value returned in for a qMachineId
+   packet.  */
+
+extern std::string gdb_linux_machine_id_cuserid ();
+
+/* Return a string describing various namespaces of the current process.
+   The format of the returned string is this:
+
+   <STRING> ::= <DESC-LIST>
+
+   <DESC-LIST> ::= <DESC-ITEM>
+                 | <DESC-ITEM> "," <DESC-LIST>
+
+   <DESC-ITEM> ::= <NAME> ":" <INODE-NUMBER>
+                 | <NAME> ":" "-"
+
+   The <DESC-ITEM>s in the <DESC-LIST> are sorted alphabetically in
+   ascending order.
+
+   Each <NAME> is the name of a namespace, as found in /proc/self/ns/,
+   e.g. 'mnt', 'pid', 'user', etc.
+
+   The <INODE-NUMBER> is the inode of the underlying namespace (as returned
+   by a stat call), formatted as hex with no '0x' prefix.  If the namespace
+   is not supported on the current host then the <INODE-NUMBER> is replaced
+   with the character "-".
+
+   If anything goes wrong building the namespace string then an empty
+   string is returned.
+
+   This is used by gdbserver when sending the reply to a qMachineId packet,
+   and used by GDB to check the value returned in for a qMachineId
+   packet.  */
+
+extern std::string gdb_linux_machine_id_namespaces ();
+
+#endif /* NAT_LINUX_MACHINE_ID_H */
diff --git a/gdb/nat/linux-namespaces.c b/gdb/nat/linux-namespaces.c
index 4b1fee18425..3bec724c9c4 100644
--- a/gdb/nat/linux-namespaces.c
+++ b/gdb/nat/linux-namespaces.c
@@ -1049,3 +1049,16 @@ linux_mntns_readlink (pid_t pid, const char *filename,
 
   return ret;
 }
+
+/* See nat/linux-namespaces.h.  */
+
+std::string
+linux_ns_id (enum linux_ns_type type)
+{
+  struct linux_ns *ns = linux_ns_get_namespace (type);
+  gdb_assert (ns->initialized);
+  if (!ns->supported)
+    return (std::string (ns->filename) + ":-");
+  return (std::string (ns->filename) + ":"
+	  + string_printf ("%llx", (unsigned long long) ns->id));
+}
diff --git a/gdb/nat/linux-namespaces.h b/gdb/nat/linux-namespaces.h
index a45b99cf650..523a6e709f1 100644
--- a/gdb/nat/linux-namespaces.h
+++ b/gdb/nat/linux-namespaces.h
@@ -73,4 +73,18 @@ extern int linux_mntns_unlink (pid_t pid, const char *filename);
 extern ssize_t linux_mntns_readlink (pid_t pid, const char *filename,
 				     char *buf, size_t bufsiz);
 
+/* Return an identification string representing namespace TYPE.  The
+   string has the format 'name:inode' or 'name:-'.  If something goes
+   wrong obtaining the information about namespace TYPE then an empty
+   string is returned.
+
+   The 'name' part of the returned string is the short name found in
+   /proc/self/ns/, e.g. 'mnt', 'user', etc.
+
+   The 'inode' part of the returned string is the inode of the underlying
+   namespace formatted as hex but with no '0x' prefix.  The character '-'
+   is returned if namespace TYPE is not supported on this host.  */
+
+extern std::string linux_ns_id (enum linux_ns_type type);
+
 #endif /* NAT_LINUX_NAMESPACES_H */
diff --git a/gdb/remote-machine-id.c b/gdb/remote-machine-id.c
new file mode 100644
index 00000000000..155bf5b3a0c
--- /dev/null
+++ b/gdb/remote-machine-id.c
@@ -0,0 +1,69 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "remote-machine-id.h"
+
+#include <string>
+#include <vector>
+
+/* List of all registered machine_id_validation objects.  */
+static std::vector<std::unique_ptr<machine_id_validation>> validation_list;
+
+/* See remote-machine-id.h.  */
+
+void
+register_machine_id_validation (std::unique_ptr<machine_id_validation> &&validation)
+{
+  validation_list.emplace_back (std::move (validation));
+}
+
+/* See remote-machine-id.  */
+
+bool
+validate_machine_id (const std::unordered_map<std::string, std::string> &kv_pairs)
+{
+  for (const auto &validator : validation_list)
+    {
+      const auto kv_master = kv_pairs.find (validator->master_key ());
+      if (kv_master == kv_pairs.end ())
+	continue;
+
+      if (!validator->check_master_key (kv_master->second))
+	continue;
+
+      /* Check all the secondary keys in KV_PAIRS.  */
+      bool match_failed = false;
+      for (const auto &kv : kv_pairs)
+	{
+	  if (kv.first == validator->master_key ())
+	    continue;
+
+	  if (!validator->check_secondary_key (kv.first, kv.second))
+	    {
+	      match_failed = true;
+	      break;
+	    }
+	}
+
+      if (!match_failed)
+	return true;
+    }
+
+  /* None of the machine_id_validation objects matched KV_PAIRS.  */
+  return false;
+}
diff --git a/gdb/remote-machine-id.h b/gdb/remote-machine-id.h
new file mode 100644
index 00000000000..8dd3b377591
--- /dev/null
+++ b/gdb/remote-machine-id.h
@@ -0,0 +1,108 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef REMOTE_MACHINE_ID_H
+#define REMOTE_MACHINE_ID_H
+
+#include <memory>
+#include <unordered_map>
+
+/* A base class from which machine-id validation objects can be created.
+   A remote target can send GDB a machine-id, which can be used to check
+   if the remote target and GDB are running on the same machine, and have
+   a common view of the file-system.  Knowing this allows GDB to optimise
+   some of its interactions with the remote target.
+
+   A machine-id consists of a set of key-value pairs, where both keys and
+   values are std::string objects.  A machine-id has a single master key
+   and some number of secondary keys.
+
+   Within GDB the native target will register one or more of these objects
+   by calling register_machine_id_validation.  When GDB receives a
+   machine-id from a remote-target each registered machine_id_validation
+   object will be checked in turn to see if it matches the machined-id.
+   If any machine_id_validation matches then this indicates that GDB and
+   the remote target are on the same machine.  */
+struct machine_id_validation
+{
+  /* Constructor.  MASTER_KEY is the name of the master key that this
+     machine_id_validation object validates for.  */
+  machine_id_validation (std::string &&master_key)
+    : m_master_key (master_key)
+  { /* Nothing.  */ }
+
+  /* Destructor.  */
+  virtual ~machine_id_validation ()
+  { /* Nothing.  */ }
+
+  /* Return a reference to the master key.  */
+  const std::string &
+  master_key () const
+  {
+    return m_master_key;
+  }
+
+  /* VALUE is a string passed from the remote target corresponding to the
+     key for master_key().  If the remote target didn't pass a key
+     matching master_key() then this function should not be called.
+
+     Return true if VALUE matches the value calculated for the host on
+     which GDB is currently running.  */
+  virtual bool
+  check_master_key (const std::string &value) = 0;
+
+  /* This function will only be called for a machine-id which contains a
+     key matching master_key(), and for which check_master_key() returned
+     true.
+
+     KEY and VALUE are a key-value pair passed from the remote target.
+     This function should return true if KEY is known, and VALUE matches
+     the value calculated for the host on which GDB is running.  If KEY is
+     not known, or VALUE doesn't match, then this function should return
+     false.  */
+  virtual bool
+  check_secondary_key (const std::string &key, const std::string &value) = 0;
+
+private:
+  /* The master key for which this object validates machine-ids.  */
+  std::string m_master_key;
+};
+
+/* Register a new machine-id.  */
+
+extern void register_machine_id_validation
+  (std::unique_ptr<machine_id_validation> &&validation);
+
+/* KV_PAIRS contains all machine-id obtained from the remote target, the
+   keys are the index into the map, and the values are the values of the
+   map.  These pairs are checked against all of the registered
+   machine_id_validation objects.
+
+   If any machine_id_validation matches all the data in KV_PAIRS then this
+   function returns true, otherwise, this function returns false.
+
+   For KV_PAIRS to match against a machine_id_validation object, KV_PAIRS
+   must contain a key matching machine_id_validation::master_key(), and the
+   value for that key must return true when passed to the function
+   machine_id_validation::check_master_key().  Then, for every other
+   key/value pair machine_id_validation::check_secondary_key() must return
+   true.  */
+
+extern bool validate_machine_id
+  (const std::unordered_map<std::string, std::string> &kv_pairs);
+
+#endif /* REMOTE_MACHINE_ID_H */
diff --git a/gdb/remote.c b/gdb/remote.c
index 77dc9607997..1a1e963f832 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -80,6 +80,7 @@
 #include "async-event.h"
 #include "gdbsupport/selftest.h"
 #include "cli/cli-style.h"
+#include "remote-machine-id.h"
 
 /* The remote target.  */
 
@@ -306,6 +307,9 @@ enum {
   /* Support the qDefaultExecAndArgs packet.  */
   PACKET_qDefaultExecAndArgs,
 
+  /* Support the qMachineId packet.  */
+  PACKET_qMachineId,
+
   PACKET_MAX
 };
 
@@ -557,6 +561,15 @@ class remote_state
      this can go away.  */
   int wait_forever_enabled_p = 1;
 
+  /* Set to true if the remote target returned a machine-id (see
+     qMachineId packet) which matched one of the registered validation
+     objects.  This indicates that the remote target is running on the
+     same host as GDB (and can see the same filesystem as GDB.
+
+     Otherwise, this is false, which indicates the remote target should be
+     treated as truly remote.  */
+  bool remote_target_is_local_p = false;
+
 private:
   /* Mapping of remote protocol data for each gdbarch.  Usually there
      is only one entry here, though we may see more with stubs that
@@ -1341,6 +1354,12 @@ class remote_target : public process_stratum_target
   /* Fetch the executable filename and argument string from the remote.  */
   remote_exec_and_args_info fetch_default_executable_and_arguments ();
 
+  /* Send the qMachineId packet and process the reply.  Update the
+     remote_state::remote_target_is_local_p field based on the result.  We
+     assume that when this is called remote_target_is_local_p will be
+     false by default.  */
+  void fetch_remote_machine_id ();
+
   bool start_remote_1 (int from_tty, int extended_p);
 
   /* The remote state.  Don't reference this directly.  Use the
@@ -5055,6 +5074,151 @@ struct scoped_mark_target_starting
   scoped_restore_tmpl<bool> m_restore_starting_up;
 };
 
+/* Extract a machine-id key/value pair from the null-terminated string
+   **STRP, and update STRP to point to the first character after the parsed
+   key/value pair, including skipping any ';' that appears after the
+   key/value pair.
+
+   A key/value pair consists of two strings separated by an '=' character,
+   neither string will contain a '=' or ';' character.
+
+   Characters are read from *STRP until '=', ';' or the null character are
+   found, this forms the key string.  If ';' or null character were found
+   then the value string is empty.  Otherwise, '=' was found, the '=' is
+   skipped, and character are read until ';' or the null character are
+   found, this forms the value string.
+
+   This function will throw an error if the key string is found to be zero
+   length (e.g. '=abc' is invalid), or if the value string contains a '='
+   character (e.g. 'foo=def=ghi' is invalid).
+
+   The pair <key, value> is then returned.  */
+
+static
+std::pair<std::string, std::string> extract_kv_pair (const char **strp)
+{
+  gdb_assert (strp != nullptr);
+  gdb_assert (*strp != nullptr);
+  gdb_assert (**strp != '\0');
+
+  std::string key, value;
+  const char *str = *strp;
+  while (*str != '=' && *str != ';' && *str != '\0')
+    {
+      key += *str;
+      ++str;
+    }
+
+  if (key.empty ())
+    error (_("empty key while parsing '%s'"), *strp);
+
+  if (*str == '\0' || *str == ';')
+    {
+      if (*str == ';')
+	++str;
+      *strp = str;
+      return { key, "" };
+    }
+
+  gdb_assert (*str == '=');
+  ++str;
+
+  while (*str != ';' && *str != '\0')
+    {
+      if (*str == '=')
+	error (_("found '=' character in value string while parsing '%s'"),
+	       *strp);
+      value += *str;
+      ++str;
+    }
+
+  if (*str == ';')
+    ++str;
+  *strp = str;
+  return { key, value };
+}
+
+/* See declaration in class above.   */
+
+void
+remote_target::fetch_remote_machine_id ()
+{
+  struct remote_state *rs = get_remote_state ();
+
+  /* This should only be called for newly created remote_target objects, so
+     the remote_state::remote_target_is_local_p within the remote_target
+     should be false by default.  */
+  gdb_assert (!rs->remote_target_is_local_p);
+
+  if (m_features.packet_support (PACKET_qMachineId) == PACKET_DISABLE)
+    return;
+
+  putpkt ("qMachineId");
+  getpkt (&rs->buf, 0);
+
+  auto packet_result = m_features.packet_ok (rs->buf, PACKET_qMachineId);
+  if (packet_result == PACKET_UNKNOWN)
+    return;
+
+  if (packet_result == PACKET_ERROR)
+    {
+      warning (_("Remote error: %s"), rs->buf.data ());
+      return;
+    }
+
+  /* If the machine-id is the string 'remote' then we are done.  The
+     remote_target_is_local_p field is false by default.  */
+  const char *id = rs->buf.data ();
+  if (startswith (id, "remote") && (id[6] == ';' || id[6] == '\0'))
+    return;
+
+  /* If the machine-id is the string 'local' then the remote claims to
+     "know" that it is on the same machine as GDB.  Good luck with that.  */
+  if (startswith (id, "local") && (id[5] == ';' || id[5] == '\0'))
+    {
+      rs->remote_target_is_local_p = true;
+      return;
+    }
+
+  /* If the machine-id starts with the string 'predicate;', then
+     everything after that string is the part of the machine-id that we
+     need to match against to confirm we are on the same machine as the
+     remote target.  */
+  static const char *predicate_prefix = "predicate;";
+  if (!startswith (id, predicate_prefix))
+    return;
+  id += strlen (predicate_prefix);
+
+  /* Split the ID string into key/value pairs.  */
+  std::unordered_map<std::string, std::string> kv;
+  try
+    {
+      while (*id != '\0')
+	{
+	  auto kv_pair = extract_kv_pair (&id);
+	  kv.emplace (std::move (kv_pair.first), std::move (kv_pair.second));
+	}
+    }
+  catch (const gdb_exception &ex)
+    {
+      /* Let the user know something went wrong, and then return, treating
+	 the target as truly remote.  */
+      warning (_("Error parsing qMachineId packet: %s"), ex.what ());
+      return;
+    }
+
+  /* If there were no predicates, then this looks like a badly behaved
+     remote target, warn the user, and assume the target is remote.  */
+  if (kv.empty ())
+    {
+      warning (_("no machine-id predicates in qMachineId packet reply"));
+      return;
+    }
+
+  /* Check to see if the remote machine is actually local.  */
+  rs->remote_target_is_local_p = validate_machine_id (kv);
+}
+
 /* See declaration in class above.   */
 
 remote_exec_and_args_info
@@ -5188,6 +5352,8 @@ remote_target::start_remote_1 (int from_tty, int extended_p)
 	rs->noack_mode = 1;
     }
 
+  fetch_remote_machine_id ();
+
   auto exec_and_args = fetch_default_executable_and_arguments ();
 
   /* Update the inferior with the executable and argument string from the
@@ -15640,6 +15806,9 @@ Show the maximum size of the address (in bits) in a memory packet."), NULL,
   add_packet_config_cmd (PACKET_qDefaultExecAndArgs, "qDefaultExecAndArgs",
 			 "fetch-exec-and-args", 0);
 
+  add_packet_config_cmd (PACKET_qMachineId, "qMachineId",
+			 "fetch-machine-id", 0);
+
   /* Assert that we've registered "set remote foo-packet" commands
      for all packet configs.  */
   {
diff --git a/gdbserver/Makefile.in b/gdbserver/Makefile.in
index 39cb9e7a151..c746a950bed 100644
--- a/gdbserver/Makefile.in
+++ b/gdbserver/Makefile.in
@@ -220,6 +220,7 @@ SFILES = \
 	$(srcdir)/../gdb/nat/aarch64-mte-linux-ptrace.c \
 	$(srcdir)/../gdb/nat/aarch64-sve-linux-ptrace.c \
 	$(srcdir)/../gdb/nat/linux-btrace.c \
+	$(srcdir)/../gdb/nat/linux-machine-id.c \
 	$(srcdir)/../gdb/nat/linux-namespaces.c \
 	$(srcdir)/../gdb/nat/linux-osdata.c \
 	$(srcdir)/../gdb/nat/linux-personality.c \
diff --git a/gdbserver/configure.srv b/gdbserver/configure.srv
index f0101994529..838d446d53e 100644
--- a/gdbserver/configure.srv
+++ b/gdbserver/configure.srv
@@ -26,7 +26,7 @@ ipa_ppc_linux_regobj="powerpc-32l-ipa.o powerpc-altivec32l-ipa.o powerpc-vsx32l-
 
 # Linux object files.  This is so we don't have to repeat
 # these files over and over again.
-srv_linux_obj="linux-low.o nat/linux-osdata.o nat/linux-procfs.o nat/linux-ptrace.o nat/linux-waitpid.o nat/linux-personality.o nat/linux-namespaces.o fork-child.o nat/fork-inferior.o"
+srv_linux_obj="linux-low.o nat/linux-osdata.o nat/linux-procfs.o nat/linux-ptrace.o nat/linux-waitpid.o nat/linux-personality.o nat/linux-machine-id.o nat/linux-namespaces.o fork-child.o nat/fork-inferior.o"
 
 # Input is taken from the "${host}" and "${target}" variables.
 
diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index e1806ade82f..7d7ed36b5f9 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -61,6 +61,7 @@
 #include <elf.h>
 #endif
 #include "nat/linux-namespaces.h"
+#include "nat/linux-machine-id.h"
 
 #ifndef O_LARGEFILE
 #define O_LARGEFILE 0
@@ -6940,6 +6941,29 @@ linux_process_target::thread_pending_child (thread_info *thread)
   return get_lwp_thread (child);
 }
 
+/* See target.h.  */
+
+std::string
+linux_process_target::get_machine_id () const
+{
+  std::string boot_id = gdb_linux_machine_id_linux_boot_id ();
+  if (boot_id.empty ())
+    return "";
+  boot_id = "linux-boot-id=" + boot_id;
+
+  std::string username = gdb_linux_machine_id_cuserid ();
+  if (username.empty ())
+    return "";
+  username = "cuserid=" + username;
+
+  std::string namespaces = gdb_linux_machine_id_namespaces ();
+  if (namespaces.empty ())
+    return "";
+  namespaces = "namespaces=" + namespaces;
+
+  return boot_id + ";" + username + ";" + namespaces;
+}
+
 /* Default implementation of linux_target_ops method "set_pc" for
    32-bit pc register which is literally named "pc".  */
 
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index 6dc93197f5c..1728370d1f2 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -317,6 +317,8 @@ class linux_process_target : public process_stratum_target
 
   bool supports_catch_syscall () override;
 
+  std::string get_machine_id () const override;
+
   /* Return the information to access registers.  This has public
      visibility because proc-service uses it.  */
   virtual const regs_info *get_regs_info () = 0;
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index e749194e039..dda6406854a 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -2730,6 +2730,18 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
       return;
     }
 
+  if (strcmp ("qMachineId", own_buf) == 0)
+    {
+      std::string machine_id = the_target->get_machine_id ();
+      if (!machine_id.empty ())
+	machine_id = std::string ("predicate;") + machine_id;
+      else
+	machine_id = std::string ("remote");
+
+      strcpy (own_buf, machine_id.c_str ());
+      return;
+    }
+
   /* Otherwise we didn't know what packet it was.  Say we didn't
      understand it.  */
   own_buf[0] = 0;
diff --git a/gdbserver/target.cc b/gdbserver/target.cc
index f8e592d20c3..2d43dfbe8de 100644
--- a/gdbserver/target.cc
+++ b/gdbserver/target.cc
@@ -442,6 +442,14 @@ process_stratum_target::store_memtags (CORE_ADDR address, size_t len,
   gdb_assert_not_reached ("target op store_memtags not supported");
 }
 
+/* See target.h.  */
+
+std::string
+process_stratum_target::get_machine_id () const
+{
+  return "";
+}
+
 int
 process_stratum_target::read_offsets (CORE_ADDR *text, CORE_ADDR *data)
 {
diff --git a/gdbserver/target.h b/gdbserver/target.h
index d993e361b76..092a6d9d3df 100644
--- a/gdbserver/target.h
+++ b/gdbserver/target.h
@@ -508,6 +508,15 @@ class process_stratum_target
      Returns true if successful and false otherwise.  */
   virtual bool store_memtags (CORE_ADDR address, size_t len,
 			      const gdb::byte_vector &tags, int type);
+
+  /* Return a string representing a machine-id suitable for returning
+     within a qMachineId packet response, but don't include the
+     'predicate;' prefix.
+
+     If the current target doesn't support machine-id, or if we fail to
+     build the machine-id for any reason, then return an empty string, the
+     server will send back a suitable reply to the debugger.  */
+  virtual std::string get_machine_id () const;
 };
 
 extern process_stratum_target *the_target;
-- 
2.25.4


  parent reply	other threads:[~2023-08-25 15:35 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16 15:54 [PATCH 00/10] Improve GDB/gdbserver experience when using a local gdbserver Andrew Burgess
2023-08-16 15:54 ` [PATCH 01/10] gdb: have remote_target::extended_remote_run take the exec filename Andrew Burgess
2023-08-23  9:30   ` Alexandra Petlanova Hajkova
2023-08-16 15:54 ` [PATCH 02/10] gdb: improve how 'remote exec-file' is stored and accessed Andrew Burgess
2023-08-23  8:44   ` Alexandra Petlanova Hajkova
2023-08-16 15:54 ` [PATCH 03/10] gdb: improve show text and help text for 'remote exec-file' Andrew Burgess
2023-08-23 11:36   ` Mark Wielaard
2023-08-24  8:56   ` Alexandra Petlanova Hajkova
2023-08-16 15:55 ` [PATCH 04/10] gdb/gdbserver: add new qDefaultExecAndArgs packet Andrew Burgess
2023-08-16 16:36   ` Eli Zaretskii
2023-08-28 15:35   ` Tom Tromey
2023-08-16 15:55 ` [PATCH 05/10] gdb: detect when gdbserver has no default executable set Andrew Burgess
2023-08-16 15:55 ` [PATCH 06/10] gdb: make use of is_target_filename Andrew Burgess
2023-08-23 13:35   ` Mark Wielaard
2023-08-16 15:55 ` [PATCH 07/10] gdb: add qMachineId packet Andrew Burgess
2023-08-16 16:34   ` Eli Zaretskii
2023-08-25 14:49     ` Andrew Burgess
2023-08-25 15:01       ` Eli Zaretskii
2023-09-26 14:42         ` Andrew Burgess
2023-09-29  7:45           ` Eli Zaretskii
2023-08-22  2:39   ` Thiago Jung Bauermann
2023-08-23  9:24   ` Mark Wielaard
2023-08-23 11:36     ` Andrew Burgess
2023-08-28 16:06   ` Tom Tromey
2023-08-16 15:55 ` [PATCH 08/10] gdb: remote filesystem can be local to GDB in some cases Andrew Burgess
2023-08-16 16:40   ` Eli Zaretskii
2023-08-16 15:55 ` [PATCH 09/10] gdb: use exec_file with remote targets when possible Andrew Burgess
2023-08-16 15:55 ` [PATCH 10/10] gdb: remote the get_remote_exec_file function Andrew Burgess
2023-08-23 13:42   ` Mark Wielaard
2023-08-22 10:41 ` [PATCH 00/10] Improve GDB/gdbserver experience when using a local gdbserver Alexandra Petlanova Hajkova
2023-08-23 14:32 ` Mark Wielaard
2023-08-23 15:26   ` Andrew Burgess
2023-08-25 15:34 ` [PATCHv2 " Andrew Burgess
2023-08-25 15:34   ` [PATCHv2 01/10] gdb: have remote_target::extended_remote_run take the exec filename Andrew Burgess
2023-08-25 15:34   ` [PATCHv2 02/10] gdb: improve how 'remote exec-file' is stored and accessed Andrew Burgess
2023-08-25 15:34   ` [PATCHv2 03/10] gdb: improve show text and help text for 'remote exec-file' Andrew Burgess
2023-08-25 15:34   ` [PATCHv2 04/10] gdb/gdbserver: add new qDefaultExecAndArgs packet Andrew Burgess
2023-08-26  6:46     ` Eli Zaretskii
2023-08-25 15:34   ` [PATCHv2 05/10] gdb: detect when gdbserver has no default executable set Andrew Burgess
2023-08-25 15:34   ` [PATCHv2 06/10] gdb: make use of is_target_filename Andrew Burgess
2023-08-25 15:34   ` Andrew Burgess [this message]
2023-08-26  6:54     ` [PATCHv2 07/10] gdb: add qMachineId packet Eli Zaretskii
2023-08-25 15:34   ` [PATCHv2 08/10] gdb: remote filesystem can be local to GDB in some cases Andrew Burgess
2023-08-26  6:49     ` Eli Zaretskii
2023-08-25 15:34   ` [PATCHv2 09/10] gdb: use exec_file with remote targets when possible Andrew Burgess
2023-08-25 15:34   ` [PATCHv2 10/10] gdb: remove the get_remote_exec_file function Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=050e4522564ed979d49880203cd7ac7b9af6ec48.1692977354.git.aburgess@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).