public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
@ 2023-03-16 10:39 Luis Machado
  2023-03-16 15:49 ` Andrew Burgess
  2023-03-23 10:56 ` [PATCH, v2] " Luis Machado
  0 siblings, 2 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-16 10:39 UTC (permalink / raw)
  To: gdb-patches, aburgess, pedro; +Cc: marcan

There were reports of gdb throwing internal errors when calling
inferior_thread ()/get_current_regcache () on a system with
Pointer Authentication enabled.

In such cases, gdb produces the following backtrace:

../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0xaaaae04a571f gdb_internal_backtrace_1
	../../../repos/binutils-gdb/gdb/bt-utils.c:122
0xaaaae04a57f3 _Z22gdb_internal_backtracev
	../../../repos/binutils-gdb/gdb/bt-utils.c:168
0xaaaae0b52ccf internal_vproblem
	../../../repos/binutils-gdb/gdb/utils.c:401
0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
	../../../repos/binutils-gdb/gdb/utils.c:481
0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
0xaaaae0a88983 _Z15inferior_threadv
	../../../repos/binutils-gdb/gdb/thread.c:86
0xaaaae0956c87 _Z20get_current_regcachev
	../../../repos/binutils-gdb/gdb/regcache.c:428
0xaaaae035223f aarch64_remove_non_address_bits
	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
0xaaaae0a692d7 memory_xfer_partial
	../../../repos/binutils-gdb/gdb/target.c:1620
0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
	../../../repos/binutils-gdb/gdb/target.c:1684
0xaaaae0a69e9f target_read_partial
	../../../repos/binutils-gdb/gdb/target.c:1937
0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
	../../../repos/binutils-gdb/gdb/target.c:1977
0xaaaae0a69937 _Z18target_read_memorymPhl
	../../../repos/binutils-gdb/gdb/target.c:1773
0xaaaae08be523 ps_xfer_memory
	../../../repos/binutils-gdb/gdb/proc-service.c:90
0xaaaae08be6db ps_pdread
	../../../repos/binutils-gdb/gdb/proc-service.c:124
0x40001ed7c3b3 _td_fetch_value
	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
0x40001ed791ef td_ta_map_lwp2thr
	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
0xaaaae07f4473 thread_from_lwp
	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
	../../../repos/binutils-gdb/gdb/target.c:2586
0xaaaae0789cf7 do_target_wait_1
	../../../repos/binutils-gdb/gdb/infrun.c:3825
0xaaaae0789e6f operator()
	../../../repos/binutils-gdb/gdb/infrun.c:3884
0xaaaae078a167 do_target_wait
	../../../repos/binutils-gdb/gdb/infrun.c:3903
0xaaaae078b0af _Z20fetch_inferior_eventv
	../../../repos/binutils-gdb/gdb/infrun.c:4314
0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
	../../../repos/binutils-gdb/gdb/inf-loop.c:41
0xaaaae07dc68b handle_target_event
	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
0xaaaae0e25fbb handle_file_event
	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
0xaaaae0e264f3 gdb_wait_for_event
	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
0xaaaae0e24f9b _Z16gdb_do_one_eventi
	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
0xaaaae080f033 start_event_loop
	../../../repos/binutils-gdb/gdb/main.c:411
0xaaaae080f1b7 captured_command_loop
	../../../repos/binutils-gdb/gdb/main.c:475
0xaaaae0810b97 captured_main
	../../../repos/binutils-gdb/gdb/main.c:1318
0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
	../../../repos/binutils-gdb/gdb/main.c:1337
0xaaaae0338453 main
	../../../repos/binutils-gdb/gdb/gdb.c:32
---------------------
../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
enabled more broad use of pointer authentication masks to remove non-address
bits of pointers.

The above crash happens because gdb is in the middle of handling an event,
and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
current thread.  This means a call to inferior_thread () will assert, and
attempting to call get_current_regcache () will also call inferior_thread (),
resulting in an assertion as well.

There is another crash scenario after we kill a previously active inferior, in
which case the gdbarch will still say we support pointer authentication but we
will also have no current thread (inferior_thread () will assert etc).

I thought about this for a bit, and I'm not sure what is the best way
to address this other than actively checking for a valid current thread
and valid current register cache.

If the target has support for Pointer Authentication, gdb needs to use
a couple (or 4, for bare-metal) mask registers to mask off some bits of
pointers, and for that it needs to access the registers.

At some points, like the one from the backtrace above, there is no active
thread/current regcache because gdb is in the middle of doing event handling
and switching between threads.

The following patch improves this situation by checking for an active thread
and register cache. If those don't exist, gdb falls back to using a default
mask which should be enough for what it is trying to do at the moment.

I discussed with Pedro the possibility of caching the mask register values
(which are per-process and can change mid-execution), but there isn't a good
spot to cache those values. Besides, the mask registers can change constantly
for bare-metal debugging when switching between exception levels.

In some cases, it is just not possible to get access to these mask registers,
like the case where threads are running. In those cases, using a default mask
to remove the non-address bits should be enough.

In addition to the check for a valid current register cache, I also introduced
a try/catch guard against errors reading registers, in case the current thread
is running, also falling back to using a default mask to remove non-address
bits.

This can happen when we let threads run in the background and then we attempt
to access a memory address (now that gdb is capable of reading memory even
with threads running).  Thus gdb will attempt to remove non-address bits
of that memory access, will attempt to access registers, running into errors.

Regression-tested on aarch64-linux Ubuntu 20.04.

Thoughts?
---
 gdb/aarch64-tdep.c | 38 +++++++++++++++++++++++++++++---------
 gdb/gdbthread.h    |  8 ++++++++
 gdb/regcache.c     | 10 ++++++++++
 gdb/regcache.h     |  8 ++++++++
 gdb/thread.c       |  8 ++++++++
 5 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
index 5b1b9921f87..8368592a7db 100644
--- a/gdb/aarch64-tdep.c
+++ b/gdb/aarch64-tdep.c
@@ -3562,13 +3562,17 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
      select bit (55).  */
   CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
 
-  if (tdep->has_pauth ())
+  /* Make sure we have a valid active current register cache to fetch pointer
+     authentication masks from.  In some situations, there may be no active
+     threads to fetch the data from (no inferiors, gdb in the middle of
+     handling events etc).  If we don't have a valid register cache, fallback
+     to using a default mask to remove non-address bits.  */
+  if (tdep->has_pauth () && current_regcache_exists ())
     {
       /* Fetch the PAC masks.  These masks are per-process, so we can just
-	 fetch data from whatever thread we have at the moment.
+	 fetch data from whatever thread we have at the moment, hence why we
+	 use the current register cache.  */
 
-	 Also, we have both a code mask and a data mask.  For now they are the
-	 same, but this may change in the future.  */
       struct regcache *regs = get_current_regcache ();
       CORE_ADDR cmask, dmask;
       int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
@@ -3583,13 +3587,29 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
 	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
 	}
 
-      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
-	dmask = mask;
+      /* GDB supports reading memory from threads that are running, but in
+	 those cases GDB cannot access registers.  If reading the pointer
+	 authentication registers fails, fallback to using a default mask.
+
+	 We could potentially get it wrong if the masks have been updated
+	 while the thread is running, but eventually we will see the thread
+	 stop and will be able to fetch the correct data.  */
+      try
+	{
+	  /* We have both a code mask and a data mask.  For now they are
+	     the same, but this may change in the future.  */
+	  if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
+	    dmask = mask;
 
-      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
-	cmask = mask;
+	  if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
+	    cmask = mask;
 
-      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
+	  mask |= aarch64_mask_from_pac_registers (cmask, dmask);
+	}
+      catch (const gdb_exception_error &e)
+	{
+	  /* We failed to fetch the contents of the masks from registers.  */
+	}
     }
 
   return aarch64_remove_top_bits (pointer, mask);
diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index c0f27a8a66e..ebb8f140770 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -875,6 +875,14 @@ class scoped_restore_current_thread
   enum language m_lang;
 };
 
+/* Return TRUE if an inferior thread exists and is currently selected,
+   meaning it is safe to call inferior_thread () without risking an assertion.
+
+   Return FALSE otherwise.  If so, inferior_thread () will assert if
+   called.  */
+
+extern bool inferior_thread_exists (void);
+
 /* Returns a pointer into the thread_info corresponding to
    INFERIOR_PTID.  INFERIOR_PTID *must* be in the thread list.  */
 extern struct thread_info* inferior_thread (void);
diff --git a/gdb/regcache.c b/gdb/regcache.c
index af76fab1a34..4685b01e9ab 100644
--- a/gdb/regcache.c
+++ b/gdb/regcache.c
@@ -422,6 +422,16 @@ get_thread_regcache (thread_info *thread)
 			      thread->ptid);
 }
 
+/* See regcache.h.  */
+
+bool
+current_regcache_exists (void)
+{
+  /* If we have an inferior thread selected, we should have a register
+     cache as well.  */
+  return inferior_thread_exists ();
+}
+
 struct regcache *
 get_current_regcache (void)
 {
diff --git a/gdb/regcache.h b/gdb/regcache.h
index b9ffab9950d..adff7e44cb9 100644
--- a/gdb/regcache.h
+++ b/gdb/regcache.h
@@ -30,6 +30,14 @@ struct address_space;
 class thread_info;
 struct process_stratum_target;
 
+/* Return TRUE if we can access the register cache from the
+   currently-selected thread.  If so, we can call get_current_regcache ()
+   safely without risking an assertion due to the lack of an inferior thread.
+
+   Return FALSE otherwise.  In this case, attempting to call
+   get_current_regcache () will run into an assertion.  */
+
+extern bool current_regcache_exists (void);
 extern struct regcache *get_current_regcache (void);
 extern struct regcache *get_thread_regcache (process_stratum_target *target,
 					     ptid_t ptid);
diff --git a/gdb/thread.c b/gdb/thread.c
index e4d98ce966a..545bbf2a411 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -80,6 +80,14 @@ is_current_thread (const thread_info *thr)
   return thr == current_thread_;
 }
 
+/* See gdbthread.h.  */
+
+bool
+inferior_thread_exists ()
+{
+  return current_thread_ != nullptr;
+}
+
 struct thread_info*
 inferior_thread (void)
 {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-16 10:39 [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers Luis Machado
@ 2023-03-16 15:49 ` Andrew Burgess
  2023-03-16 23:15   ` Luis Machado
  2023-03-23 10:56 ` [PATCH, v2] " Luis Machado
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2023-03-16 15:49 UTC (permalink / raw)
  To: Luis Machado, gdb-patches, pedro; +Cc: marcan

Luis Machado <luis.machado@arm.com> writes:

> There were reports of gdb throwing internal errors when calling
> inferior_thread ()/get_current_regcache () on a system with
> Pointer Authentication enabled.
>
> In such cases, gdb produces the following backtrace:
>
> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> ----- Backtrace -----
> 0xaaaae04a571f gdb_internal_backtrace_1
> 	../../../repos/binutils-gdb/gdb/bt-utils.c:122
> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
> 	../../../repos/binutils-gdb/gdb/bt-utils.c:168
> 0xaaaae0b52ccf internal_vproblem
> 	../../../repos/binutils-gdb/gdb/utils.c:401
> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
> 	../../../repos/binutils-gdb/gdb/utils.c:481
> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
> 	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
> 0xaaaae0a88983 _Z15inferior_threadv
> 	../../../repos/binutils-gdb/gdb/thread.c:86
> 0xaaaae0956c87 _Z20get_current_regcachev
> 	../../../repos/binutils-gdb/gdb/regcache.c:428
> 0xaaaae035223f aarch64_remove_non_address_bits
> 	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
> 	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
> 0xaaaae0a692d7 memory_xfer_partial
> 	../../../repos/binutils-gdb/gdb/target.c:1620
> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
> 	../../../repos/binutils-gdb/gdb/target.c:1684
> 0xaaaae0a69e9f target_read_partial
> 	../../../repos/binutils-gdb/gdb/target.c:1937
> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
> 	../../../repos/binutils-gdb/gdb/target.c:1977
> 0xaaaae0a69937 _Z18target_read_memorymPhl
> 	../../../repos/binutils-gdb/gdb/target.c:1773
> 0xaaaae08be523 ps_xfer_memory
> 	../../../repos/binutils-gdb/gdb/proc-service.c:90
> 0xaaaae08be6db ps_pdread
> 	../../../repos/binutils-gdb/gdb/proc-service.c:124
> 0x40001ed7c3b3 _td_fetch_value
> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
> 0x40001ed791ef td_ta_map_lwp2thr
> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
> 0xaaaae07f4473 thread_from_lwp
> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
> 	../../../repos/binutils-gdb/gdb/target.c:2586
> 0xaaaae0789cf7 do_target_wait_1
> 	../../../repos/binutils-gdb/gdb/infrun.c:3825
> 0xaaaae0789e6f operator()
> 	../../../repos/binutils-gdb/gdb/infrun.c:3884
> 0xaaaae078a167 do_target_wait
> 	../../../repos/binutils-gdb/gdb/infrun.c:3903
> 0xaaaae078b0af _Z20fetch_inferior_eventv
> 	../../../repos/binutils-gdb/gdb/infrun.c:4314
> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
> 	../../../repos/binutils-gdb/gdb/inf-loop.c:41
> 0xaaaae07dc68b handle_target_event
> 	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
> 0xaaaae0e25fbb handle_file_event
> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
> 0xaaaae0e264f3 gdb_wait_for_event
> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
> 0xaaaae080f033 start_event_loop
> 	../../../repos/binutils-gdb/gdb/main.c:411
> 0xaaaae080f1b7 captured_command_loop
> 	../../../repos/binutils-gdb/gdb/main.c:475
> 0xaaaae0810b97 captured_main
> 	../../../repos/binutils-gdb/gdb/main.c:1318
> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
> 	../../../repos/binutils-gdb/gdb/main.c:1337
> 0xaaaae0338453 main
> 	../../../repos/binutils-gdb/gdb/gdb.c:32
> ---------------------
> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n)
>
> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
> enabled more broad use of pointer authentication masks to remove non-address
> bits of pointers.
>
> The above crash happens because gdb is in the middle of handling an event,
> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
> current thread.  This means a call to inferior_thread () will assert, and
> attempting to call get_current_regcache () will also call inferior_thread (),
> resulting in an assertion as well.
>
> There is another crash scenario after we kill a previously active inferior, in
> which case the gdbarch will still say we support pointer authentication but we
> will also have no current thread (inferior_thread () will assert etc).
>
> I thought about this for a bit, and I'm not sure what is the best way
> to address this other than actively checking for a valid current thread
> and valid current register cache.
>
> If the target has support for Pointer Authentication, gdb needs to use
> a couple (or 4, for bare-metal) mask registers to mask off some bits of
> pointers, and for that it needs to access the registers.
>
> At some points, like the one from the backtrace above, there is no active
> thread/current regcache because gdb is in the middle of doing event handling
> and switching between threads.
>
> The following patch improves this situation by checking for an active thread
> and register cache. If those don't exist, gdb falls back to using a default
> mask which should be enough for what it is trying to do at the moment.
>
> I discussed with Pedro the possibility of caching the mask register values
> (which are per-process and can change mid-execution), but there isn't a good
> spot to cache those values. Besides, the mask registers can change constantly
> for bare-metal debugging when switching between exception levels.
>
> In some cases, it is just not possible to get access to these mask registers,
> like the case where threads are running. In those cases, using a default mask
> to remove the non-address bits should be enough.
>
> In addition to the check for a valid current register cache, I also introduced
> a try/catch guard against errors reading registers, in case the current thread
> is running, also falling back to using a default mask to remove non-address
> bits.
>
> This can happen when we let threads run in the background and then we attempt
> to access a memory address (now that gdb is capable of reading memory even
> with threads running).  Thus gdb will attempt to remove non-address bits
> of that memory access, will attempt to access registers, running into errors.
>
> Regression-tested on aarch64-linux Ubuntu 20.04.
>
> Thoughts?
> ---
>  gdb/aarch64-tdep.c | 38 +++++++++++++++++++++++++++++---------
>  gdb/gdbthread.h    |  8 ++++++++
>  gdb/regcache.c     | 10 ++++++++++
>  gdb/regcache.h     |  8 ++++++++
>  gdb/thread.c       |  8 ++++++++

Would it be possible to write a test for this issue?  Or maybe some of
the existing tests fail due to this issue?  If there are some tests that
already expose this issue then it would be great to mention them in the
commit message.

>  5 files changed, 63 insertions(+), 9 deletions(-)
>
> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
> index 5b1b9921f87..8368592a7db 100644
> --- a/gdb/aarch64-tdep.c
> +++ b/gdb/aarch64-tdep.c
> @@ -3562,13 +3562,17 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>       select bit (55).  */
>    CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>  
> -  if (tdep->has_pauth ())
> +  /* Make sure we have a valid active current register cache to fetch pointer
> +     authentication masks from.  In some situations, there may be no active
> +     threads to fetch the data from (no inferiors, gdb in the middle of
> +     handling events etc).  If we don't have a valid register cache, fallback
> +     to using a default mask to remove non-address bits.  */
> +  if (tdep->has_pauth () && current_regcache_exists ())

Maybe I'm completely off the mark here, but does target_has_registers
not help at all here?

For a process_stratum_target, this should end up in this function:

  bool
  process_stratum_target::has_registers ()
  {
    /* Can't read registers from no inferior.  */
    return inferior_ptid != null_ptid;
  }

And given that switch_to_no_thread () includes the line:

  inferior_ptid = null_ptid;

My hope would be that when we have no thread selected,
target_has_registers will return false.

The other thing I worried about before suggesting the above is this
depends on the current_inferior in order to identify a target stack.
But reading registers already makes that assumption, right.. reading a
register might require us to actually go get the registers, so we are
already assuming that the current_inferior() is sane at this point.

Thanks,
Andrew




>      {
>        /* Fetch the PAC masks.  These masks are per-process, so we can just
> -	 fetch data from whatever thread we have at the moment.
> +	 fetch data from whatever thread we have at the moment, hence why we
> +	 use the current register cache.  */
>  
> -	 Also, we have both a code mask and a data mask.  For now they are the
> -	 same, but this may change in the future.  */
>        struct regcache *regs = get_current_regcache ();
>        CORE_ADDR cmask, dmask;
>        int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
> @@ -3583,13 +3587,29 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>  	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>  	}
>  
> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
> -	dmask = mask;
> +      /* GDB supports reading memory from threads that are running, but in
> +	 those cases GDB cannot access registers.  If reading the pointer
> +	 authentication registers fails, fallback to using a default mask.
> +
> +	 We could potentially get it wrong if the masks have been updated
> +	 while the thread is running, but eventually we will see the thread
> +	 stop and will be able to fetch the correct data.  */
> +      try
> +	{
> +	  /* We have both a code mask and a data mask.  For now they are
> +	     the same, but this may change in the future.  */
> +	  if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
> +	    dmask = mask;
>  
> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
> -	cmask = mask;
> +	  if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
> +	    cmask = mask;
>  
> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
> +	  mask |= aarch64_mask_from_pac_registers (cmask, dmask);
> +	}
> +      catch (const gdb_exception_error &e)
> +	{
> +	  /* We failed to fetch the contents of the masks from registers.  */
> +	}
>      }
>  
>    return aarch64_remove_top_bits (pointer, mask);
> diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
> index c0f27a8a66e..ebb8f140770 100644
> --- a/gdb/gdbthread.h
> +++ b/gdb/gdbthread.h
> @@ -875,6 +875,14 @@ class scoped_restore_current_thread
>    enum language m_lang;
>  };
>  
> +/* Return TRUE if an inferior thread exists and is currently selected,
> +   meaning it is safe to call inferior_thread () without risking an assertion.
> +
> +   Return FALSE otherwise.  If so, inferior_thread () will assert if
> +   called.  */
> +
> +extern bool inferior_thread_exists (void);
> +
>  /* Returns a pointer into the thread_info corresponding to
>     INFERIOR_PTID.  INFERIOR_PTID *must* be in the thread list.  */
>  extern struct thread_info* inferior_thread (void);
> diff --git a/gdb/regcache.c b/gdb/regcache.c
> index af76fab1a34..4685b01e9ab 100644
> --- a/gdb/regcache.c
> +++ b/gdb/regcache.c
> @@ -422,6 +422,16 @@ get_thread_regcache (thread_info *thread)
>  			      thread->ptid);
>  }
>  
> +/* See regcache.h.  */
> +
> +bool
> +current_regcache_exists (void)
> +{
> +  /* If we have an inferior thread selected, we should have a register
> +     cache as well.  */
> +  return inferior_thread_exists ();
> +}
> +
>  struct regcache *
>  get_current_regcache (void)
>  {
> diff --git a/gdb/regcache.h b/gdb/regcache.h
> index b9ffab9950d..adff7e44cb9 100644
> --- a/gdb/regcache.h
> +++ b/gdb/regcache.h
> @@ -30,6 +30,14 @@ struct address_space;
>  class thread_info;
>  struct process_stratum_target;
>  
> +/* Return TRUE if we can access the register cache from the
> +   currently-selected thread.  If so, we can call get_current_regcache ()
> +   safely without risking an assertion due to the lack of an inferior thread.
> +
> +   Return FALSE otherwise.  In this case, attempting to call
> +   get_current_regcache () will run into an assertion.  */
> +
> +extern bool current_regcache_exists (void);
>  extern struct regcache *get_current_regcache (void);
>  extern struct regcache *get_thread_regcache (process_stratum_target *target,
>  					     ptid_t ptid);
> diff --git a/gdb/thread.c b/gdb/thread.c
> index e4d98ce966a..545bbf2a411 100644
> --- a/gdb/thread.c
> +++ b/gdb/thread.c
> @@ -80,6 +80,14 @@ is_current_thread (const thread_info *thr)
>    return thr == current_thread_;
>  }
>  
> +/* See gdbthread.h.  */
> +
> +bool
> +inferior_thread_exists ()
> +{
> +  return current_thread_ != nullptr;
> +}
> +
>  struct thread_info*
>  inferior_thread (void)
>  {
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-16 15:49 ` Andrew Burgess
@ 2023-03-16 23:15   ` Luis Machado
  2023-03-17 17:15     ` Simon Marchi
  2023-03-20 20:06     ` Andrew Burgess
  0 siblings, 2 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-16 23:15 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches, pedro; +Cc: marcan

On 3/16/23 15:49, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> There were reports of gdb throwing internal errors when calling
>> inferior_thread ()/get_current_regcache () on a system with
>> Pointer Authentication enabled.
>>
>> In such cases, gdb produces the following backtrace:
>>
>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> ----- Backtrace -----
>> 0xaaaae04a571f gdb_internal_backtrace_1
>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:122
>> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:168
>> 0xaaaae0b52ccf internal_vproblem
>> 	../../../repos/binutils-gdb/gdb/utils.c:401
>> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
>> 	../../../repos/binutils-gdb/gdb/utils.c:481
>> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
>> 	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
>> 0xaaaae0a88983 _Z15inferior_threadv
>> 	../../../repos/binutils-gdb/gdb/thread.c:86
>> 0xaaaae0956c87 _Z20get_current_regcachev
>> 	../../../repos/binutils-gdb/gdb/regcache.c:428
>> 0xaaaae035223f aarch64_remove_non_address_bits
>> 	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
>> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
>> 	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
>> 0xaaaae0a692d7 memory_xfer_partial
>> 	../../../repos/binutils-gdb/gdb/target.c:1620
>> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
>> 	../../../repos/binutils-gdb/gdb/target.c:1684
>> 0xaaaae0a69e9f target_read_partial
>> 	../../../repos/binutils-gdb/gdb/target.c:1937
>> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
>> 	../../../repos/binutils-gdb/gdb/target.c:1977
>> 0xaaaae0a69937 _Z18target_read_memorymPhl
>> 	../../../repos/binutils-gdb/gdb/target.c:1773
>> 0xaaaae08be523 ps_xfer_memory
>> 	../../../repos/binutils-gdb/gdb/proc-service.c:90
>> 0xaaaae08be6db ps_pdread
>> 	../../../repos/binutils-gdb/gdb/proc-service.c:124
>> 0x40001ed7c3b3 _td_fetch_value
>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
>> 0x40001ed791ef td_ta_map_lwp2thr
>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
>> 0xaaaae07f4473 thread_from_lwp
>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
>> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
>> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>> 	../../../repos/binutils-gdb/gdb/target.c:2586
>> 0xaaaae0789cf7 do_target_wait_1
>> 	../../../repos/binutils-gdb/gdb/infrun.c:3825
>> 0xaaaae0789e6f operator()
>> 	../../../repos/binutils-gdb/gdb/infrun.c:3884
>> 0xaaaae078a167 do_target_wait
>> 	../../../repos/binutils-gdb/gdb/infrun.c:3903
>> 0xaaaae078b0af _Z20fetch_inferior_eventv
>> 	../../../repos/binutils-gdb/gdb/infrun.c:4314
>> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
>> 	../../../repos/binutils-gdb/gdb/inf-loop.c:41
>> 0xaaaae07dc68b handle_target_event
>> 	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
>> 0xaaaae0e25fbb handle_file_event
>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
>> 0xaaaae0e264f3 gdb_wait_for_event
>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
>> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
>> 0xaaaae080f033 start_event_loop
>> 	../../../repos/binutils-gdb/gdb/main.c:411
>> 0xaaaae080f1b7 captured_command_loop
>> 	../../../repos/binutils-gdb/gdb/main.c:475
>> 0xaaaae0810b97 captured_main
>> 	../../../repos/binutils-gdb/gdb/main.c:1318
>> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
>> 	../../../repos/binutils-gdb/gdb/main.c:1337
>> 0xaaaae0338453 main
>> 	../../../repos/binutils-gdb/gdb/gdb.c:32
>> ---------------------
>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> Quit this debugging session? (y or n)
>>
>> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
>> enabled more broad use of pointer authentication masks to remove non-address
>> bits of pointers.
>>
>> The above crash happens because gdb is in the middle of handling an event,
>> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
>> current thread.  This means a call to inferior_thread () will assert, and
>> attempting to call get_current_regcache () will also call inferior_thread (),
>> resulting in an assertion as well.
>>
>> There is another crash scenario after we kill a previously active inferior, in
>> which case the gdbarch will still say we support pointer authentication but we
>> will also have no current thread (inferior_thread () will assert etc).
>>
>> I thought about this for a bit, and I'm not sure what is the best way
>> to address this other than actively checking for a valid current thread
>> and valid current register cache.
>>
>> If the target has support for Pointer Authentication, gdb needs to use
>> a couple (or 4, for bare-metal) mask registers to mask off some bits of
>> pointers, and for that it needs to access the registers.
>>
>> At some points, like the one from the backtrace above, there is no active
>> thread/current regcache because gdb is in the middle of doing event handling
>> and switching between threads.
>>
>> The following patch improves this situation by checking for an active thread
>> and register cache. If those don't exist, gdb falls back to using a default
>> mask which should be enough for what it is trying to do at the moment.
>>
>> I discussed with Pedro the possibility of caching the mask register values
>> (which are per-process and can change mid-execution), but there isn't a good
>> spot to cache those values. Besides, the mask registers can change constantly
>> for bare-metal debugging when switching between exception levels.
>>
>> In some cases, it is just not possible to get access to these mask registers,
>> like the case where threads are running. In those cases, using a default mask
>> to remove the non-address bits should be enough.
>>
>> In addition to the check for a valid current register cache, I also introduced
>> a try/catch guard against errors reading registers, in case the current thread
>> is running, also falling back to using a default mask to remove non-address
>> bits.
>>
>> This can happen when we let threads run in the background and then we attempt
>> to access a memory address (now that gdb is capable of reading memory even
>> with threads running).  Thus gdb will attempt to remove non-address bits
>> of that memory access, will attempt to access registers, running into errors.
>>
>> Regression-tested on aarch64-linux Ubuntu 20.04.
>>
>> Thoughts?
>> ---
>>   gdb/aarch64-tdep.c | 38 +++++++++++++++++++++++++++++---------
>>   gdb/gdbthread.h    |  8 ++++++++
>>   gdb/regcache.c     | 10 ++++++++++
>>   gdb/regcache.h     |  8 ++++++++
>>   gdb/thread.c       |  8 ++++++++
> 
> Would it be possible to write a test for this issue?  Or maybe some of
> the existing tests fail due to this issue?  If there are some tests that
> already expose this issue then it would be great to mention them in the
> commit message.
> 

Existing tests fail due to this. Two example are gdb.base/break.exp and gdb.base/access-mem-running.exp are
two examples. I'll add those to the commit message for context.

The catch here is that they will only fail on a system that supports pointer authentication natively, and
there are only a few of those at the moment (Graviton 3, M2 and system QEMU are 3 examples I know of).

>>   5 files changed, 63 insertions(+), 9 deletions(-)
>>
>> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
>> index 5b1b9921f87..8368592a7db 100644
>> --- a/gdb/aarch64-tdep.c
>> +++ b/gdb/aarch64-tdep.c
>> @@ -3562,13 +3562,17 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>        select bit (55).  */
>>     CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>>   
>> -  if (tdep->has_pauth ())
>> +  /* Make sure we have a valid active current register cache to fetch pointer
>> +     authentication masks from.  In some situations, there may be no active
>> +     threads to fetch the data from (no inferiors, gdb in the middle of
>> +     handling events etc).  If we don't have a valid register cache, fallback
>> +     to using a default mask to remove non-address bits.  */
>> +  if (tdep->has_pauth () && current_regcache_exists ())
> 
> Maybe I'm completely off the mark here, but does target_has_registers
> not help at all here?

It does, and that was the first thing I considered for guarding this block against a situation where we
have no current thread.

But...

> 
> For a process_stratum_target, this should end up in this function:
> 
>    bool
>    process_stratum_target::has_registers ()
>    {
>      /* Can't read registers from no inferior.  */
>      return inferior_ptid != null_ptid;
>    }
> 
> And given that switch_to_no_thread () includes the line:
> 
>    inferior_ptid = null_ptid;
> 

... despite the call to switch_to_no_thread in switch_to_inferior_no_thread from do_target_wait_1
in the backtrace above, the call to ps_xfer_memory sets inferior_ptid momentarily before reading
memory:


static ps_err_e
ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
                 gdb_byte *buf, size_t len, int write)
{
   scoped_restore_current_inferior restore_inferior;
   set_current_inferior (ph->thread->inf);

   scoped_restore_current_program_space restore_current_progspace;
   set_current_program_space (ph->thread->inf->pspace);

   scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
   inferior_ptid = ph->thread->ptid;

   CORE_ADDR core_addr = ps_addr_to_core_addr (addr);

   int ret;
   if (write)
     ret = target_write_memory (core_addr, buf, len);
   else
     ret = target_read_memory (core_addr, buf, len);
   return (ret == 0 ? PS_OK : PS_ERR);
}


Maybe this shouldn't happen, or maybe it is just an unfortunate state to be in. But this
prevents the use of target_has_registers to guard against the lack of registers, since,
although current_thread_ is still nullptr, inferior_ptid is valid and is not null_ptid.

So doing a direct check of current_thread_ through a helper function seemed safer.

And just to clarify, this is just one of the issues where we attempt to fetch a current
register cache with a null current_thread_, leading to an assertion.

The other issue, which is solved with a try/catch block, yields an error message about not
being able to access registers (because the thread we want to fetch registers from is still
running).

Maybe in the future we should prevent calls to register fetch/store functions if the thread we
want to fetch data from is still running. Right now this isn't a problem besides the AArch64
target though.

> My hope would be that when we have no thread selected,
> target_has_registers will return false.
> 
> The other thing I worried about before suggesting the above is this
> depends on the current_inferior in order to identify a target stack.
> But reading registers already makes that assumption, right.. reading a
> register might require us to actually go get the registers, so we are
> already assuming that the current_inferior() is sane at this point.
> 
> Thanks,
> Andrew
> 
> 
> 
> 
>>       {
>>         /* Fetch the PAC masks.  These masks are per-process, so we can just
>> -	 fetch data from whatever thread we have at the moment.
>> +	 fetch data from whatever thread we have at the moment, hence why we
>> +	 use the current register cache.  */
>>   
>> -	 Also, we have both a code mask and a data mask.  For now they are the
>> -	 same, but this may change in the future.  */
>>         struct regcache *regs = get_current_regcache ();
>>         CORE_ADDR cmask, dmask;
>>         int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
>> @@ -3583,13 +3587,29 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>   	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>>   	}
>>   
>> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>> -	dmask = mask;
>> +      /* GDB supports reading memory from threads that are running, but in
>> +	 those cases GDB cannot access registers.  If reading the pointer
>> +	 authentication registers fails, fallback to using a default mask.
>> +
>> +	 We could potentially get it wrong if the masks have been updated
>> +	 while the thread is running, but eventually we will see the thread
>> +	 stop and will be able to fetch the correct data.  */
>> +      try
>> +	{
>> +	  /* We have both a code mask and a data mask.  For now they are
>> +	     the same, but this may change in the future.  */
>> +	  if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>> +	    dmask = mask;
>>   
>> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>> -	cmask = mask;
>> +	  if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>> +	    cmask = mask;
>>   
>> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>> +	  mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>> +	}
>> +      catch (const gdb_exception_error &e)
>> +	{
>> +	  /* We failed to fetch the contents of the masks from registers.  */
>> +	}
>>       }
>>   
>>     return aarch64_remove_top_bits (pointer, mask);
>> diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
>> index c0f27a8a66e..ebb8f140770 100644
>> --- a/gdb/gdbthread.h
>> +++ b/gdb/gdbthread.h
>> @@ -875,6 +875,14 @@ class scoped_restore_current_thread
>>     enum language m_lang;
>>   };
>>   
>> +/* Return TRUE if an inferior thread exists and is currently selected,
>> +   meaning it is safe to call inferior_thread () without risking an assertion.
>> +
>> +   Return FALSE otherwise.  If so, inferior_thread () will assert if
>> +   called.  */
>> +
>> +extern bool inferior_thread_exists (void);
>> +
>>   /* Returns a pointer into the thread_info corresponding to
>>      INFERIOR_PTID.  INFERIOR_PTID *must* be in the thread list.  */
>>   extern struct thread_info* inferior_thread (void);
>> diff --git a/gdb/regcache.c b/gdb/regcache.c
>> index af76fab1a34..4685b01e9ab 100644
>> --- a/gdb/regcache.c
>> +++ b/gdb/regcache.c
>> @@ -422,6 +422,16 @@ get_thread_regcache (thread_info *thread)
>>   			      thread->ptid);
>>   }
>>   
>> +/* See regcache.h.  */
>> +
>> +bool
>> +current_regcache_exists (void)
>> +{
>> +  /* If we have an inferior thread selected, we should have a register
>> +     cache as well.  */
>> +  return inferior_thread_exists ();
>> +}
>> +
>>   struct regcache *
>>   get_current_regcache (void)
>>   {
>> diff --git a/gdb/regcache.h b/gdb/regcache.h
>> index b9ffab9950d..adff7e44cb9 100644
>> --- a/gdb/regcache.h
>> +++ b/gdb/regcache.h
>> @@ -30,6 +30,14 @@ struct address_space;
>>   class thread_info;
>>   struct process_stratum_target;
>>   
>> +/* Return TRUE if we can access the register cache from the
>> +   currently-selected thread.  If so, we can call get_current_regcache ()
>> +   safely without risking an assertion due to the lack of an inferior thread.
>> +
>> +   Return FALSE otherwise.  In this case, attempting to call
>> +   get_current_regcache () will run into an assertion.  */
>> +
>> +extern bool current_regcache_exists (void);
>>   extern struct regcache *get_current_regcache (void);
>>   extern struct regcache *get_thread_regcache (process_stratum_target *target,
>>   					     ptid_t ptid);
>> diff --git a/gdb/thread.c b/gdb/thread.c
>> index e4d98ce966a..545bbf2a411 100644
>> --- a/gdb/thread.c
>> +++ b/gdb/thread.c
>> @@ -80,6 +80,14 @@ is_current_thread (const thread_info *thr)
>>     return thr == current_thread_;
>>   }
>>   
>> +/* See gdbthread.h.  */
>> +
>> +bool
>> +inferior_thread_exists ()
>> +{
>> +  return current_thread_ != nullptr;
>> +}
>> +
>>   struct thread_info*
>>   inferior_thread (void)
>>   {
>> -- 
>> 2.25.1
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-16 23:15   ` Luis Machado
@ 2023-03-17 17:15     ` Simon Marchi
  2023-03-17 17:25       ` Luis Machado
  2023-03-20 20:06     ` Andrew Burgess
  1 sibling, 1 reply; 17+ messages in thread
From: Simon Marchi @ 2023-03-17 17:15 UTC (permalink / raw)
  To: Luis Machado, Andrew Burgess, gdb-patches, pedro; +Cc: marcan

> ... despite the call to switch_to_no_thread in switch_to_inferior_no_thread from do_target_wait_1
> in the backtrace above, the call to ps_xfer_memory sets inferior_ptid momentarily before reading
> memory:
> 
> 
> static ps_err_e
> ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
>                 gdb_byte *buf, size_t len, int write)
> {
>   scoped_restore_current_inferior restore_inferior;
>   set_current_inferior (ph->thread->inf);
> 
>   scoped_restore_current_program_space restore_current_progspace;
>   set_current_program_space (ph->thread->inf->pspace);
> 
>   scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>   inferior_ptid = ph->thread->ptid;
> 
>   CORE_ADDR core_addr = ps_addr_to_core_addr (addr);
> 
>   int ret;
>   if (write)
>     ret = target_write_memory (core_addr, buf, len);
>   else
>     ret = target_read_memory (core_addr, buf, len);
>   return (ret == 0 ? PS_OK : PS_ERR);
> }
> 
> 
> Maybe this shouldn't happen, or maybe it is just an unfortunate state to be in. But this
> prevents the use of target_has_registers to guard against the lack of registers, since,
> although current_thread_ is still nullptr, inferior_ptid is valid and is not null_ptid.
It's needed to set inferior_ptid here because it's how we communicate to
the targets which inferior / thread we want to read or write memory
from.  And we can't set current_thread_ to the the correponding
thread_info *, because it does not exist at this point (we are in the
process of creating it).  And the problem is that
aarch64_remove_non_address_bits assumes there is a current_thread_.  Can
we change aarch64_remove_non_address_bits to get the regcache using the
ptid?

  get_thread_regcache (current_inferior ()->process_target (), inferior_ptid);

Simon


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-17 17:15     ` Simon Marchi
@ 2023-03-17 17:25       ` Luis Machado
  2023-03-17 17:29         ` Simon Marchi
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado @ 2023-03-17 17:25 UTC (permalink / raw)
  To: Simon Marchi, Andrew Burgess, gdb-patches, pedro; +Cc: marcan

On 3/17/23 17:15, Simon Marchi wrote:
>> ... despite the call to switch_to_no_thread in switch_to_inferior_no_thread from do_target_wait_1
>> in the backtrace above, the call to ps_xfer_memory sets inferior_ptid momentarily before reading
>> memory:
>>
>>
>> static ps_err_e
>> ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
>>                  gdb_byte *buf, size_t len, int write)
>> {
>>    scoped_restore_current_inferior restore_inferior;
>>    set_current_inferior (ph->thread->inf);
>>
>>    scoped_restore_current_program_space restore_current_progspace;
>>    set_current_program_space (ph->thread->inf->pspace);
>>
>>    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>>    inferior_ptid = ph->thread->ptid;
>>
>>    CORE_ADDR core_addr = ps_addr_to_core_addr (addr);
>>
>>    int ret;
>>    if (write)
>>      ret = target_write_memory (core_addr, buf, len);
>>    else
>>      ret = target_read_memory (core_addr, buf, len);
>>    return (ret == 0 ? PS_OK : PS_ERR);
>> }
>>
>>
>> Maybe this shouldn't happen, or maybe it is just an unfortunate state to be in. But this
>> prevents the use of target_has_registers to guard against the lack of registers, since,
>> although current_thread_ is still nullptr, inferior_ptid is valid and is not null_ptid.
> It's needed to set inferior_ptid here because it's how we communicate to
> the targets which inferior / thread we want to read or write memory
> from.  And we can't set current_thread_ to the the correponding
> thread_info *, because it does not exist at this point (we are in the
> process of creating it).  And the problem is that
> aarch64_remove_non_address_bits assumes there is a current_thread_.  Can
> we change aarch64_remove_non_address_bits to get the regcache using the
> ptid?
> 
>    get_thread_regcache (current_inferior ()->process_target (), inferior_ptid);

What would be the outcome of this call when there is no active register cache because the inferior
has been killed? I think it would assert in find_inferior_pid with a null_ptid.

aarch64_remove_non_address_bits gets called for any memory access, so we may be in a state where we have
a valid stopped thread, a valid running thread, core file, no inferiors (just reading symbols/bfd) and so
on.

Or were you considering doing a check for inferior_ptid == null_ptid before calling get_thread_regcache?

> 
> Simon
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-17 17:25       ` Luis Machado
@ 2023-03-17 17:29         ` Simon Marchi
  2023-03-17 17:38           ` Luis Machado
  0 siblings, 1 reply; 17+ messages in thread
From: Simon Marchi @ 2023-03-17 17:29 UTC (permalink / raw)
  To: Luis Machado, Andrew Burgess, gdb-patches, pedro; +Cc: marcan


> What would be the outcome of this call when there is no active register cache because the inferior
> has been killed? I think it would assert in find_inferior_pid with a null_ptid.
> 
> aarch64_remove_non_address_bits gets called for any memory access, so we may be in a state where we have
> a valid stopped thread, a valid running thread, core file, no inferiors (just reading symbols/bfd) and so
> on.
> 
> Or were you considering doing a check for inferior_ptid == null_ptid before calling get_thread_regcache?

Yes, a check for inferior_ptid == null_ptid would make sense.  I was
really just thinking about the initial problem, which is
aarch64_remove_non_address_bits being called while processing the event
in thread-db.

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-17 17:29         ` Simon Marchi
@ 2023-03-17 17:38           ` Luis Machado
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-17 17:38 UTC (permalink / raw)
  To: Simon Marchi, Andrew Burgess, gdb-patches, pedro; +Cc: marcan

On 3/17/23 17:29, Simon Marchi wrote:
> 
>> What would be the outcome of this call when there is no active register cache because the inferior
>> has been killed? I think it would assert in find_inferior_pid with a null_ptid.
>>
>> aarch64_remove_non_address_bits gets called for any memory access, so we may be in a state where we have
>> a valid stopped thread, a valid running thread, core file, no inferiors (just reading symbols/bfd) and so
>> on.
>>
>> Or were you considering doing a check for inferior_ptid == null_ptid before calling get_thread_regcache?
> 
> Yes, a check for inferior_ptid == null_ptid would make sense.  I was
> really just thinking about the initial problem, which is
> aarch64_remove_non_address_bits being called while processing the event
> in thread-db.

I could switch to using inferior_ptid instead of using the helper functions to check current_thread_ if it is
deemed more appropriate.

If the threads are running while we try to read memory, the try/catch block will guard against register read errors.

> 
> Simon


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-16 23:15   ` Luis Machado
  2023-03-17 17:15     ` Simon Marchi
@ 2023-03-20 20:06     ` Andrew Burgess
  2023-03-22 10:26       ` Luis Machado
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2023-03-20 20:06 UTC (permalink / raw)
  To: Luis Machado, gdb-patches, pedro; +Cc: marcan

Luis Machado <luis.machado@arm.com> writes:

> On 3/16/23 15:49, Andrew Burgess wrote:
>> Luis Machado <luis.machado@arm.com> writes:
>> 
>>> There were reports of gdb throwing internal errors when calling
>>> inferior_thread ()/get_current_regcache () on a system with
>>> Pointer Authentication enabled.
>>>
>>> In such cases, gdb produces the following backtrace:
>>>
>>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>>> A problem internal to GDB has been detected,
>>> further debugging may prove unreliable.
>>> ----- Backtrace -----
>>> 0xaaaae04a571f gdb_internal_backtrace_1
>>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:122
>>> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
>>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:168
>>> 0xaaaae0b52ccf internal_vproblem
>>> 	../../../repos/binutils-gdb/gdb/utils.c:401
>>> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
>>> 	../../../repos/binutils-gdb/gdb/utils.c:481
>>> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
>>> 	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
>>> 0xaaaae0a88983 _Z15inferior_threadv
>>> 	../../../repos/binutils-gdb/gdb/thread.c:86
>>> 0xaaaae0956c87 _Z20get_current_regcachev
>>> 	../../../repos/binutils-gdb/gdb/regcache.c:428
>>> 0xaaaae035223f aarch64_remove_non_address_bits
>>> 	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
>>> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
>>> 	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
>>> 0xaaaae0a692d7 memory_xfer_partial
>>> 	../../../repos/binutils-gdb/gdb/target.c:1620
>>> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
>>> 	../../../repos/binutils-gdb/gdb/target.c:1684
>>> 0xaaaae0a69e9f target_read_partial
>>> 	../../../repos/binutils-gdb/gdb/target.c:1937
>>> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
>>> 	../../../repos/binutils-gdb/gdb/target.c:1977
>>> 0xaaaae0a69937 _Z18target_read_memorymPhl
>>> 	../../../repos/binutils-gdb/gdb/target.c:1773
>>> 0xaaaae08be523 ps_xfer_memory
>>> 	../../../repos/binutils-gdb/gdb/proc-service.c:90
>>> 0xaaaae08be6db ps_pdread
>>> 	../../../repos/binutils-gdb/gdb/proc-service.c:124
>>> 0x40001ed7c3b3 _td_fetch_value
>>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
>>> 0x40001ed791ef td_ta_map_lwp2thr
>>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
>>> 0xaaaae07f4473 thread_from_lwp
>>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
>>> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
>>> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>>> 	../../../repos/binutils-gdb/gdb/target.c:2586
>>> 0xaaaae0789cf7 do_target_wait_1
>>> 	../../../repos/binutils-gdb/gdb/infrun.c:3825
>>> 0xaaaae0789e6f operator()
>>> 	../../../repos/binutils-gdb/gdb/infrun.c:3884
>>> 0xaaaae078a167 do_target_wait
>>> 	../../../repos/binutils-gdb/gdb/infrun.c:3903
>>> 0xaaaae078b0af _Z20fetch_inferior_eventv
>>> 	../../../repos/binutils-gdb/gdb/infrun.c:4314
>>> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
>>> 	../../../repos/binutils-gdb/gdb/inf-loop.c:41
>>> 0xaaaae07dc68b handle_target_event
>>> 	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
>>> 0xaaaae0e25fbb handle_file_event
>>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
>>> 0xaaaae0e264f3 gdb_wait_for_event
>>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
>>> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
>>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
>>> 0xaaaae080f033 start_event_loop
>>> 	../../../repos/binutils-gdb/gdb/main.c:411
>>> 0xaaaae080f1b7 captured_command_loop
>>> 	../../../repos/binutils-gdb/gdb/main.c:475
>>> 0xaaaae0810b97 captured_main
>>> 	../../../repos/binutils-gdb/gdb/main.c:1318
>>> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
>>> 	../../../repos/binutils-gdb/gdb/main.c:1337
>>> 0xaaaae0338453 main
>>> 	../../../repos/binutils-gdb/gdb/gdb.c:32
>>> ---------------------
>>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>>> A problem internal to GDB has been detected,
>>> further debugging may prove unreliable.
>>> Quit this debugging session? (y or n)
>>>
>>> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
>>> enabled more broad use of pointer authentication masks to remove non-address
>>> bits of pointers.
>>>
>>> The above crash happens because gdb is in the middle of handling an event,
>>> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
>>> current thread.  This means a call to inferior_thread () will assert, and
>>> attempting to call get_current_regcache () will also call inferior_thread (),
>>> resulting in an assertion as well.
>>>
>>> There is another crash scenario after we kill a previously active inferior, in
>>> which case the gdbarch will still say we support pointer authentication but we
>>> will also have no current thread (inferior_thread () will assert etc).
>>>
>>> I thought about this for a bit, and I'm not sure what is the best way
>>> to address this other than actively checking for a valid current thread
>>> and valid current register cache.
>>>
>>> If the target has support for Pointer Authentication, gdb needs to use
>>> a couple (or 4, for bare-metal) mask registers to mask off some bits of
>>> pointers, and for that it needs to access the registers.
>>>
>>> At some points, like the one from the backtrace above, there is no active
>>> thread/current regcache because gdb is in the middle of doing event handling
>>> and switching between threads.
>>>
>>> The following patch improves this situation by checking for an active thread
>>> and register cache. If those don't exist, gdb falls back to using a default
>>> mask which should be enough for what it is trying to do at the moment.
>>>
>>> I discussed with Pedro the possibility of caching the mask register values
>>> (which are per-process and can change mid-execution), but there isn't a good
>>> spot to cache those values. Besides, the mask registers can change constantly
>>> for bare-metal debugging when switching between exception levels.
>>>
>>> In some cases, it is just not possible to get access to these mask registers,
>>> like the case where threads are running. In those cases, using a default mask
>>> to remove the non-address bits should be enough.
>>>
>>> In addition to the check for a valid current register cache, I also introduced
>>> a try/catch guard against errors reading registers, in case the current thread
>>> is running, also falling back to using a default mask to remove non-address
>>> bits.
>>>
>>> This can happen when we let threads run in the background and then we attempt
>>> to access a memory address (now that gdb is capable of reading memory even
>>> with threads running).  Thus gdb will attempt to remove non-address bits
>>> of that memory access, will attempt to access registers, running into errors.
>>>
>>> Regression-tested on aarch64-linux Ubuntu 20.04.
>>>
>>> Thoughts?
>>> ---
>>>   gdb/aarch64-tdep.c | 38 +++++++++++++++++++++++++++++---------
>>>   gdb/gdbthread.h    |  8 ++++++++
>>>   gdb/regcache.c     | 10 ++++++++++
>>>   gdb/regcache.h     |  8 ++++++++
>>>   gdb/thread.c       |  8 ++++++++
>> 
>> Would it be possible to write a test for this issue?  Or maybe some of
>> the existing tests fail due to this issue?  If there are some tests that
>> already expose this issue then it would be great to mention them in the
>> commit message.
>> 
>
> Existing tests fail due to this. Two example are gdb.base/break.exp and gdb.base/access-mem-running.exp are
> two examples. I'll add those to the commit message for context.
>
> The catch here is that they will only fail on a system that supports pointer authentication natively, and
> there are only a few of those at the moment (Graviton 3, M2 and system
> QEMU are 3 examples I know of).

I don't think that's a problem, you should just list a few tests that
exhibit failures and under what circumstances you expect to see the
failures, e.g. on a system with pointer authentication.

>
>>>   5 files changed, 63 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
>>> index 5b1b9921f87..8368592a7db 100644
>>> --- a/gdb/aarch64-tdep.c
>>> +++ b/gdb/aarch64-tdep.c
>>> @@ -3562,13 +3562,17 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>>        select bit (55).  */
>>>     CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>>>   
>>> -  if (tdep->has_pauth ())
>>> +  /* Make sure we have a valid active current register cache to fetch pointer
>>> +     authentication masks from.  In some situations, there may be no active
>>> +     threads to fetch the data from (no inferiors, gdb in the middle of
>>> +     handling events etc).  If we don't have a valid register cache, fallback
>>> +     to using a default mask to remove non-address bits.  */
>>> +  if (tdep->has_pauth () && current_regcache_exists ())
>> 
>> Maybe I'm completely off the mark here, but does target_has_registers
>> not help at all here?
>
> It does, and that was the first thing I considered for guarding this block against a situation where we
> have no current thread.
>
> But...
>
>> 
>> For a process_stratum_target, this should end up in this function:
>> 
>>    bool
>>    process_stratum_target::has_registers ()
>>    {
>>      /* Can't read registers from no inferior.  */
>>      return inferior_ptid != null_ptid;
>>    }
>> 
>> And given that switch_to_no_thread () includes the line:
>> 
>>    inferior_ptid = null_ptid;
>> 
>
> ... despite the call to switch_to_no_thread in switch_to_inferior_no_thread from do_target_wait_1
> in the backtrace above, the call to ps_xfer_memory sets inferior_ptid momentarily before reading
> memory:
>
>
> static ps_err_e
> ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
>                  gdb_byte *buf, size_t len, int write)
> {
>    scoped_restore_current_inferior restore_inferior;
>    set_current_inferior (ph->thread->inf);
>
>    scoped_restore_current_program_space restore_current_progspace;
>    set_current_program_space (ph->thread->inf->pspace);
>
>    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>    inferior_ptid = ph->thread->ptid;
>
>    CORE_ADDR core_addr = ps_addr_to_core_addr (addr);
>
>    int ret;
>    if (write)
>      ret = target_write_memory (core_addr, buf, len);
>    else
>      ret = target_read_memory (core_addr, buf, len);
>    return (ret == 0 ? PS_OK : PS_ERR);
> }
>
>
> Maybe this shouldn't happen, or maybe it is just an unfortunate state to be in. But this
> prevents the use of target_has_registers to guard against the lack of registers, since,
> although current_thread_ is still nullptr, inferior_ptid is valid and is not null_ptid.
>
> So doing a direct check of current_thread_ through a helper function
> seemed safer.

Personally, I find this sort of information really valuable to include
in the commit message.

If someone is digging through this code years from now, all they'll have
is your commit message.  They, like me, might think about
target_has_registers, and then they have to change GDB, test everything,
and rediscover why target_has_registers isn't going to work.

>
> And just to clarify, this is just one of the issues where we attempt to fetch a current
> register cache with a null current_thread_, leading to an assertion.
>
> The other issue, which is solved with a try/catch block, yields an error message about not
> being able to access registers (because the thread we want to fetch registers from is still
> running).
>
> Maybe in the future we should prevent calls to register fetch/store functions if the thread we
> want to fetch data from is still running. Right now this isn't a problem besides the AArch64
> target though.

I think thread_info::executing() should tell you if the thread is
running... At least, I think that's the intention.  I don't think that
field is always correct, so keeping the try/catch is probably a good
idea, at least for now.

I don't object to this patch going in as is then (maybe with an extended
commit message), but I guess you should figure out with Simon if his
idea is better or not...

Thanks for all the details,
Andrew


>
>> My hope would be that when we have no thread selected,
>> target_has_registers will return false.
>> 
>> The other thing I worried about before suggesting the above is this
>> depends on the current_inferior in order to identify a target stack.
>> But reading registers already makes that assumption, right.. reading a
>> register might require us to actually go get the registers, so we are
>> already assuming that the current_inferior() is sane at this point.
>> 
>> Thanks,
>> Andrew
>> 
>> 
>> 
>> 
>>>       {
>>>         /* Fetch the PAC masks.  These masks are per-process, so we can just
>>> -	 fetch data from whatever thread we have at the moment.
>>> +	 fetch data from whatever thread we have at the moment, hence why we
>>> +	 use the current register cache.  */
>>>   
>>> -	 Also, we have both a code mask and a data mask.  For now they are the
>>> -	 same, but this may change in the future.  */
>>>         struct regcache *regs = get_current_regcache ();
>>>         CORE_ADDR cmask, dmask;
>>>         int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
>>> @@ -3583,13 +3587,29 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>>   	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>>>   	}
>>>   
>>> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>>> -	dmask = mask;
>>> +      /* GDB supports reading memory from threads that are running, but in
>>> +	 those cases GDB cannot access registers.  If reading the pointer
>>> +	 authentication registers fails, fallback to using a default mask.
>>> +
>>> +	 We could potentially get it wrong if the masks have been updated
>>> +	 while the thread is running, but eventually we will see the thread
>>> +	 stop and will be able to fetch the correct data.  */
>>> +      try
>>> +	{
>>> +	  /* We have both a code mask and a data mask.  For now they are
>>> +	     the same, but this may change in the future.  */
>>> +	  if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>>> +	    dmask = mask;
>>>   
>>> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>>> -	cmask = mask;
>>> +	  if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>>> +	    cmask = mask;
>>>   
>>> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>>> +	  mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>>> +	}
>>> +      catch (const gdb_exception_error &e)
>>> +	{
>>> +	  /* We failed to fetch the contents of the masks from registers.  */
>>> +	}
>>>       }
>>>   
>>>     return aarch64_remove_top_bits (pointer, mask);
>>> diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
>>> index c0f27a8a66e..ebb8f140770 100644
>>> --- a/gdb/gdbthread.h
>>> +++ b/gdb/gdbthread.h
>>> @@ -875,6 +875,14 @@ class scoped_restore_current_thread
>>>     enum language m_lang;
>>>   };
>>>   
>>> +/* Return TRUE if an inferior thread exists and is currently selected,
>>> +   meaning it is safe to call inferior_thread () without risking an assertion.
>>> +
>>> +   Return FALSE otherwise.  If so, inferior_thread () will assert if
>>> +   called.  */
>>> +
>>> +extern bool inferior_thread_exists (void);
>>> +
>>>   /* Returns a pointer into the thread_info corresponding to
>>>      INFERIOR_PTID.  INFERIOR_PTID *must* be in the thread list.  */
>>>   extern struct thread_info* inferior_thread (void);
>>> diff --git a/gdb/regcache.c b/gdb/regcache.c
>>> index af76fab1a34..4685b01e9ab 100644
>>> --- a/gdb/regcache.c
>>> +++ b/gdb/regcache.c
>>> @@ -422,6 +422,16 @@ get_thread_regcache (thread_info *thread)
>>>   			      thread->ptid);
>>>   }
>>>   
>>> +/* See regcache.h.  */
>>> +
>>> +bool
>>> +current_regcache_exists (void)
>>> +{
>>> +  /* If we have an inferior thread selected, we should have a register
>>> +     cache as well.  */
>>> +  return inferior_thread_exists ();
>>> +}
>>> +
>>>   struct regcache *
>>>   get_current_regcache (void)
>>>   {
>>> diff --git a/gdb/regcache.h b/gdb/regcache.h
>>> index b9ffab9950d..adff7e44cb9 100644
>>> --- a/gdb/regcache.h
>>> +++ b/gdb/regcache.h
>>> @@ -30,6 +30,14 @@ struct address_space;
>>>   class thread_info;
>>>   struct process_stratum_target;
>>>   
>>> +/* Return TRUE if we can access the register cache from the
>>> +   currently-selected thread.  If so, we can call get_current_regcache ()
>>> +   safely without risking an assertion due to the lack of an inferior thread.
>>> +
>>> +   Return FALSE otherwise.  In this case, attempting to call
>>> +   get_current_regcache () will run into an assertion.  */
>>> +
>>> +extern bool current_regcache_exists (void);
>>>   extern struct regcache *get_current_regcache (void);
>>>   extern struct regcache *get_thread_regcache (process_stratum_target *target,
>>>   					     ptid_t ptid);
>>> diff --git a/gdb/thread.c b/gdb/thread.c
>>> index e4d98ce966a..545bbf2a411 100644
>>> --- a/gdb/thread.c
>>> +++ b/gdb/thread.c
>>> @@ -80,6 +80,14 @@ is_current_thread (const thread_info *thr)
>>>     return thr == current_thread_;
>>>   }
>>>   
>>> +/* See gdbthread.h.  */
>>> +
>>> +bool
>>> +inferior_thread_exists ()
>>> +{
>>> +  return current_thread_ != nullptr;
>>> +}
>>> +
>>>   struct thread_info*
>>>   inferior_thread (void)
>>>   {
>>> -- 
>>> 2.25.1
>> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-20 20:06     ` Andrew Burgess
@ 2023-03-22 10:26       ` Luis Machado
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-22 10:26 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches, pedro; +Cc: marcan

On 3/20/23 20:06, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> On 3/16/23 15:49, Andrew Burgess wrote:
>>> Luis Machado <luis.machado@arm.com> writes:
>>>
>>>> There were reports of gdb throwing internal errors when calling
>>>> inferior_thread ()/get_current_regcache () on a system with
>>>> Pointer Authentication enabled.
>>>>
>>>> In such cases, gdb produces the following backtrace:
>>>>
>>>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>>>> A problem internal to GDB has been detected,
>>>> further debugging may prove unreliable.
>>>> ----- Backtrace -----
>>>> 0xaaaae04a571f gdb_internal_backtrace_1
>>>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:122
>>>> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
>>>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:168
>>>> 0xaaaae0b52ccf internal_vproblem
>>>> 	../../../repos/binutils-gdb/gdb/utils.c:401
>>>> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
>>>> 	../../../repos/binutils-gdb/gdb/utils.c:481
>>>> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
>>>> 	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
>>>> 0xaaaae0a88983 _Z15inferior_threadv
>>>> 	../../../repos/binutils-gdb/gdb/thread.c:86
>>>> 0xaaaae0956c87 _Z20get_current_regcachev
>>>> 	../../../repos/binutils-gdb/gdb/regcache.c:428
>>>> 0xaaaae035223f aarch64_remove_non_address_bits
>>>> 	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
>>>> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
>>>> 	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
>>>> 0xaaaae0a692d7 memory_xfer_partial
>>>> 	../../../repos/binutils-gdb/gdb/target.c:1620
>>>> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
>>>> 	../../../repos/binutils-gdb/gdb/target.c:1684
>>>> 0xaaaae0a69e9f target_read_partial
>>>> 	../../../repos/binutils-gdb/gdb/target.c:1937
>>>> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
>>>> 	../../../repos/binutils-gdb/gdb/target.c:1977
>>>> 0xaaaae0a69937 _Z18target_read_memorymPhl
>>>> 	../../../repos/binutils-gdb/gdb/target.c:1773
>>>> 0xaaaae08be523 ps_xfer_memory
>>>> 	../../../repos/binutils-gdb/gdb/proc-service.c:90
>>>> 0xaaaae08be6db ps_pdread
>>>> 	../../../repos/binutils-gdb/gdb/proc-service.c:124
>>>> 0x40001ed7c3b3 _td_fetch_value
>>>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
>>>> 0x40001ed791ef td_ta_map_lwp2thr
>>>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
>>>> 0xaaaae07f4473 thread_from_lwp
>>>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
>>>> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>>>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
>>>> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>>>> 	../../../repos/binutils-gdb/gdb/target.c:2586
>>>> 0xaaaae0789cf7 do_target_wait_1
>>>> 	../../../repos/binutils-gdb/gdb/infrun.c:3825
>>>> 0xaaaae0789e6f operator()
>>>> 	../../../repos/binutils-gdb/gdb/infrun.c:3884
>>>> 0xaaaae078a167 do_target_wait
>>>> 	../../../repos/binutils-gdb/gdb/infrun.c:3903
>>>> 0xaaaae078b0af _Z20fetch_inferior_eventv
>>>> 	../../../repos/binutils-gdb/gdb/infrun.c:4314
>>>> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
>>>> 	../../../repos/binutils-gdb/gdb/inf-loop.c:41
>>>> 0xaaaae07dc68b handle_target_event
>>>> 	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
>>>> 0xaaaae0e25fbb handle_file_event
>>>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
>>>> 0xaaaae0e264f3 gdb_wait_for_event
>>>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
>>>> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
>>>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
>>>> 0xaaaae080f033 start_event_loop
>>>> 	../../../repos/binutils-gdb/gdb/main.c:411
>>>> 0xaaaae080f1b7 captured_command_loop
>>>> 	../../../repos/binutils-gdb/gdb/main.c:475
>>>> 0xaaaae0810b97 captured_main
>>>> 	../../../repos/binutils-gdb/gdb/main.c:1318
>>>> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
>>>> 	../../../repos/binutils-gdb/gdb/main.c:1337
>>>> 0xaaaae0338453 main
>>>> 	../../../repos/binutils-gdb/gdb/gdb.c:32
>>>> ---------------------
>>>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>>>> A problem internal to GDB has been detected,
>>>> further debugging may prove unreliable.
>>>> Quit this debugging session? (y or n)
>>>>
>>>> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
>>>> enabled more broad use of pointer authentication masks to remove non-address
>>>> bits of pointers.
>>>>
>>>> The above crash happens because gdb is in the middle of handling an event,
>>>> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
>>>> current thread.  This means a call to inferior_thread () will assert, and
>>>> attempting to call get_current_regcache () will also call inferior_thread (),
>>>> resulting in an assertion as well.
>>>>
>>>> There is another crash scenario after we kill a previously active inferior, in
>>>> which case the gdbarch will still say we support pointer authentication but we
>>>> will also have no current thread (inferior_thread () will assert etc).
>>>>
>>>> I thought about this for a bit, and I'm not sure what is the best way
>>>> to address this other than actively checking for a valid current thread
>>>> and valid current register cache.
>>>>
>>>> If the target has support for Pointer Authentication, gdb needs to use
>>>> a couple (or 4, for bare-metal) mask registers to mask off some bits of
>>>> pointers, and for that it needs to access the registers.
>>>>
>>>> At some points, like the one from the backtrace above, there is no active
>>>> thread/current regcache because gdb is in the middle of doing event handling
>>>> and switching between threads.
>>>>
>>>> The following patch improves this situation by checking for an active thread
>>>> and register cache. If those don't exist, gdb falls back to using a default
>>>> mask which should be enough for what it is trying to do at the moment.
>>>>
>>>> I discussed with Pedro the possibility of caching the mask register values
>>>> (which are per-process and can change mid-execution), but there isn't a good
>>>> spot to cache those values. Besides, the mask registers can change constantly
>>>> for bare-metal debugging when switching between exception levels.
>>>>
>>>> In some cases, it is just not possible to get access to these mask registers,
>>>> like the case where threads are running. In those cases, using a default mask
>>>> to remove the non-address bits should be enough.
>>>>
>>>> In addition to the check for a valid current register cache, I also introduced
>>>> a try/catch guard against errors reading registers, in case the current thread
>>>> is running, also falling back to using a default mask to remove non-address
>>>> bits.
>>>>
>>>> This can happen when we let threads run in the background and then we attempt
>>>> to access a memory address (now that gdb is capable of reading memory even
>>>> with threads running).  Thus gdb will attempt to remove non-address bits
>>>> of that memory access, will attempt to access registers, running into errors.
>>>>
>>>> Regression-tested on aarch64-linux Ubuntu 20.04.
>>>>
>>>> Thoughts?
>>>> ---
>>>>    gdb/aarch64-tdep.c | 38 +++++++++++++++++++++++++++++---------
>>>>    gdb/gdbthread.h    |  8 ++++++++
>>>>    gdb/regcache.c     | 10 ++++++++++
>>>>    gdb/regcache.h     |  8 ++++++++
>>>>    gdb/thread.c       |  8 ++++++++
>>>
>>> Would it be possible to write a test for this issue?  Or maybe some of
>>> the existing tests fail due to this issue?  If there are some tests that
>>> already expose this issue then it would be great to mention them in the
>>> commit message.
>>>
>>
>> Existing tests fail due to this. Two example are gdb.base/break.exp and gdb.base/access-mem-running.exp are
>> two examples. I'll add those to the commit message for context.
>>
>> The catch here is that they will only fail on a system that supports pointer authentication natively, and
>> there are only a few of those at the moment (Graviton 3, M2 and system
>> QEMU are 3 examples I know of).
> 
> I don't think that's a problem, you should just list a few tests that
> exhibit failures and under what circumstances you expect to see the
> failures, e.g. on a system with pointer authentication.
> 

Done now. It will be in v2.

>>
>>>>    5 files changed, 63 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
>>>> index 5b1b9921f87..8368592a7db 100644
>>>> --- a/gdb/aarch64-tdep.c
>>>> +++ b/gdb/aarch64-tdep.c
>>>> @@ -3562,13 +3562,17 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>>>         select bit (55).  */
>>>>      CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>>>>    
>>>> -  if (tdep->has_pauth ())
>>>> +  /* Make sure we have a valid active current register cache to fetch pointer
>>>> +     authentication masks from.  In some situations, there may be no active
>>>> +     threads to fetch the data from (no inferiors, gdb in the middle of
>>>> +     handling events etc).  If we don't have a valid register cache, fallback
>>>> +     to using a default mask to remove non-address bits.  */
>>>> +  if (tdep->has_pauth () && current_regcache_exists ())
>>>
>>> Maybe I'm completely off the mark here, but does target_has_registers
>>> not help at all here?
>>
>> It does, and that was the first thing I considered for guarding this block against a situation where we
>> have no current thread.
>>
>> But...
>>
>>>
>>> For a process_stratum_target, this should end up in this function:
>>>
>>>     bool
>>>     process_stratum_target::has_registers ()
>>>     {
>>>       /* Can't read registers from no inferior.  */
>>>       return inferior_ptid != null_ptid;
>>>     }
>>>
>>> And given that switch_to_no_thread () includes the line:
>>>
>>>     inferior_ptid = null_ptid;
>>>
>>
>> ... despite the call to switch_to_no_thread in switch_to_inferior_no_thread from do_target_wait_1
>> in the backtrace above, the call to ps_xfer_memory sets inferior_ptid momentarily before reading
>> memory:
>>
>>
>> static ps_err_e
>> ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
>>                   gdb_byte *buf, size_t len, int write)
>> {
>>     scoped_restore_current_inferior restore_inferior;
>>     set_current_inferior (ph->thread->inf);
>>
>>     scoped_restore_current_program_space restore_current_progspace;
>>     set_current_program_space (ph->thread->inf->pspace);
>>
>>     scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>>     inferior_ptid = ph->thread->ptid;
>>
>>     CORE_ADDR core_addr = ps_addr_to_core_addr (addr);
>>
>>     int ret;
>>     if (write)
>>       ret = target_write_memory (core_addr, buf, len);
>>     else
>>       ret = target_read_memory (core_addr, buf, len);
>>     return (ret == 0 ? PS_OK : PS_ERR);
>> }
>>
>>
>> Maybe this shouldn't happen, or maybe it is just an unfortunate state to be in. But this
>> prevents the use of target_has_registers to guard against the lack of registers, since,
>> although current_thread_ is still nullptr, inferior_ptid is valid and is not null_ptid.
>>
>> So doing a direct check of current_thread_ through a helper function
>> seemed safer.
> 
> Personally, I find this sort of information really valuable to include
> in the commit message.
> 
> If someone is digging through this code years from now, all they'll have
> is your commit message.  They, like me, might think about
> target_has_registers, and then they have to change GDB, test everything,
> and rediscover why target_has_registers isn't going to work.
> 

You're right. I'll add that to v2.

>>
>> And just to clarify, this is just one of the issues where we attempt to fetch a current
>> register cache with a null current_thread_, leading to an assertion.
>>
>> The other issue, which is solved with a try/catch block, yields an error message about not
>> being able to access registers (because the thread we want to fetch registers from is still
>> running).
>>
>> Maybe in the future we should prevent calls to register fetch/store functions if the thread we
>> want to fetch data from is still running. Right now this isn't a problem besides the AArch64
>> target though.
> 
> I think thread_info::executing() should tell you if the thread is
> running... At least, I think that's the intention.  I don't think that
> field is always correct, so keeping the try/catch is probably a good
> idea, at least for now.
> 
> I don't object to this patch going in as is then (maybe with an extended
> commit message), but I guess you should figure out with Simon if his
> idea is better or not...

Thanks for the feedback. I'll wait for Simon's take on it.

> 
> Thanks for all the details,
> Andrew
> 
> 
>>
>>> My hope would be that when we have no thread selected,
>>> target_has_registers will return false.
>>>
>>> The other thing I worried about before suggesting the above is this
>>> depends on the current_inferior in order to identify a target stack.
>>> But reading registers already makes that assumption, right.. reading a
>>> register might require us to actually go get the registers, so we are
>>> already assuming that the current_inferior() is sane at this point.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>>
>>>
>>>>        {
>>>>          /* Fetch the PAC masks.  These masks are per-process, so we can just
>>>> -	 fetch data from whatever thread we have at the moment.
>>>> +	 fetch data from whatever thread we have at the moment, hence why we
>>>> +	 use the current register cache.  */
>>>>    
>>>> -	 Also, we have both a code mask and a data mask.  For now they are the
>>>> -	 same, but this may change in the future.  */
>>>>          struct regcache *regs = get_current_regcache ();
>>>>          CORE_ADDR cmask, dmask;
>>>>          int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
>>>> @@ -3583,13 +3587,29 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>>>    	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>>>>    	}
>>>>    
>>>> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>>>> -	dmask = mask;
>>>> +      /* GDB supports reading memory from threads that are running, but in
>>>> +	 those cases GDB cannot access registers.  If reading the pointer
>>>> +	 authentication registers fails, fallback to using a default mask.
>>>> +
>>>> +	 We could potentially get it wrong if the masks have been updated
>>>> +	 while the thread is running, but eventually we will see the thread
>>>> +	 stop and will be able to fetch the correct data.  */
>>>> +      try
>>>> +	{
>>>> +	  /* We have both a code mask and a data mask.  For now they are
>>>> +	     the same, but this may change in the future.  */
>>>> +	  if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>>>> +	    dmask = mask;
>>>>    
>>>> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>>>> -	cmask = mask;
>>>> +	  if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>>>> +	    cmask = mask;
>>>>    
>>>> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>>>> +	  mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>>>> +	}
>>>> +      catch (const gdb_exception_error &e)
>>>> +	{
>>>> +	  /* We failed to fetch the contents of the masks from registers.  */
>>>> +	}
>>>>        }
>>>>    
>>>>      return aarch64_remove_top_bits (pointer, mask);
>>>> diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
>>>> index c0f27a8a66e..ebb8f140770 100644
>>>> --- a/gdb/gdbthread.h
>>>> +++ b/gdb/gdbthread.h
>>>> @@ -875,6 +875,14 @@ class scoped_restore_current_thread
>>>>      enum language m_lang;
>>>>    };
>>>>    
>>>> +/* Return TRUE if an inferior thread exists and is currently selected,
>>>> +   meaning it is safe to call inferior_thread () without risking an assertion.
>>>> +
>>>> +   Return FALSE otherwise.  If so, inferior_thread () will assert if
>>>> +   called.  */
>>>> +
>>>> +extern bool inferior_thread_exists (void);
>>>> +
>>>>    /* Returns a pointer into the thread_info corresponding to
>>>>       INFERIOR_PTID.  INFERIOR_PTID *must* be in the thread list.  */
>>>>    extern struct thread_info* inferior_thread (void);
>>>> diff --git a/gdb/regcache.c b/gdb/regcache.c
>>>> index af76fab1a34..4685b01e9ab 100644
>>>> --- a/gdb/regcache.c
>>>> +++ b/gdb/regcache.c
>>>> @@ -422,6 +422,16 @@ get_thread_regcache (thread_info *thread)
>>>>    			      thread->ptid);
>>>>    }
>>>>    
>>>> +/* See regcache.h.  */
>>>> +
>>>> +bool
>>>> +current_regcache_exists (void)
>>>> +{
>>>> +  /* If we have an inferior thread selected, we should have a register
>>>> +     cache as well.  */
>>>> +  return inferior_thread_exists ();
>>>> +}
>>>> +
>>>>    struct regcache *
>>>>    get_current_regcache (void)
>>>>    {
>>>> diff --git a/gdb/regcache.h b/gdb/regcache.h
>>>> index b9ffab9950d..adff7e44cb9 100644
>>>> --- a/gdb/regcache.h
>>>> +++ b/gdb/regcache.h
>>>> @@ -30,6 +30,14 @@ struct address_space;
>>>>    class thread_info;
>>>>    struct process_stratum_target;
>>>>    
>>>> +/* Return TRUE if we can access the register cache from the
>>>> +   currently-selected thread.  If so, we can call get_current_regcache ()
>>>> +   safely without risking an assertion due to the lack of an inferior thread.
>>>> +
>>>> +   Return FALSE otherwise.  In this case, attempting to call
>>>> +   get_current_regcache () will run into an assertion.  */
>>>> +
>>>> +extern bool current_regcache_exists (void);
>>>>    extern struct regcache *get_current_regcache (void);
>>>>    extern struct regcache *get_thread_regcache (process_stratum_target *target,
>>>>    					     ptid_t ptid);
>>>> diff --git a/gdb/thread.c b/gdb/thread.c
>>>> index e4d98ce966a..545bbf2a411 100644
>>>> --- a/gdb/thread.c
>>>> +++ b/gdb/thread.c
>>>> @@ -80,6 +80,14 @@ is_current_thread (const thread_info *thr)
>>>>      return thr == current_thread_;
>>>>    }
>>>>    
>>>> +/* See gdbthread.h.  */
>>>> +
>>>> +bool
>>>> +inferior_thread_exists ()
>>>> +{
>>>> +  return current_thread_ != nullptr;
>>>> +}
>>>> +
>>>>    struct thread_info*
>>>>    inferior_thread (void)
>>>>    {
>>>> -- 
>>>> 2.25.1
>>>
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-16 10:39 [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers Luis Machado
  2023-03-16 15:49 ` Andrew Burgess
@ 2023-03-23 10:56 ` Luis Machado
  2023-03-23 18:25   ` Kevin Buettner
  2023-03-24  8:19   ` Luis Machado
  1 sibling, 2 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-23 10:56 UTC (permalink / raw)
  To: gdb-patches, aburgess, simon.marchi, pedro; +Cc: marcan

Changes in v2:

- Dropped helper functions and used inferior_ptid/current_inferior ()
instead.  Changes are now aarch64-specific.

There were reports of gdb throwing internal errors when calling
inferior_thread ()/get_current_regcache () on a system with
Pointer Authentication enabled.

In such cases, gdb produces the following backtrace:

../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0xaaaae04a571f gdb_internal_backtrace_1
	../../../repos/binutils-gdb/gdb/bt-utils.c:122
0xaaaae04a57f3 _Z22gdb_internal_backtracev
	../../../repos/binutils-gdb/gdb/bt-utils.c:168
0xaaaae0b52ccf internal_vproblem
	../../../repos/binutils-gdb/gdb/utils.c:401
0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
	../../../repos/binutils-gdb/gdb/utils.c:481
0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
0xaaaae0a88983 _Z15inferior_threadv
	../../../repos/binutils-gdb/gdb/thread.c:86
0xaaaae0956c87 _Z20get_current_regcachev
	../../../repos/binutils-gdb/gdb/regcache.c:428
0xaaaae035223f aarch64_remove_non_address_bits
	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
0xaaaae0a692d7 memory_xfer_partial
	../../../repos/binutils-gdb/gdb/target.c:1620
0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
	../../../repos/binutils-gdb/gdb/target.c:1684
0xaaaae0a69e9f target_read_partial
	../../../repos/binutils-gdb/gdb/target.c:1937
0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
	../../../repos/binutils-gdb/gdb/target.c:1977
0xaaaae0a69937 _Z18target_read_memorymPhl
	../../../repos/binutils-gdb/gdb/target.c:1773
0xaaaae08be523 ps_xfer_memory
	../../../repos/binutils-gdb/gdb/proc-service.c:90
0xaaaae08be6db ps_pdread
	../../../repos/binutils-gdb/gdb/proc-service.c:124
0x40001ed7c3b3 _td_fetch_value
	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
0x40001ed791ef td_ta_map_lwp2thr
	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
0xaaaae07f4473 thread_from_lwp
	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
	../../../repos/binutils-gdb/gdb/target.c:2586
0xaaaae0789cf7 do_target_wait_1
	../../../repos/binutils-gdb/gdb/infrun.c:3825
0xaaaae0789e6f operator()
	../../../repos/binutils-gdb/gdb/infrun.c:3884
0xaaaae078a167 do_target_wait
	../../../repos/binutils-gdb/gdb/infrun.c:3903
0xaaaae078b0af _Z20fetch_inferior_eventv
	../../../repos/binutils-gdb/gdb/infrun.c:4314
0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
	../../../repos/binutils-gdb/gdb/inf-loop.c:41
0xaaaae07dc68b handle_target_event
	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
0xaaaae0e25fbb handle_file_event
	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
0xaaaae0e264f3 gdb_wait_for_event
	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
0xaaaae0e24f9b _Z16gdb_do_one_eventi
	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
0xaaaae080f033 start_event_loop
	../../../repos/binutils-gdb/gdb/main.c:411
0xaaaae080f1b7 captured_command_loop
	../../../repos/binutils-gdb/gdb/main.c:475
0xaaaae0810b97 captured_main
	../../../repos/binutils-gdb/gdb/main.c:1318
0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
	../../../repos/binutils-gdb/gdb/main.c:1337
0xaaaae0338453 main
	../../../repos/binutils-gdb/gdb/gdb.c:32
---------------------
../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

We also see failures across the testsuite if the tests get executed on a target
that has native support for the pointer authentication feature. But
gdb.base/break.exp and gdb.base/access-mem-running.exp are two examples of
tests that run into errors and internal errors.

This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
enabled more broad use of pointer authentication masks to remove non-address
bits of pointers, but wasn't immediately detected because systems with native
support for pointer authentication are not that common yet.

The above crash happens because gdb is in the middle of handling an event,
and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
current thread.  This means a call to inferior_thread () will assert, and
attempting to call get_current_regcache () will also call inferior_thread (),
resulting in an assertion as well.

target_has_registers was one function that seemed useful for detecting these
types of situation where we don't have a register cache. The problem with that
is the inconsistent state of inferior_ptid, which is used by
target_has_registers.

Despite the call to switch_to_no_thread in switch_to_inferior_no_thread from
do_target_wait_1 in the backtrace above clearing inferior_ptid, the call to
ps_xfer_memory sets inferior_ptid momentarily before reading memory:

static ps_err_e
ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
                gdb_byte *buf, size_t len, int write)
{
  scoped_restore_current_inferior restore_inferior;
  set_current_inferior (ph->thread->inf);

  scoped_restore_current_program_space restore_current_progspace;
  set_current_program_space (ph->thread->inf->pspace);

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
  inferior_ptid = ph->thread->ptid;

  CORE_ADDR core_addr = ps_addr_to_core_addr (addr);

  int ret;
  if (write)
    ret = target_write_memory (core_addr, buf, len);
  else
    ret = target_read_memory (core_addr, buf, len);
  return (ret == 0 ? PS_OK : PS_ERR);
}

Maybe this shouldn't happen, or maybe it is just an unfortunate state to be
in. But this prevents the use of target_has_registers to guard against the
lack of registers, since, although current_thread_ is still nullptr,
inferior_ptid is valid and is not null_ptid.

There is another crash scenario after we kill a previously active inferior, in
which case the gdbarch will still say we support pointer authentication but we
will also have no current thread (inferior_thread () will assert etc).

If the target has support for pointer authentication, gdb needs to use
a couple (or 4, for bare-metal) mask registers to mask off some bits of
pointers, and for that it needs to access the registers.

At some points, like the one from the backtrace above, there is no active
thread/current regcache because gdb is in the middle of doing event handling
and switching between threads.

Simon suggested the use of inferior_ptid to fetch the register cache, as
opposed to relying on the current register cache.  Though we need to make sure
inferior_ptid is valid (not null_ptid), I think this works nicely.

With inferior_ptid, we can do safety checks along the way, making sure we have
a thread to fetch a register cache from and checking if the thread is actually
stopped or running.

The following patch implements this idea with safety checks to make sure we
don't run into assertions or errors.  If any of the checks fail, we fallback to
using a default mask to remove non-address bits of a pointer.

I discussed with Pedro the possibility of caching the mask register values
(which are per-process and can change mid-execution), but there isn't a good
spot to cache those values. Besides, the mask registers can change constantly
for bare-metal debugging when switching between exception levels.

In some cases, it is just not possible to get access to these mask registers,
like the case where threads are running. In those cases, using a default mask
to remove the non-address bits should be enough.

This can happen when we let threads run in the background and then we attempt
to access a memory address (now that gdb is capable of reading memory even
with threads running).  Thus gdb will attempt to remove non-address bits
of that memory access, will attempt to access registers, running into errors.

Regression-tested on aarch64-linux Ubuntu 20.04.
---
 gdb/aarch64-tdep.c | 88 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 62 insertions(+), 26 deletions(-)

diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
index 5b1b9921f87..d11d8320799 100644
--- a/gdb/aarch64-tdep.c
+++ b/gdb/aarch64-tdep.c
@@ -55,6 +55,9 @@
 #include <algorithm>
 #include <unordered_map>
 
+/* For inferior_ptid and current_inferior ().  */
+#include "inferior.h"
+
 /* A Homogeneous Floating-Point or Short-Vector Aggregate may have at most
    four members.  */
 #define HA_MAX_NUM_FLDS		4
@@ -3556,40 +3559,73 @@ aarch64_stack_frame_destroyed_p (struct gdbarch *gdbarch, CORE_ADDR pc)
 static CORE_ADDR
 aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
 {
-  aarch64_gdbarch_tdep *tdep = gdbarch_tdep<aarch64_gdbarch_tdep> (gdbarch);
-
   /* By default, we assume TBI and discard the top 8 bits plus the VA range
-     select bit (55).  */
+     select bit (55).  Below we try to fetch information about pointer
+     authentication masks in order to make non-address removal more
+     precise.  */
   CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
 
-  if (tdep->has_pauth ())
+  /* Check if we have an inferior first.  If not, just use the default
+     mask.
+
+     We use the inferior_ptid here because the pointer authentication masks
+     should be the same across threads of a process.  Since we may not have
+     access to the current thread (gdb may have switched to no inferiors
+     momentarily), we use the inferior ptid.  */
+  if (inferior_ptid != null_ptid)
     {
-      /* Fetch the PAC masks.  These masks are per-process, so we can just
-	 fetch data from whatever thread we have at the moment.
-
-	 Also, we have both a code mask and a data mask.  For now they are the
-	 same, but this may change in the future.  */
-      struct regcache *regs = get_current_regcache ();
-      CORE_ADDR cmask, dmask;
-      int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
-      int cmask_regnum = AARCH64_PAUTH_CMASK_REGNUM (tdep->pauth_reg_base);
-
-      /* If we have a kernel address and we have kernel-mode address mask
-	 registers, use those instead.  */
-      if (tdep->pauth_reg_count > 2
-	  && pointer & VA_RANGE_SELECT_BIT_MASK)
+      /* If we do have an inferior, attempt to fetch its thread's thread_info
+	 struct.  */
+      thread_info *thread
+	= find_thread_ptid (current_inferior ()->process_target (),
+			    inferior_ptid);
+
+      /* If the thread is running, we will not be able to fetch the mask
+	 registers.  */
+      if (thread != nullptr && thread->state != THREAD_RUNNING)
 	{
-	  dmask_regnum = AARCH64_PAUTH_DMASK_HIGH_REGNUM (tdep->pauth_reg_base);
-	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
-	}
+	  /* Otherwise, fetch the register cache and the masks.  */
+	  struct regcache *regs
+	    = get_thread_regcache (current_inferior ()->process_target (),
+				   inferior_ptid);
+
+	  /* Use the gdbarch from the register cache to check for pointer
+	     authentication support, as it matches the features found in
+	     that particular thread.  */
+	  aarch64_gdbarch_tdep *tdep
+	    = gdbarch_tdep<aarch64_gdbarch_tdep> (regs->arch ());
 
-      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
-	dmask = mask;
+	  /* Is there pointer authentication support?  */
+	  if (tdep->has_pauth ())
+	    {
+	      CORE_ADDR cmask, dmask;
+	      int dmask_regnum
+		= AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
+	      int cmask_regnum
+		= AARCH64_PAUTH_CMASK_REGNUM (tdep->pauth_reg_base);
+
+	      /* If we have a kernel address and we have kernel-mode address
+		 mask registers, use those instead.  */
+	      if (tdep->pauth_reg_count > 2
+		  && pointer & VA_RANGE_SELECT_BIT_MASK)
+		{
+		  dmask_regnum
+		    = AARCH64_PAUTH_DMASK_HIGH_REGNUM (tdep->pauth_reg_base);
+		  cmask_regnum
+		    = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
+		}
 
-      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
-	cmask = mask;
+	      /* We have both a code mask and a data mask.  For now they are
+		 the same, but this may change in the future.  */
+	      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
+		dmask = mask;
 
-      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
+	      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
+		cmask = mask;
+
+	      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
+	    }
+	}
     }
 
   return aarch64_remove_top_bits (pointer, mask);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-23 10:56 ` [PATCH, v2] " Luis Machado
@ 2023-03-23 18:25   ` Kevin Buettner
  2023-03-23 18:27     ` Luis Machado
  2023-03-24  8:19   ` Luis Machado
  1 sibling, 1 reply; 17+ messages in thread
From: Kevin Buettner @ 2023-03-23 18:25 UTC (permalink / raw)
  To: Luis Machado via Gdb-patches; +Cc: Luis Machado

On Thu, 23 Mar 2023 10:56:19 +0000
Luis Machado via Gdb-patches <gdb-patches@sourceware.org> wrote:

> Changes in v2:
> 
> - Dropped helper functions and used inferior_ptid/current_inferior ()
> instead.  Changes are now aarch64-specific.

A bug which your patch fixes had been reported in Fedora bugzilla; see:

https://bugzilla.redhat.com/show_bug.cgi?id=2177655

I made a scratch build using your v1 patch and one user reported the
bug fixed.

I've looked over your v2 patch and it looks good to me.

Kevin


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-23 18:25   ` Kevin Buettner
@ 2023-03-23 18:27     ` Luis Machado
  2023-03-23 18:32       ` Luis Machado
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado @ 2023-03-23 18:27 UTC (permalink / raw)
  To: Kevin Buettner, Luis Machado via Gdb-patches

On 3/23/23 18:25, Kevin Buettner wrote:
> On Thu, 23 Mar 2023 10:56:19 +0000
> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> wrote:
> 
>> Changes in v2:
>>
>> - Dropped helper functions and used inferior_ptid/current_inferior ()
>> instead.  Changes are now aarch64-specific.
> 
> A bug which your patch fixes had been reported in Fedora bugzilla; see:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2177655
> 
> I made a scratch build using your v1 patch and one user reported the
> bug fixed.
> 
> I've looked over your v2 patch and it looks good to me.

Thanks for testing. Andrew B. had pinged me about a report on his end.

Hopefully we can get this sorted out soon, as it is quite annoying.

> 
> Kevin
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-23 18:27     ` Luis Machado
@ 2023-03-23 18:32       ` Luis Machado
  2023-03-23 18:39         ` Kevin Buettner
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado @ 2023-03-23 18:32 UTC (permalink / raw)
  To: Kevin Buettner, Luis Machado via Gdb-patches

On 3/23/23 18:27, Luis Machado wrote:
> On 3/23/23 18:25, Kevin Buettner wrote:
>> On Thu, 23 Mar 2023 10:56:19 +0000
>> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> wrote:
>>
>>> Changes in v2:
>>>
>>> - Dropped helper functions and used inferior_ptid/current_inferior ()
>>> instead.  Changes are now aarch64-specific.
>>
>> A bug which your patch fixes had been reported in Fedora bugzilla; see:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=2177655
>>
>> I made a scratch build using your v1 patch and one user reported the
>> bug fixed.
>>
>> I've looked over your v2 patch and it looks good to me.
> 
> Thanks for testing. Andrew B. had pinged me about a report on his end.
> 
> Hopefully we can get this sorted out soon, as it is quite annoying.
> 
>>
>> Kevin
>>
> 

Worth mentioning that I plan to backport this fix to GDB 13.1, which might require a
slightly different patch because in GDB 13 this code is linux-specific instead of
OS-independent.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-23 18:32       ` Luis Machado
@ 2023-03-23 18:39         ` Kevin Buettner
  2023-03-24  6:28           ` Luis Machado
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Buettner @ 2023-03-23 18:39 UTC (permalink / raw)
  To: Luis Machado; +Cc: gdb-patches

On Thu, 23 Mar 2023 18:32:31 +0000
Luis Machado <luis.machado@arm.com> wrote:

> On 3/23/23 18:27, Luis Machado wrote:
> > On 3/23/23 18:25, Kevin Buettner wrote:  
> >> On Thu, 23 Mar 2023 10:56:19 +0000
> >> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> wrote:
> >>  
> >>> Changes in v2:
> >>>
> >>> - Dropped helper functions and used inferior_ptid/current_inferior ()
> >>> instead.__ Changes are now aarch64-specific.  
> >>
> >> A bug which your patch fixes had been reported in Fedora bugzilla; see:
> >>
> >> https://bugzilla.redhat.com/show_bug.cgi?id=2177655
> >>
> >> I made a scratch build using your v1 patch and one user reported the
> >> bug fixed.
> >>
> >> I've looked over your v2 patch and it looks good to me.  
> > 
> > Thanks for testing. Andrew B. had pinged me about a report on his end.
> > 
> > Hopefully we can get this sorted out soon, as it is quite annoying.
> >   
> >>
> >> Kevin
> >>  
> >   
> 
> Worth mentioning that I plan to backport this fix to GDB 13.1, which might require a
> slightly different patch because in GDB 13 this code is linux-specific instead of
> OS-independent.

In order to apply your v1 patch to (Fedora) GDB 13.1, I first
cherry-picked commit 6d0020873de.  I think it might make sense
to do the same for the 13.1 branch.

Kevin


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-23 18:39         ` Kevin Buettner
@ 2023-03-24  6:28           ` Luis Machado
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-24  6:28 UTC (permalink / raw)
  To: Kevin Buettner; +Cc: gdb-patches

On 3/23/23 18:39, Kevin Buettner wrote:
> On Thu, 23 Mar 2023 18:32:31 +0000
> Luis Machado <luis.machado@arm.com> wrote:
> 
>> On 3/23/23 18:27, Luis Machado wrote:
>>> On 3/23/23 18:25, Kevin Buettner wrote:
>>>> On Thu, 23 Mar 2023 10:56:19 +0000
>>>> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> wrote:
>>>>   
>>>>> Changes in v2:
>>>>>
>>>>> - Dropped helper functions and used inferior_ptid/current_inferior ()
>>>>> instead.__ Changes are now aarch64-specific.
>>>>
>>>> A bug which your patch fixes had been reported in Fedora bugzilla; see:
>>>>
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2177655
>>>>
>>>> I made a scratch build using your v1 patch and one user reported the
>>>> bug fixed.
>>>>
>>>> I've looked over your v2 patch and it looks good to me.
>>>
>>> Thanks for testing. Andrew B. had pinged me about a report on his end.
>>>
>>> Hopefully we can get this sorted out soon, as it is quite annoying.
>>>    
>>>>
>>>> Kevin
>>>>   
>>>    
>>
>> Worth mentioning that I plan to backport this fix to GDB 13.1, which might require a
>> slightly different patch because in GDB 13 this code is linux-specific instead of
>> OS-independent.
> 
> In order to apply your v1 patch to (Fedora) GDB 13.1, I first
> cherry-picked commit 6d0020873de.  I think it might make sense
> to do the same for the 13.1 branch.

Though that works, it will enable pauth support for bare metal in GDB 13.1. The change could be
backported without that particular feature.

> 
> Kevin
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-23 10:56 ` [PATCH, v2] " Luis Machado
  2023-03-23 18:25   ` Kevin Buettner
@ 2023-03-24  8:19   ` Luis Machado
  2023-03-24 13:42     ` Luis Machado
  1 sibling, 1 reply; 17+ messages in thread
From: Luis Machado @ 2023-03-24  8:19 UTC (permalink / raw)
  To: gdb-patches, aburgess, simon.marchi, pedro; +Cc: marcan

Hi,


On 3/23/23 10:56, Luis Machado via Gdb-patches wrote:
> Changes in v2:
> 
> - Dropped helper functions and used inferior_ptid/current_inferior ()
> instead.  Changes are now aarch64-specific.
> 
> There were reports of gdb throwing internal errors when calling
> inferior_thread ()/get_current_regcache () on a system with
> Pointer Authentication enabled.
> 
> In such cases, gdb produces the following backtrace:
> 
> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> ----- Backtrace -----
> 0xaaaae04a571f gdb_internal_backtrace_1
> 	../../../repos/binutils-gdb/gdb/bt-utils.c:122
> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
> 	../../../repos/binutils-gdb/gdb/bt-utils.c:168
> 0xaaaae0b52ccf internal_vproblem
> 	../../../repos/binutils-gdb/gdb/utils.c:401
> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
> 	../../../repos/binutils-gdb/gdb/utils.c:481
> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
> 	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
> 0xaaaae0a88983 _Z15inferior_threadv
> 	../../../repos/binutils-gdb/gdb/thread.c:86
> 0xaaaae0956c87 _Z20get_current_regcachev
> 	../../../repos/binutils-gdb/gdb/regcache.c:428
> 0xaaaae035223f aarch64_remove_non_address_bits
> 	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
> 	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
> 0xaaaae0a692d7 memory_xfer_partial
> 	../../../repos/binutils-gdb/gdb/target.c:1620
> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
> 	../../../repos/binutils-gdb/gdb/target.c:1684
> 0xaaaae0a69e9f target_read_partial
> 	../../../repos/binutils-gdb/gdb/target.c:1937
> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
> 	../../../repos/binutils-gdb/gdb/target.c:1977
> 0xaaaae0a69937 _Z18target_read_memorymPhl
> 	../../../repos/binutils-gdb/gdb/target.c:1773
> 0xaaaae08be523 ps_xfer_memory
> 	../../../repos/binutils-gdb/gdb/proc-service.c:90
> 0xaaaae08be6db ps_pdread
> 	../../../repos/binutils-gdb/gdb/proc-service.c:124
> 0x40001ed7c3b3 _td_fetch_value
> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
> 0x40001ed791ef td_ta_map_lwp2thr
> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
> 0xaaaae07f4473 thread_from_lwp
> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
> 	../../../repos/binutils-gdb/gdb/target.c:2586
> 0xaaaae0789cf7 do_target_wait_1
> 	../../../repos/binutils-gdb/gdb/infrun.c:3825
> 0xaaaae0789e6f operator()
> 	../../../repos/binutils-gdb/gdb/infrun.c:3884
> 0xaaaae078a167 do_target_wait
> 	../../../repos/binutils-gdb/gdb/infrun.c:3903
> 0xaaaae078b0af _Z20fetch_inferior_eventv
> 	../../../repos/binutils-gdb/gdb/infrun.c:4314
> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
> 	../../../repos/binutils-gdb/gdb/inf-loop.c:41
> 0xaaaae07dc68b handle_target_event
> 	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
> 0xaaaae0e25fbb handle_file_event
> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
> 0xaaaae0e264f3 gdb_wait_for_event
> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
> 0xaaaae080f033 start_event_loop
> 	../../../repos/binutils-gdb/gdb/main.c:411
> 0xaaaae080f1b7 captured_command_loop
> 	../../../repos/binutils-gdb/gdb/main.c:475
> 0xaaaae0810b97 captured_main
> 	../../../repos/binutils-gdb/gdb/main.c:1318
> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
> 	../../../repos/binutils-gdb/gdb/main.c:1337
> 0xaaaae0338453 main
> 	../../../repos/binutils-gdb/gdb/gdb.c:32
> ---------------------
> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n)
> 
> We also see failures across the testsuite if the tests get executed on a target
> that has native support for the pointer authentication feature. But
> gdb.base/break.exp and gdb.base/access-mem-running.exp are two examples of
> tests that run into errors and internal errors.
> 
> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
> enabled more broad use of pointer authentication masks to remove non-address
> bits of pointers, but wasn't immediately detected because systems with native
> support for pointer authentication are not that common yet.
> 
> The above crash happens because gdb is in the middle of handling an event,
> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
> current thread.  This means a call to inferior_thread () will assert, and
> attempting to call get_current_regcache () will also call inferior_thread (),
> resulting in an assertion as well.
> 
> target_has_registers was one function that seemed useful for detecting these
> types of situation where we don't have a register cache. The problem with that
> is the inconsistent state of inferior_ptid, which is used by
> target_has_registers.
> 
> Despite the call to switch_to_no_thread in switch_to_inferior_no_thread from
> do_target_wait_1 in the backtrace above clearing inferior_ptid, the call to
> ps_xfer_memory sets inferior_ptid momentarily before reading memory:
> 
> static ps_err_e
> ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
>                  gdb_byte *buf, size_t len, int write)
> {
>    scoped_restore_current_inferior restore_inferior;
>    set_current_inferior (ph->thread->inf);
> 
>    scoped_restore_current_program_space restore_current_progspace;
>    set_current_program_space (ph->thread->inf->pspace);
> 
>    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>    inferior_ptid = ph->thread->ptid;
> 
>    CORE_ADDR core_addr = ps_addr_to_core_addr (addr);
> 
>    int ret;
>    if (write)
>      ret = target_write_memory (core_addr, buf, len);
>    else
>      ret = target_read_memory (core_addr, buf, len);
>    return (ret == 0 ? PS_OK : PS_ERR);
> }
> 
> Maybe this shouldn't happen, or maybe it is just an unfortunate state to be
> in. But this prevents the use of target_has_registers to guard against the
> lack of registers, since, although current_thread_ is still nullptr,
> inferior_ptid is valid and is not null_ptid.
> 
> There is another crash scenario after we kill a previously active inferior, in
> which case the gdbarch will still say we support pointer authentication but we
> will also have no current thread (inferior_thread () will assert etc).
> 
> If the target has support for pointer authentication, gdb needs to use
> a couple (or 4, for bare-metal) mask registers to mask off some bits of
> pointers, and for that it needs to access the registers.
> 
> At some points, like the one from the backtrace above, there is no active
> thread/current regcache because gdb is in the middle of doing event handling
> and switching between threads.
> 
> Simon suggested the use of inferior_ptid to fetch the register cache, as
> opposed to relying on the current register cache.  Though we need to make sure
> inferior_ptid is valid (not null_ptid), I think this works nicely.
> 
> With inferior_ptid, we can do safety checks along the way, making sure we have
> a thread to fetch a register cache from and checking if the thread is actually
> stopped or running.
> 
> The following patch implements this idea with safety checks to make sure we
> don't run into assertions or errors.  If any of the checks fail, we fallback to
> using a default mask to remove non-address bits of a pointer.
> 
> I discussed with Pedro the possibility of caching the mask register values
> (which are per-process and can change mid-execution), but there isn't a good
> spot to cache those values. Besides, the mask registers can change constantly
> for bare-metal debugging when switching between exception levels.
> 
> In some cases, it is just not possible to get access to these mask registers,
> like the case where threads are running. In those cases, using a default mask
> to remove the non-address bits should be enough.
> 
> This can happen when we let threads run in the background and then we attempt
> to access a memory address (now that gdb is capable of reading memory even
> with threads running).  Thus gdb will attempt to remove non-address bits
> of that memory access, will attempt to access registers, running into errors.
> 
> Regression-tested on aarch64-linux Ubuntu 20.04.
> ---
>   gdb/aarch64-tdep.c | 88 ++++++++++++++++++++++++++++++++--------------
>   1 file changed, 62 insertions(+), 26 deletions(-)
> 
> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
> index 5b1b9921f87..d11d8320799 100644
> --- a/gdb/aarch64-tdep.c
> +++ b/gdb/aarch64-tdep.c
> @@ -55,6 +55,9 @@
>   #include <algorithm>
>   #include <unordered_map>
>   
> +/* For inferior_ptid and current_inferior ().  */
> +#include "inferior.h"
> +
>   /* A Homogeneous Floating-Point or Short-Vector Aggregate may have at most
>      four members.  */
>   #define HA_MAX_NUM_FLDS		4
> @@ -3556,40 +3559,73 @@ aarch64_stack_frame_destroyed_p (struct gdbarch *gdbarch, CORE_ADDR pc)
>   static CORE_ADDR
>   aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>   {
> -  aarch64_gdbarch_tdep *tdep = gdbarch_tdep<aarch64_gdbarch_tdep> (gdbarch);
> -
>     /* By default, we assume TBI and discard the top 8 bits plus the VA range
> -     select bit (55).  */
> +     select bit (55).  Below we try to fetch information about pointer
> +     authentication masks in order to make non-address removal more
> +     precise.  */
>     CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>   
> -  if (tdep->has_pauth ())
> +  /* Check if we have an inferior first.  If not, just use the default
> +     mask.
> +
> +     We use the inferior_ptid here because the pointer authentication masks
> +     should be the same across threads of a process.  Since we may not have
> +     access to the current thread (gdb may have switched to no inferiors
> +     momentarily), we use the inferior ptid.  */
> +  if (inferior_ptid != null_ptid)
>       {
> -      /* Fetch the PAC masks.  These masks are per-process, so we can just
> -	 fetch data from whatever thread we have at the moment.
> -
> -	 Also, we have both a code mask and a data mask.  For now they are the
> -	 same, but this may change in the future.  */
> -      struct regcache *regs = get_current_regcache ();
> -      CORE_ADDR cmask, dmask;
> -      int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
> -      int cmask_regnum = AARCH64_PAUTH_CMASK_REGNUM (tdep->pauth_reg_base);
> -
> -      /* If we have a kernel address and we have kernel-mode address mask
> -	 registers, use those instead.  */
> -      if (tdep->pauth_reg_count > 2
> -	  && pointer & VA_RANGE_SELECT_BIT_MASK)
> +      /* If we do have an inferior, attempt to fetch its thread's thread_info
> +	 struct.  */
> +      thread_info *thread
> +	= find_thread_ptid (current_inferior ()->process_target (),
> +			    inferior_ptid);
> +
> +      /* If the thread is running, we will not be able to fetch the mask
> +	 registers.  */
> +      if (thread != nullptr && thread->state != THREAD_RUNNING)
>   	{
> -	  dmask_regnum = AARCH64_PAUTH_DMASK_HIGH_REGNUM (tdep->pauth_reg_base);
> -	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
> -	}
> +	  /* Otherwise, fetch the register cache and the masks.  */
> +	  struct regcache *regs
> +	    = get_thread_regcache (current_inferior ()->process_target (),
> +				   inferior_ptid);
> +
> +	  /* Use the gdbarch from the register cache to check for pointer
> +	     authentication support, as it matches the features found in
> +	     that particular thread.  */
> +	  aarch64_gdbarch_tdep *tdep
> +	    = gdbarch_tdep<aarch64_gdbarch_tdep> (regs->arch ());
>   
> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
> -	dmask = mask;
> +	  /* Is there pointer authentication support?  */
> +	  if (tdep->has_pauth ())
> +	    {
> +	      CORE_ADDR cmask, dmask;
> +	      int dmask_regnum
> +		= AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
> +	      int cmask_regnum
> +		= AARCH64_PAUTH_CMASK_REGNUM (tdep->pauth_reg_base);
> +
> +	      /* If we have a kernel address and we have kernel-mode address
> +		 mask registers, use those instead.  */
> +	      if (tdep->pauth_reg_count > 2
> +		  && pointer & VA_RANGE_SELECT_BIT_MASK)
> +		{
> +		  dmask_regnum
> +		    = AARCH64_PAUTH_DMASK_HIGH_REGNUM (tdep->pauth_reg_base);
> +		  cmask_regnum
> +		    = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
> +		}
>   
> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
> -	cmask = mask;
> +	      /* We have both a code mask and a data mask.  For now they are
> +		 the same, but this may change in the future.  */
> +	      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
> +		dmask = mask;
>   
> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
> +	      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
> +		cmask = mask;
> +
> +	      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
> +	    }
> +	}
>       }
>   
>     return aarch64_remove_top_bits (pointer, mask);


Unless there are objections to the strategy used in the patch, I'm planning on pushing this later today and
also pushing a backport to gdb 13.1 (where this code is linux-specific).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH, v2] aarch64: Check for valid inferior thread/regcache before reading pauth registers
  2023-03-24  8:19   ` Luis Machado
@ 2023-03-24 13:42     ` Luis Machado
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Machado @ 2023-03-24 13:42 UTC (permalink / raw)
  To: gdb-patches, aburgess, simon.marchi, pedro; +Cc: marcan

On 3/24/23 08:19, Luis Machado via Gdb-patches wrote:
> Hi,
> 
> 
> On 3/23/23 10:56, Luis Machado via Gdb-patches wrote:
>> Changes in v2:
>>
>> - Dropped helper functions and used inferior_ptid/current_inferior ()
>> instead.  Changes are now aarch64-specific.
>>
>> There were reports of gdb throwing internal errors when calling
>> inferior_thread ()/get_current_regcache () on a system with
>> Pointer Authentication enabled.
>>
>> In such cases, gdb produces the following backtrace:
>>
>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> ----- Backtrace -----
>> 0xaaaae04a571f gdb_internal_backtrace_1
>>     ../../../repos/binutils-gdb/gdb/bt-utils.c:122
>> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
>>     ../../../repos/binutils-gdb/gdb/bt-utils.c:168
>> 0xaaaae0b52ccf internal_vproblem
>>     ../../../repos/binutils-gdb/gdb/utils.c:401
>> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
>>     ../../../repos/binutils-gdb/gdb/utils.c:481
>> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
>>     ../../../repos/binutils-gdb/gdbsupport/errors.cc:58
>> 0xaaaae0a88983 _Z15inferior_threadv
>>     ../../../repos/binutils-gdb/gdb/thread.c:86
>> 0xaaaae0956c87 _Z20get_current_regcachev
>>     ../../../repos/binutils-gdb/gdb/regcache.c:428
>> 0xaaaae035223f aarch64_remove_non_address_bits
>>     ../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
>> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
>>     ../../../repos/binutils-gdb/gdb/gdbarch.c:3109
>> 0xaaaae0a692d7 memory_xfer_partial
>>     ../../../repos/binutils-gdb/gdb/target.c:1620
>> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
>>     ../../../repos/binutils-gdb/gdb/target.c:1684
>> 0xaaaae0a69e9f target_read_partial
>>     ../../../repos/binutils-gdb/gdb/target.c:1937
>> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
>>     ../../../repos/binutils-gdb/gdb/target.c:1977
>> 0xaaaae0a69937 _Z18target_read_memorymPhl
>>     ../../../repos/binutils-gdb/gdb/target.c:1773
>> 0xaaaae08be523 ps_xfer_memory
>>     ../../../repos/binutils-gdb/gdb/proc-service.c:90
>> 0xaaaae08be6db ps_pdread
>>     ../../../repos/binutils-gdb/gdb/proc-service.c:124
>> 0x40001ed7c3b3 _td_fetch_value
>>     /build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
>> 0x40001ed791ef td_ta_map_lwp2thr
>>     /build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
>> 0xaaaae07f4473 thread_from_lwp
>>     ../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
>> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>>     ../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
>> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>>     ../../../repos/binutils-gdb/gdb/target.c:2586
>> 0xaaaae0789cf7 do_target_wait_1
>>     ../../../repos/binutils-gdb/gdb/infrun.c:3825
>> 0xaaaae0789e6f operator()
>>     ../../../repos/binutils-gdb/gdb/infrun.c:3884
>> 0xaaaae078a167 do_target_wait
>>     ../../../repos/binutils-gdb/gdb/infrun.c:3903
>> 0xaaaae078b0af _Z20fetch_inferior_eventv
>>     ../../../repos/binutils-gdb/gdb/infrun.c:4314
>> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
>>     ../../../repos/binutils-gdb/gdb/inf-loop.c:41
>> 0xaaaae07dc68b handle_target_event
>>     ../../../repos/binutils-gdb/gdb/linux-nat.c:4206
>> 0xaaaae0e25fbb handle_file_event
>>     ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
>> 0xaaaae0e264f3 gdb_wait_for_event
>>     ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
>> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
>>     ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
>> 0xaaaae080f033 start_event_loop
>>     ../../../repos/binutils-gdb/gdb/main.c:411
>> 0xaaaae080f1b7 captured_command_loop
>>     ../../../repos/binutils-gdb/gdb/main.c:475
>> 0xaaaae0810b97 captured_main
>>     ../../../repos/binutils-gdb/gdb/main.c:1318
>> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
>>     ../../../repos/binutils-gdb/gdb/main.c:1337
>> 0xaaaae0338453 main
>>     ../../../repos/binutils-gdb/gdb/gdb.c:32
>> ---------------------
>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> Quit this debugging session? (y or n)
>>
>> We also see failures across the testsuite if the tests get executed on a target
>> that has native support for the pointer authentication feature. But
>> gdb.base/break.exp and gdb.base/access-mem-running.exp are two examples of
>> tests that run into errors and internal errors.
>>
>> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
>> enabled more broad use of pointer authentication masks to remove non-address
>> bits of pointers, but wasn't immediately detected because systems with native
>> support for pointer authentication are not that common yet.
>>
>> The above crash happens because gdb is in the middle of handling an event,
>> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
>> current thread.  This means a call to inferior_thread () will assert, and
>> attempting to call get_current_regcache () will also call inferior_thread (),
>> resulting in an assertion as well.
>>
>> target_has_registers was one function that seemed useful for detecting these
>> types of situation where we don't have a register cache. The problem with that
>> is the inconsistent state of inferior_ptid, which is used by
>> target_has_registers.
>>
>> Despite the call to switch_to_no_thread in switch_to_inferior_no_thread from
>> do_target_wait_1 in the backtrace above clearing inferior_ptid, the call to
>> ps_xfer_memory sets inferior_ptid momentarily before reading memory:
>>
>> static ps_err_e
>> ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
>>                  gdb_byte *buf, size_t len, int write)
>> {
>>    scoped_restore_current_inferior restore_inferior;
>>    set_current_inferior (ph->thread->inf);
>>
>>    scoped_restore_current_program_space restore_current_progspace;
>>    set_current_program_space (ph->thread->inf->pspace);
>>
>>    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>>    inferior_ptid = ph->thread->ptid;
>>
>>    CORE_ADDR core_addr = ps_addr_to_core_addr (addr);
>>
>>    int ret;
>>    if (write)
>>      ret = target_write_memory (core_addr, buf, len);
>>    else
>>      ret = target_read_memory (core_addr, buf, len);
>>    return (ret == 0 ? PS_OK : PS_ERR);
>> }
>>
>> Maybe this shouldn't happen, or maybe it is just an unfortunate state to be
>> in. But this prevents the use of target_has_registers to guard against the
>> lack of registers, since, although current_thread_ is still nullptr,
>> inferior_ptid is valid and is not null_ptid.
>>
>> There is another crash scenario after we kill a previously active inferior, in
>> which case the gdbarch will still say we support pointer authentication but we
>> will also have no current thread (inferior_thread () will assert etc).
>>
>> If the target has support for pointer authentication, gdb needs to use
>> a couple (or 4, for bare-metal) mask registers to mask off some bits of
>> pointers, and for that it needs to access the registers.
>>
>> At some points, like the one from the backtrace above, there is no active
>> thread/current regcache because gdb is in the middle of doing event handling
>> and switching between threads.
>>
>> Simon suggested the use of inferior_ptid to fetch the register cache, as
>> opposed to relying on the current register cache.  Though we need to make sure
>> inferior_ptid is valid (not null_ptid), I think this works nicely.
>>
>> With inferior_ptid, we can do safety checks along the way, making sure we have
>> a thread to fetch a register cache from and checking if the thread is actually
>> stopped or running.
>>
>> The following patch implements this idea with safety checks to make sure we
>> don't run into assertions or errors.  If any of the checks fail, we fallback to
>> using a default mask to remove non-address bits of a pointer.
>>
>> I discussed with Pedro the possibility of caching the mask register values
>> (which are per-process and can change mid-execution), but there isn't a good
>> spot to cache those values. Besides, the mask registers can change constantly
>> for bare-metal debugging when switching between exception levels.
>>
>> In some cases, it is just not possible to get access to these mask registers,
>> like the case where threads are running. In those cases, using a default mask
>> to remove the non-address bits should be enough.
>>
>> This can happen when we let threads run in the background and then we attempt
>> to access a memory address (now that gdb is capable of reading memory even
>> with threads running).  Thus gdb will attempt to remove non-address bits
>> of that memory access, will attempt to access registers, running into errors.
>>
>> Regression-tested on aarch64-linux Ubuntu 20.04.
>> ---
>>   gdb/aarch64-tdep.c | 88 ++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 62 insertions(+), 26 deletions(-)
>>
>> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
>> index 5b1b9921f87..d11d8320799 100644
>> --- a/gdb/aarch64-tdep.c
>> +++ b/gdb/aarch64-tdep.c
>> @@ -55,6 +55,9 @@
>>   #include <algorithm>
>>   #include <unordered_map>
>> +/* For inferior_ptid and current_inferior ().  */
>> +#include "inferior.h"
>> +
>>   /* A Homogeneous Floating-Point or Short-Vector Aggregate may have at most
>>      four members.  */
>>   #define HA_MAX_NUM_FLDS        4
>> @@ -3556,40 +3559,73 @@ aarch64_stack_frame_destroyed_p (struct gdbarch *gdbarch, CORE_ADDR pc)
>>   static CORE_ADDR
>>   aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>   {
>> -  aarch64_gdbarch_tdep *tdep = gdbarch_tdep<aarch64_gdbarch_tdep> (gdbarch);
>> -
>>     /* By default, we assume TBI and discard the top 8 bits plus the VA range
>> -     select bit (55).  */
>> +     select bit (55).  Below we try to fetch information about pointer
>> +     authentication masks in order to make non-address removal more
>> +     precise.  */
>>     CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>> -  if (tdep->has_pauth ())
>> +  /* Check if we have an inferior first.  If not, just use the default
>> +     mask.
>> +
>> +     We use the inferior_ptid here because the pointer authentication masks
>> +     should be the same across threads of a process.  Since we may not have
>> +     access to the current thread (gdb may have switched to no inferiors
>> +     momentarily), we use the inferior ptid.  */
>> +  if (inferior_ptid != null_ptid)
>>       {
>> -      /* Fetch the PAC masks.  These masks are per-process, so we can just
>> -     fetch data from whatever thread we have at the moment.
>> -
>> -     Also, we have both a code mask and a data mask.  For now they are the
>> -     same, but this may change in the future.  */
>> -      struct regcache *regs = get_current_regcache ();
>> -      CORE_ADDR cmask, dmask;
>> -      int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
>> -      int cmask_regnum = AARCH64_PAUTH_CMASK_REGNUM (tdep->pauth_reg_base);
>> -
>> -      /* If we have a kernel address and we have kernel-mode address mask
>> -     registers, use those instead.  */
>> -      if (tdep->pauth_reg_count > 2
>> -      && pointer & VA_RANGE_SELECT_BIT_MASK)
>> +      /* If we do have an inferior, attempt to fetch its thread's thread_info
>> +     struct.  */
>> +      thread_info *thread
>> +    = find_thread_ptid (current_inferior ()->process_target (),
>> +                inferior_ptid);
>> +
>> +      /* If the thread is running, we will not be able to fetch the mask
>> +     registers.  */
>> +      if (thread != nullptr && thread->state != THREAD_RUNNING)
>>       {
>> -      dmask_regnum = AARCH64_PAUTH_DMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>> -      cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>> -    }
>> +      /* Otherwise, fetch the register cache and the masks.  */
>> +      struct regcache *regs
>> +        = get_thread_regcache (current_inferior ()->process_target (),
>> +                   inferior_ptid);
>> +
>> +      /* Use the gdbarch from the register cache to check for pointer
>> +         authentication support, as it matches the features found in
>> +         that particular thread.  */
>> +      aarch64_gdbarch_tdep *tdep
>> +        = gdbarch_tdep<aarch64_gdbarch_tdep> (regs->arch ());
>> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>> -    dmask = mask;
>> +      /* Is there pointer authentication support?  */
>> +      if (tdep->has_pauth ())
>> +        {
>> +          CORE_ADDR cmask, dmask;
>> +          int dmask_regnum
>> +        = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
>> +          int cmask_regnum
>> +        = AARCH64_PAUTH_CMASK_REGNUM (tdep->pauth_reg_base);
>> +
>> +          /* If we have a kernel address and we have kernel-mode address
>> +         mask registers, use those instead.  */
>> +          if (tdep->pauth_reg_count > 2
>> +          && pointer & VA_RANGE_SELECT_BIT_MASK)
>> +        {
>> +          dmask_regnum
>> +            = AARCH64_PAUTH_DMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>> +          cmask_regnum
>> +            = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>> +        }
>> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>> -    cmask = mask;
>> +          /* We have both a code mask and a data mask.  For now they are
>> +         the same, but this may change in the future.  */
>> +          if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>> +        dmask = mask;
>> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>> +          if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>> +        cmask = mask;
>> +
>> +          mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>> +        }
>> +    }
>>       }
>>     return aarch64_remove_top_bits (pointer, mask);
> 
> 
> Unless there are objections to the strategy used in the patch, I'm planning on pushing this later today and
> also pushing a backport to gdb 13.1 (where this code is linux-specific).

Pushed to master (ef1398987a132769779679fd9ffd353dce840f95) and gdb-13 branch (b3eff3e15576229af9bae026c5c23ee694b90389) now.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-03-24 13:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-16 10:39 [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers Luis Machado
2023-03-16 15:49 ` Andrew Burgess
2023-03-16 23:15   ` Luis Machado
2023-03-17 17:15     ` Simon Marchi
2023-03-17 17:25       ` Luis Machado
2023-03-17 17:29         ` Simon Marchi
2023-03-17 17:38           ` Luis Machado
2023-03-20 20:06     ` Andrew Burgess
2023-03-22 10:26       ` Luis Machado
2023-03-23 10:56 ` [PATCH, v2] " Luis Machado
2023-03-23 18:25   ` Kevin Buettner
2023-03-23 18:27     ` Luis Machado
2023-03-23 18:32       ` Luis Machado
2023-03-23 18:39         ` Kevin Buettner
2023-03-24  6:28           ` Luis Machado
2023-03-24  8:19   ` Luis Machado
2023-03-24 13:42     ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).