public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Luis Machado <luis.machado@arm.com>
To: Andrew Burgess <aburgess@redhat.com>,
	gdb-patches@sourceware.org, pedro@palves.net
Cc: marcan@marcan.st
Subject: Re: [PATCH] aarch64: Check for valid inferior thread/regcache before reading pauth registers
Date: Thu, 16 Mar 2023 23:15:29 +0000	[thread overview]
Message-ID: <5ce96222-d665-5129-8d65-27c6933a6623@arm.com> (raw)
In-Reply-To: <873564g0h3.fsf@redhat.com>

On 3/16/23 15:49, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> There were reports of gdb throwing internal errors when calling
>> inferior_thread ()/get_current_regcache () on a system with
>> Pointer Authentication enabled.
>>
>> In such cases, gdb produces the following backtrace:
>>
>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> ----- Backtrace -----
>> 0xaaaae04a571f gdb_internal_backtrace_1
>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:122
>> 0xaaaae04a57f3 _Z22gdb_internal_backtracev
>> 	../../../repos/binutils-gdb/gdb/bt-utils.c:168
>> 0xaaaae0b52ccf internal_vproblem
>> 	../../../repos/binutils-gdb/gdb/utils.c:401
>> 0xaaaae0b5310b _Z15internal_verrorPKciS0_St9__va_list
>> 	../../../repos/binutils-gdb/gdb/utils.c:481
>> 0xaaaae0e24b8f _Z18internal_error_locPKciS0_z
>> 	../../../repos/binutils-gdb/gdbsupport/errors.cc:58
>> 0xaaaae0a88983 _Z15inferior_threadv
>> 	../../../repos/binutils-gdb/gdb/thread.c:86
>> 0xaaaae0956c87 _Z20get_current_regcachev
>> 	../../../repos/binutils-gdb/gdb/regcache.c:428
>> 0xaaaae035223f aarch64_remove_non_address_bits
>> 	../../../repos/binutils-gdb/gdb/aarch64-tdep.c:3572
>> 0xaaaae03e8abb _Z31gdbarch_remove_non_address_bitsP7gdbarchm
>> 	../../../repos/binutils-gdb/gdb/gdbarch.c:3109
>> 0xaaaae0a692d7 memory_xfer_partial
>> 	../../../repos/binutils-gdb/gdb/target.c:1620
>> 0xaaaae0a695e3 _Z19target_xfer_partialP10target_ops13target_objectPKcPhPKhmmPm
>> 	../../../repos/binutils-gdb/gdb/target.c:1684
>> 0xaaaae0a69e9f target_read_partial
>> 	../../../repos/binutils-gdb/gdb/target.c:1937
>> 0xaaaae0a69fdf _Z11target_readP10target_ops13target_objectPKcPhml
>> 	../../../repos/binutils-gdb/gdb/target.c:1977
>> 0xaaaae0a69937 _Z18target_read_memorymPhl
>> 	../../../repos/binutils-gdb/gdb/target.c:1773
>> 0xaaaae08be523 ps_xfer_memory
>> 	../../../repos/binutils-gdb/gdb/proc-service.c:90
>> 0xaaaae08be6db ps_pdread
>> 	../../../repos/binutils-gdb/gdb/proc-service.c:124
>> 0x40001ed7c3b3 _td_fetch_value
>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/fetch-value.c:115
>> 0x40001ed791ef td_ta_map_lwp2thr
>> 	/build/glibc-RIFKjK/glibc-2.31/nptl_db/td_ta_map_lwp2thr.c:194
>> 0xaaaae07f4473 thread_from_lwp
>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:413
>> 0xaaaae07f6d6f _ZN16thread_db_target4waitE6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>> 	../../../repos/binutils-gdb/gdb/linux-thread-db.c:1420
>> 0xaaaae0a6b33b _Z11target_wait6ptid_tP17target_waitstatus10enum_flagsI16target_wait_flagE
>> 	../../../repos/binutils-gdb/gdb/target.c:2586
>> 0xaaaae0789cf7 do_target_wait_1
>> 	../../../repos/binutils-gdb/gdb/infrun.c:3825
>> 0xaaaae0789e6f operator()
>> 	../../../repos/binutils-gdb/gdb/infrun.c:3884
>> 0xaaaae078a167 do_target_wait
>> 	../../../repos/binutils-gdb/gdb/infrun.c:3903
>> 0xaaaae078b0af _Z20fetch_inferior_eventv
>> 	../../../repos/binutils-gdb/gdb/infrun.c:4314
>> 0xaaaae076652f _Z22inferior_event_handler19inferior_event_type
>> 	../../../repos/binutils-gdb/gdb/inf-loop.c:41
>> 0xaaaae07dc68b handle_target_event
>> 	../../../repos/binutils-gdb/gdb/linux-nat.c:4206
>> 0xaaaae0e25fbb handle_file_event
>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:573
>> 0xaaaae0e264f3 gdb_wait_for_event
>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:694
>> 0xaaaae0e24f9b _Z16gdb_do_one_eventi
>> 	../../../repos/binutils-gdb/gdbsupport/event-loop.cc:217
>> 0xaaaae080f033 start_event_loop
>> 	../../../repos/binutils-gdb/gdb/main.c:411
>> 0xaaaae080f1b7 captured_command_loop
>> 	../../../repos/binutils-gdb/gdb/main.c:475
>> 0xaaaae0810b97 captured_main
>> 	../../../repos/binutils-gdb/gdb/main.c:1318
>> 0xaaaae0810c1b _Z8gdb_mainP18captured_main_args
>> 	../../../repos/binutils-gdb/gdb/main.c:1337
>> 0xaaaae0338453 main
>> 	../../../repos/binutils-gdb/gdb/gdb.c:32
>> ---------------------
>> ../../../repos/binutils-gdb/gdb/thread.c:86: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> Quit this debugging session? (y or n)
>>
>> This issue started after commit d88cb738e6a7a7179dfaff8af78d69250c852af1, which
>> enabled more broad use of pointer authentication masks to remove non-address
>> bits of pointers.
>>
>> The above crash happens because gdb is in the middle of handling an event,
>> and do_target_wait_1 calls switch_to_inferior_no_thread, nullifying the
>> current thread.  This means a call to inferior_thread () will assert, and
>> attempting to call get_current_regcache () will also call inferior_thread (),
>> resulting in an assertion as well.
>>
>> There is another crash scenario after we kill a previously active inferior, in
>> which case the gdbarch will still say we support pointer authentication but we
>> will also have no current thread (inferior_thread () will assert etc).
>>
>> I thought about this for a bit, and I'm not sure what is the best way
>> to address this other than actively checking for a valid current thread
>> and valid current register cache.
>>
>> If the target has support for Pointer Authentication, gdb needs to use
>> a couple (or 4, for bare-metal) mask registers to mask off some bits of
>> pointers, and for that it needs to access the registers.
>>
>> At some points, like the one from the backtrace above, there is no active
>> thread/current regcache because gdb is in the middle of doing event handling
>> and switching between threads.
>>
>> The following patch improves this situation by checking for an active thread
>> and register cache. If those don't exist, gdb falls back to using a default
>> mask which should be enough for what it is trying to do at the moment.
>>
>> I discussed with Pedro the possibility of caching the mask register values
>> (which are per-process and can change mid-execution), but there isn't a good
>> spot to cache those values. Besides, the mask registers can change constantly
>> for bare-metal debugging when switching between exception levels.
>>
>> In some cases, it is just not possible to get access to these mask registers,
>> like the case where threads are running. In those cases, using a default mask
>> to remove the non-address bits should be enough.
>>
>> In addition to the check for a valid current register cache, I also introduced
>> a try/catch guard against errors reading registers, in case the current thread
>> is running, also falling back to using a default mask to remove non-address
>> bits.
>>
>> This can happen when we let threads run in the background and then we attempt
>> to access a memory address (now that gdb is capable of reading memory even
>> with threads running).  Thus gdb will attempt to remove non-address bits
>> of that memory access, will attempt to access registers, running into errors.
>>
>> Regression-tested on aarch64-linux Ubuntu 20.04.
>>
>> Thoughts?
>> ---
>>   gdb/aarch64-tdep.c | 38 +++++++++++++++++++++++++++++---------
>>   gdb/gdbthread.h    |  8 ++++++++
>>   gdb/regcache.c     | 10 ++++++++++
>>   gdb/regcache.h     |  8 ++++++++
>>   gdb/thread.c       |  8 ++++++++
> 
> Would it be possible to write a test for this issue?  Or maybe some of
> the existing tests fail due to this issue?  If there are some tests that
> already expose this issue then it would be great to mention them in the
> commit message.
> 

Existing tests fail due to this. Two example are gdb.base/break.exp and gdb.base/access-mem-running.exp are
two examples. I'll add those to the commit message for context.

The catch here is that they will only fail on a system that supports pointer authentication natively, and
there are only a few of those at the moment (Graviton 3, M2 and system QEMU are 3 examples I know of).

>>   5 files changed, 63 insertions(+), 9 deletions(-)
>>
>> diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
>> index 5b1b9921f87..8368592a7db 100644
>> --- a/gdb/aarch64-tdep.c
>> +++ b/gdb/aarch64-tdep.c
>> @@ -3562,13 +3562,17 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>        select bit (55).  */
>>     CORE_ADDR mask = AARCH64_TOP_BITS_MASK;
>>   
>> -  if (tdep->has_pauth ())
>> +  /* Make sure we have a valid active current register cache to fetch pointer
>> +     authentication masks from.  In some situations, there may be no active
>> +     threads to fetch the data from (no inferiors, gdb in the middle of
>> +     handling events etc).  If we don't have a valid register cache, fallback
>> +     to using a default mask to remove non-address bits.  */
>> +  if (tdep->has_pauth () && current_regcache_exists ())
> 
> Maybe I'm completely off the mark here, but does target_has_registers
> not help at all here?

It does, and that was the first thing I considered for guarding this block against a situation where we
have no current thread.

But...

> 
> For a process_stratum_target, this should end up in this function:
> 
>    bool
>    process_stratum_target::has_registers ()
>    {
>      /* Can't read registers from no inferior.  */
>      return inferior_ptid != null_ptid;
>    }
> 
> And given that switch_to_no_thread () includes the line:
> 
>    inferior_ptid = null_ptid;
> 

... despite the call to switch_to_no_thread in switch_to_inferior_no_thread from do_target_wait_1
in the backtrace above, the call to ps_xfer_memory sets inferior_ptid momentarily before reading
memory:


static ps_err_e
ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
                 gdb_byte *buf, size_t len, int write)
{
   scoped_restore_current_inferior restore_inferior;
   set_current_inferior (ph->thread->inf);

   scoped_restore_current_program_space restore_current_progspace;
   set_current_program_space (ph->thread->inf->pspace);

   scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
   inferior_ptid = ph->thread->ptid;

   CORE_ADDR core_addr = ps_addr_to_core_addr (addr);

   int ret;
   if (write)
     ret = target_write_memory (core_addr, buf, len);
   else
     ret = target_read_memory (core_addr, buf, len);
   return (ret == 0 ? PS_OK : PS_ERR);
}


Maybe this shouldn't happen, or maybe it is just an unfortunate state to be in. But this
prevents the use of target_has_registers to guard against the lack of registers, since,
although current_thread_ is still nullptr, inferior_ptid is valid and is not null_ptid.

So doing a direct check of current_thread_ through a helper function seemed safer.

And just to clarify, this is just one of the issues where we attempt to fetch a current
register cache with a null current_thread_, leading to an assertion.

The other issue, which is solved with a try/catch block, yields an error message about not
being able to access registers (because the thread we want to fetch registers from is still
running).

Maybe in the future we should prevent calls to register fetch/store functions if the thread we
want to fetch data from is still running. Right now this isn't a problem besides the AArch64
target though.

> My hope would be that when we have no thread selected,
> target_has_registers will return false.
> 
> The other thing I worried about before suggesting the above is this
> depends on the current_inferior in order to identify a target stack.
> But reading registers already makes that assumption, right.. reading a
> register might require us to actually go get the registers, so we are
> already assuming that the current_inferior() is sane at this point.
> 
> Thanks,
> Andrew
> 
> 
> 
> 
>>       {
>>         /* Fetch the PAC masks.  These masks are per-process, so we can just
>> -	 fetch data from whatever thread we have at the moment.
>> +	 fetch data from whatever thread we have at the moment, hence why we
>> +	 use the current register cache.  */
>>   
>> -	 Also, we have both a code mask and a data mask.  For now they are the
>> -	 same, but this may change in the future.  */
>>         struct regcache *regs = get_current_regcache ();
>>         CORE_ADDR cmask, dmask;
>>         int dmask_regnum = AARCH64_PAUTH_DMASK_REGNUM (tdep->pauth_reg_base);
>> @@ -3583,13 +3587,29 @@ aarch64_remove_non_address_bits (struct gdbarch *gdbarch, CORE_ADDR pointer)
>>   	  cmask_regnum = AARCH64_PAUTH_CMASK_HIGH_REGNUM (tdep->pauth_reg_base);
>>   	}
>>   
>> -      if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>> -	dmask = mask;
>> +      /* GDB supports reading memory from threads that are running, but in
>> +	 those cases GDB cannot access registers.  If reading the pointer
>> +	 authentication registers fails, fallback to using a default mask.
>> +
>> +	 We could potentially get it wrong if the masks have been updated
>> +	 while the thread is running, but eventually we will see the thread
>> +	 stop and will be able to fetch the correct data.  */
>> +      try
>> +	{
>> +	  /* We have both a code mask and a data mask.  For now they are
>> +	     the same, but this may change in the future.  */
>> +	  if (regs->cooked_read (dmask_regnum, &dmask) != REG_VALID)
>> +	    dmask = mask;
>>   
>> -      if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>> -	cmask = mask;
>> +	  if (regs->cooked_read (cmask_regnum, &cmask) != REG_VALID)
>> +	    cmask = mask;
>>   
>> -      mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>> +	  mask |= aarch64_mask_from_pac_registers (cmask, dmask);
>> +	}
>> +      catch (const gdb_exception_error &e)
>> +	{
>> +	  /* We failed to fetch the contents of the masks from registers.  */
>> +	}
>>       }
>>   
>>     return aarch64_remove_top_bits (pointer, mask);
>> diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
>> index c0f27a8a66e..ebb8f140770 100644
>> --- a/gdb/gdbthread.h
>> +++ b/gdb/gdbthread.h
>> @@ -875,6 +875,14 @@ class scoped_restore_current_thread
>>     enum language m_lang;
>>   };
>>   
>> +/* Return TRUE if an inferior thread exists and is currently selected,
>> +   meaning it is safe to call inferior_thread () without risking an assertion.
>> +
>> +   Return FALSE otherwise.  If so, inferior_thread () will assert if
>> +   called.  */
>> +
>> +extern bool inferior_thread_exists (void);
>> +
>>   /* Returns a pointer into the thread_info corresponding to
>>      INFERIOR_PTID.  INFERIOR_PTID *must* be in the thread list.  */
>>   extern struct thread_info* inferior_thread (void);
>> diff --git a/gdb/regcache.c b/gdb/regcache.c
>> index af76fab1a34..4685b01e9ab 100644
>> --- a/gdb/regcache.c
>> +++ b/gdb/regcache.c
>> @@ -422,6 +422,16 @@ get_thread_regcache (thread_info *thread)
>>   			      thread->ptid);
>>   }
>>   
>> +/* See regcache.h.  */
>> +
>> +bool
>> +current_regcache_exists (void)
>> +{
>> +  /* If we have an inferior thread selected, we should have a register
>> +     cache as well.  */
>> +  return inferior_thread_exists ();
>> +}
>> +
>>   struct regcache *
>>   get_current_regcache (void)
>>   {
>> diff --git a/gdb/regcache.h b/gdb/regcache.h
>> index b9ffab9950d..adff7e44cb9 100644
>> --- a/gdb/regcache.h
>> +++ b/gdb/regcache.h
>> @@ -30,6 +30,14 @@ struct address_space;
>>   class thread_info;
>>   struct process_stratum_target;
>>   
>> +/* Return TRUE if we can access the register cache from the
>> +   currently-selected thread.  If so, we can call get_current_regcache ()
>> +   safely without risking an assertion due to the lack of an inferior thread.
>> +
>> +   Return FALSE otherwise.  In this case, attempting to call
>> +   get_current_regcache () will run into an assertion.  */
>> +
>> +extern bool current_regcache_exists (void);
>>   extern struct regcache *get_current_regcache (void);
>>   extern struct regcache *get_thread_regcache (process_stratum_target *target,
>>   					     ptid_t ptid);
>> diff --git a/gdb/thread.c b/gdb/thread.c
>> index e4d98ce966a..545bbf2a411 100644
>> --- a/gdb/thread.c
>> +++ b/gdb/thread.c
>> @@ -80,6 +80,14 @@ is_current_thread (const thread_info *thr)
>>     return thr == current_thread_;
>>   }
>>   
>> +/* See gdbthread.h.  */
>> +
>> +bool
>> +inferior_thread_exists ()
>> +{
>> +  return current_thread_ != nullptr;
>> +}
>> +
>>   struct thread_info*
>>   inferior_thread (void)
>>   {
>> -- 
>> 2.25.1
> 


  reply	other threads:[~2023-03-16 23:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-16 10:39 Luis Machado
2023-03-16 15:49 ` Andrew Burgess
2023-03-16 23:15   ` Luis Machado [this message]
2023-03-17 17:15     ` Simon Marchi
2023-03-17 17:25       ` Luis Machado
2023-03-17 17:29         ` Simon Marchi
2023-03-17 17:38           ` Luis Machado
2023-03-20 20:06     ` Andrew Burgess
2023-03-22 10:26       ` Luis Machado
2023-03-23 10:56 ` [PATCH, v2] " Luis Machado
2023-03-23 18:25   ` Kevin Buettner
2023-03-23 18:27     ` Luis Machado
2023-03-23 18:32       ` Luis Machado
2023-03-23 18:39         ` Kevin Buettner
2023-03-24  6:28           ` Luis Machado
2023-03-24  8:19   ` Luis Machado
2023-03-24 13:42     ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ce96222-d665-5129-8d65-27c6933a6623@arm.com \
    --to=luis.machado@arm.com \
    --cc=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=marcan@marcan.st \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).