Re: [PATCH][AArch64] Fix missing watchpoint hits/endless loop

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Alan Hayward <Alan.Hayward@arm.com>
To: Luis Machado <luis.machado@linaro.org>
Cc: "gdb-patches\\@sourceware.org" <gdb-patches@sourceware.org>,
	"david.spickett@linaro.org" <david.spickett@linaro.org>,
	nd <nd@arm.com>
Subject: Re: [PATCH][AArch64] Fix missing watchpoint hits/endless loop
Date: Wed, 2 Jun 2021 16:57:26 +0000	[thread overview]
Message-ID: <01A08E0C-97D3-43DA-AACC-05EF8D126714@arm.com> (raw)
In-Reply-To: <20210602142745.2174-1-luis.machado@linaro.org>



> On 2 Jun 2021, at 15:27, Luis Machado <luis.machado@linaro.org> wrote:
> 
> I ran into a situation where a hardware watchpoint hit is not detected
> correctly, misleading GDB into thinking it was a delayed breakpoint hit.
> 
> The problem is that hardware watchpoints are not skippable on AArch64, so
> that makes GDB loop endlessly trying to run past the instruction.
> 
> The most obvious case where this happens is when the load/store pair
> instructions access 16 bytes of memory.
> 
> Suppose we have a stp instruction that will write a couple 64-bit registers
> to address 0x10 (stp x3,x4 [x2]). It will write data from 0x10 up to 0x20.
> 
> Now suppose a write watchpoint is created to monitor memory address 0x18,
> which is the start of the second register write. It can have whatever length,
> but let's assume it has length 8.
> 
> When we execute that stp instruction, it will trap and the reported stopped
> data address from the kernel will be 0x10 (the beginning of the memory range
> accessed by the instruction).
> 
> The current code won't be able to detect a valid trigger because it assumes an
> alignment of 8 bytes for the watchpoint address. Forcing that kind of alignment
> won't be enough to detect a 16-byte access if the trap address falls outside of
> the 8-byte alignment window. We need to know how many bytes the instruction
> will access, but we won't have that data unless we go parsing instructions.
> 
> Another issue with the current code seems to be that it assumes the accesses
> will always be 8 bytes in size, since it wants to align the watchpoint address
> to that particular boundary. This leads to problems when we have unaligned
> addresses and unaligned watchpoints.
> 
> For example, suppose we have a str instruction storing 8 bytes to memory
> address 0xf. Now suppose we have a write watchpoint at address 0x16,
> monitoring 8 bytes.
> 
> The trap address will be 0xf, but forcing 0x16 to 8-byte alignment yields
> 0x10, and so GDB doesn't think this is a watchpoint hit.

Ouch.
I’m starting to wonder now if I have encountered this bug myself, and just
shrugged it off thinking the variable wasn’t being hit.

> 
> I believe you can trigger the same problem with smaller memory accesses,
> except one that accesses a single byte.
> 
> As I said earlier, ideally we'd go parsing instructions to figure out how many
> bytes they access. That is a bit complex though, so meanwhile I think we should
> go with a simpler solution.
> 
> We keep the assumption that the instructions access 8 bytes, except when we
> detect one of the instructions that access 16 bytes.

What about NEON and SVE instructions? They can access a lot of data at once.

> 
> For the trigger detection, instead of forcing an alignment and then checking
> if the trap address falls within the range of the watchpoint, we just check
> if the memory access and the watchpoint range overlap. If they do, then we
> have a watchpoint hit.
> 
> This still has potential to give false positives if we choose the right
> combination of memory address, access size and watchpoint address. But it
> should match the old code in this regard, while giving better results for
> accesses of 16 bytes and unaligned accesses.

Is there a plan for fixing those?

> 
> It also fixes these two failures in the testsuite:
> 
> FAIL: gdb.base/watchpoint-unaligned.exp: continue (timeout)
> FAIL: gdb.base/watchpoint-unaligned.exp: size8twice write
> 
> Regression tested on aarch64-linux Ubuntu/20.04.
> 
> gdb/ChangeLog:
> 
> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
> 
> 	* aarch64-linux-nat.c
> 	(aarch64_linux_nat_target::stopped_data_address): Refactor.
> 	* nat/aarch64-linux-hw-point.c (hw_watch_is_16b_ldst)
> 	(hw_watch_regions_overlap, hw_watch_detect_trigger): New functions.
> 	* nat/aarch64-linux-hw-point.h (LDP_STP_MASK, STP_OPC_64)
> 	(LDP_OPC_64): New constants.
> 	(hw_watch_is_16b_ldst, hw_watch_regions_overlap)
> 	(hw_watch_detect_trigger): New prototypes.
> 
> gdbserver/ChangeLog:
> 
> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
> 
> 	* linux-aarch64-low.cc
> 	(aarch64_target::low_stopped_data_address): Refactor.
> ---
> gdb/aarch64-linux-nat.c          |  45 ++----------
> gdb/nat/aarch64-linux-hw-point.c | 120 +++++++++++++++++++++++++++++++
> gdb/nat/aarch64-linux-hw-point.h |  16 +++++
> gdbserver/linux-aarch64-low.cc   |  49 ++++---------
> 4 files changed, 154 insertions(+), 76 deletions(-)
> 
> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
> index 61224022f6a..de405a5e351 100644
> --- a/gdb/aarch64-linux-nat.c
> +++ b/gdb/aarch64-linux-nat.c
> @@ -951,7 +951,6 @@ bool
> aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
> {
>   siginfo_t siginfo;
> -  int i;
>   struct aarch64_debug_reg_state *state;
> 
>   if (!linux_nat_get_siginfo (inferior_ptid, &siginfo))
> @@ -969,46 +968,14 @@ aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
>   const CORE_ADDR addr_trap
>     = address_significant (gdbarch, (CORE_ADDR) siginfo.si_addr);
> 
> +  struct regcache *regs = get_thread_regcache (this, inferior_ptid);
> +  CORE_ADDR trigger_pc = regcache_read_pc (regs);
> +  uint32_t insn;
> +  read_memory (trigger_pc, (gdb_byte *) &insn, 4);
> +
>   /* Check if the address matches any watched address.  */
>   state = aarch64_get_debug_reg_state (inferior_ptid.pid ());
> -  for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
> -    {
> -      const unsigned int offset
> -	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> -      const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> -      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> -      const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
> -      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> -
> -      if (state->dr_ref_count_wp[i]
> -	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
> -	  && addr_trap >= addr_watch_aligned
> -	  && addr_trap < addr_watch + len)
> -	{
> -	  /* ADDR_TRAP reports the first address of the memory range
> -	     accessed by the CPU, regardless of what was the memory
> -	     range watched.  Thus, a large CPU access that straddles
> -	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> -	     ADDR_TRAP that is lower than the
> -	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
> -
> -	     addr: |   4   |   5   |   6   |   7   |   8   |
> -				   |---- range watched ----|
> -		   |----------- range accessed ------------|
> -
> -	     In this case, ADDR_TRAP will be 4.
> -
> -	     To match a watchpoint known to GDB core, we must never
> -	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> -	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> -	     positive on kernels older than 4.10.  See PR
> -	     external/20207.  */
> -	  *addr_p = addr_orig;
> -	  return true;
> -	}
> -    }
> -
> -  return false;
> +  return hw_watch_detect_trigger (state, insn, addr_trap, addr_p);
> }
> 
> /* Implement the "stopped_by_watchpoint" target_ops method.  */
> diff --git a/gdb/nat/aarch64-linux-hw-point.c b/gdb/nat/aarch64-linux-hw-point.c
> index af2cc4254e2..32ed76157ac 100644
> --- a/gdb/nat/aarch64-linux-hw-point.c
> +++ b/gdb/nat/aarch64-linux-hw-point.c
> @@ -865,3 +865,123 @@ aarch64_linux_region_ok_for_watchpoint (CORE_ADDR addr, int len)
>      the checking is costly.  */
>   return 1;
> }
> +
> +/* Return true if INSN is a ldp/stp that accesses 16 bytes of memory.
> +   Return false otherwise.  */
> +
> +bool
> +hw_watch_is_16b_ldst (CORE_ADDR insn)
> +{
> +  return ((insn & LDP_STP_MASK) == STP_OPC_64
> +	  || (insn & LDP_STP_MASK) == LDP_OPC_64);
> +}
> +
> +/* Return true if the regions [mem_addr, mem_addr + mem_len] and
> +   [watch_addr, watch_addr + watch_len] overlap.  False otherwise.  */
> +
> +bool
> +hw_watch_regions_overlap (CORE_ADDR mem_addr, size_t mem_len,
> +			  CORE_ADDR watch_addr, size_t watch_len)
> +{
> +  if (watch_addr > (mem_addr + mem_len)
> +      || mem_addr > (watch_addr + watch_len))
> +    return false;
> +
> +  CORE_ADDR start = std::max (mem_addr, watch_addr);
> +  CORE_ADDR end = std::min (mem_addr + mem_len, watch_addr + watch_len);
> +
> +  return ((end - start) > 0);
> +}
> +
> +/* Check if a hardware watchpoint has triggered.  If a trigger is detected,
> +   return true and update ADDR_P with the stopped data address.
> +   Otherwise return false.
> +
> +   STATE is the debug register's state, INSN is the instruction the inferior
> +   stopped at and ADDR_TRAP is the reported stopped data address.  */
> +
> +bool
> +hw_watch_detect_trigger (const struct aarch64_debug_reg_state *state,
> +			 CORE_ADDR insn, CORE_ADDR addr_trap,
> +			 CORE_ADDR *addr_p)
> +{
> +  /* There are 6 variations of watchpoint range and memory access
> +     range positioning:
> +
> +     - W is the byte in the watchpoint range only.
> +
> +     - M is the byte in the memory access range ony.
> +
> +     - O is the byte in the overlapping region of the watchpoint range and
> +       the memory access range.
> +
> +     1 - Non-overlapping, no triggers.
> +
> +     [WWWWWWWW]...[MMMMMMMM]
> +
> +     2 - Non-overlapping, no triggers.
> +
> +     [MMMMMMMM]...[WWWWWWWW]
> +
> +     3 - Overlapping partially, triggers.
> +
> +     [MMMMOOOOWWWW]
> +
> +     4 - Overlapping partially, triggers.
> +
> +     [WWWWOOOOMMMM]
> +
> +     5 - Memory access contained in watchpoint range, triggers.
> +
> +     [WWWWOOOOOOOOWWWW]
> +
> +     6 - Memory access containing watchpoint range, triggers.
> +
> +     [MMMMOOOOOOOOMMMM]

Very minor nit: M looks like an upside down W, so there was an initial slight confusion.
Not worth changing, because M and W are the right letter choices.

> +  */
> +
> +  /* We assume the memory access size is 8 bytes.  */
> +  unsigned int memory_access_size = 8;
> +
> +  /* Check if the memory access size is 16 bytes.  */
> +  if (hw_watch_is_16b_ldst (insn))
> +    memory_access_size = 16;
> +
> +  for (int i = aarch64_num_wp_regs - 1; i >= 0; --i)
> +    {
> +      const unsigned int offset
> +	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> +      const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> +      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> +      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> +
> +      if ((state->dr_ref_count_wp[i]
> +	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i]))
> +	  && hw_watch_regions_overlap (addr_trap, memory_access_size,
> +				       addr_watch, len))
> +	{
> +	  /* ADDR_TRAP reports the first address of the memory range
> +	     accessed by the CPU, regardless of what was the memory
> +	     range watched.  Thus, a large CPU access that straddles
> +	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> +	     ADDR_TRAP that is lower than the
> +	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
> +
> +	     addr: |   4   |   5   |   6   |   7   |   8   |
> +				   |---- range watched ----|
> +		   |----------- range accessed ------------|
> +
> +	     In this case, ADDR_TRAP will be 4.
> +
> +	     To match a watchpoint known to GDB core, we must never
> +	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> +	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> +	     positive on kernels older than 4.10.  See PR
> +	     external/20207.  */
> +	  *addr_p = addr_orig;
> +	  return true;
> +	}
> +    }
> +
> +  return false;
> +}
> diff --git a/gdb/nat/aarch64-linux-hw-point.h b/gdb/nat/aarch64-linux-hw-point.h
> index 2fc4b400ece..62e755dd1e9 100644
> --- a/gdb/nat/aarch64-linux-hw-point.h
> +++ b/gdb/nat/aarch64-linux-hw-point.h
> @@ -132,6 +132,13 @@ typedef ULONGEST dr_changed_t;
> #define DR_HAS_CHANGED(x) ((x) != 0)
> #define DR_N_HAS_CHANGED(x, n) ((x) & ((dr_changed_t)1 << (n)))
> 
> +/* Mask for matching LDP and STP instruction variants.  */
> +#define LDP_STP_MASK  0xFE400000
> +/* stp and stnp with 64-bit registers.  */
> +#define STP_OPC_64    0xA8000000
> +/* ldp and ldnp with 64-bit registers.  */
> +#define LDP_OPC_64    0xA8400000
> +
> /* Structure for managing the hardware breakpoint/watchpoint resources.
>    DR_ADDR_* stores the address, DR_CTRL_* stores the control register
>    content, and DR_REF_COUNT_* counts the numbers of references to the
> @@ -197,4 +204,13 @@ struct aarch64_debug_reg_state *aarch64_get_debug_reg_state (pid_t pid);
> 
> int aarch64_linux_region_ok_for_watchpoint (CORE_ADDR addr, int len);
> 
> +bool hw_watch_is_16b_ldst (CORE_ADDR insn);
> +
> +bool hw_watch_regions_overlap (CORE_ADDR mem_addr, size_t mem_len,
> +			       CORE_ADDR watch_addr, size_t watch_len);
> +
> +bool hw_watch_detect_trigger (const struct aarch64_debug_reg_state *state,
> +			      CORE_ADDR insn, CORE_ADDR stopped_data_address,
> +			      CORE_ADDR *addr_p);
> +
> #endif /* NAT_AARCH64_LINUX_HW_POINT_H */
> diff --git a/gdbserver/linux-aarch64-low.cc b/gdbserver/linux-aarch64-low.cc
> index daccfef746e..5df632fe724 100644
> --- a/gdbserver/linux-aarch64-low.cc
> +++ b/gdbserver/linux-aarch64-low.cc
> @@ -518,7 +518,7 @@ CORE_ADDR
> aarch64_target::low_stopped_data_address ()
> {
>   siginfo_t siginfo;
> -  int pid, i;
> +  int pid;
>   struct aarch64_debug_reg_state *state;
> 
>   pid = lwpid_of (current_thread);
> @@ -538,45 +538,20 @@ aarch64_target::low_stopped_data_address ()
>   const CORE_ADDR addr_trap
>     = address_significant ((CORE_ADDR) siginfo.si_addr);
> 
> +  struct regcache *regs = get_thread_regcache (current_thread, 1);
> +  CORE_ADDR trigger_pc = regcache_read_pc (regs);
> +  uint32_t insn;
> +  read_memory (trigger_pc, (gdb_byte *) &insn, 4);
> +
>   /* Check if the address matches any watched address.  */
>   state = aarch64_get_debug_reg_state (pid_of (current_thread));
> -  for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
> -    {
> -      const unsigned int offset
> -	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> -      const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> -      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> -      const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
> -      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> -
> -      if (state->dr_ref_count_wp[i]
> -	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
> -	  && addr_trap >= addr_watch_aligned
> -	  && addr_trap < addr_watch + len)
> -	{
> -	  /* ADDR_TRAP reports the first address of the memory range
> -	     accessed by the CPU, regardless of what was the memory
> -	     range watched.  Thus, a large CPU access that straddles
> -	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> -	     ADDR_TRAP that is lower than the
> -	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
> -
> -	     addr: |   4   |   5   |   6   |   7   |   8   |
> -				   |---- range watched ----|
> -		   |----------- range accessed ------------|
> -
> -	     In this case, ADDR_TRAP will be 4.
> -
> -	     To match a watchpoint known to GDB core, we must never
> -	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> -	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> -	     positive on kernels older than 4.10.  See PR
> -	     external/20207.  */
> -	  return addr_orig;
> -	}
> -    }
> 
> -  return (CORE_ADDR) 0;
> +  CORE_ADDR trigger_addr;
> +
> +  if (hw_watch_detect_trigger (state, insn, addr_trap, &trigger_addr))
> +    return trigger_addr;
> +
> +  return 0;
> }
> 
> /* Implementation of linux target ops method "low_stopped_by_watchpoint".  */
> -- 
> 2.25.1
>

next prev parent reply	other threads:[~2021-06-02 16:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02 14:27 Luis Machado
2021-06-02 16:57 ` Alan Hayward [this message]
2021-06-02 17:22   ` Luis Machado
2021-06-03 10:22     ` Alan Hayward
2021-07-23 13:24 ` [PATCH, v2][AArch64] " Luis Machado
2021-08-17 15:30   ` Alan Hayward

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01A08E0C-97D3-43DA-AACC-05EF8D126714@arm.com \
    --to=alan.hayward@arm.com \
    --cc=david.spickett@linaro.org \
    --cc=gdb-patches@sourceware.org \
    --cc=luis.machado@linaro.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).