From: Alan Hayward <Alan.Hayward@arm.com>
To: Luis Machado <luis.machado@linaro.org>
Cc: "gdb-patches\\@sourceware.org" <gdb-patches@sourceware.org>,
"david.spickett@linaro.org" <david.spickett@linaro.org>,
nd <nd@arm.com>
Subject: Re: [PATCH][AArch64] Fix missing watchpoint hits/endless loop
Date: Wed, 2 Jun 2021 16:57:26 +0000 [thread overview]
Message-ID: <01A08E0C-97D3-43DA-AACC-05EF8D126714@arm.com> (raw)
In-Reply-To: <20210602142745.2174-1-luis.machado@linaro.org>
> On 2 Jun 2021, at 15:27, Luis Machado <luis.machado@linaro.org> wrote:
>
> I ran into a situation where a hardware watchpoint hit is not detected
> correctly, misleading GDB into thinking it was a delayed breakpoint hit.
>
> The problem is that hardware watchpoints are not skippable on AArch64, so
> that makes GDB loop endlessly trying to run past the instruction.
>
> The most obvious case where this happens is when the load/store pair
> instructions access 16 bytes of memory.
>
> Suppose we have a stp instruction that will write a couple 64-bit registers
> to address 0x10 (stp x3,x4 [x2]). It will write data from 0x10 up to 0x20.
>
> Now suppose a write watchpoint is created to monitor memory address 0x18,
> which is the start of the second register write. It can have whatever length,
> but let's assume it has length 8.
>
> When we execute that stp instruction, it will trap and the reported stopped
> data address from the kernel will be 0x10 (the beginning of the memory range
> accessed by the instruction).
>
> The current code won't be able to detect a valid trigger because it assumes an
> alignment of 8 bytes for the watchpoint address. Forcing that kind of alignment
> won't be enough to detect a 16-byte access if the trap address falls outside of
> the 8-byte alignment window. We need to know how many bytes the instruction
> will access, but we won't have that data unless we go parsing instructions.
>
> Another issue with the current code seems to be that it assumes the accesses
> will always be 8 bytes in size, since it wants to align the watchpoint address
> to that particular boundary. This leads to problems when we have unaligned
> addresses and unaligned watchpoints.
>
> For example, suppose we have a str instruction storing 8 bytes to memory
> address 0xf. Now suppose we have a write watchpoint at address 0x16,
> monitoring 8 bytes.
>
> The trap address will be 0xf, but forcing 0x16 to 8-byte alignment yields
> 0x10, and so GDB doesn't think this is a watchpoint hit.
Ouch.
I’m starting to wonder now if I have encountered this bug myself, and just
shrugged it off thinking the variable wasn’t being hit.
>
> I believe you can trigger the same problem with smaller memory accesses,
> except one that accesses a single byte.
>
> As I said earlier, ideally we'd go parsing instructions to figure out how many
> bytes they access. That is a bit complex though, so meanwhile I think we should
> go with a simpler solution.
>
> We keep the assumption that the instructions access 8 bytes, except when we
> detect one of the instructions that access 16 bytes.
What about NEON and SVE instructions? They can access a lot of data at once.
>
> For the trigger detection, instead of forcing an alignment and then checking
> if the trap address falls within the range of the watchpoint, we just check
> if the memory access and the watchpoint range overlap. If they do, then we
> have a watchpoint hit.
>
> This still has potential to give false positives if we choose the right
> combination of memory address, access size and watchpoint address. But it
> should match the old code in this regard, while giving better results for
> accesses of 16 bytes and unaligned accesses.
Is there a plan for fixing those?
>
> It also fixes these two failures in the testsuite:
>
> FAIL: gdb.base/watchpoint-unaligned.exp: continue (timeout)
> FAIL: gdb.base/watchpoint-unaligned.exp: size8twice write
>
> Regression tested on aarch64-linux Ubuntu/20.04.
>
> gdb/ChangeLog:
>
> YYYY-MM-DD Luis Machado <luis.machado@linaro.org>
>
> * aarch64-linux-nat.c
> (aarch64_linux_nat_target::stopped_data_address): Refactor.
> * nat/aarch64-linux-hw-point.c (hw_watch_is_16b_ldst)
> (hw_watch_regions_overlap, hw_watch_detect_trigger): New functions.
> * nat/aarch64-linux-hw-point.h (LDP_STP_MASK, STP_OPC_64)
> (LDP_OPC_64): New constants.
> (hw_watch_is_16b_ldst, hw_watch_regions_overlap)
> (hw_watch_detect_trigger): New prototypes.
>
> gdbserver/ChangeLog:
>
> YYYY-MM-DD Luis Machado <luis.machado@linaro.org>
>
> * linux-aarch64-low.cc
> (aarch64_target::low_stopped_data_address): Refactor.
> ---
> gdb/aarch64-linux-nat.c | 45 ++----------
> gdb/nat/aarch64-linux-hw-point.c | 120 +++++++++++++++++++++++++++++++
> gdb/nat/aarch64-linux-hw-point.h | 16 +++++
> gdbserver/linux-aarch64-low.cc | 49 ++++---------
> 4 files changed, 154 insertions(+), 76 deletions(-)
>
> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
> index 61224022f6a..de405a5e351 100644
> --- a/gdb/aarch64-linux-nat.c
> +++ b/gdb/aarch64-linux-nat.c
> @@ -951,7 +951,6 @@ bool
> aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
> {
> siginfo_t siginfo;
> - int i;
> struct aarch64_debug_reg_state *state;
>
> if (!linux_nat_get_siginfo (inferior_ptid, &siginfo))
> @@ -969,46 +968,14 @@ aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
> const CORE_ADDR addr_trap
> = address_significant (gdbarch, (CORE_ADDR) siginfo.si_addr);
>
> + struct regcache *regs = get_thread_regcache (this, inferior_ptid);
> + CORE_ADDR trigger_pc = regcache_read_pc (regs);
> + uint32_t insn;
> + read_memory (trigger_pc, (gdb_byte *) &insn, 4);
> +
> /* Check if the address matches any watched address. */
> state = aarch64_get_debug_reg_state (inferior_ptid.pid ());
> - for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
> - {
> - const unsigned int offset
> - = aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> - const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> - const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> - const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
> - const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> -
> - if (state->dr_ref_count_wp[i]
> - && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
> - && addr_trap >= addr_watch_aligned
> - && addr_trap < addr_watch + len)
> - {
> - /* ADDR_TRAP reports the first address of the memory range
> - accessed by the CPU, regardless of what was the memory
> - range watched. Thus, a large CPU access that straddles
> - the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> - ADDR_TRAP that is lower than the
> - ADDR_WATCH..ADDR_WATCH+LEN range. E.g.:
> -
> - addr: | 4 | 5 | 6 | 7 | 8 |
> - |---- range watched ----|
> - |----------- range accessed ------------|
> -
> - In this case, ADDR_TRAP will be 4.
> -
> - To match a watchpoint known to GDB core, we must never
> - report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> - range. ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> - positive on kernels older than 4.10. See PR
> - external/20207. */
> - *addr_p = addr_orig;
> - return true;
> - }
> - }
> -
> - return false;
> + return hw_watch_detect_trigger (state, insn, addr_trap, addr_p);
> }
>
> /* Implement the "stopped_by_watchpoint" target_ops method. */
> diff --git a/gdb/nat/aarch64-linux-hw-point.c b/gdb/nat/aarch64-linux-hw-point.c
> index af2cc4254e2..32ed76157ac 100644
> --- a/gdb/nat/aarch64-linux-hw-point.c
> +++ b/gdb/nat/aarch64-linux-hw-point.c
> @@ -865,3 +865,123 @@ aarch64_linux_region_ok_for_watchpoint (CORE_ADDR addr, int len)
> the checking is costly. */
> return 1;
> }
> +
> +/* Return true if INSN is a ldp/stp that accesses 16 bytes of memory.
> + Return false otherwise. */
> +
> +bool
> +hw_watch_is_16b_ldst (CORE_ADDR insn)
> +{
> + return ((insn & LDP_STP_MASK) == STP_OPC_64
> + || (insn & LDP_STP_MASK) == LDP_OPC_64);
> +}
> +
> +/* Return true if the regions [mem_addr, mem_addr + mem_len] and
> + [watch_addr, watch_addr + watch_len] overlap. False otherwise. */
> +
> +bool
> +hw_watch_regions_overlap (CORE_ADDR mem_addr, size_t mem_len,
> + CORE_ADDR watch_addr, size_t watch_len)
> +{
> + if (watch_addr > (mem_addr + mem_len)
> + || mem_addr > (watch_addr + watch_len))
> + return false;
> +
> + CORE_ADDR start = std::max (mem_addr, watch_addr);
> + CORE_ADDR end = std::min (mem_addr + mem_len, watch_addr + watch_len);
> +
> + return ((end - start) > 0);
> +}
> +
> +/* Check if a hardware watchpoint has triggered. If a trigger is detected,
> + return true and update ADDR_P with the stopped data address.
> + Otherwise return false.
> +
> + STATE is the debug register's state, INSN is the instruction the inferior
> + stopped at and ADDR_TRAP is the reported stopped data address. */
> +
> +bool
> +hw_watch_detect_trigger (const struct aarch64_debug_reg_state *state,
> + CORE_ADDR insn, CORE_ADDR addr_trap,
> + CORE_ADDR *addr_p)
> +{
> + /* There are 6 variations of watchpoint range and memory access
> + range positioning:
> +
> + - W is the byte in the watchpoint range only.
> +
> + - M is the byte in the memory access range ony.
> +
> + - O is the byte in the overlapping region of the watchpoint range and
> + the memory access range.
> +
> + 1 - Non-overlapping, no triggers.
> +
> + [WWWWWWWW]...[MMMMMMMM]
> +
> + 2 - Non-overlapping, no triggers.
> +
> + [MMMMMMMM]...[WWWWWWWW]
> +
> + 3 - Overlapping partially, triggers.
> +
> + [MMMMOOOOWWWW]
> +
> + 4 - Overlapping partially, triggers.
> +
> + [WWWWOOOOMMMM]
> +
> + 5 - Memory access contained in watchpoint range, triggers.
> +
> + [WWWWOOOOOOOOWWWW]
> +
> + 6 - Memory access containing watchpoint range, triggers.
> +
> + [MMMMOOOOOOOOMMMM]
Very minor nit: M looks like an upside down W, so there was an initial slight confusion.
Not worth changing, because M and W are the right letter choices.
> + */
> +
> + /* We assume the memory access size is 8 bytes. */
> + unsigned int memory_access_size = 8;
> +
> + /* Check if the memory access size is 16 bytes. */
> + if (hw_watch_is_16b_ldst (insn))
> + memory_access_size = 16;
> +
> + for (int i = aarch64_num_wp_regs - 1; i >= 0; --i)
> + {
> + const unsigned int offset
> + = aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> + const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> + const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> + const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> +
> + if ((state->dr_ref_count_wp[i]
> + && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i]))
> + && hw_watch_regions_overlap (addr_trap, memory_access_size,
> + addr_watch, len))
> + {
> + /* ADDR_TRAP reports the first address of the memory range
> + accessed by the CPU, regardless of what was the memory
> + range watched. Thus, a large CPU access that straddles
> + the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> + ADDR_TRAP that is lower than the
> + ADDR_WATCH..ADDR_WATCH+LEN range. E.g.:
> +
> + addr: | 4 | 5 | 6 | 7 | 8 |
> + |---- range watched ----|
> + |----------- range accessed ------------|
> +
> + In this case, ADDR_TRAP will be 4.
> +
> + To match a watchpoint known to GDB core, we must never
> + report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> + range. ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> + positive on kernels older than 4.10. See PR
> + external/20207. */
> + *addr_p = addr_orig;
> + return true;
> + }
> + }
> +
> + return false;
> +}
> diff --git a/gdb/nat/aarch64-linux-hw-point.h b/gdb/nat/aarch64-linux-hw-point.h
> index 2fc4b400ece..62e755dd1e9 100644
> --- a/gdb/nat/aarch64-linux-hw-point.h
> +++ b/gdb/nat/aarch64-linux-hw-point.h
> @@ -132,6 +132,13 @@ typedef ULONGEST dr_changed_t;
> #define DR_HAS_CHANGED(x) ((x) != 0)
> #define DR_N_HAS_CHANGED(x, n) ((x) & ((dr_changed_t)1 << (n)))
>
> +/* Mask for matching LDP and STP instruction variants. */
> +#define LDP_STP_MASK 0xFE400000
> +/* stp and stnp with 64-bit registers. */
> +#define STP_OPC_64 0xA8000000
> +/* ldp and ldnp with 64-bit registers. */
> +#define LDP_OPC_64 0xA8400000
> +
> /* Structure for managing the hardware breakpoint/watchpoint resources.
> DR_ADDR_* stores the address, DR_CTRL_* stores the control register
> content, and DR_REF_COUNT_* counts the numbers of references to the
> @@ -197,4 +204,13 @@ struct aarch64_debug_reg_state *aarch64_get_debug_reg_state (pid_t pid);
>
> int aarch64_linux_region_ok_for_watchpoint (CORE_ADDR addr, int len);
>
> +bool hw_watch_is_16b_ldst (CORE_ADDR insn);
> +
> +bool hw_watch_regions_overlap (CORE_ADDR mem_addr, size_t mem_len,
> + CORE_ADDR watch_addr, size_t watch_len);
> +
> +bool hw_watch_detect_trigger (const struct aarch64_debug_reg_state *state,
> + CORE_ADDR insn, CORE_ADDR stopped_data_address,
> + CORE_ADDR *addr_p);
> +
> #endif /* NAT_AARCH64_LINUX_HW_POINT_H */
> diff --git a/gdbserver/linux-aarch64-low.cc b/gdbserver/linux-aarch64-low.cc
> index daccfef746e..5df632fe724 100644
> --- a/gdbserver/linux-aarch64-low.cc
> +++ b/gdbserver/linux-aarch64-low.cc
> @@ -518,7 +518,7 @@ CORE_ADDR
> aarch64_target::low_stopped_data_address ()
> {
> siginfo_t siginfo;
> - int pid, i;
> + int pid;
> struct aarch64_debug_reg_state *state;
>
> pid = lwpid_of (current_thread);
> @@ -538,45 +538,20 @@ aarch64_target::low_stopped_data_address ()
> const CORE_ADDR addr_trap
> = address_significant ((CORE_ADDR) siginfo.si_addr);
>
> + struct regcache *regs = get_thread_regcache (current_thread, 1);
> + CORE_ADDR trigger_pc = regcache_read_pc (regs);
> + uint32_t insn;
> + read_memory (trigger_pc, (gdb_byte *) &insn, 4);
> +
> /* Check if the address matches any watched address. */
> state = aarch64_get_debug_reg_state (pid_of (current_thread));
> - for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
> - {
> - const unsigned int offset
> - = aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> - const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> - const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> - const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
> - const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> -
> - if (state->dr_ref_count_wp[i]
> - && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
> - && addr_trap >= addr_watch_aligned
> - && addr_trap < addr_watch + len)
> - {
> - /* ADDR_TRAP reports the first address of the memory range
> - accessed by the CPU, regardless of what was the memory
> - range watched. Thus, a large CPU access that straddles
> - the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> - ADDR_TRAP that is lower than the
> - ADDR_WATCH..ADDR_WATCH+LEN range. E.g.:
> -
> - addr: | 4 | 5 | 6 | 7 | 8 |
> - |---- range watched ----|
> - |----------- range accessed ------------|
> -
> - In this case, ADDR_TRAP will be 4.
> -
> - To match a watchpoint known to GDB core, we must never
> - report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> - range. ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> - positive on kernels older than 4.10. See PR
> - external/20207. */
> - return addr_orig;
> - }
> - }
>
> - return (CORE_ADDR) 0;
> + CORE_ADDR trigger_addr;
> +
> + if (hw_watch_detect_trigger (state, insn, addr_trap, &trigger_addr))
> + return trigger_addr;
> +
> + return 0;
> }
>
> /* Implementation of linux target ops method "low_stopped_by_watchpoint". */
> --
> 2.25.1
>
next prev parent reply other threads:[~2021-06-02 16:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-02 14:27 Luis Machado
2021-06-02 16:57 ` Alan Hayward [this message]
2021-06-02 17:22 ` Luis Machado
2021-06-03 10:22 ` Alan Hayward
2021-07-23 13:24 ` [PATCH, v2][AArch64] " Luis Machado
2021-08-17 15:30 ` Alan Hayward
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=01A08E0C-97D3-43DA-AACC-05EF8D126714@arm.com \
--to=alan.hayward@arm.com \
--cc=david.spickett@linaro.org \
--cc=gdb-patches@sourceware.org \
--cc=luis.machado@linaro.org \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).