Re: [PATCH, v4] [AArch64] MTE corefile support

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Luis Machado <luis.machado@arm.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH, v4] [AArch64] MTE corefile support
Date: Mon, 11 Jul 2022 11:13:44 +0100	[thread overview]
Message-ID: <6d32ac14-992f-fe6c-b09e-adfecb214dba@arm.com> (raw)
In-Reply-To: <c529aa29-1942-b196-7387-3f8c84f3158c@palves.net>

Hi,

On 6/27/22 15:51, Pedro Alves wrote:
> Hi Luis,
> 
> On 2022-05-03 22:56, Luis Machado via Gdb-patches wrote:
>> v4:
>>
>> - Updated documentation (added cross-references).
>> - Updated the segment name from PT_ARM_MEMTAG_MTE to
>>    PT_AARCH64_MEMTAG_MTE.
>>
>> v3:
>>
>> - Updated NEWS and documentation to be more thorough.
>>
>> v2:
>>
>> - Rework memory tag section handling to use generic section fields.
>>
>> --
>>
>> Teach GDB how to dump memory tags for AArch64 when using the gcore command
>> and how to read memory tag data back from a core file generated by GDB
>> (via gcore) or by the Linux kernel.
>>
>> The format is documented in the Linux Kernel documentation [1].
>>
>> Each tagged memory range (listed in /proc/<pid>/smaps) gets dumped to its
>> own PT_AARCH64_MEMTAG_MTE segment.
> 
> ...
> 
>> A section named ".memtag" is created for each
>> of those segments when reading the core file back.
> 
> It's "memtag", not ".memtag", AFAICT.
> 

Yep. I had fixed this already in a local version for an upcoming v5.

>> +/* AArch64 Linux implementation of the gdbarch_create_memtag_section hook.  */
>> +
>> +static asection *
>> +aarch64_linux_create_memtag_section (struct gdbarch *gdbarch, bfd *obfd,
>> +				     CORE_ADDR address, size_t size)
>> +{
>> +  gdb_assert (obfd != nullptr);
>> +  gdb_assert (size > 0);
>> +
>> +  /* Create the section and associated program header.  */
>> +  asection *mte_section = bfd_make_section_anyway (obfd, "memtag");
>> +
>> +  if (mte_section == nullptr)
>> +    return nullptr;
>> +
>> +  bfd_set_section_vma (mte_section, address);
>> +  /* The size of the memory range covered by the memory tags.  We reuse the
>> +     section's rawsize field for this purpose.  */
>> +  mte_section->rawsize = size;
>> +  /* Tags are stored packed as 2 tags per byte.  */
>> +  bfd_set_section_size (mte_section, (size / AARCH64_MTE_GRANULE_SIZE) / 2);
> 
> Shouldn't the /2 division round up, rather than down, to account for odd number
> of tags?  I.e., this:
> 
>     bfd_set_section_size (mte_section, (size / AARCH64_MTE_GRANULE_SIZE + 1) / 2);
> 

Indeed. I think we just got away with it because we can only allocate maps sized in multiples of the page size.

So that will guarantee we will also have an exact division here. It is not quite correct for an odd number
of tag granules. I'll fix this.

>> +  /* Make sure the section's flags has SEC_HAS_CONTENTS, otherwise BFD will
>> +     refuse to write data to this section.  */
>> +  bfd_set_section_flags (mte_section, SEC_HAS_CONTENTS);
> 
> How about using bfd_make_section_anyway_with_flags further above?
> 

Should work fine. Done now.

>> +
>> +  /* Store program header information.  */
>> +  bfd_record_phdr (obfd, PT_AARCH64_MEMTAG_MTE, 1, 0, 0, 0, 0, 0, 1,
>> +		   &mte_section);
>> +
>> +  return mte_section;
>> +}
>> +
> 
> 
>> +
>> +/* AArch64 Linux implementation of the gdbarch_fill_memtag_section hook.  */
>> +
>> +static bool
>> +aarch64_linux_fill_memtag_section (struct gdbarch *gdbarch, asection *osec)
>> +{
>> +  /* We only handle MTE tags for now.  */
>> +
>> +  size_t segment_size = osec->rawsize;
>> +  CORE_ADDR start_address = bfd_section_vma (osec);
>> +  CORE_ADDR end_address = start_address + segment_size;
>> +
>> +  /* Figure out how many tags we need to store in this memory range.  */
>> +  size_t granules = aarch64_mte_get_tag_granules (start_address, segment_size,
>> +						  AARCH64_MTE_GRANULE_SIZE);
>> +
>> +  /* If there are no tag granules to fetch, just return.  */
>> +  if (granules == 0)
>> +    return true;
>> +
>> +  CORE_ADDR address = start_address;
>> +
>> +  /* Vector of tags.  */
>> +  gdb::byte_vector tags;
>> +
>> +  while (granules > 0)
>> +    {
>> +      /* Transfer tags in chunks.  */
>> +      gdb::byte_vector tags_read;
>> +      size_t xfer_len
>> +	= (granules >= MAX_TAGS_TO_TRANSFER)?
>> +	  MAX_TAGS_TO_TRANSFER * AARCH64_MTE_GRANULE_SIZE :
>> +	  granules * AARCH64_MTE_GRANULE_SIZE;
> 
> '?' and ':' go on the next line, like other operators.  Wrap expression in
> parens for proper alignment, per GNU standards.  So:
> 
>    = ((granules >= MAX_TAGS_TO_TRANSFER)
>       ? MAX_TAGS_TO_TRANSFER * AARCH64_MTE_GRANULE_SIZE
>       : granules * AARCH64_MTE_GRANULE_SIZE);
> 
> (note how you can read "?" as "then" and ":" as "else", this way.)
> 

Fixed the formatting now. Thanks.

>> +
>> +      if (!target_fetch_memtags (address, xfer_len, tags_read,
>> +				 static_cast<int> (memtag_type::allocation)))
>> +	{
>> +	  warning (_("Failed to read MTE tags from memory range [%s,%s)."),
>> +		     phex_nz (start_address, sizeof (start_address)),
>> +		     phex_nz (end_address, sizeof (end_address)));
>> +	  return false;
>> +	}
>> +
>> +      /* Transfer over the tags that have been read.  */
>> +      tags.insert (tags.end (), tags_read.begin (), tags_read.end ());
> 
> (Just a passing comment, feel free to ignore: If target_fetch_memtags were adjusted to
> append to the buffer instead of clearing it, you could pass "tags" directly to
> target_fetch_memtags and get rid of this copying.)
> 

I think it is a reasonable idea. I'm not keen on chaging it up at this point, but that would be a good
improvement. I just need to make sure the callers all cleanup the tags vector before passing it onwards
to the functions.

>> +
>> +      /* Adjust the remaining granules and starting address.  */
>> +      granules -= tags_read.size ();
>> +      address += tags_read.size () * AARCH64_MTE_GRANULE_SIZE;
>> +    }
>> +
>> +  /* Pack the MTE tag bits.  */
>> +  aarch64_mte_pack_tags (tags);
>> +
>> +  if (!bfd_set_section_contents (osec->owner, osec, tags.data (),
>> +				 0, tags.size ()))
>> +    {
>> +      warning (_("Failed to write %s bytes of corefile memory "
>> +		 "tag content (%s)."),
>> +	       pulongest (tags.size ()),
>> +	       bfd_errmsg (bfd_get_error ()));
>> +    }
>> +  return true;
>> +}
>> +
> 
>> +
>> +/* See arch/aarch64-mte-linux.h */
>> +
>>   size_t
>>   aarch64_mte_get_tag_granules (CORE_ADDR addr, size_t len, size_t granule_size)
>>   {
>> diff --git a/gdb/arch/aarch64-mte-linux.h b/gdb/arch/aarch64-mte-linux.h
>> index d158926feff..8a145b447aa 100644
>> --- a/gdb/arch/aarch64-mte-linux.h
>> +++ b/gdb/arch/aarch64-mte-linux.h
>> @@ -32,6 +32,7 @@
>>   
>>   /* We have one tag per 16 bytes of memory.  */
>>   #define AARCH64_MTE_GRANULE_SIZE 16
>> +#define AARCH64_MTE_TAG_BIT_SIZE 4
>>   #define AARCH64_MTE_LOGICAL_TAG_START_BIT 56
>>   #define AARCH64_MTE_LOGICAL_MAX_VALUE 0xf
>>   
>> @@ -71,4 +72,13 @@ extern CORE_ADDR aarch64_mte_set_ltag (CORE_ADDR address, CORE_ADDR tag);
>>      It is always possible to get the logical tag.  */
>>   extern CORE_ADDR aarch64_mte_get_ltag (CORE_ADDR address);
>>   
>> +/* Given a TAGS vector containing 1 MTE tag per byte, pack the data as
>> +   2 tags per byte and resize the vector.  */
>> +void aarch64_mte_pack_tags (gdb::byte_vector &tags);
>> +
>> +/* Given a TAGS vector containing 2 MTE tags per byte, unpack the data as
>> +   1 tag per byte and resize the vector.  If SKIP_FIRST is TRUE, skip the
>> +   first unpacked element.  Otherwise leave it in the unpacked vector.  */
>> +void aarch64_mte_unpack_tags (gdb::byte_vector &tags, bool skip_first);
> 
> All other declarations in the header have explicit "extern".
> 

Done now. Thanks.

>> +
>>   #endif /* ARCH_AARCH64_LINUX_H */
> 
> 
>> +/* Implementation of the "fetch_memtags" target_ops method.  */
>> +
>> +bool
>> +core_target::fetch_memtags (CORE_ADDR address, size_t len,
>> +			    gdb::byte_vector &tags, int type)
>> +{
>> +  struct gdbarch *gdbarch = target_gdbarch ();
>> +
>> +  /* Make sure we have a way to decode the memory tag notes.  */
>> +  if (!gdbarch_decode_memtag_section_p (gdbarch))
>> +    error (_("gdbarch_decode_memtag_section not implemented for this "
>> +	     "architecture."));
>> +
>> +  memtag_section_info info;
>> +  info.memtag_section = nullptr;
>> +
>> +  while (get_next_core_memtag_section (core_bfd, info.memtag_section,
>> +				       address, info))
>> +  {
>> +    size_t adjusted_length
>> +      = (address + len < info.end_address)? len : (info.end_address - address);
> 
> Space before "?".
> 

Fixed.

>> +
>> +    /* Decode the memory tag note and return the tags.  */
>> +    gdb::byte_vector tags_read
>> +      = gdbarch_decode_memtag_section (gdbarch, info.memtag_section, type,
>> +				       address, adjusted_length);
>> +
>> +    /* Transfer over the tags that have been read.  */
>> +    tags.insert (tags.end (), tags_read.begin (), tags_read.end ());
>> +
>> +    /* ADDRESS + LEN may cross the boundaries of a particular memory tag
>> +       segment.  Check if we need to fetch tags from a different section.  */
>> +    if (!tags_read.empty () && (address + len) < info.end_address)
>> +      return true;
>> +
>> +    /* There are more tags to fetch.  Update ADDRESS and LEN.  */
>> +    len -= (info.end_address - address);
>> +    address = info.end_address;
>> +  }
>> +
>> +  return false;
>> +}
>> +
> +
> 
> 
>> +
>> +bool
>> +get_next_core_memtag_section (bfd *abfd, asection *section,
>> +			      CORE_ADDR address, memtag_section_info &info)
>> +{
>> +  /* If the caller provided no SECTION to start from, search from the
>> +     beginning.  */
>> +  if (section == nullptr)
>> +    section = bfd_get_section_by_name (abfd, "memtag");
>> +
>> +  /* Go through all the memtag sections and figure out if ADDRESS
>> +     falls within one of the memory ranges that contain tags.  */
>> +  while (section != nullptr)
>> +    {
>> +      size_t memtag_range_size = section->rawsize;
>> +      size_t tags_size = bfd_section_size (section);
>> +
>> +      /* Empty memory range and empty tag dump should not happen.  */
>> +      gdb_assert (memtag_range_size != 0);
>> +      gdb_assert (tags_size != 0);
> 
> Seems like this could happen with corrupted input?  GDB should not assert
> on bad input.
> 

I turned this into a warning and then we skip to the next section.

>> +
>> +      CORE_ADDR start_address = bfd_section_vma (section);
>> +      CORE_ADDR end_address = start_address + memtag_range_size;
>> +
>> +      /* Is the address within [start_address, end_address)?  */
>> +      if (address >= start_address
>> +	  && address < end_address)
>> +	{
>> +	  info.start_address = start_address;
>> +	  info.end_address = end_address;
>> +	  info.memtag_section = section;
>> +	  return true;
>> +	}
>> +      section = bfd_get_next_section_by_name (abfd, section);
>> +    }
>> +  return false;
>> +}
> 
> 
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.c
>> @@ -0,0 +1,93 @@
>> +/* This test program is part of GDB, the GNU debugger.
>> +
>> +   Copyright 2021 Free Software Foundation, Inc.
> 
> 2022 missing.
> 
> 

Fixed now.

>> diff --git a/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp
>> new file mode 100644
>> index 00000000000..8a19c4b449e
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp
>> @@ -0,0 +1,107 @@
>> +# Copyright (C) 2018-2021 Free Software Foundation, Inc.
> 
> 2021 -> 2022.
> 
>> +# Run until a crash and confirm GDB displays memory tag violation
>> +# information.
>> +gdb_test "continue" \
>> +    [multi_line \
>> +	"Program received signal SIGSEGV, Segmentation fault" \
>> +	"Memory tag violation while accessing address $hex" \
>> +	"Allocation tag $hex" \
>> +	"Logical tag $hex\." \
>> +	"$hex in access_memory \\(.*\\) at .*" \
>> +	".*tagged_ptr\\\[0\\\] = 'a';"] \
>> +	 "display tag violation information for live process"
>> +
>> +# Generate the core file.
>> +set core_filename [standard_output_file "$testfile.core"]
>> +set core_generated [gdb_gcore_cmd "$core_filename" "generate core file"]
> 
> I'd suggest also making sure gdb is able to read kernel-generated core properly.
> You can use core_find to produce one.

I spent a bit of time doing this. I wasn't aware of core_find. The testcase now
exercises both gcore and native core files.

v5 of this patch is on its way.

next prev parent reply	other threads:[~2022-07-11 10:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-31 14:03 Luis Machado
2022-04-21 15:20 ` [PATCH, v2] " Luis Machado
2022-04-21 15:52   ` Eli Zaretskii
2022-04-22  8:12     ` Luis Machado
2022-04-22  8:30       ` Eli Zaretskii
2022-04-22  8:37         ` Luis Machado
2022-04-22  8:43           ` Eli Zaretskii
2022-04-22  8:44             ` Luis Machado
2022-04-22 13:27 ` [PATCH 1/2] " Luis Machado
2022-04-22 13:33   ` Eli Zaretskii
2022-04-22 13:30 ` [PATCH, v3] " Luis Machado
2022-05-03 21:56 ` [PATCH, v4] " Luis Machado
2022-05-12 10:36   ` Luis Machado
2022-05-18 12:46   ` Luis Machado
2022-05-18 13:58     ` John Baldwin
2022-05-23  9:50       ` Luis Machado
2022-05-23  9:49     ` Luis Machado
2022-06-06  9:28   ` Luis Machado
2022-06-06  9:42     ` Kuan-Ying Lee
2022-06-06  9:47       ` Luis Machado
2022-06-06  9:54         ` Kuan-Ying Lee
2022-06-06 10:49     ` Eli Zaretskii
2022-06-22  9:04   ` Luis Machado
2022-06-27 14:51   ` Pedro Alves
2022-07-11 10:13     ` Luis Machado [this message]
2022-07-11 10:57   ` [PATCH] [AArch64,v5] " Luis Machado
2022-07-18 13:54     ` Pedro Alves
2022-07-19 14:25       ` Luis Machado
  -- strict thread matches above, loose matches on Subject: below --
2021-05-18 20:20 [PATCH] [AArch64] " Luis Machado
2021-05-31 16:44 ` [PATCH,v4][AArch64] " Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6d32ac14-992f-fe6c-b09e-adfecb214dba@arm.com \
    --to=luis.machado@arm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).