Re: [PATCH,v2] [AArch64] MTE corefile support

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Luis Machado <luis.machado@linaro.org>
To: Simon Marchi <simon.marchi@polymtl.ca>, gdb-patches@sourceware.org
Cc: david.spickett@linaro.org, catalin.marinas@arm.com,
	John Baldwin <jhb@FreeBSD.org>
Subject: Re: [PATCH,v2] [AArch64] MTE corefile support
Date: Mon, 31 May 2021 11:12:22 -0300	[thread overview]
Message-ID: <4ae84e2c-f0c4-bcac-1e5d-b9b34d8521db@linaro.org> (raw)
In-Reply-To: <ce359d1d-9853-06c1-e243-b9b6cb8b4a5a@polymtl.ca>

On 5/29/21 12:14 AM, Simon Marchi wrote:
> On 2021-05-26 10:08 a.m., Luis Machado via Gdb-patches wrote:
>> +/* AArch64 Linux implementation of the decode_memtag_note gdbarch
>> +   hook.  Decode a memory tag note and return the requested tags.
>> +
>> +   The note is guaranteed to cover the [ADDRESS, ADDRESS + length)
>> +   range.  */
>> +
>> +static gdb::byte_vector
>> +aarch64_linux_decode_memtag_note (struct gdbarch *gdbarch,
>> +				  gdb::array_view <const gdb_byte> note,
>> +				  CORE_ADDR address, size_t length)
>> +{
>> +  gdb::byte_vector tags;
>> +
>> +  /* Sanity check.  */
>> +  if (note.size () < MEMTAG_NOTE_HEADER_SIZE)
>> +    {
>> +      warning (_("malformed core note - too short for MTE header"));
> 
> I would suggest indicating the expected minimum size and the actualy
> size.  If that ever fails, this bit of information could help the user.
> 

Fixed now.

>> +      return tags;
>> +    }
>> +
>> +  /* The amount of memory tag granules we need to fetch.  */
>> +  int granules
>> +    = aarch64_mte_get_tag_granules (address, length, AARCH64_MTE_GRANULE_SIZE);
>> +
>> +  /* If there are no tag granules to decode, just return.  */
>> +  if (granules == 0)
>> +    return tags;
>> +
>> +  /* Read the generic header.  */
>> +  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
>> +  struct tag_dump_header header;
>> +  const gdb_byte *buf = note.data ();
>> +
>> +  header.format
>> +    = extract_unsigned_integer (buf, sizeof (uint16_t), byte_order);
>> +  buf += sizeof (uint16_t);
>> +
>> +  header.start_vma
>> +    = extract_unsigned_integer (buf, sizeof (uint64_t), byte_order);
>> +  buf += sizeof (uint64_t);
>> +
>> +  header.end_vma
>> +    = extract_unsigned_integer (buf, sizeof (uint64_t), byte_order);
>> +  buf += sizeof (uint64_t);
>> +
>> +  /* Sanity check.  */
>> +  gdb_assert (address + length < header.end_vma);
>> +
>> +  /* Sanity check  */
>> +  if (header.format != NT_MEMTAG_TYPE_AARCH_MTE)
>> +    {
>> +      warning (_("Unexpected memory tag note format (%x).\n"), header.format);
> 
> Here, it's ambiguous (from the point of view of the user) whether the
> value you print is the expected value or actual value.  For clarity,
> might was well print both.
> 

Updated to show the expected format hex value and what we really got.

>> +      return tags;
>> +    }
>> +
>> +  /* Calculate how many granules we need to skip to get to the granule of
>> +     ADDRESS.  Align both the start address and the requested address
>> +     so it is easier to get the number of granules to skip.  This way we
>> +     don't need to consider cases where ADDRESS falls in the middle of a
>> +     tag granule range.  */
>> +  CORE_ADDR aligned_start_address
>> +    = align_down (header.start_vma, AARCH64_MTE_GRANULE_SIZE);
>> +  CORE_ADDR aligned_address = align_down (address, AARCH64_MTE_GRANULE_SIZE);
>> +
>> +  int skipped_granules
>> +    = aarch64_mte_get_tag_granules (aligned_start_address,
>> +				    aligned_address - aligned_start_address,
>> +				    AARCH64_MTE_GRANULE_SIZE);
> 
> aarch64_mte_get_tag_granules returns size_t, so we might as well make
> the local variable size_t...
> 

Updated all the variables now, and the loop variables.

>> +
>> +  /* Point to the block of data that contains the first granule we are
>> +     interested in.  */
>> +  const gdb_byte *tags_data
>> +    = note.data () + MEMTAG_NOTE_HEADER_SIZE + skipped_granules;
> 
> To benefit from (potential, I've been wanting to add that for a while)
> array_view bound_checking, I suggest making an array_view slice instead
> of falling back on a plain pointer.  You should be able to call
> 
>    note.slice (MEMTAG_NOTE_HEADER_SIZE + skipped_granules)

Done now.

> 
> which will get view another array view that starts where you need.
> 
>> +
>> +  /* Read the tag granules.  */
>> +  for (unsigned int i = 0; i < granules; i++)
> 
> ... and the iteration variable size_t too.  This might apply to other
> calls to aarch64_mte_get_tag_granules.
> 

Done.

>> +      tags.push_back (tags_data[i]);
> 
> I feel like we are missing some bound checking here.  Couldn't malformed
> data send us past the array_view's window?
> 

I've added a new check after we read the header, but before reading the 
tag data. It validates the note has enough tag bytes for the specified 
start_vma/end_vma range. It could have more though, in which case I 
don't do anything and assume it is OK as is.

>> +
>> +  return tags;
>> +}
> 
> In the computations above, the code seems to assume that each granule's
> tag takes up one byte.  That is a safe assumption? >

Even though MTE tags are 4 bits in size, we write them down as 1 byte 
tags for simplicity. So it is correct to handle the tag data here as 1 
tag per byte.

There are plans to add the capability of compressing the tag data. That 
would make it smaller. But we're not looking at that at this time.

>> @@ -1862,6 +2049,17 @@ aarch64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
>>   
>>         set_gdbarch_report_signal_info (gdbarch,
>>   				      aarch64_linux_report_signal_info);
>> +
>> +      /* Core file helpers.  */
>> +
>> +      /* Core file helper to create memory tag notes for a particular range of
>> +	 addresses.  */
>> +      set_gdbarch_create_memtag_notes_from_range (gdbarch,
>> +				  aarch64_linux_create_memtag_notes_from_range);
> 
> Indent.
> 

Done.

>> @@ -71,4 +72,18 @@ extern CORE_ADDR aarch64_mte_set_ltag (CORE_ADDR address, CORE_ADDR tag);
>>      It is always possible to get the logical tag.  */
>>   extern CORE_ADDR aarch64_mte_get_ltag (CORE_ADDR address);
>>   
>> +/* MTE-specific NT_MEMTAG header.  */
>> +struct tag_dump_mte
>> +{
>> +  /* Size of the tag granule in bytes.  */
>> +  uint16_t granule_byte_size;
>> +  /* Size of the tag in bits.  */
>> +  uint16_t tag_bit_size;
>> +  /* Reserved field for the future.  */
>> +  uint16_t __unused;
>> +};
> 
> This struct is unused.
> 

I'm inclined to leave that as documentation instead, so it is clear what 
the header sizes are referring to.

>> +
>> +/* Size of the MTE header for a NT_MEMTAG note.  */
>> +#define MEMTAG_MTE_HEADER_SIZE (3 * sizeof (uint16_t))
> 
> NT_MEMTAG_MTE_HEADER_SIZE?  Otherwise there's nothing in the name that
> refers to the fact that it's the header of the note.
> 

Done now.

>> @@ -1115,6 +1122,107 @@ core_target::info_proc (const char *args, enum info_proc_what request)
>>     return true;
>>   }
>>   
>> +/* Implementation of the "supports_memory_tagging" target_ops method.  */
>> +
>> +bool
>> +core_target::supports_memory_tagging ()
>> +{
>> +  /* Look for memory tag notes.  If they exist, that means this core file
>> +     supports memory tagging.  */
>> +  if (bfd_get_section_by_name (core_bfd, ".memtag") == nullptr)
>> +    return false;
>> +
>> +  return true;
> 
> Use:
> 
>    bfd_get_section_by_name (...) != nullptr;
> 

Changed now.

>> +}
>> +
>> +/* Size of the generic header for the NT_MEMTAG note.  This is OS-independent
>> +   and should be shared with OS' and arch-specific code.  */
>> +#define NT_MEMTAG_HEADER_SIZE (sizeof (uint16_t) + 2 * sizeof (uint64_t))
> 
> It's a little strange, NT_MEMTAG_HEADER_SIZE is defined at two places,
> linux-tdep.h and here.  If this is really OS-independent, there should
> probably be just one definition, in an OS-independent header file.  Not
> sure where though...
> 

Right. I ran into this as well. I thought about this a bit, and I guess 
the simplest way is to have a new memtag.h header file and define it there.

It would be generic enough and other OS' would be able to include it as 
well. And it would work for Core files.

>> +
>> +/* Implementation of the "fetch_memtags" target_ops method.  */
>> +
>> +bool
>> +core_target::fetch_memtags (CORE_ADDR address, size_t len,
>> +			    gdb::byte_vector &tags, int type)
>> +{
>> +  struct gdbarch *gdbarch = target_gdbarch ();
>> +
>> +  /* Make sure we have a way to decode the memory tag notes.  */
>> +  if (!gdbarch_decode_memtag_note_p (gdbarch))
>> +    warning (_("gdbarch_decode_memtag_note not implemented for this "
>> +	       "architecture."));
> 
> Just a warning?  You want to continue this function even if this
> happens?
> 

This should've been an error. Thanks.

>> +
>> +  asection *section
>> +    = bfd_get_section_by_name (core_bfd, ".memtag");
> 
> Fits on one line.
> 

Fixed.

>> @@ -1473,6 +1474,137 @@ linux_address_in_memtag_page (CORE_ADDR address)
>>     return false;
>>   }
>>   
>> +/* Helper that checks if an address is in a memory tag page for a core file
>> +   process.  */
>> +
>> +static bool
>> +linux_core_file_address_in_memtag_page (CORE_ADDR address)
>> +{
>> +  if (core_bfd == nullptr)
>> +    return false;
>> +
>> +  asection *section
>> +    = bfd_get_section_by_name (core_bfd, ".memtag");
>> +
>> +  /* Go through all the memtag sections and figure out if ADDRESS
>> +     falls within one of the memory ranges that contain tags.  */
>> +  while (section != nullptr)
>> +    {
>> +      size_t note_size = bfd_section_size (section);
>> +
>> +      /* If the note is smaller than the size of the header, this core note
>> +	 is malformed.  */
>> +      if (note_size < NT_MEMTAG_HEADER_SIZE)
>> +	{
>> +	  warning (_("malformed core note - too short for header"));
>> +	  return false;
>> +	}
>> +

I've updated this warning to show the expected and the actual size of 
the header.

>> +      gdb::byte_vector note (note_size);
>> +
>> +      /* Fetch the contents of this particular memtag note.  */
>> +      if (!bfd_get_section_contents (core_bfd, section,
>> +				     note.data (), 0, note_size))
>> +	{
>> +	  warning (_("could not get core note contents."));
>> +	  return false;
>> +	}
>> +
>> +      /* Read the generic header of the note.  Those contain the format,
>> +	 start address and end address.  */
>> +      uint64_t start_address
>> +	= bfd_get_64 (core_bfd, note.data () + sizeof (uint16_t));
>> +      uint64_t end_address
>> +	= bfd_get_64 (core_bfd, note.data () + sizeof (uint16_t)
>> +				+ sizeof (uint64_t));
> 
> The code of this function approximately up to here is really similar to
> the code in corelow.c.  Is there a change there could be a single
> implementation of it, that takes a callback?  Like
> 
>    for_each_matching_memtag_note_section_matching_address
>      (CORE_ADDR address, gdb::function_view<...> callback)
> 
> except with a better name.
> 

I'm aware of that. I was hoping to leave this cleanup for a later stage.

A complicating factor here is that we need to traverse the NT_MEMTAG 
notes for two slightly different reasons.

The first is to decide if an address (no length specified) falls within 
one of the NT_MEMTAG entries, in which case we just return true/false;

The second is to actually fetch the tags the user has requested (via 
address + length). This may potentially span multiple NT_MEMTAG notes, 
so the callback may be invoked multiple times.

It should be possible to refactor the code to invoke callbacks, but we 
would need to have two callbacks (one for corelow and another for 
linux-tdep). The one for linux-tdep just needs to check if a note exists 
for a particular range, so it would be an useless callback in my opinion.

With that said, I reworked things and moved the common code to a new 
function. This new function goes through the notes and returns whenever 
we hit a note that contains the address we're looking for. Then we have 
a chance to do some processing, return or continue going through notes.

I think this simplifies things.

>> +/* For each memory map entry that has memory tagging enabled, create a new
>> +   core file note that contains all of its memory tags.  Save the data to
>> +   NOTE_DATA and update NOTE_SIZE accordingly.  */
>> +
>> +static void
>> +linux_make_memtag_corefile_notes (struct gdbarch *gdbarch, bfd *obfd,
>> +				  gdb::unique_xmalloc_ptr<char> &note_data,
>> +				  int *note_size)
>> +{
>> +  if (current_inferior ()->fake_pid_p)
>> +    return;
>> +
>> +  /* If the architecture doesn't have a hook to return memory tag notes,
>> +     there is nothing left to do.  */
>> +  if (!gdbarch_create_memtag_notes_from_range_p (gdbarch))
>> +    return;
>> +
>> +  pid_t pid = current_inferior ()->pid;
>> +
>> +  std::string smaps_file = string_printf ("/proc/%d/smaps", pid);
>> +
>> +  gdb::unique_xmalloc_ptr<char> data
>> +    = target_fileio_read_stralloc (NULL, smaps_file.c_str ());
>> +
>> +  if (data == nullptr)
>> +    return;
>> +
>> +  std::vector<struct smaps_data> smaps;
>> +
>> +  /* Parse the contents of smaps into a vector.  */
>> +  smaps = parse_smaps_data (data.get (), smaps_file);
> 
> Declare when assigning:
> 
>    std::vector<smaps_data> smaps = parse_smaps_data (...);
> 

Done.

>> +
>> +  for (const smaps_data &map : smaps)
>> +    {
>> +      /* Does this mapping have memory tagging enabled? If so, save the
>> +	 memory tags to the core file note.  */
>> +      if (map.vmflags.memory_tagging == 0)
>> +	continue;
>> +
>> +      /* Ask the architecture to create (one or more) NT_MEMTAG notes for
>> +	 this particular memory range, including the header.
>> +
>> +	 If the notes are too big, we may need to break up the transfer
>> +	 into smaller chunks.
>> +
>> +	 If the architecture returns an empty vector, that means there are
>> +	 no memory tag notes to write.  */
>> +      std::vector<gdb::byte_vector> memory_tag_notes;
>> +      memory_tag_notes
>> +	= gdbarch_create_memtag_notes_from_range (gdbarch,
>> +						  map.start_address,
>> +						  map.end_address);
> 
> Here too.
> 

Done.

>> +      /* Write notes to the core file.  */
>> +      for (gdb::byte_vector note : memory_tag_notes)
> 
> This copies each byte_vector in the iteration variable - iterate with
> const reference:
> 
>    for (const gdb::byte_vector &note : memory_tag_notes)
> 

Done.

>> diff --git a/gdb/linux-tdep.h b/gdb/linux-tdep.h
>> index 28b60e46579..1ff4d008df4 100644
>> --- a/gdb/linux-tdep.h
>> +++ b/gdb/linux-tdep.h
>> @@ -26,6 +26,19 @@
>>   struct inferior;
>>   struct regcache;
>>   
>> +#define NT_MEMTAG_HEADER_SIZE (sizeof (uint16_t) + 2 * sizeof (uint64_t))
>> +
>> +/* Generic NT_MEMTAG header.  */
>> +struct tag_dump_header
>> +{
>> +  /* Tag format.  */
>> +  uint16_t format;
>> +  /* Start address of the tagged range.  */
>> +  uint64_t start_vma;
>> +  /* End address of the tagged range.  */
>> +  uint64_t end_vma;
>> +};
> 
> This struct is used once, in aarch64_linux_decode_memtag_note, but where
> it's used it's not very useful, it could very well be replaced with 3
> local variables, and then it could be removed.  I'm guessing that an
> earlier version of this patch used this struct to read the data contents
> directly from the buffer, but that didn't work with different byte
> orders?
> 

Like the MTE NT_MEMTAG header, this is mostly for documentation 
purposes. We don't have a lot of use for the header structs right now, 
but may have more uses for them in the future. We may want to pass the 
header around and access it in a simpler way than just using local 
variables and skipping individual sizes.

This is because we're using a packed struct and we don't want to run 
into issues with alignments and padding inserted by the compilers, or 
having to handle endianness differences at the corelow.c level.

I'll do the same as I've done with the NT_MEMTAG_MTE_HEADER_SIZE case 
and will document this struct alongside the definition of 
NT_MEMTAG_HEADER_SIZE (in a new generic file, memtag.h).

>> diff --git a/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp
>> new file mode 100644
>> index 00000000000..bb529a8b369
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp
>> @@ -0,0 +1,115 @@
>> +# Copyright (C) 2018-2021 Free Software Foundation, Inc.
>> +#
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +# This file is part of the gdb testsuite.
>> +
>> +# Test generating and reading a core file with MTE memory tags.
>> +
>> +if {![is_aarch64_target]} {
>> +    verbose "Skipping ${gdb_test_file_name}."
>> +    return
>> +}
>> +
>> +standard_testfile
>> +if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile}] } {
>> +    return -1
>> +}
>> +
>> +if ![runto_main] {
>> +    untested "could not run to main"
>> +    return -1
>> +}
>> +
>> +# Targets that don't support memory tagging should not execute the
>> +# runtime memory tagging tests.
>> +if {![supports_memtag]} {
>> +    unsupported "memory tagging unsupported"
>> +    return -1
>> +}
>> +
>> +gdb_breakpoint "access_memory"
>> +
>> +if [gdb_continue "access_memory"] {
>> +    return -1
>> +}
>> +
>> +# Set each tag granule to a different tag value, from 0x0 to 0xf.
>> +set atag_msg "Allocation tag\\(s\\) updated successfully\."
>> +for {set i 15} {$i >= 0} {incr i -1} {
>> +    set index [expr [expr 15 - $i] * 16]
>> +    set tag [format "%02x" $i]
>> +    gdb_test "memory-tag set-allocation-tag &tagged_ptr\[$index\] 1 $tag" \
>> +	     $atag_msg \
>> +	     "set memory tag of &tagged_ptr\[$index\] to $tag"
>> +}
>> +
>> +# Run until a crash and confirm GDB displays memory tag violation
>> +# information.
>> +gdb_test "continue" \
>> +    [multi_line \
>> +	"Program received signal SIGSEGV, Segmentation fault" \
>> +	"Memory tag violation while accessing address $hex" \
>> +	"Allocation tag $hex" \
>> +	"Logical tag $hex\." \
>> +	"$hex in access_memory \\(.*\\) at .*" \
>> +	".*tagged_ptr\\\[0\\\] = 'a';"] \
>> +	 "display tag violation information for live process"
>> +
>> +# Generate the core file.
>> +set core_filename [standard_output_file "$testfile.core"]
>> +set core_generated [gdb_gcore_cmd "$core_filename" "generate core file"]
>> +
>> +if { !$core_generated } {
>> +    return -1
>> +}
>> +
>> +clean_restart
>> +
>> +# Load the program file.
>> +set program_filename [standard_output_file $testfile]
>> +set program_loaded [gdb_file_cmd $program_filename]
> 
> Can the two lines above just be replaced with passing $binfile to
> clean_restart?

Done now. Thanks.

next prev parent reply	other threads:[~2021-05-31 14:12 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18 20:20 [PATCH] " Luis Machado
2021-05-19 10:01 ` David Spickett
2021-05-19 11:11   ` Luis Machado
2021-05-19 12:13 ` Eli Zaretskii
2021-05-21 15:12 ` Alan Hayward
2021-05-21 15:30   ` Luis Machado
2021-05-21 17:20     ` John Baldwin
2021-05-24 13:41       ` Luis Machado
2021-05-24  8:07     ` Alan Hayward
2021-05-24 12:45       ` Luis Machado
2021-05-26 14:08 ` [PATCH,v2] " Luis Machado
2021-05-29  3:14   ` Simon Marchi
2021-05-31 14:12     ` Luis Machado [this message]
2021-05-31 14:49       ` Simon Marchi
2021-05-31 14:56         ` Luis Machado
2021-05-31 14:15   ` [PATCH,v3][AArch64] " Luis Machado
2021-05-31 16:44 ` [PATCH,v4][AArch64] " Luis Machado
2021-06-01 17:45 ` [PATCH,v5][AArch64] " Luis Machado
2021-06-15 14:10   ` [Ping][PATCH,v5][AArch64] " Luis Machado
2021-06-24 14:00   ` [PATCH,v5][AArch64] " Alan Hayward
2021-06-24 14:37     ` Luis Machado
2021-06-24 15:18       ` Alan Hayward
2021-07-01 13:50   ` [PING][PATCH,v5][AArch64] " Luis Machado
2021-07-11 14:22     ` Joel Brobecker
2021-07-14 13:07       ` Catalin Marinas
2021-07-29  2:26         ` Simon Marchi
2021-07-29 16:03           ` John Baldwin
2021-07-29 18:10           ` Catalin Marinas
2021-07-29 18:20             ` Simon Marchi
2021-08-01 15:44               ` Joel Brobecker
2021-08-02 12:06                 ` Luis Machado
2021-07-19 19:05   ` Luis Machado
2021-07-27 16:10   ` Luis Machado
2022-03-31 14:03 [AArch64] " Luis Machado
2022-04-21 15:20 ` [PATCH, v2] " Luis Machado
2022-04-21 15:52   ` Eli Zaretskii
2022-04-22  8:12     ` Luis Machado
2022-04-22  8:30       ` Eli Zaretskii
2022-04-22  8:37         ` Luis Machado
2022-04-22  8:43           ` Eli Zaretskii
2022-04-22  8:44             ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ae84e2c-f0c4-bcac-1e5d-b9b34d8521db@linaro.org \
    --to=luis.machado@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=david.spickett@linaro.org \
    --cc=gdb-patches@sourceware.org \
    --cc=jhb@FreeBSD.org \
    --cc=simon.marchi@polymtl.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).