From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <alves.ped@gmail.com>
Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com
 [209.85.128.49])
 by sourceware.org (Postfix) with ESMTPS id E246B38387B6
 for <gdb-patches@sourceware.org>; Mon, 27 Jun 2022 14:51:03 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E246B38387B6
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=palves.net
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-wm1-f49.google.com with SMTP id
 h14-20020a1ccc0e000000b0039eff745c53so5784963wmb.5
 for <gdb-patches@sourceware.org>; Mon, 27 Jun 2022 07:51:03 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
 :content-language:to:references:from:in-reply-to
 :content-transfer-encoding;
 bh=DPbasDGTe0FunhTN8EeOn9hPe7cUwAICMgpS0IiqVcQ=;
 b=cfm7+LbTPAJo/9eXITuv4gtR1Fpi8FRrpYfuZZ9yEqBSenClOL62FFlNHm14QflYZq
 2343beLKjfMq7Io6Ds8iI5D1gUmzU4KXnZWltBsGXrsX4ZVDz+FgILhwOFG3w9ebymNT
 O7ib70H/l4iF9pDAE045KaGwTfbpFoPDH7HEZjipUzFiVh7KiHa3lLXFu5ZBz+b6mz/7
 F25rfcbITqrBiK6IDd2/TbKjOlq52sLA+Mk4YDWrvrh72bRP4IBXADxupCVBtl3NJu6/
 JZRk+e3l3DhozNRAYcANyiwOTAO33su6YT6APr0w1n6AmYP3DR3aaoWwzNrr0WRi0c9N
 jN9g==
X-Gm-Message-State: AJIora9X924U+lJLV7d7wvDeCGhEqS+hdv4pfikQI9IqO5Fj/WR+StIG
 UolOfb5VSoPaXHv1fxz4+XoT13RnQPM=
X-Google-Smtp-Source: AGRyM1tE18zt6mXAMiZrLjkFCohtxW2MSOSeD/jyA7A48X5W10tSjGjiLja1d4SwVnFxf37kkBpgWw==
X-Received: by 2002:a1c:7517:0:b0:3a0:2d36:4dcc with SMTP id
 o23-20020a1c7517000000b003a02d364dccmr21032285wmc.21.1656341462669; 
 Mon, 27 Jun 2022 07:51:02 -0700 (PDT)
Received: from ?IPV6:2001:8a0:f924:2600:5b14:8ad0:780f:bdda?
 ([2001:8a0:f924:2600:5b14:8ad0:780f:bdda])
 by smtp.gmail.com with ESMTPSA id
 bk20-20020a0560001d9400b0021b8b998ca5sm10142426wrb.107.2022.06.27.07.51.01
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Mon, 27 Jun 2022 07:51:02 -0700 (PDT)
Message-ID: <c529aa29-1942-b196-7387-3f8c84f3158c@palves.net>
Date: Mon, 27 Jun 2022 15:51:00 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.10.0
Subject: Re: [PATCH, v4] [AArch64] MTE corefile support
Content-Language: en-US
To: Luis Machado <luis.machado@arm.com>, gdb-patches@sourceware.org
References: <20220331140343.9047-1-luis.machado@arm.com>
 <20220503215632.914608-1-luis.machado@arm.com>
From: Pedro Alves <pedro@palves.net>
In-Reply-To: <20220503215632.914608-1-luis.machado@arm.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00,
 FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0,
 HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jun 2022 14:51:09 -0000

Hi Luis,

On 2022-05-03 22:56, Luis Machado via Gdb-patches wrote:
> v4:
> 
> - Updated documentation (added cross-references).
> - Updated the segment name from PT_ARM_MEMTAG_MTE to
>   PT_AARCH64_MEMTAG_MTE.
> 
> v3:
> 
> - Updated NEWS and documentation to be more thorough.
> 
> v2:
> 
> - Rework memory tag section handling to use generic section fields.
> 
> --
> 
> Teach GDB how to dump memory tags for AArch64 when using the gcore command
> and how to read memory tag data back from a core file generated by GDB
> (via gcore) or by the Linux kernel.
> 
> The format is documented in the Linux Kernel documentation [1].
> 
> Each tagged memory range (listed in /proc/<pid>/smaps) gets dumped to its
> own PT_AARCH64_MEMTAG_MTE segment. 

...

> A section named ".memtag" is created for each
> of those segments when reading the core file back.

It's "memtag", not ".memtag", AFAICT.

> +/* AArch64 Linux implementation of the gdbarch_create_memtag_section hook.  */
> +
> +static asection *
> +aarch64_linux_create_memtag_section (struct gdbarch *gdbarch, bfd *obfd,
> +				     CORE_ADDR address, size_t size)
> +{
> +  gdb_assert (obfd != nullptr);
> +  gdb_assert (size > 0);
> +
> +  /* Create the section and associated program header.  */
> +  asection *mte_section = bfd_make_section_anyway (obfd, "memtag");
> +
> +  if (mte_section == nullptr)
> +    return nullptr;
> +
> +  bfd_set_section_vma (mte_section, address);
> +  /* The size of the memory range covered by the memory tags.  We reuse the
> +     section's rawsize field for this purpose.  */
> +  mte_section->rawsize = size;
> +  /* Tags are stored packed as 2 tags per byte.  */
> +  bfd_set_section_size (mte_section, (size / AARCH64_MTE_GRANULE_SIZE) / 2);

Shouldn't the /2 division round up, rather than down, to account for odd number
of tags?  I.e., this:

   bfd_set_section_size (mte_section, (size / AARCH64_MTE_GRANULE_SIZE + 1) / 2);

> +  /* Make sure the section's flags has SEC_HAS_CONTENTS, otherwise BFD will
> +     refuse to write data to this section.  */
> +  bfd_set_section_flags (mte_section, SEC_HAS_CONTENTS);

How about using bfd_make_section_anyway_with_flags further above?

> +
> +  /* Store program header information.  */
> +  bfd_record_phdr (obfd, PT_AARCH64_MEMTAG_MTE, 1, 0, 0, 0, 0, 0, 1,
> +		   &mte_section);
> +
> +  return mte_section;
> +}
> +


> +
> +/* AArch64 Linux implementation of the gdbarch_fill_memtag_section hook.  */
> +
> +static bool
> +aarch64_linux_fill_memtag_section (struct gdbarch *gdbarch, asection *osec)
> +{
> +  /* We only handle MTE tags for now.  */
> +
> +  size_t segment_size = osec->rawsize;
> +  CORE_ADDR start_address = bfd_section_vma (osec);
> +  CORE_ADDR end_address = start_address + segment_size;
> +
> +  /* Figure out how many tags we need to store in this memory range.  */
> +  size_t granules = aarch64_mte_get_tag_granules (start_address, segment_size,
> +						  AARCH64_MTE_GRANULE_SIZE);
> +
> +  /* If there are no tag granules to fetch, just return.  */
> +  if (granules == 0)
> +    return true;
> +
> +  CORE_ADDR address = start_address;
> +
> +  /* Vector of tags.  */
> +  gdb::byte_vector tags;
> +
> +  while (granules > 0)
> +    {
> +      /* Transfer tags in chunks.  */
> +      gdb::byte_vector tags_read;
> +      size_t xfer_len
> +	= (granules >= MAX_TAGS_TO_TRANSFER)?
> +	  MAX_TAGS_TO_TRANSFER * AARCH64_MTE_GRANULE_SIZE :
> +	  granules * AARCH64_MTE_GRANULE_SIZE;

'?' and ':' go on the next line, like other operators.  Wrap expression in
parens for proper alignment, per GNU standards.  So:

  = ((granules >= MAX_TAGS_TO_TRANSFER)
     ? MAX_TAGS_TO_TRANSFER * AARCH64_MTE_GRANULE_SIZE
     : granules * AARCH64_MTE_GRANULE_SIZE);

(note how you can read "?" as "then" and ":" as "else", this way.)

> +
> +      if (!target_fetch_memtags (address, xfer_len, tags_read,
> +				 static_cast<int> (memtag_type::allocation)))
> +	{
> +	  warning (_("Failed to read MTE tags from memory range [%s,%s)."),
> +		     phex_nz (start_address, sizeof (start_address)),
> +		     phex_nz (end_address, sizeof (end_address)));
> +	  return false;
> +	}
> +
> +      /* Transfer over the tags that have been read.  */
> +      tags.insert (tags.end (), tags_read.begin (), tags_read.end ());

(Just a passing comment, feel free to ignore: If target_fetch_memtags were adjusted to
append to the buffer instead of clearing it, you could pass "tags" directly to
target_fetch_memtags and get rid of this copying.)

> +
> +      /* Adjust the remaining granules and starting address.  */
> +      granules -= tags_read.size ();
> +      address += tags_read.size () * AARCH64_MTE_GRANULE_SIZE;
> +    }
> +
> +  /* Pack the MTE tag bits.  */
> +  aarch64_mte_pack_tags (tags);
> +
> +  if (!bfd_set_section_contents (osec->owner, osec, tags.data (),
> +				 0, tags.size ()))
> +    {
> +      warning (_("Failed to write %s bytes of corefile memory "
> +		 "tag content (%s)."),
> +	       pulongest (tags.size ()),
> +	       bfd_errmsg (bfd_get_error ()));
> +    }
> +  return true;
> +}
> +

> +
> +/* See arch/aarch64-mte-linux.h */
> +
>  size_t
>  aarch64_mte_get_tag_granules (CORE_ADDR addr, size_t len, size_t granule_size)
>  {
> diff --git a/gdb/arch/aarch64-mte-linux.h b/gdb/arch/aarch64-mte-linux.h
> index d158926feff..8a145b447aa 100644
> --- a/gdb/arch/aarch64-mte-linux.h
> +++ b/gdb/arch/aarch64-mte-linux.h
> @@ -32,6 +32,7 @@
>  
>  /* We have one tag per 16 bytes of memory.  */
>  #define AARCH64_MTE_GRANULE_SIZE 16
> +#define AARCH64_MTE_TAG_BIT_SIZE 4
>  #define AARCH64_MTE_LOGICAL_TAG_START_BIT 56
>  #define AARCH64_MTE_LOGICAL_MAX_VALUE 0xf
>  
> @@ -71,4 +72,13 @@ extern CORE_ADDR aarch64_mte_set_ltag (CORE_ADDR address, CORE_ADDR tag);
>     It is always possible to get the logical tag.  */
>  extern CORE_ADDR aarch64_mte_get_ltag (CORE_ADDR address);
>  
> +/* Given a TAGS vector containing 1 MTE tag per byte, pack the data as
> +   2 tags per byte and resize the vector.  */
> +void aarch64_mte_pack_tags (gdb::byte_vector &tags);
> +
> +/* Given a TAGS vector containing 2 MTE tags per byte, unpack the data as
> +   1 tag per byte and resize the vector.  If SKIP_FIRST is TRUE, skip the
> +   first unpacked element.  Otherwise leave it in the unpacked vector.  */
> +void aarch64_mte_unpack_tags (gdb::byte_vector &tags, bool skip_first);

All other declarations in the header have explicit "extern".

> +
>  #endif /* ARCH_AARCH64_LINUX_H */


> +/* Implementation of the "fetch_memtags" target_ops method.  */
> +
> +bool
> +core_target::fetch_memtags (CORE_ADDR address, size_t len,
> +			    gdb::byte_vector &tags, int type)
> +{
> +  struct gdbarch *gdbarch = target_gdbarch ();
> +
> +  /* Make sure we have a way to decode the memory tag notes.  */
> +  if (!gdbarch_decode_memtag_section_p (gdbarch))
> +    error (_("gdbarch_decode_memtag_section not implemented for this "
> +	     "architecture."));
> +
> +  memtag_section_info info;
> +  info.memtag_section = nullptr;
> +
> +  while (get_next_core_memtag_section (core_bfd, info.memtag_section,
> +				       address, info))
> +  {
> +    size_t adjusted_length
> +      = (address + len < info.end_address)? len : (info.end_address - address);

Space before "?".

> +
> +    /* Decode the memory tag note and return the tags.  */
> +    gdb::byte_vector tags_read
> +      = gdbarch_decode_memtag_section (gdbarch, info.memtag_section, type,
> +				       address, adjusted_length);
> +
> +    /* Transfer over the tags that have been read.  */
> +    tags.insert (tags.end (), tags_read.begin (), tags_read.end ());
> +
> +    /* ADDRESS + LEN may cross the boundaries of a particular memory tag
> +       segment.  Check if we need to fetch tags from a different section.  */
> +    if (!tags_read.empty () && (address + len) < info.end_address)
> +      return true;
> +
> +    /* There are more tags to fetch.  Update ADDRESS and LEN.  */
> +    len -= (info.end_address - address);
> +    address = info.end_address;
> +  }
> +
> +  return false;
> +}
> +
+


> +
> +bool
> +get_next_core_memtag_section (bfd *abfd, asection *section,
> +			      CORE_ADDR address, memtag_section_info &info)
> +{
> +  /* If the caller provided no SECTION to start from, search from the
> +     beginning.  */
> +  if (section == nullptr)
> +    section = bfd_get_section_by_name (abfd, "memtag");
> +
> +  /* Go through all the memtag sections and figure out if ADDRESS
> +     falls within one of the memory ranges that contain tags.  */
> +  while (section != nullptr)
> +    {
> +      size_t memtag_range_size = section->rawsize;
> +      size_t tags_size = bfd_section_size (section);
> +
> +      /* Empty memory range and empty tag dump should not happen.  */
> +      gdb_assert (memtag_range_size != 0);
> +      gdb_assert (tags_size != 0);

Seems like this could happen with corrupted input?  GDB should not assert
on bad input.

> +
> +      CORE_ADDR start_address = bfd_section_vma (section);
> +      CORE_ADDR end_address = start_address + memtag_range_size;
> +
> +      /* Is the address within [start_address, end_address)?  */
> +      if (address >= start_address
> +	  && address < end_address)
> +	{
> +	  info.start_address = start_address;
> +	  info.end_address = end_address;
> +	  info.memtag_section = section;
> +	  return true;
> +	}
> +      section = bfd_get_next_section_by_name (abfd, section);
> +    }
> +  return false;
> +}


> --- /dev/null
> +++ b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.c
> @@ -0,0 +1,93 @@
> +/* This test program is part of GDB, the GNU debugger.
> +
> +   Copyright 2021 Free Software Foundation, Inc.

2022 missing.


> diff --git a/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp
> new file mode 100644
> index 00000000000..8a19c4b449e
> --- /dev/null
> +++ b/gdb/testsuite/gdb.arch/aarch64-mte-gcore.exp
> @@ -0,0 +1,107 @@
> +# Copyright (C) 2018-2021 Free Software Foundation, Inc.

2021 -> 2022.

> +# Run until a crash and confirm GDB displays memory tag violation
> +# information.
> +gdb_test "continue" \
> +    [multi_line \
> +	"Program received signal SIGSEGV, Segmentation fault" \
> +	"Memory tag violation while accessing address $hex" \
> +	"Allocation tag $hex" \
> +	"Logical tag $hex\." \
> +	"$hex in access_memory \\(.*\\) at .*" \
> +	".*tagged_ptr\\\[0\\\] = 'a';"] \
> +	 "display tag violation information for live process"
> +
> +# Generate the core file.
> +set core_filename [standard_output_file "$testfile.core"]
> +set core_generated [gdb_gcore_cmd "$core_filename" "generate core file"]

I'd suggest also making sure gdb is able to read kernel-generated core properly.
You can use core_find to produce one.