Re: [PATCH v3 05/16] [gdb/aarch64] sme: Enable SME registers and pseudo-registers

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Luis Machado <luis.machado@arm.com>
To: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH v3 05/16] [gdb/aarch64] sme: Enable SME registers and pseudo-registers
Date: Thu, 27 Jul 2023 10:01:50 +0100	[thread overview]
Message-ID: <317fa7b7-800e-87e0-1118-acaf9d1c1008@arm.com> (raw)
In-Reply-To: <87pm4e8ms4.fsf@linaro.org>

On 7/26/23 21:01, Thiago Jung Bauermann wrote:
> 
> Hello,
> 
> Just minor nits...
> 
> 
> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> The SME (Scalable Matrix Extension) [1] exposes a new matrix register ZA with
>> variable sizes.  It also exposes a new mode called streaming mode.
>>
>> Similarly to SVE, the ZA register size is dictated by a vector length, but the
>> SME vector length is called streaming vetor length. The total size for
> 
> s/vetor/vector/
> 

Fixed.

>> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
>> index a775631e9ec..cb97b205c86 100644
>> --- a/gdb/aarch64-linux-nat.c
>> +++ b/gdb/aarch64-linux-nat.c
>> @@ -55,6 +55,7 @@
>>  #include "arch/aarch64-mte-linux.h"
>>  
>>  #include "nat/aarch64-mte-linux-ptrace.h"
>> +#include "arch/aarch64-scalable-linux.h"
>>  
>>  #include <string.h>
>>  
>> @@ -313,8 +314,11 @@ store_fpregs_to_thread (const struct regcache *regcache)
>>      }
>>  }
>>  
>> -/* Fill GDB's register array with the sve register values
>> -   from the current thread.  */
>> +/* Fill GDB's REGCACHE with the valid SVE register values from the current
>> +   thread.
> 
> The function uses the regcache's ptid to identify the thread, so it's
> not from the current thread that it gets the register values.
> 

Yeah, this is a bit confusing. I think what happens here (historically) is that we say current thread
but we mean "this regcache's thread". And copy/pasting over the years perpetuates it.

I've changed it from current thread to the thread associated with REGCACHE.

>> +
>> +   This function handles reading data from SVE or SSVE states, depending
>> +   on which state is active at the moment.  */
>>  
>>  static void
>>  fetch_sveregs_from_thread (struct regcache *regcache)
>> @@ -323,8 +327,11 @@ fetch_sveregs_from_thread (struct regcache *regcache)
>>    aarch64_sve_regs_copy_to_reg_buf (regcache->ptid ().lwp (), regcache);
>>  }
>>  
>> -/* Store to the current thread the valid sve register
>> -   values in the GDB's register array.  */
>> +/* Store the valid SVE register values from GDB's REGCACHE to the current
>> +   thread.
> 
> Same comment about the current thread here.
> 

Fixed.

>> +
>> +   This function handles writing data to SVE or SSVE states, depending
>> +   on which state is active at the moment.  */
>>  
>>  static void
>>  store_sveregs_to_thread (struct regcache *regcache)
>> @@ -334,6 +341,41 @@ store_sveregs_to_thread (struct regcache *regcache)
>>    aarch64_sve_regs_copy_from_reg_buf (regcache->ptid ().lwp (), regcache);
>>  }
>>  
>> +/* Fill GDB's REGCACHE with the ZA register set contents from the
>> +   current thread.  If there is no active ZA registe state, make the
>> +   ZA register contents zero.  */
> 
> Same comment about the current thread here.
> Also, s/registe/register/
> 

Fixed.

>> +
>> +static void
>> +fetch_za_from_thread (struct regcache *regcache)
>> +{
>> +  aarch64_gdbarch_tdep *tdep
>> +    = gdbarch_tdep<aarch64_gdbarch_tdep> (regcache->arch ());
>> +
>> +  /* Read ZA state from the thread to the register cache.  */
>> +  aarch64_za_regs_copy_to_reg_buf (regcache->ptid ().lwp (),
>> +				   regcache,
>> +				   tdep->sme_za_regnum,
>> +				   tdep->sme_svg_regnum,
>> +				   tdep->sme_svcr_regnum);
>> +}
>> +
>> +/* Store the NT_ARM_ZA register set contents from GDB's REGCACHE to the current
>> +   thread.  */
> 
> Same comment about the current thread here.
> 

Fixed.

>> +
>> +static void
>> +store_za_to_thread (struct regcache *regcache)
>> +{
>> +  aarch64_gdbarch_tdep *tdep
>> +    = gdbarch_tdep<aarch64_gdbarch_tdep> (regcache->arch ());
>> +
>> +  /* Write ZA state from the register cache to the thread.  */
>> +  aarch64_za_regs_copy_from_reg_buf (regcache->ptid ().lwp (),
>> +				     regcache,
>> +				     tdep->sme_za_regnum,
>> +				     tdep->sme_svg_regnum,
>> +				     tdep->sme_svcr_regnum);
>> +}
>> +
>> @@ -890,21 +959,27 @@ aarch64_linux_nat_target::thread_architecture (ptid_t ptid)
>>    if (gdbarch_bfd_arch_info (inf->gdbarch)->bits_per_word == 32)
>>      return inf->gdbarch;
>>  
>> -  /* Only return it if the current vector length matches the one in the tdep.  */
>> +  /* Only return the inferior's gdbarch if both vq and svq match the ones in
>> +     the tdep.  */
>>    aarch64_gdbarch_tdep *tdep
>>      = gdbarch_tdep<aarch64_gdbarch_tdep> (inf->gdbarch);
>>    uint64_t vq = aarch64_sve_get_vq (ptid.lwp ());
>> -  if (vq == tdep->vq)
>> +  uint64_t svq = aarch64_za_get_svq (ptid.lwp ());
>> +  if (vq == tdep->vq && svq == tdep->sme_svq)
>>      return inf->gdbarch;
>>  
>> -  /* We reach here if the vector length for the thread is different from its
>> +  /* We reach here if any vector length for the thread is different from its
>>       value at process start.  Lookup gdbarch via info (potentially creating a
>> -     new one) by using a target description that corresponds to the new vq value
>> -     and the current architecture features.  */
>> +     new one) by using a target description that corresponds to the new vq/svq
>> +     value and the current architecture features.  */
>>  
>>    const struct target_desc *tdesc = gdbarch_target_desc (inf->gdbarch);
>>    aarch64_features features = aarch64_features_from_target_desc (tdesc);
>>    features.vq = vq;
>> +  /* We cast to uint8_t here because the features struct can get large, and it
>> +     is important to store the information in as little storage as
>> +     possible.  */
> 
> Is it because features is used as a key in tdesc_aarch64_map? Mentioning
> in the comment why it needs to be small would be useful.
> 

I've changed this to:

/* The svq value in the features struct is stored as uint8_t, so cast it
   properly in here.  */

It makes me wonder if this cast is useful though. Technically we will implicitly cast
it to uint8_t anyway, so the value will be truncated. When we hash things into feature bits, it
will be handled properly.

I'm considering removing the comment altogether and just doing the assignment with the implicit
conversion.

>> +  features.svq = (uint8_t) svq;
>>  
>>    struct gdbarch_info info;
>>    info.bfd_arch_info = bfd_lookup_arch (bfd_arch_aarch64, bfd_mach_aarch64);
> 
>> @@ -2802,6 +3124,122 @@ aarch64_pseudo_read_value_1 (struct gdbarch *gdbarch,
>>    return result_value;
>>   }
>>  
>> +/* Helper function for reading/writing ZA pseudo-registers.  Given REGNUM,
>> +   a ZA pseudo-register number, return, in OFFSETS, the information on positioning
>> +   of the bytes that must be read from/written to.  */
>> +
>> +static void
>> +aarch64_za_offsets_from_regnum (struct gdbarch *gdbarch, int regnum,
>> +				struct za_offsets &offsets)
>> +{
>> +  aarch64_gdbarch_tdep *tdep = gdbarch_tdep<aarch64_gdbarch_tdep> (gdbarch);
>> +
>> +  gdb_assert (tdep->has_sme ());
>> +  gdb_assert (tdep->sme_svq > 0);
>> +  gdb_assert (tdep->sme_pseudo_base <= regnum);
>> +  gdb_assert (regnum < tdep->sme_pseudo_base + tdep->sme_pseudo_count);
>> +
>> +  struct za_pseudo_encoding encoding;
>> +
>> +  /* Decode the ZA pseudo-register number.  */
>> +  aarch64_za_decode_pseudos (gdbarch, regnum, encoding);
>> +
>> +  /* Fetch the streaming vector length.  */
>> +  size_t svl = sve_vl_from_vq (tdep->sme_svq);
>> +
>> +  if (is_sme_tile_slice_pseudo_register (gdbarch, regnum))
>> +    {
>> +      if (encoding.horizontal)
>> +	{
>> +	  /* Horizontal tile slices are contiguous ranges of svl bytes.  */
>> +
>> +	  /* The starting offset depends on the tile index (to locate the tile
>> +	     in the ZA buffer), the slice index (to locate the slice within the
>> +	     tile) and the qualifier.  */
>> +	  offsets.starting_offset
>> +	    = encoding.tile_index * svl + encoding.slice_index
>> +					  * (svl >> encoding.qualifier_index);
>> +	  /* Horizontal tile slice data is contiguous and thus doesn't have
>> +	     a stride.  */
>> +	  offsets.stride_size = 0;
>> +	  /* Horizontal tile slice data is contiguous and thus only has 1
>> +	     chunk.  */
>> +	  offsets.chunks = 1;
>> +	  /* The chunk size is always svl bytes.  */
>> +	  offsets.chunk_size = svl;
>> +	}
>> +      else
>> +	{
>> +	  /* Vertical tile slices are non-contiguous ranges of
>> +	     (1 << qualifier_index) bytes.  */
>> +
>> +	  /* The starting offset depends on the tile number (to locate the
>> +	     tile in the ZA buffer), the slice index (to locate the element
>> +	     within the tile slice) and the qualifier.  */
>> +	  offsets.starting_offset
>> +	    = encoding.tile_index * svl + encoding.slice_index
>> +					  * (1 << encoding.qualifier_index);
>> +	  /* The offset between vertical tile slices depends on the qualifier
>> +	     and svl.  */
>> +	  offsets.stride_size = svl << (encoding.qualifier_index);
> 
> The parentheses above are unnecessary.
> 

Yeah. Artifact of intermediate changes. Fixed all 3 cases now.

>> +	  /* The number of chunks depends on svl and the qualifier size.  */
>> +	  offsets.chunks = svl >> encoding.qualifier_index;
>> +	  /* The chunk size depends on the qualifier.  */
>> +	  offsets.chunk_size = 1 << encoding.qualifier_index;
>> +	}
>> +    }
>> +  else
>> +    {
>> +      /* ZA tile pseudo-register.  */
>> +
>> +      /* Starting offset depends on the tile index and qualifier.  */
>> +      offsets.starting_offset = (encoding.tile_index) * svl;
> 
> The parentheses above are unnecessary.
> 
>> +      /* The offset between tile slices depends on the qualifier and svl.  */
>> +      offsets.stride_size = svl << (encoding.qualifier_index);
> 
> The parentheses above are unnecessary.
> 
>> +      /* The number of chunks depends on the qualifier and svl.  */
>> +      offsets.chunks = svl >> encoding.qualifier_index;
>> +      /* The chunk size is always svl bytes.  */
>> +      offsets.chunk_size = svl;
>> +    }
>> +}
> 
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +bool
>> +write_sve_header (int tid, const struct user_sve_header &header)
>> +{
>> +  struct iovec iovec;
>> +
>> +  iovec.iov_len = sizeof (header);
>> +  iovec.iov_base = (void *) &header;
> 
> Unnecessary cast.
> 

Turns out this is necessary, as otherwise we'll get a complaint about trying to assign a (const void *)
to (void *). Same for the other const struct cases where we write headers.

>> +
>> +  if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_SVE, &iovec) < 0)
>> +    {
>> +      /* SVE is not supported.  */
>> +      return false;
>> +    }
>> +  return true;
>> +}
>> +
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +bool
>> +read_ssve_header (int tid, struct user_sve_header &header)
>> +{
>> +  struct iovec iovec;
>> +
>> +  iovec.iov_len = sizeof (header);
>> +  iovec.iov_base = &header;
>> +
>> +  if (ptrace (PTRACE_GETREGSET, tid, NT_ARM_SSVE, &iovec) < 0)
>> +    {
>> +      /* SSVE is not supported.  */
>> +      return false;
>> +    }
>> +  return true;
>> +}
>> +
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +bool
>> +write_ssve_header (int tid, const struct user_sve_header &header)
>> +{
>> +  struct iovec iovec;
>> +
>> +  iovec.iov_len = sizeof (header);
>> +  iovec.iov_base = (void *) &header;
> 
> Unnecessary cast.
> 
>> +
>> +  if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_SSVE, &iovec) < 0)
>> +    {
>> +      /* SSVE is not supported.  */
>> +      return false;
>> +    }
>> +  return true;
>> +}
>> +
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +bool
>> +read_za_header (int tid, struct user_za_header &header)
>> +{
>> +  struct iovec iovec;
>> +
>> +  iovec.iov_len = sizeof (header);
>> +  iovec.iov_base = &header;
>> +
>> +  if (ptrace (PTRACE_GETREGSET, tid, NT_ARM_ZA, &iovec) < 0)
>> +    {
>> +      /* ZA is not supported.  */
>> +      return false;
>> +    }
>> +  return true;
>> +}
>> +
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +bool
>> +write_za_header (int tid, const struct user_za_header &header)
>> +{
>> +  struct iovec iovec;
>> +
>> +  iovec.iov_len = sizeof (header);
>> +  iovec.iov_base = (void *) &header;
> 
> Unnecessary cast.
> 
>> +
>> +  if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_ZA, &iovec) < 0)
>> +    {
>> +      /* ZA is not supported.  */
>> +      return false;
>> +    }
>> +  return true;
>> +}
> 
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +bool
>> +aarch64_store_za_regset (int tid, const gdb::byte_vector &za_state)
>> +{
>> +  struct iovec iovec;
>> +  iovec.iov_len = za_state.size ();
>> +  iovec.iov_base = (void *) za_state.data ();
> 
> Unnecessary cast.
> 
>> +
>> +  if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_ZA, &iovec) < 0)
>> +    perror_with_name (_("Failed to write to the NT_ARM_ZA register set."));
>> +
>> +  return true;
>> +}
>> +
>> +/* See nat/aarch64-scalable-linux-ptrace.h.  */
>> +
>> +void
>> +aarch64_initialize_za_regset (int tid)
>> +{
>> +  /* First fetch the NT_ARM_ZA header so we can fetch the streaming vector
>> +     length.  */
>> +  struct user_za_header header;
>> +  if (!read_za_header (tid, header))
>> +    error (_("Failed to read NT_ARM_ZA header."));
>> +
>> +  /* The vector should be default-initialized to zero, and we should account
>> +     for the payload as well.  */
>> +  std::vector<gdb_byte> za_new_state (ZA_PT_SIZE (sve_vq_from_vl (header.vl)));
>> +
>> +  /* Adjust the header size since we are adding the initialized ZA
>> +     payload.  */
>> +  header.size = ZA_PT_SIZE (sve_vq_from_vl (header.vl));
>> +
>> +  /* Overlay the modified header onto the new ZA state.  */
>> +  const gdb_byte *base = (gdb_byte *) &header;
>> +  memcpy (za_new_state.data (), base, sizeof (user_za_header));
>> +
>> +  /* Set the ptrace request up and update the NT_ARM_ZA register set.  */
>> +  struct iovec iovec;
>> +  iovec.iov_len = za_new_state.size ();
>> +  iovec.iov_base = (void *) za_new_state.data ();
> 
> Unnecessary cast.
> 

This one is indeed not required. Fixed now.

>> +
>> +  if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_ZA, &iovec) < 0)
>> +    perror_with_name (_("Failed to initialize the NT_ARM_ZA register set."));
>> +
>> +  /* The NT_ARM_ZA register set should now contain a zero-initialized ZA
>> +     payload.  */
>> +}
>> +
> 

Thanks!

next prev parent reply	other threads:[~2023-07-27  9:02 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-30 13:46 [PATCH v3 00/16] SME support for AArch64 gdb/gdbserver on Linux Luis Machado
2023-06-30 13:46 ` [PATCH v3 01/16] [gdb/aarch64] Fix register fetch/store order for native AArch64 Linux Luis Machado
2023-07-24 15:53   ` Thiago Jung Bauermann
2023-07-24 16:26     ` Luis Machado
2023-06-30 13:46 ` [PATCH v3 02/16] [gdb/aarch64] refactor: Rename SVE-specific files Luis Machado
2023-06-30 13:46 ` [PATCH v3 03/16] [gdb/gdbserver] refactor: Simplify SVE interface to read/write registers Luis Machado
2023-07-24 16:19   ` Thiago Jung Bauermann
2023-07-25  9:28     ` Luis Machado
2023-06-30 13:46 ` [PATCH v3 04/16] [gdb/aarch64] sve: Fix return command when using V registers in a SVE-enabled target Luis Machado
2023-06-30 13:46 ` [PATCH v3 05/16] [gdb/aarch64] sme: Enable SME registers and pseudo-registers Luis Machado
2023-07-26 20:01   ` Thiago Jung Bauermann
2023-07-27  9:01     ` Luis Machado [this message]
2023-07-28  1:19       ` Thiago Jung Bauermann
2023-06-30 13:46 ` [PATCH v3 06/16] [gdbserver/aarch64] refactor: Adjust expedited registers dynamically Luis Machado
2023-06-30 13:46 ` [PATCH v3 07/16] [gdbserver/aarch64] sme: Add support for SME Luis Machado
2023-07-27 19:41   ` Thiago Jung Bauermann
2023-06-30 13:46 ` [PATCH v3 08/16] [gdb/aarch64] sve: Fix signal frame z/v register restore Luis Machado
2023-07-27 21:52   ` Thiago Jung Bauermann
2023-07-31 12:22     ` Luis Machado
2023-06-30 13:46 ` [PATCH v3 09/16] [gdb/aarch64] sme: Signal frame support Luis Machado
2023-07-27 22:25   ` Thiago Jung Bauermann
2023-07-31 12:23     ` Luis Machado
2023-06-30 13:46 ` [PATCH v3 10/16] [gdb/aarch64] sme: Fixup sigframe gdbarch when vg/svg changes Luis Machado
2023-07-28  1:01   ` Thiago Jung Bauermann
2023-07-31 12:27     ` Luis Machado
2023-06-30 13:46 ` [PATCH v3 11/16] [gdb/aarch64] sme: Support TPIDR2 signal frame context Luis Machado
2023-06-30 13:46 ` [PATCH v3 12/16] [gdb/generic] corefile/bug: Use thread-specific gdbarch when dumping register state to core files Luis Machado
2023-06-30 13:46 ` [PATCH v3 13/16] [gdb/generic] corefile/bug: Fixup (gcore) core file target description reading order Luis Machado
2023-07-28  3:12   ` Thiago Jung Bauermann
2023-07-31 11:38     ` Luis Machado
2023-09-05  8:28     ` Luis Machado
2023-06-30 13:46 ` [PATCH v3 14/16] [gdb/aarch64] sme: Core file support for Linux Luis Machado
2023-08-03  0:18   ` Thiago Jung Bauermann
2023-08-03 11:37     ` Luis Machado
2023-08-04 20:45       ` Thiago Jung Bauermann
2023-06-30 13:46 ` [PATCH v3 15/16] [gdb/testsuite] sme: Add SVE/SME testcases Luis Machado
2023-08-04  0:59   ` Thiago Jung Bauermann
2023-08-11 15:42     ` Luis Machado
2023-08-12  0:42       ` Thiago Jung Bauermann
2023-06-30 13:46 ` [PATCH v3 16/16] [gdb/docs] sme: Document SME registers and features Luis Machado
2023-07-01  8:58   ` Eli Zaretskii
2023-07-03  9:52     ` Luis Machado
2023-07-03 12:03       ` Eli Zaretskii
2023-07-03 12:06         ` Luis Machado
2023-07-17 11:40 ` [PING][PATCH v3 00/16] SME support for AArch64 gdb/gdbserver on Linux Luis Machado
2023-07-24  8:15 ` Luis Machado
2023-08-04 21:24 ` [PATCH " Thiago Jung Bauermann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=317fa7b7-800e-87e0-1118-acaf9d1c1008@arm.com \
    --to=luis.machado@arm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=thiago.bauermann@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).