[PATCH][AArch64] SVE/FPSIMD fixup for big endian

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH][AArch64] SVE/FPSIMD fixup for big endian
@ 2020-11-30 18:55 Luis Machado
  2020-12-01 11:28 ` Alan Hayward
  2020-12-02 17:57 ` [PATCH,v2] " Luis Machado
  0 siblings, 2 replies; 13+ messages in thread
From: Luis Machado @ 2020-11-30 18:55 UTC (permalink / raw)
  To: gdb-patches

The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
structure follows the target endianness, whereas the SVE dumps are
endianness-independent (LE).

Therefore, when the system is in BE mode, we need to reverse the bytes
for the FPSIMD data.

Given the V registers are larger than 64-bit, I've added a way for value
bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
nicely with the unwinding *_got_bytes function and makes the trad-frame
more flexible and capable of saving larger registers.

The memory for the bytes is allocated via the frame obstack, so it gets freed
after we're done inspecting the frame.

gdb/ChangeLog:

YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>

	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
	(aarch64_maybe_swab128): New function.
	(aarch64_sve_regs_copy_to_reg_buf)
	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
	the data field.
	(TF_REG_VALUE_BYTES): New enum value.
	(trad_frame_value_bytes_p): New function.
	(trad_frame_set_value_bytes): New function.
	(trad_frame_set_reg_value_bytes): New function.
	(trad_frame_get_prev_register): Handle register values saved as bytes.
	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
	(struct trad_frame_saved_reg) <data>: New field.
	(trad_frame_set_value_bytes): New prototype.
	(trad_frame_value_bytes_p): New prototype.
---
 gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
 gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
 gdb/trad-frame.c                   |  46 +++++++++++-
 gdb/trad-frame.h                   |  19 +++++
 4 files changed, 213 insertions(+), 24 deletions(-)

diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
index c9898bdafd..108f96be71 100644
--- a/gdb/aarch64-linux-tdep.c
+++ b/gdb/aarch64-linux-tdep.c
@@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
   return magic;
 }
 
+/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
+   registers from a signal frame.
+
+   VREG_NUM is the number of the V register being restored, OFFSET is the
+   address containing the register value, BYTE_ORDER is the endianness and
+   HAS_SVE tells us if we have a valid SVE context or not.  */
+
+static void
+aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
+			    int vreg_num, CORE_ADDR offset,
+			    enum bfd_endian byte_order, bool has_sve)
+{
+  /* WARNING: SIMD state is laid out in memory in target-endian format, while
+     SVE state is laid out in an endianness-independent format (LE).
+
+     So we have a couple cases to consider:
+
+     1 - If the target is big endian, then SIMD state is big endian,
+     requiring a byteswap.
+
+     2 - If the target is little endian, then SIMD state is little endian,
+     which matches the SVE format, so no byteswap is needed. */
+
+  if (byte_order == BFD_ENDIAN_BIG)
+    {
+      gdb_byte buf[V_REGISTER_SIZE];
+
+      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
+	{
+	  size_t size = V_REGISTER_SIZE/2;
+
+	  /* Read the two halves of the V register in reverse byte order.  */
+	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
+						    byte_order);
+	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
+						    byte_order);
+
+	  /* Copy the reversed bytes to the buffer.  */
+	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
+	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
+
+	  /* Now we can store the correct bytes for the V register.  */
+	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
+					  buf, V_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_Q0_REGNUM
+					  + vreg_num, buf, Q_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_D0_REGNUM
+					  + vreg_num, buf, D_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_S0_REGNUM
+					  + vreg_num, buf, S_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_H0_REGNUM
+					  + vreg_num, buf, H_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_B0_REGNUM
+					  + vreg_num, buf, B_REGISTER_SIZE);
+
+	  if (has_sve)
+	    trad_frame_set_reg_value_bytes (cache,
+					    num_regs + AARCH64_SVE_V0_REGNUM
+					    + vreg_num, buf, V_REGISTER_SIZE);
+	}
+      return;
+    }
+
+  /* Little endian, just point at the address containing the register
+     value.  */
+  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
+			   offset);
+
+  if (has_sve)
+    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
+			     + vreg_num, offset);
+
+}
+
 /* Implement the "init" method of struct tramp_frame.  */
 
 static void
@@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
 
       /* If there was no SVE section then set up the V registers.  */
       if (sve_regs == 0)
-	for (int i = 0; i < 32; i++)
-	  {
-	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
+	{
+	  for (int i = 0; i < 32; i++)
+	    {
+	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
 
-	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_Q0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_D0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_S0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_H0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_B0_REGNUM + i, offset);
-	    if (tdep->has_sve ())
-	      trad_frame_set_reg_addr (this_cache,
-				       num_regs + AARCH64_SVE_V0_REGNUM + i,
-				       offset);
-	  }
+	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
+					  byte_order, tdep->has_sve ());
+	    }
+	}
     }
 
   trad_frame_set_id (this_cache, frame_id_build (sp, func));
diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
index 2ce90ccfd7..9ef5e91801 100644
--- a/gdb/nat/aarch64-sve-linux-ptrace.c
+++ b/gdb/nat/aarch64-sve-linux-ptrace.c
@@ -26,6 +26,7 @@
 #include "arch/aarch64.h"
 #include "gdbsupport/common-regcache.h"
 #include "gdbsupport/byte-vector.h"
+#include <endian.h>
 
 /* See nat/aarch64-sve-linux-ptrace.h.  */
 
@@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
   return buf;
 }
 
+/* If we are running in BE mode, convert the contents
+   of VALUE (a 16 byte buffer) to LE.  */
+
+static void
+aarch64_maybe_swab128 (gdb_byte *value)
+{
+  gdb_assert (value != nullptr);
+
+#if (__BYTE_ORDER == __BIG_ENDIAN)
+  gdb_byte copy[16];
+
+  /* Save the original value.  */
+  memcpy (copy, value, 16);
+
+  for (int i = 0; i < 15; i++)
+    value[i] = copy[15 - i];
+#else
+  /* Nothing to be done.  */
+#endif
+}
+
 /* See nat/aarch64-sve-linux-ptrace.h.  */
 
 void
@@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
     }
   else
     {
+      /* WARNING: SIMD state is laid out in memory in target-endian format,
+	 while SVE state is laid out in an endianness-independent format (LE).
+
+	 So we have a couple cases to consider:
+
+	 1 - If the target is big endian, then SIMD state is big endian,
+	 requiring a byteswap.
+
+	 2 - If the target is little endian, then SIMD state is little endian,
+	 which matches the SVE format, so no byteswap is needed. */
+
       /* There is no SVE state yet - the register dump contains a fpsimd
 	 structure instead.  These registers still exist in the hardware, but
 	 the kernel has not yet initialised them, and so they will be null.  */
 
-      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
+      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
       struct user_fpsimd_state *fpsimd
 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
 
@@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
 
       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
 	{
-	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
+	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
+	  /* Handle big endian/little endian SIMD/SVE conversion.  */
+	  aarch64_maybe_swab128 (zero_reg);
 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
 	}
 
@@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	 kernel, which is why we try to avoid it.  */
 
       bool has_sve_state = false;
-      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
+      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
       struct user_fpsimd_state *fpsimd
 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
 
@@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	 write out state and return.  */
       if (!has_sve_state)
 	{
+	  /* WARNING: SIMD state is laid out in memory in target-endian format,
+	     while SVE state is laid out in an endianness-independent format
+	     (LE).
+
+	     So we have a couple cases to consider:
+
+	     1 - If the target is big endian, then SIMD state is big endian,
+	     requiring a byteswap.
+
+	     2 - If the target is little endian, then SIMD state is little
+	     endian, which matches the SVE format, so no byteswap is needed. */
+
 	  /* The collects of the Z registers will overflow the size of a vreg.
 	     There is enough space in the structure to allow for this, but we
 	     cannot overflow into the next register as we might not be
@@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
 		{
 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
-		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
+		  /* Handle big endian/little endian SIMD/SVE conversion.  */
+		  aarch64_maybe_swab128 (zero_reg);
+		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
 		}
 	    }
 
diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
index a6a84790a9..8a1aa818ad 100644
--- a/gdb/trad-frame.c
+++ b/gdb/trad-frame.c
@@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
     {
       regs[regnum].realreg = regnum;
       regs[regnum].addr = -1;
+      regs[regnum].data = nullptr;
     }
 }
 
@@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
   return trad_frame_alloc_saved_regs (gdbarch);
 }
 
-enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
+enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
 
 int
 trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
@@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
 	  && this_saved_regs[regnum].addr == -1);
 }
 
+/* See trad-frame.h.  */
+
+bool
+trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
+			  int regnum)
+{
+  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
+	  && this_saved_regs[regnum].data != nullptr);
+}
+
 void
 trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
 		      int regnum, LONGEST val)
@@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
   this_saved_regs[regnum].addr = -1;
 }
 
+/* See trad-frame.h.  */
+
+void
+trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
+			    int regnum, const gdb_byte *bytes,
+			    size_t size)
+{
+  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
+
+  /* Allocate the space and copy the data bytes.  */
+  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
+  memcpy (this_saved_regs[regnum].data, bytes, size);
+}
+
+/* See trad-frame.h.  */
+
+void
+trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
+				int regnum, const gdb_byte *bytes,
+				size_t size)
+{
+  /* External interface for users of trad_frame_cache
+     (who cannot access the prev_regs object directly).  */
+  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
+			      size);
+}
+
+
+
 struct value *
 trad_frame_get_prev_register (struct frame_info *this_frame,
 			      struct trad_frame_saved_reg this_saved_regs[],
@@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
     /* The register's value is available.  */
     return frame_unwind_got_constant (this_frame, regnum,
 				      this_saved_regs[regnum].addr);
+  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
+    /* The register's value is available as a sequence of bytes.  */
+    return frame_unwind_got_bytes (this_frame, regnum,
+				   this_saved_regs[regnum].data);
   else
     return frame_unwind_got_optimized (this_frame, regnum);
 }
diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
index 7b5785616e..38db439579 100644
--- a/gdb/trad-frame.h
+++ b/gdb/trad-frame.h
@@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
 void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
 			       int regnum, LONGEST val);
 
+/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
+   contained in BYTES with size SIZE.  */
+void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
+				     int regnum, const gdb_byte *bytes,
+				     size_t size);
+
 struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
 				       struct frame_info *this_frame,
 				       int regnum);
@@ -86,6 +92,8 @@ struct trad_frame_saved_reg
 {
   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
   int realreg;
+  /* Register data (for values that don't fit in ADDR).  */
+  gdb_byte *data;
 };
 
 /* Encode REGNUM value in the trad-frame.  */
@@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
 void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
 			     int regnum);
 
+/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
+   useful when the value is larger than what primitive types can hold.  */
+void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
+				 int regnum, const gdb_byte *bytes,
+				 size_t size);
+
 /* Convenience functions, return non-zero if the register has been
    encoded as specified.  */
 int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
@@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
 int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
 			  int regnum);
 
+/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
+   otherwise.  */
+bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
+			      int regnum);
+
 /* Reset the save regs cache, setting register values to -1.  */
 void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
 				  struct trad_frame_saved_reg *regs);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian
  2020-11-30 18:55 [PATCH][AArch64] SVE/FPSIMD fixup for big endian Luis Machado
@ 2020-12-01 11:28 ` Alan Hayward
  2020-12-01 12:19   ` Luis Machado
  2020-12-02 17:57 ` [PATCH,v2] " Luis Machado
  1 sibling, 1 reply; 13+ messages in thread
From: Alan Hayward @ 2020-12-01 11:28 UTC (permalink / raw)
  To: Luis Machado; +Cc: gdb-patches\@sourceware.org, nd



> On 30 Nov 2020, at 18:55, Luis Machado <luis.machado@linaro.org> wrote:
> 
> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
> structure follows the target endianness, whereas the SVE dumps are
> endianness-independent (LE).
> 
> Therefore, when the system is in BE mode, we need to reverse the bytes
> for the FPSIMD data.
> 
> Given the V registers are larger than 64-bit, I've added a way for value
> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
> nicely with the unwinding *_got_bytes function and makes the trad-frame
> more flexible and capable of saving larger registers.
> 
> The memory for the bytes is allocated via the frame obstack, so it gets freed
> after we're done inspecting the frame.
> 
> gdb/ChangeLog:
> 
> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
> 
> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
> 	(aarch64_maybe_swab128): New function.
> 	(aarch64_sve_regs_copy_to_reg_buf)
> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
> 	the data field.
> 	(TF_REG_VALUE_BYTES): New enum value.
> 	(trad_frame_value_bytes_p): New function.
> 	(trad_frame_set_value_bytes): New function.
> 	(trad_frame_set_reg_value_bytes): New function.
> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
> 	(struct trad_frame_saved_reg) <data>: New field.
> 	(trad_frame_set_value_bytes): New prototype.
> 	(trad_frame_value_bytes_p): New prototype.
> ---
> gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
> gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
> gdb/trad-frame.c                   |  46 +++++++++++-
> gdb/trad-frame.h                   |  19 +++++
> 4 files changed, 213 insertions(+), 24 deletions(-)
> 
> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
> index c9898bdafd..108f96be71 100644
> --- a/gdb/aarch64-linux-tdep.c
> +++ b/gdb/aarch64-linux-tdep.c
> @@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>   return magic;
> }
> 
> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
> +   registers from a signal frame.
> +
> +   VREG_NUM is the number of the V register being restored, OFFSET is the
> +   address containing the register value, BYTE_ORDER is the endianness and
> +   HAS_SVE tells us if we have a valid SVE context or not.  */
> +
> +static void
> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
> +			    int vreg_num, CORE_ADDR offset,
> +			    enum bfd_endian byte_order, bool has_sve)
> +{
> +  /* WARNING: SIMD state is laid out in memory in target-endian format, while
> +     SVE state is laid out in an endianness-independent format (LE).
> +
> +     So we have a couple cases to consider:
> +
> +     1 - If the target is big endian, then SIMD state is big endian,
> +     requiring a byteswap.
> +
> +     2 - If the target is little endian, then SIMD state is little endian,
> +     which matches the SVE format, so no byteswap is needed. */
> +

With this function, we are only handling FPSIMD values, so no need to mention SVE.
As it is now, it makes the has_sve parts confusing because they are being treated
the same as the rest of the fpsimd.

The same comment is fine when used elsewhere in the patch.


> +  if (byte_order == BFD_ENDIAN_BIG)
> +    {
> +      gdb_byte buf[V_REGISTER_SIZE];
> +
> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
> +	{
> +	  size_t size = V_REGISTER_SIZE/2;
> +
> +	  /* Read the two halves of the V register in reverse byte order.  */
> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
> +						    byte_order);
> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
> +						    byte_order);
> +
> +	  /* Copy the reversed bytes to the buffer.  */
> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
> +
> +	  /* Now we can store the correct bytes for the V register.  */
> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
> +					  buf, V_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_Q0_REGNUM
> +					  + vreg_num, buf, Q_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_D0_REGNUM
> +					  + vreg_num, buf, D_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_S0_REGNUM
> +					  + vreg_num, buf, S_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_H0_REGNUM
> +					  + vreg_num, buf, H_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_B0_REGNUM
> +					  + vreg_num, buf, B_REGISTER_SIZE);
> +
> +	  if (has_sve)
> +	    trad_frame_set_reg_value_bytes (cache,
> +					    num_regs + AARCH64_SVE_V0_REGNUM
> +					    + vreg_num, buf, V_REGISTER_SIZE);
> +	}
> +      return;
> +    }
> +
> +  /* Little endian, just point at the address containing the register
> +     value.  */
> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
> +			   offset);
> +
> +  if (has_sve)
> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
> +			     + vreg_num, offset);
> +
> +}
> +
> /* Implement the "init" method of struct tramp_frame.  */
> 
> static void
> @@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
> 
>       /* If there was no SVE section then set up the V registers.  */
>       if (sve_regs == 0)
> -	for (int i = 0; i < 32; i++)
> -	  {
> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> +	{
> +	  for (int i = 0; i < 32; i++)
> +	    {
> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
> 
> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
> -	    if (tdep->has_sve ())
> -	      trad_frame_set_reg_addr (this_cache,
> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
> -				       offset);
> -	  }
> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
> +					  byte_order, tdep->has_sve ());
> +	    }
> +	}
>     }
> 
>   trad_frame_set_id (this_cache, frame_id_build (sp, func));
> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
> index 2ce90ccfd7..9ef5e91801 100644
> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
> @@ -26,6 +26,7 @@
> #include "arch/aarch64.h"
> #include "gdbsupport/common-regcache.h"
> #include "gdbsupport/byte-vector.h"
> +#include <endian.h>
> 
> /* See nat/aarch64-sve-linux-ptrace.h.  */
> 
> @@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
>   return buf;
> }
> 
> +/* If we are running in BE mode, convert the contents
> +   of VALUE (a 16 byte buffer) to LE.  */
> +
> +static void
> +aarch64_maybe_swab128 (gdb_byte *value)
> +{
> +  gdb_assert (value != nullptr);
> +
> +#if (__BYTE_ORDER == __BIG_ENDIAN)
> +  gdb_byte copy[16];
> +
> +  /* Save the original value.  */
> +  memcpy (copy, value, 16);
> +
> +  for (int i = 0; i < 15; i++)
> +    value[i] = copy[15 - i];
> +#else
> +  /* Nothing to be done.  */
> +#endif
> +}
> +
> /* See nat/aarch64-sve-linux-ptrace.h.  */
> 
> void
> @@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>     }
>   else
>     {
> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	 while SVE state is laid out in an endianness-independent format (LE).
> +
> +	 So we have a couple cases to consider:
> +
> +	 1 - If the target is big endian, then SIMD state is big endian,
> +	 requiring a byteswap.
> +
> +	 2 - If the target is little endian, then SIMD state is little endian,
> +	 which matches the SVE format, so no byteswap is needed. */
> +
>       /* There is no SVE state yet - the register dump contains a fpsimd
> 	 structure instead.  These registers still exist in the hardware, but
> 	 the kernel has not yet initialised them, and so they will be null.  */
> 
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>       struct user_fpsimd_state *fpsimd
> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
> 
> @@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
> 
>       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
> 	{
> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
> +	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +	  aarch64_maybe_swab128 (zero_reg);

I think we have a long standing bug here. zero_reg was meant to stay as the value 0. But then
it got reused as a general temp buffer.
It’s not shown in the diff, but we do:
* memset zero_reg to 0
* use zero_reg as a temp buffer for copying fpsimd values.
* use zero_reg as value 0 for fpsr and fpcr.

The memset needs moving after using it for fpsimd. (maybe also rename zero_reg to buf?)

Can we also reduce the number of memcpys - just byte swap vregs[i] directly into the zero_reg buffer?


> 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> 	}
> 
> @@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	 kernel, which is why we try to avoid it.  */
> 
>       bool has_sve_state = false;
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>       struct user_fpsimd_state *fpsimd
> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
> 
> @@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	 write out state and return.  */
>       if (!has_sve_state)
> 	{
> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	     while SVE state is laid out in an endianness-independent format
> +	     (LE).
> +
> +	     So we have a couple cases to consider:
> +
> +	     1 - If the target is big endian, then SIMD state is big endian,
> +	     requiring a byteswap.
> +
> +	     2 - If the target is little endian, then SIMD state is little
> +	     endian, which matches the SVE format, so no byteswap is needed. */
> +
> 	  /* The collects of the Z registers will overflow the size of a vreg.
> 	     There is enough space in the structure to allow for this, but we
> 	     cannot overflow into the next register as we might not be
> @@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
> 		{
> 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +		  aarch64_maybe_swab128 (zero_reg);
> +		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
> 		}
> 	    }
> 
> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
> index a6a84790a9..8a1aa818ad 100644
> --- a/gdb/trad-frame.c
> +++ b/gdb/trad-frame.c
> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>     {
>       regs[regnum].realreg = regnum;
>       regs[regnum].addr = -1;
> +      regs[regnum].data = nullptr;
>     }
> }
> 
> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>   return trad_frame_alloc_saved_regs (gdbarch);
> }
> 
> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
> 
> int
> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
> 	  && this_saved_regs[regnum].addr == -1);
> }
> 
> +/* See trad-frame.h.  */
> +
> +bool
> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			  int regnum)
> +{
> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
> +	  && this_saved_regs[regnum].data != nullptr);
> +}
> +
> void
> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
> 		      int regnum, LONGEST val)
> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>   this_saved_regs[regnum].addr = -1;
> }
> 
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +			    int regnum, const gdb_byte *bytes,
> +			    size_t size)
> +{
> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
> +
> +  /* Allocate the space and copy the data bytes.  */
> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);

Am I right to assume this means data will be automatically unallocated when
the trad_frame_saved_reg goes out of scope?


> +  memcpy (this_saved_regs[regnum].data, bytes, size);
> +}
> +
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				int regnum, const gdb_byte *bytes,
> +				size_t size)
> +{
> +  /* External interface for users of trad_frame_cache
> +     (who cannot access the prev_regs object directly).  */
> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
> +			      size);
> +}
> +
> +
> +
> struct value *
> trad_frame_get_prev_register (struct frame_info *this_frame,
> 			      struct trad_frame_saved_reg this_saved_regs[],
> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>     /* The register's value is available.  */
>     return frame_unwind_got_constant (this_frame, regnum,
> 				      this_saved_regs[regnum].addr);
> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
> +    /* The register's value is available as a sequence of bytes.  */
> +    return frame_unwind_got_bytes (this_frame, regnum,
> +				   this_saved_regs[regnum].data);
>   else
>     return frame_unwind_got_optimized (this_frame, regnum);
> }
> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
> index 7b5785616e..38db439579 100644
> --- a/gdb/trad-frame.h
> +++ b/gdb/trad-frame.h
> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
> 			       int regnum, LONGEST val);
> 
> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
> +   contained in BYTES with size SIZE.  */
> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				     int regnum, const gdb_byte *bytes,
> +				     size_t size);
> +
> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
> 				       struct frame_info *this_frame,
> 				       int regnum);
> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
> {
>   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>   int realreg;
> +  /* Register data (for values that don't fit in ADDR).  */
> +  gdb_byte *data;
> };
> 
> /* Encode REGNUM value in the trad-frame.  */
> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
> 			     int regnum);
> 
> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
> +   useful when the value is larger than what primitive types can hold.  */
> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +				 int regnum, const gdb_byte *bytes,
> +				 size_t size);
> +
> /* Convenience functions, return non-zero if the register has been
>    encoded as specified.  */
> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
> 			  int regnum);
> 
> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
> +   otherwise.  */
> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			      int regnum);
> +
> /* Reset the save regs cache, setting register values to -1.  */
> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
> 				  struct trad_frame_saved_reg *regs);
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian
  2020-12-01 11:28 ` Alan Hayward
@ 2020-12-01 12:19   ` Luis Machado
  2020-12-01 17:38     ` Alan Hayward
  0 siblings, 1 reply; 13+ messages in thread
From: Luis Machado @ 2020-12-01 12:19 UTC (permalink / raw)
  To: Alan Hayward; +Cc: gdb-patches\@sourceware.org, nd

Hi,

Thanks for the review.

On 12/1/20 8:28 AM, Alan Hayward wrote:
> 
> 
>> On 30 Nov 2020, at 18:55, Luis Machado <luis.machado@linaro.org> wrote:
>>
>> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
>> structure follows the target endianness, whereas the SVE dumps are
>> endianness-independent (LE).
>>
>> Therefore, when the system is in BE mode, we need to reverse the bytes
>> for the FPSIMD data.
>>
>> Given the V registers are larger than 64-bit, I've added a way for value
>> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
>> nicely with the unwinding *_got_bytes function and makes the trad-frame
>> more flexible and capable of saving larger registers.
>>
>> The memory for the bytes is allocated via the frame obstack, so it gets freed
>> after we're done inspecting the frame.
>>
>> gdb/ChangeLog:
>>
>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>
>> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
>> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
>> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
>> 	(aarch64_maybe_swab128): New function.
>> 	(aarch64_sve_regs_copy_to_reg_buf)
>> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
>> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
>> 	the data field.
>> 	(TF_REG_VALUE_BYTES): New enum value.
>> 	(trad_frame_value_bytes_p): New function.
>> 	(trad_frame_set_value_bytes): New function.
>> 	(trad_frame_set_reg_value_bytes): New function.
>> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
>> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
>> 	(struct trad_frame_saved_reg) <data>: New field.
>> 	(trad_frame_set_value_bytes): New prototype.
>> 	(trad_frame_value_bytes_p): New prototype.
>> ---
>> gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
>> gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
>> gdb/trad-frame.c                   |  46 +++++++++++-
>> gdb/trad-frame.h                   |  19 +++++
>> 4 files changed, 213 insertions(+), 24 deletions(-)
>>
>> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
>> index c9898bdafd..108f96be71 100644
>> --- a/gdb/aarch64-linux-tdep.c
>> +++ b/gdb/aarch64-linux-tdep.c
>> @@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>>    return magic;
>> }
>>
>> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
>> +   registers from a signal frame.
>> +
>> +   VREG_NUM is the number of the V register being restored, OFFSET is the
>> +   address containing the register value, BYTE_ORDER is the endianness and
>> +   HAS_SVE tells us if we have a valid SVE context or not.  */
>> +
>> +static void
>> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
>> +			    int vreg_num, CORE_ADDR offset,
>> +			    enum bfd_endian byte_order, bool has_sve)
>> +{
>> +  /* WARNING: SIMD state is laid out in memory in target-endian format, while
>> +     SVE state is laid out in an endianness-independent format (LE).
>> +
>> +     So we have a couple cases to consider:
>> +
>> +     1 - If the target is big endian, then SIMD state is big endian,
>> +     requiring a byteswap.
>> +
>> +     2 - If the target is little endian, then SIMD state is little endian,
>> +     which matches the SVE format, so no byteswap is needed. */
>> +
> 
> With this function, we are only handling FPSIMD values, so no need to mention SVE.
> As it is now, it makes the has_sve parts confusing because they are being treated
> the same as the rest of the fpsimd.

Though we are handling FPSIMD, we still need to set at least one SVE 
pseudo-register (AARCH64_SVE_V0_REGNUM), hence why I decided to keep the 
warning. It is not really a SVE register, but a pseudo-register of the 
FPSIMD V register.

Do you still want to remove the SVE references in the warning?

What is the difference between AARCH64_SVE_V0_REGNUM and AARCH64_V0_REGNUM?

> 
> The same comment is fine when used elsewhere in the patch.
> 
> 
>> +  if (byte_order == BFD_ENDIAN_BIG)
>> +    {
>> +      gdb_byte buf[V_REGISTER_SIZE];
>> +
>> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
>> +	{
>> +	  size_t size = V_REGISTER_SIZE/2;
>> +
>> +	  /* Read the two halves of the V register in reverse byte order.  */
>> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
>> +						    byte_order);
>> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
>> +						    byte_order);
>> +
>> +	  /* Copy the reversed bytes to the buffer.  */
>> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
>> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
>> +
>> +	  /* Now we can store the correct bytes for the V register.  */
>> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
>> +					  buf, V_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_Q0_REGNUM
>> +					  + vreg_num, buf, Q_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_D0_REGNUM
>> +					  + vreg_num, buf, D_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_S0_REGNUM
>> +					  + vreg_num, buf, S_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_H0_REGNUM
>> +					  + vreg_num, buf, H_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_B0_REGNUM
>> +					  + vreg_num, buf, B_REGISTER_SIZE);
>> +
>> +	  if (has_sve)
>> +	    trad_frame_set_reg_value_bytes (cache,
>> +					    num_regs + AARCH64_SVE_V0_REGNUM
>> +					    + vreg_num, buf, V_REGISTER_SIZE);
>> +	}
>> +      return;
>> +    }
>> +
>> +  /* Little endian, just point at the address containing the register
>> +     value.  */
>> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
>> +			   offset);
>> +
>> +  if (has_sve)
>> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
>> +			     + vreg_num, offset);
>> +
>> +}
>> +
>> /* Implement the "init" method of struct tramp_frame.  */
>>
>> static void
>> @@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>>
>>        /* If there was no SVE section then set up the V registers.  */
>>        if (sve_regs == 0)
>> -	for (int i = 0; i < 32; i++)
>> -	  {
>> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>> +	{
>> +	  for (int i = 0; i < 32; i++)
>> +	    {
>> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
>>
>> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
>> -	    if (tdep->has_sve ())
>> -	      trad_frame_set_reg_addr (this_cache,
>> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
>> -				       offset);
>> -	  }
>> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
>> +					  byte_order, tdep->has_sve ());
>> +	    }
>> +	}
>>      }
>>
>>    trad_frame_set_id (this_cache, frame_id_build (sp, func));
>> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
>> index 2ce90ccfd7..9ef5e91801 100644
>> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
>> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
>> @@ -26,6 +26,7 @@
>> #include "arch/aarch64.h"
>> #include "gdbsupport/common-regcache.h"
>> #include "gdbsupport/byte-vector.h"
>> +#include <endian.h>
>>
>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>
>> @@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
>>    return buf;
>> }
>>
>> +/* If we are running in BE mode, convert the contents
>> +   of VALUE (a 16 byte buffer) to LE.  */
>> +
>> +static void
>> +aarch64_maybe_swab128 (gdb_byte *value)
>> +{
>> +  gdb_assert (value != nullptr);
>> +
>> +#if (__BYTE_ORDER == __BIG_ENDIAN)
>> +  gdb_byte copy[16];
>> +
>> +  /* Save the original value.  */
>> +  memcpy (copy, value, 16);
>> +
>> +  for (int i = 0; i < 15; i++)
>> +    value[i] = copy[15 - i];
>> +#else
>> +  /* Nothing to be done.  */
>> +#endif
>> +}
>> +
>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>
>> void
>> @@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>      }
>>    else
>>      {
>> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
>> +	 while SVE state is laid out in an endianness-independent format (LE).
>> +
>> +	 So we have a couple cases to consider:
>> +
>> +	 1 - If the target is big endian, then SIMD state is big endian,
>> +	 requiring a byteswap.
>> +
>> +	 2 - If the target is little endian, then SIMD state is little endian,
>> +	 which matches the SVE format, so no byteswap is needed. */
>> +
>>        /* There is no SVE state yet - the register dump contains a fpsimd
>> 	 structure instead.  These registers still exist in the hardware, but
>> 	 the kernel has not yet initialised them, and so they will be null.  */
>>
>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>        struct user_fpsimd_state *fpsimd
>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>
>> @@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>
>>        for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>> 	{
>> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
>> +	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
>> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
>> +	  aarch64_maybe_swab128 (zero_reg);
> 
> I think we have a long standing bug here. zero_reg was meant to stay as the value 0. But then
> it got reused as a general temp buffer.
> It’s not shown in the diff, but we do:
> * memset zero_reg to 0
> * use zero_reg as a temp buffer for copying fpsimd values.
> * use zero_reg as value 0 for fpsr and fpcr.
> 
> The memset needs moving after using it for fpsimd. (maybe also rename zero_reg to buf?)

I'll address this.

> 
> Can we also reduce the number of memcpys - just byte swap vregs[i] directly into the zero_reg buffer?

The byte swap function only swaps things for __BYTE_ORDER == 
__BIG_ENDIAN, we would still need to memcpy the bytes to zero_reg. The 
number of memcpy's would be the same, no?

> 
> 
>> 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>> 	}
>>
>> @@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>> 	 kernel, which is why we try to avoid it.  */
>>
>>        bool has_sve_state = false;
>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>        struct user_fpsimd_state *fpsimd
>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>
>> @@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>> 	 write out state and return.  */
>>        if (!has_sve_state)
>> 	{
>> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
>> +	     while SVE state is laid out in an endianness-independent format
>> +	     (LE).
>> +
>> +	     So we have a couple cases to consider:
>> +
>> +	     1 - If the target is big endian, then SIMD state is big endian,
>> +	     requiring a byteswap.
>> +
>> +	     2 - If the target is little endian, then SIMD state is little
>> +	     endian, which matches the SVE format, so no byteswap is needed. */
>> +
>> 	  /* The collects of the Z registers will overflow the size of a vreg.
>> 	     There is enough space in the structure to allow for this, but we
>> 	     cannot overflow into the next register as we might not be
>> @@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>> 		{
>> 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
>> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
>> +		  aarch64_maybe_swab128 (zero_reg);
>> +		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
>> 		}
>> 	    }
>>
>> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
>> index a6a84790a9..8a1aa818ad 100644
>> --- a/gdb/trad-frame.c
>> +++ b/gdb/trad-frame.c
>> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>      {
>>        regs[regnum].realreg = regnum;
>>        regs[regnum].addr = -1;
>> +      regs[regnum].data = nullptr;
>>      }
>> }
>>
>> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>>    return trad_frame_alloc_saved_regs (gdbarch);
>> }
>>
>> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
>> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
>>
>> int
>> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
>> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>> 	  && this_saved_regs[regnum].addr == -1);
>> }
>>
>> +/* See trad-frame.h.  */
>> +
>> +bool
>> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>> +			  int regnum)
>> +{
>> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
>> +	  && this_saved_regs[regnum].data != nullptr);
>> +}
>> +
>> void
>> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>> 		      int regnum, LONGEST val)
>> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>    this_saved_regs[regnum].addr = -1;
>> }
>>
>> +/* See trad-frame.h.  */
>> +
>> +void
>> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>> +			    int regnum, const gdb_byte *bytes,
>> +			    size_t size)
>> +{
>> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
>> +
>> +  /* Allocate the space and copy the data bytes.  */
>> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
> 
> Am I right to assume this means data will be automatically unallocated when
> the trad_frame_saved_reg goes out of scope?

Not when it goes out of scope, but when all the frame-related data is 
freed by GDB just before restarting inferior movement.

> 
> 
>> +  memcpy (this_saved_regs[regnum].data, bytes, size);
>> +}
>> +
>> +/* See trad-frame.h.  */
>> +
>> +void
>> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>> +				int regnum, const gdb_byte *bytes,
>> +				size_t size)
>> +{
>> +  /* External interface for users of trad_frame_cache
>> +     (who cannot access the prev_regs object directly).  */
>> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
>> +			      size);
>> +}
>> +
>> +
>> +
>> struct value *
>> trad_frame_get_prev_register (struct frame_info *this_frame,
>> 			      struct trad_frame_saved_reg this_saved_regs[],
>> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>>      /* The register's value is available.  */
>>      return frame_unwind_got_constant (this_frame, regnum,
>> 				      this_saved_regs[regnum].addr);
>> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
>> +    /* The register's value is available as a sequence of bytes.  */
>> +    return frame_unwind_got_bytes (this_frame, regnum,
>> +				   this_saved_regs[regnum].data);
>>    else
>>      return frame_unwind_got_optimized (this_frame, regnum);
>> }
>> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
>> index 7b5785616e..38db439579 100644
>> --- a/gdb/trad-frame.h
>> +++ b/gdb/trad-frame.h
>> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
>> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>> 			       int regnum, LONGEST val);
>>
>> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
>> +   contained in BYTES with size SIZE.  */
>> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>> +				     int regnum, const gdb_byte *bytes,
>> +				     size_t size);
>> +
>> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
>> 				       struct frame_info *this_frame,
>> 				       int regnum);
>> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>> {
>>    LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>>    int realreg;
>> +  /* Register data (for values that don't fit in ADDR).  */
>> +  gdb_byte *data;
>> };
>>
>> /* Encode REGNUM value in the trad-frame.  */
>> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
>> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>> 			     int regnum);
>>
>> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
>> +   useful when the value is larger than what primitive types can hold.  */
>> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>> +				 int regnum, const gdb_byte *bytes,
>> +				 size_t size);
>> +
>> /* Convenience functions, return non-zero if the register has been
>>     encoded as specified.  */
>> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
>> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
>> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>> 			  int regnum);
>>
>> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
>> +   otherwise.  */
>> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>> +			      int regnum);
>> +
>> /* Reset the save regs cache, setting register values to -1.  */
>> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>> 				  struct trad_frame_saved_reg *regs);
>> -- 
>> 2.25.1
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian
  2020-12-01 12:19   ` Luis Machado
@ 2020-12-01 17:38     ` Alan Hayward
  2020-12-01 18:40       ` Luis Machado
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Hayward @ 2020-12-01 17:38 UTC (permalink / raw)
  To: Luis Machado; +Cc: gdb-patches\@sourceware.org, nd



> On 1 Dec 2020, at 12:19, Luis Machado <luis.machado@linaro.org> wrote:
> 
> Hi,
> 
> Thanks for the review.
> 
> On 12/1/20 8:28 AM, Alan Hayward wrote:
>>> On 30 Nov 2020, at 18:55, Luis Machado <luis.machado@linaro.org> wrote:
>>> 
>>> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
>>> structure follows the target endianness, whereas the SVE dumps are
>>> endianness-independent (LE).
>>> 
>>> Therefore, when the system is in BE mode, we need to reverse the bytes
>>> for the FPSIMD data.
>>> 
>>> Given the V registers are larger than 64-bit, I've added a way for value
>>> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
>>> nicely with the unwinding *_got_bytes function and makes the trad-frame
>>> more flexible and capable of saving larger registers.
>>> 
>>> The memory for the bytes is allocated via the frame obstack, so it gets freed
>>> after we're done inspecting the frame.
>>> 
>>> gdb/ChangeLog:
>>> 
>>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>> 
>>> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
>>> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
>>> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
>>> 	(aarch64_maybe_swab128): New function.
>>> 	(aarch64_sve_regs_copy_to_reg_buf)
>>> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
>>> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
>>> 	the data field.
>>> 	(TF_REG_VALUE_BYTES): New enum value.
>>> 	(trad_frame_value_bytes_p): New function.
>>> 	(trad_frame_set_value_bytes): New function.
>>> 	(trad_frame_set_reg_value_bytes): New function.
>>> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
>>> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
>>> 	(struct trad_frame_saved_reg) <data>: New field.
>>> 	(trad_frame_set_value_bytes): New prototype.
>>> 	(trad_frame_value_bytes_p): New prototype.
>>> ---
>>> gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
>>> gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
>>> gdb/trad-frame.c                   |  46 +++++++++++-
>>> gdb/trad-frame.h                   |  19 +++++
>>> 4 files changed, 213 insertions(+), 24 deletions(-)
>>> 
>>> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
>>> index c9898bdafd..108f96be71 100644
>>> --- a/gdb/aarch64-linux-tdep.c
>>> +++ b/gdb/aarch64-linux-tdep.c
>>> @@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>>>   return magic;
>>> }
>>> 
>>> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
>>> +   registers from a signal frame.
>>> +
>>> +   VREG_NUM is the number of the V register being restored, OFFSET is the
>>> +   address containing the register value, BYTE_ORDER is the endianness and
>>> +   HAS_SVE tells us if we have a valid SVE context or not.  */
>>> +
>>> +static void
>>> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
>>> +			    int vreg_num, CORE_ADDR offset,
>>> +			    enum bfd_endian byte_order, bool has_sve)
>>> +{
>>> +  /* WARNING: SIMD state is laid out in memory in target-endian format, while
>>> +     SVE state is laid out in an endianness-independent format (LE).
>>> +
>>> +     So we have a couple cases to consider:
>>> +
>>> +     1 - If the target is big endian, then SIMD state is big endian,
>>> +     requiring a byteswap.
>>> +
>>> +     2 - If the target is little endian, then SIMD state is little endian,
>>> +     which matches the SVE format, so no byteswap is needed. */
>>> +
>> With this function, we are only handling FPSIMD values, so no need to mention SVE.
>> As it is now, it makes the has_sve parts confusing because they are being treated
>> the same as the rest of the fpsimd.
> 
> Though we are handling FPSIMD, we still need to set at least one SVE pseudo-register (AARCH64_SVE_V0_REGNUM), hence why I decided to keep the warning. It is not really a SVE register, but a pseudo-register of the FPSIMD V register.
> 
> Do you still want to remove the SVE references in the warning?

For this function, I would say remove the SVE comment, as everything in the function
is being treated in the same way. But I’m not overly hung up on it.

> 
> What is the difference between AARCH64_SVE_V0_REGNUM and AARCH64_V0_REGNUM?

Z,V,Q,D,S,H,B registers all overlap with each other. It’s the same register,
just a different size.
In the tdep, when we only have neon, V is the real register, and Q,D,S,H,B are pseudos.
Then when we have sve, Z is the real register, and V,Q,D,S,H,B are pseudos.

AARCH64_SVE_Z0_REGNUM == AARCH64_V0_REGNUM.

And then AARCH64_SVE_V0_REGNUM represents the regnum of the V0 pseudo register
when SVE is enabled.

ARCH64_SVE_V0_REGNUM == AARCH64_B0_REGNUM + 32

Maybe eventually this should be updated to be similar to pauth. Have a single define
which requires a v0_reg_base value in the tdep.


> 
>> The same comment is fine when used elsewhere in the patch.
>>> +  if (byte_order == BFD_ENDIAN_BIG)
>>> +    {
>>> +      gdb_byte buf[V_REGISTER_SIZE];
>>> +
>>> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
>>> +	{
>>> +	  size_t size = V_REGISTER_SIZE/2;
>>> +
>>> +	  /* Read the two halves of the V register in reverse byte order.  */
>>> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
>>> +						    byte_order);
>>> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
>>> +						    byte_order);
>>> +
>>> +	  /* Copy the reversed bytes to the buffer.  */
>>> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
>>> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
>>> +
>>> +	  /* Now we can store the correct bytes for the V register.  */
>>> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
>>> +					  buf, V_REGISTER_SIZE);
>>> +	  trad_frame_set_reg_value_bytes (cache,
>>> +					  num_regs + AARCH64_Q0_REGNUM
>>> +					  + vreg_num, buf, Q_REGISTER_SIZE);
>>> +	  trad_frame_set_reg_value_bytes (cache,
>>> +					  num_regs + AARCH64_D0_REGNUM
>>> +					  + vreg_num, buf, D_REGISTER_SIZE);
>>> +	  trad_frame_set_reg_value_bytes (cache,
>>> +					  num_regs + AARCH64_S0_REGNUM
>>> +					  + vreg_num, buf, S_REGISTER_SIZE);
>>> +	  trad_frame_set_reg_value_bytes (cache,
>>> +					  num_regs + AARCH64_H0_REGNUM
>>> +					  + vreg_num, buf, H_REGISTER_SIZE);
>>> +	  trad_frame_set_reg_value_bytes (cache,
>>> +					  num_regs + AARCH64_B0_REGNUM
>>> +					  + vreg_num, buf, B_REGISTER_SIZE);
>>> +
>>> +	  if (has_sve)
>>> +	    trad_frame_set_reg_value_bytes (cache,
>>> +					    num_regs + AARCH64_SVE_V0_REGNUM
>>> +					    + vreg_num, buf, V_REGISTER_SIZE);
>>> +	}
>>> +      return;
>>> +    }
>>> +
>>> +  /* Little endian, just point at the address containing the register
>>> +     value.  */
>>> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
>>> +			   offset);
>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
>>> +			   offset);
>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
>>> +			   offset);
>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
>>> +			   offset);
>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
>>> +			   offset);
>>> +
>>> +  if (has_sve)
>>> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
>>> +			     + vreg_num, offset);
>>> +
>>> +}
>>> +
>>> /* Implement the "init" method of struct tramp_frame.  */
>>> 
>>> static void
>>> @@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>>> 
>>>       /* If there was no SVE section then set up the V registers.  */
>>>       if (sve_regs == 0)
>>> -	for (int i = 0; i < 32; i++)
>>> -	  {
>>> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>> +	{
>>> +	  for (int i = 0; i < 32; i++)
>>> +	    {
>>> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
>>> 
>>> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
>>> -	    trad_frame_set_reg_addr (this_cache,
>>> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
>>> -	    trad_frame_set_reg_addr (this_cache,
>>> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
>>> -	    trad_frame_set_reg_addr (this_cache,
>>> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
>>> -	    trad_frame_set_reg_addr (this_cache,
>>> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
>>> -	    trad_frame_set_reg_addr (this_cache,
>>> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
>>> -	    if (tdep->has_sve ())
>>> -	      trad_frame_set_reg_addr (this_cache,
>>> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
>>> -				       offset);
>>> -	  }
>>> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
>>> +					  byte_order, tdep->has_sve ());
>>> +	    }
>>> +	}
>>>     }
>>> 
>>>   trad_frame_set_id (this_cache, frame_id_build (sp, func));
>>> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
>>> index 2ce90ccfd7..9ef5e91801 100644
>>> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
>>> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
>>> @@ -26,6 +26,7 @@
>>> #include "arch/aarch64.h"
>>> #include "gdbsupport/common-regcache.h"
>>> #include "gdbsupport/byte-vector.h"
>>> +#include <endian.h>
>>> 
>>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>> 
>>> @@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
>>>   return buf;
>>> }
>>> 
>>> +/* If we are running in BE mode, convert the contents
>>> +   of VALUE (a 16 byte buffer) to LE.  */
>>> +
>>> +static void
>>> +aarch64_maybe_swab128 (gdb_byte *value)
>>> +{
>>> +  gdb_assert (value != nullptr);
>>> +
>>> +#if (__BYTE_ORDER == __BIG_ENDIAN)
>>> +  gdb_byte copy[16];
>>> +
>>> +  /* Save the original value.  */
>>> +  memcpy (copy, value, 16);
>>> +
>>> +  for (int i = 0; i < 15; i++)
>>> +    value[i] = copy[15 - i];
>>> +#else
>>> +  /* Nothing to be done.  */
>>> +#endif
>>> +}
>>> +
>>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>> 
>>> void
>>> @@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>>     }
>>>   else
>>>     {
>>> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
>>> +	 while SVE state is laid out in an endianness-independent format (LE).
>>> +
>>> +	 So we have a couple cases to consider:
>>> +
>>> +	 1 - If the target is big endian, then SIMD state is big endian,
>>> +	 requiring a byteswap.
>>> +
>>> +	 2 - If the target is little endian, then SIMD state is little endian,
>>> +	 which matches the SVE format, so no byteswap is needed. */
>>> +
>>>       /* There is no SVE state yet - the register dump contains a fpsimd
>>> 	 structure instead.  These registers still exist in the hardware, but
>>> 	 the kernel has not yet initialised them, and so they will be null.  */
>>> 
>>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>       struct user_fpsimd_state *fpsimd
>>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>> 
>>> @@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>> 
>>>       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>>> 	{
>>> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
>>> +	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
>>> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
>>> +	  aarch64_maybe_swab128 (zero_reg);
>> I think we have a long standing bug here. zero_reg was meant to stay as the value 0. But then
>> it got reused as a general temp buffer.
>> It’s not shown in the diff, but we do:
>> * memset zero_reg to 0
>> * use zero_reg as a temp buffer for copying fpsimd values.
>> * use zero_reg as value 0 for fpsr and fpcr.
>> The memset needs moving after using it for fpsimd. (maybe also rename zero_reg to buf?)
> 
> I'll address this.
> 
>> Can we also reduce the number of memcpys - just byte swap vregs[i] directly into the zero_reg buffer?
> 
> The byte swap function only swaps things for __BYTE_ORDER == __BIG_ENDIAN, we would still need to memcpy the bytes to zero_reg. The number of memcpy's would be the same, no?

For LE, we want to copy into zero_reg to ensure it is zero padded after 128bits, then call raw_supply.
So that’s all fine.

For BE, we could just byteswap directly from fpsimd->vregs[i] into zero_reg (?) No need for any memcpys.


> 
>>> 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>>> 	}
>>> 
>>> @@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>> 	 kernel, which is why we try to avoid it.  */
>>> 
>>>       bool has_sve_state = false;
>>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>       struct user_fpsimd_state *fpsimd
>>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>> 
>>> @@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>> 	 write out state and return.  */
>>>       if (!has_sve_state)
>>> 	{
>>> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
>>> +	     while SVE state is laid out in an endianness-independent format
>>> +	     (LE).
>>> +
>>> +	     So we have a couple cases to consider:
>>> +
>>> +	     1 - If the target is big endian, then SIMD state is big endian,
>>> +	     requiring a byteswap.
>>> +
>>> +	     2 - If the target is little endian, then SIMD state is little
>>> +	     endian, which matches the SVE format, so no byteswap is needed. */
>>> +
>>> 	  /* The collects of the Z registers will overflow the size of a vreg.
>>> 	     There is enough space in the structure to allow for this, but we
>>> 	     cannot overflow into the next register as we might not be
>>> @@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>>> 		{
>>> 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>>> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
>>> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
>>> +		  aarch64_maybe_swab128 (zero_reg);
>>> +		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
>>> 		}
>>> 	    }
>>> 
>>> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
>>> index a6a84790a9..8a1aa818ad 100644
>>> --- a/gdb/trad-frame.c
>>> +++ b/gdb/trad-frame.c
>>> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>>     {
>>>       regs[regnum].realreg = regnum;
>>>       regs[regnum].addr = -1;
>>> +      regs[regnum].data = nullptr;
>>>     }
>>> }
>>> 
>>> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>>>   return trad_frame_alloc_saved_regs (gdbarch);
>>> }
>>> 
>>> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
>>> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
>>> 
>>> int
>>> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
>>> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>>> 	  && this_saved_regs[regnum].addr == -1);
>>> }
>>> 
>>> +/* See trad-frame.h.  */
>>> +
>>> +bool
>>> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>>> +			  int regnum)
>>> +{
>>> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
>>> +	  && this_saved_regs[regnum].data != nullptr);
>>> +}
>>> +
>>> void
>>> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>>> 		      int regnum, LONGEST val)
>>> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>>   this_saved_regs[regnum].addr = -1;
>>> }
>>> 
>>> +/* See trad-frame.h.  */
>>> +
>>> +void
>>> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>>> +			    int regnum, const gdb_byte *bytes,
>>> +			    size_t size)
>>> +{
>>> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
>>> +
>>> +  /* Allocate the space and copy the data bytes.  */
>>> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
>> Am I right to assume this means data will be automatically unallocated when
>> the trad_frame_saved_reg goes out of scope?
> 
> Not when it goes out of scope, but when all the frame-related data is freed by GDB just before restarting inferior movement.

Ok.

> 
>>> +  memcpy (this_saved_regs[regnum].data, bytes, size);
>>> +}
>>> +
>>> +/* See trad-frame.h.  */
>>> +
>>> +void
>>> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>>> +				int regnum, const gdb_byte *bytes,
>>> +				size_t size)
>>> +{
>>> +  /* External interface for users of trad_frame_cache
>>> +     (who cannot access the prev_regs object directly).  */
>>> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
>>> +			      size);
>>> +}
>>> +
>>> +
>>> +
>>> struct value *
>>> trad_frame_get_prev_register (struct frame_info *this_frame,
>>> 			      struct trad_frame_saved_reg this_saved_regs[],
>>> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>>>     /* The register's value is available.  */
>>>     return frame_unwind_got_constant (this_frame, regnum,
>>> 				      this_saved_regs[regnum].addr);
>>> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
>>> +    /* The register's value is available as a sequence of bytes.  */
>>> +    return frame_unwind_got_bytes (this_frame, regnum,
>>> +				   this_saved_regs[regnum].data);
>>>   else
>>>     return frame_unwind_got_optimized (this_frame, regnum);
>>> }
>>> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
>>> index 7b5785616e..38db439579 100644
>>> --- a/gdb/trad-frame.h
>>> +++ b/gdb/trad-frame.h
>>> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
>>> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>>> 			       int regnum, LONGEST val);
>>> 
>>> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
>>> +   contained in BYTES with size SIZE.  */
>>> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>>> +				     int regnum, const gdb_byte *bytes,
>>> +				     size_t size);
>>> +
>>> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
>>> 				       struct frame_info *this_frame,
>>> 				       int regnum);
>>> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>>> {
>>>   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>>>   int realreg;
>>> +  /* Register data (for values that don't fit in ADDR).  */
>>> +  gdb_byte *data;
>>> };
>>> 
>>> /* Encode REGNUM value in the trad-frame.  */
>>> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
>>> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>> 			     int regnum);
>>> 
>>> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
>>> +   useful when the value is larger than what primitive types can hold.  */
>>> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>>> +				 int regnum, const gdb_byte *bytes,
>>> +				 size_t size);
>>> +
>>> /* Convenience functions, return non-zero if the register has been
>>>    encoded as specified.  */
>>> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
>>> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
>>> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>>> 			  int regnum);
>>> 
>>> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
>>> +   otherwise.  */
>>> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>>> +			      int regnum);
>>> +
>>> /* Reset the save regs cache, setting register values to -1.  */
>>> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>> 				  struct trad_frame_saved_reg *regs);
>>> -- 
>>> 2.25.1
>>> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian
  2020-12-01 17:38     ` Alan Hayward
@ 2020-12-01 18:40       ` Luis Machado
  2020-12-02  9:07         ` Alan Hayward
  0 siblings, 1 reply; 13+ messages in thread
From: Luis Machado @ 2020-12-01 18:40 UTC (permalink / raw)
  To: Alan Hayward; +Cc: gdb-patches\@sourceware.org, nd

On 12/1/20 2:38 PM, Alan Hayward wrote:
> 
> 
>> On 1 Dec 2020, at 12:19, Luis Machado <luis.machado@linaro.org> wrote:
>>
>> Hi,
>>
>> Thanks for the review.
>>
>> On 12/1/20 8:28 AM, Alan Hayward wrote:
>>>> On 30 Nov 2020, at 18:55, Luis Machado <luis.machado@linaro.org> wrote:
>>>>
>>>> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
>>>> structure follows the target endianness, whereas the SVE dumps are
>>>> endianness-independent (LE).
>>>>
>>>> Therefore, when the system is in BE mode, we need to reverse the bytes
>>>> for the FPSIMD data.
>>>>
>>>> Given the V registers are larger than 64-bit, I've added a way for value
>>>> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
>>>> nicely with the unwinding *_got_bytes function and makes the trad-frame
>>>> more flexible and capable of saving larger registers.
>>>>
>>>> The memory for the bytes is allocated via the frame obstack, so it gets freed
>>>> after we're done inspecting the frame.
>>>>
>>>> gdb/ChangeLog:
>>>>
>>>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>>>
>>>> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
>>>> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
>>>> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
>>>> 	(aarch64_maybe_swab128): New function.
>>>> 	(aarch64_sve_regs_copy_to_reg_buf)
>>>> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
>>>> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
>>>> 	the data field.
>>>> 	(TF_REG_VALUE_BYTES): New enum value.
>>>> 	(trad_frame_value_bytes_p): New function.
>>>> 	(trad_frame_set_value_bytes): New function.
>>>> 	(trad_frame_set_reg_value_bytes): New function.
>>>> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
>>>> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
>>>> 	(struct trad_frame_saved_reg) <data>: New field.
>>>> 	(trad_frame_set_value_bytes): New prototype.
>>>> 	(trad_frame_value_bytes_p): New prototype.
>>>> ---
>>>> gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
>>>> gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
>>>> gdb/trad-frame.c                   |  46 +++++++++++-
>>>> gdb/trad-frame.h                   |  19 +++++
>>>> 4 files changed, 213 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
>>>> index c9898bdafd..108f96be71 100644
>>>> --- a/gdb/aarch64-linux-tdep.c
>>>> +++ b/gdb/aarch64-linux-tdep.c
>>>> @@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>>>>    return magic;
>>>> }
>>>>
>>>> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
>>>> +   registers from a signal frame.
>>>> +
>>>> +   VREG_NUM is the number of the V register being restored, OFFSET is the
>>>> +   address containing the register value, BYTE_ORDER is the endianness and
>>>> +   HAS_SVE tells us if we have a valid SVE context or not.  */
>>>> +
>>>> +static void
>>>> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
>>>> +			    int vreg_num, CORE_ADDR offset,
>>>> +			    enum bfd_endian byte_order, bool has_sve)
>>>> +{
>>>> +  /* WARNING: SIMD state is laid out in memory in target-endian format, while
>>>> +     SVE state is laid out in an endianness-independent format (LE).
>>>> +
>>>> +     So we have a couple cases to consider:
>>>> +
>>>> +     1 - If the target is big endian, then SIMD state is big endian,
>>>> +     requiring a byteswap.
>>>> +
>>>> +     2 - If the target is little endian, then SIMD state is little endian,
>>>> +     which matches the SVE format, so no byteswap is needed. */
>>>> +
>>> With this function, we are only handling FPSIMD values, so no need to mention SVE.
>>> As it is now, it makes the has_sve parts confusing because they are being treated
>>> the same as the rest of the fpsimd.
>>
>> Though we are handling FPSIMD, we still need to set at least one SVE pseudo-register (AARCH64_SVE_V0_REGNUM), hence why I decided to keep the warning. It is not really a SVE register, but a pseudo-register of the FPSIMD V register.
>>
>> Do you still want to remove the SVE references in the warning?
> 
> For this function, I would say remove the SVE comment, as everything in the function
> is being treated in the same way. But I’m not overly hung up on it.
> 

It now reads...

   /* WARNING: SIMD state is laid out in memory in target-endian format.

      So we have a couple cases to consider:

      1 - If the target is big endian, then SIMD state is big endian, 
requiring a byteswap.

      2 - If the target is little endian, then SIMD state is little 
endian, so no byteswap is needed. */


>>
>> What is the difference between AARCH64_SVE_V0_REGNUM and AARCH64_V0_REGNUM?
> 
> Z,V,Q,D,S,H,B registers all overlap with each other. It’s the same register,
> just a different size.
> In the tdep, when we only have neon, V is the real register, and Q,D,S,H,B are pseudos.
> Then when we have sve, Z is the real register, and V,Q,D,S,H,B are pseudos.
> 
> AARCH64_SVE_Z0_REGNUM == AARCH64_V0_REGNUM.
> 
> And then AARCH64_SVE_V0_REGNUM represents the regnum of the V0 pseudo register
> when SVE is enabled.
> 
> ARCH64_SVE_V0_REGNUM == AARCH64_B0_REGNUM + 32
> 
> Maybe eventually this should be updated to be similar to pauth. Have a single define
> which requires a v0_reg_base value in the tdep.
> 
> 
>>
>>> The same comment is fine when used elsewhere in the patch.
>>>> +  if (byte_order == BFD_ENDIAN_BIG)
>>>> +    {
>>>> +      gdb_byte buf[V_REGISTER_SIZE];
>>>> +
>>>> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
>>>> +	{
>>>> +	  size_t size = V_REGISTER_SIZE/2;
>>>> +
>>>> +	  /* Read the two halves of the V register in reverse byte order.  */
>>>> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
>>>> +						    byte_order);
>>>> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
>>>> +						    byte_order);
>>>> +
>>>> +	  /* Copy the reversed bytes to the buffer.  */
>>>> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
>>>> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
>>>> +
>>>> +	  /* Now we can store the correct bytes for the V register.  */
>>>> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
>>>> +					  buf, V_REGISTER_SIZE);
>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>> +					  num_regs + AARCH64_Q0_REGNUM
>>>> +					  + vreg_num, buf, Q_REGISTER_SIZE);
>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>> +					  num_regs + AARCH64_D0_REGNUM
>>>> +					  + vreg_num, buf, D_REGISTER_SIZE);
>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>> +					  num_regs + AARCH64_S0_REGNUM
>>>> +					  + vreg_num, buf, S_REGISTER_SIZE);
>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>> +					  num_regs + AARCH64_H0_REGNUM
>>>> +					  + vreg_num, buf, H_REGISTER_SIZE);
>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>> +					  num_regs + AARCH64_B0_REGNUM
>>>> +					  + vreg_num, buf, B_REGISTER_SIZE);
>>>> +
>>>> +	  if (has_sve)
>>>> +	    trad_frame_set_reg_value_bytes (cache,
>>>> +					    num_regs + AARCH64_SVE_V0_REGNUM
>>>> +					    + vreg_num, buf, V_REGISTER_SIZE);
>>>> +	}
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  /* Little endian, just point at the address containing the register
>>>> +     value.  */
>>>> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
>>>> +			   offset);
>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
>>>> +			   offset);
>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
>>>> +			   offset);
>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
>>>> +			   offset);
>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
>>>> +			   offset);
>>>> +
>>>> +  if (has_sve)
>>>> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
>>>> +			     + vreg_num, offset);
>>>> +
>>>> +}
>>>> +
>>>> /* Implement the "init" method of struct tramp_frame.  */
>>>>
>>>> static void
>>>> @@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>>>>
>>>>        /* If there was no SVE section then set up the V registers.  */
>>>>        if (sve_regs == 0)
>>>> -	for (int i = 0; i < 32; i++)
>>>> -	  {
>>>> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>>> +	{
>>>> +	  for (int i = 0; i < 32; i++)
>>>> +	    {
>>>> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>>> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
>>>>
>>>> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
>>>> -	    if (tdep->has_sve ())
>>>> -	      trad_frame_set_reg_addr (this_cache,
>>>> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
>>>> -				       offset);
>>>> -	  }
>>>> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
>>>> +					  byte_order, tdep->has_sve ());
>>>> +	    }
>>>> +	}
>>>>      }
>>>>
>>>>    trad_frame_set_id (this_cache, frame_id_build (sp, func));
>>>> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
>>>> index 2ce90ccfd7..9ef5e91801 100644
>>>> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
>>>> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
>>>> @@ -26,6 +26,7 @@
>>>> #include "arch/aarch64.h"
>>>> #include "gdbsupport/common-regcache.h"
>>>> #include "gdbsupport/byte-vector.h"
>>>> +#include <endian.h>
>>>>
>>>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>>>
>>>> @@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
>>>>    return buf;
>>>> }
>>>>
>>>> +/* If we are running in BE mode, convert the contents
>>>> +   of VALUE (a 16 byte buffer) to LE.  */
>>>> +
>>>> +static void
>>>> +aarch64_maybe_swab128 (gdb_byte *value)
>>>> +{
>>>> +  gdb_assert (value != nullptr);
>>>> +
>>>> +#if (__BYTE_ORDER == __BIG_ENDIAN)
>>>> +  gdb_byte copy[16];
>>>> +
>>>> +  /* Save the original value.  */
>>>> +  memcpy (copy, value, 16);
>>>> +
>>>> +  for (int i = 0; i < 15; i++)
>>>> +    value[i] = copy[15 - i];
>>>> +#else
>>>> +  /* Nothing to be done.  */
>>>> +#endif
>>>> +}
>>>> +
>>>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>>>
>>>> void
>>>> @@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>>>      }
>>>>    else
>>>>      {
>>>> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
>>>> +	 while SVE state is laid out in an endianness-independent format (LE).
>>>> +
>>>> +	 So we have a couple cases to consider:
>>>> +
>>>> +	 1 - If the target is big endian, then SIMD state is big endian,
>>>> +	 requiring a byteswap.
>>>> +
>>>> +	 2 - If the target is little endian, then SIMD state is little endian,
>>>> +	 which matches the SVE format, so no byteswap is needed. */
>>>> +
>>>>        /* There is no SVE state yet - the register dump contains a fpsimd
>>>> 	 structure instead.  These registers still exist in the hardware, but
>>>> 	 the kernel has not yet initialised them, and so they will be null.  */
>>>>
>>>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>>        struct user_fpsimd_state *fpsimd
>>>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>>>
>>>> @@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>>>
>>>>        for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>>>> 	{
>>>> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
>>>> +	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
>>>> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
>>>> +	  aarch64_maybe_swab128 (zero_reg);
>>> I think we have a long standing bug here. zero_reg was meant to stay as the value 0. But then
>>> it got reused as a general temp buffer.
>>> It’s not shown in the diff, but we do:
>>> * memset zero_reg to 0
>>> * use zero_reg as a temp buffer for copying fpsimd values.
>>> * use zero_reg as value 0 for fpsr and fpcr.
>>> The memset needs moving after using it for fpsimd. (maybe also rename zero_reg to buf?)
>>
>> I'll address this.
>>
>>> Can we also reduce the number of memcpys - just byte swap vregs[i] directly into the zero_reg buffer?
>>
>> The byte swap function only swaps things for __BYTE_ORDER == __BIG_ENDIAN, we would still need to memcpy the bytes to zero_reg. The number of memcpy's would be the same, no?
> 
> For LE, we want to copy into zero_reg to ensure it is zero padded after 128bits, then call raw_supply.
> So that’s all fine.
> 
> For BE, we could just byteswap directly from fpsimd->vregs[i] into zero_reg (?) No need for any memcpys.
> 
> 

Are you suggesting rewriting aarch64_maybe_swab128 to take dst and src 
parameters, so we can byteswap from vregs[i] into zero_reg? Right now 
aarch64_maybe_swab128 only swaps things in place.

>>
>>>> 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>>>> 	}
>>>>
>>>> @@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>>> 	 kernel, which is why we try to avoid it.  */
>>>>
>>>>        bool has_sve_state = false;
>>>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>>        struct user_fpsimd_state *fpsimd
>>>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>>>
>>>> @@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>>> 	 write out state and return.  */
>>>>        if (!has_sve_state)
>>>> 	{
>>>> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
>>>> +	     while SVE state is laid out in an endianness-independent format
>>>> +	     (LE).
>>>> +
>>>> +	     So we have a couple cases to consider:
>>>> +
>>>> +	     1 - If the target is big endian, then SIMD state is big endian,
>>>> +	     requiring a byteswap.
>>>> +
>>>> +	     2 - If the target is little endian, then SIMD state is little
>>>> +	     endian, which matches the SVE format, so no byteswap is needed. */
>>>> +
>>>> 	  /* The collects of the Z registers will overflow the size of a vreg.
>>>> 	     There is enough space in the structure to allow for this, but we
>>>> 	     cannot overflow into the next register as we might not be
>>>> @@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>>> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>>>> 		{
>>>> 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>>>> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
>>>> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
>>>> +		  aarch64_maybe_swab128 (zero_reg);
>>>> +		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
>>>> 		}
>>>> 	    }
>>>>
>>>> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
>>>> index a6a84790a9..8a1aa818ad 100644
>>>> --- a/gdb/trad-frame.c
>>>> +++ b/gdb/trad-frame.c
>>>> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>>>      {
>>>>        regs[regnum].realreg = regnum;
>>>>        regs[regnum].addr = -1;
>>>> +      regs[regnum].data = nullptr;
>>>>      }
>>>> }
>>>>
>>>> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>>>>    return trad_frame_alloc_saved_regs (gdbarch);
>>>> }
>>>>
>>>> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
>>>> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
>>>>
>>>> int
>>>> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
>>>> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>>>> 	  && this_saved_regs[regnum].addr == -1);
>>>> }
>>>>
>>>> +/* See trad-frame.h.  */
>>>> +
>>>> +bool
>>>> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>>>> +			  int regnum)
>>>> +{
>>>> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
>>>> +	  && this_saved_regs[regnum].data != nullptr);
>>>> +}
>>>> +
>>>> void
>>>> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>>>> 		      int regnum, LONGEST val)
>>>> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>>>    this_saved_regs[regnum].addr = -1;
>>>> }
>>>>
>>>> +/* See trad-frame.h.  */
>>>> +
>>>> +void
>>>> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>>>> +			    int regnum, const gdb_byte *bytes,
>>>> +			    size_t size)
>>>> +{
>>>> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
>>>> +
>>>> +  /* Allocate the space and copy the data bytes.  */
>>>> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
>>> Am I right to assume this means data will be automatically unallocated when
>>> the trad_frame_saved_reg goes out of scope?
>>
>> Not when it goes out of scope, but when all the frame-related data is freed by GDB just before restarting inferior movement.
> 
> Ok.
> 
>>
>>>> +  memcpy (this_saved_regs[regnum].data, bytes, size);
>>>> +}
>>>> +
>>>> +/* See trad-frame.h.  */
>>>> +
>>>> +void
>>>> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>>>> +				int regnum, const gdb_byte *bytes,
>>>> +				size_t size)
>>>> +{
>>>> +  /* External interface for users of trad_frame_cache
>>>> +     (who cannot access the prev_regs object directly).  */
>>>> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
>>>> +			      size);
>>>> +}
>>>> +
>>>> +
>>>> +
>>>> struct value *
>>>> trad_frame_get_prev_register (struct frame_info *this_frame,
>>>> 			      struct trad_frame_saved_reg this_saved_regs[],
>>>> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>>>>      /* The register's value is available.  */
>>>>      return frame_unwind_got_constant (this_frame, regnum,
>>>> 				      this_saved_regs[regnum].addr);
>>>> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
>>>> +    /* The register's value is available as a sequence of bytes.  */
>>>> +    return frame_unwind_got_bytes (this_frame, regnum,
>>>> +				   this_saved_regs[regnum].data);
>>>>    else
>>>>      return frame_unwind_got_optimized (this_frame, regnum);
>>>> }
>>>> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
>>>> index 7b5785616e..38db439579 100644
>>>> --- a/gdb/trad-frame.h
>>>> +++ b/gdb/trad-frame.h
>>>> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
>>>> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>>>> 			       int regnum, LONGEST val);
>>>>
>>>> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
>>>> +   contained in BYTES with size SIZE.  */
>>>> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>>>> +				     int regnum, const gdb_byte *bytes,
>>>> +				     size_t size);
>>>> +
>>>> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
>>>> 				       struct frame_info *this_frame,
>>>> 				       int regnum);
>>>> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>>>> {
>>>>    LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>>>>    int realreg;
>>>> +  /* Register data (for values that don't fit in ADDR).  */
>>>> +  gdb_byte *data;
>>>> };
>>>>
>>>> /* Encode REGNUM value in the trad-frame.  */
>>>> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
>>>> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>>> 			     int regnum);
>>>>
>>>> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
>>>> +   useful when the value is larger than what primitive types can hold.  */
>>>> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>>>> +				 int regnum, const gdb_byte *bytes,
>>>> +				 size_t size);
>>>> +
>>>> /* Convenience functions, return non-zero if the register has been
>>>>     encoded as specified.  */
>>>> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
>>>> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
>>>> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>>>> 			  int regnum);
>>>>
>>>> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
>>>> +   otherwise.  */
>>>> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>>>> +			      int regnum);
>>>> +
>>>> /* Reset the save regs cache, setting register values to -1.  */
>>>> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>>> 				  struct trad_frame_saved_reg *regs);
>>>> -- 
>>>> 2.25.1
>>>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH][AArch64] SVE/FPSIMD fixup for big endian
  2020-12-01 18:40       ` Luis Machado
@ 2020-12-02  9:07         ` Alan Hayward
  0 siblings, 0 replies; 13+ messages in thread
From: Alan Hayward @ 2020-12-02  9:07 UTC (permalink / raw)
  To: Luis Machado; +Cc: gdb-patches\@sourceware.org, nd



> On 1 Dec 2020, at 18:40, Luis Machado <luis.machado@linaro.org> wrote:
> 
> On 12/1/20 2:38 PM, Alan Hayward wrote:
>>> On 1 Dec 2020, at 12:19, Luis Machado <luis.machado@linaro.org> wrote:
>>> 
>>> Hi,
>>> 
>>> Thanks for the review.
>>> 
>>> On 12/1/20 8:28 AM, Alan Hayward wrote:
>>>>> On 30 Nov 2020, at 18:55, Luis Machado <luis.machado@linaro.org> wrote:
>>>>> 
>>>>> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
>>>>> structure follows the target endianness, whereas the SVE dumps are
>>>>> endianness-independent (LE).
>>>>> 
>>>>> Therefore, when the system is in BE mode, we need to reverse the bytes
>>>>> for the FPSIMD data.
>>>>> 
>>>>> Given the V registers are larger than 64-bit, I've added a way for value
>>>>> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
>>>>> nicely with the unwinding *_got_bytes function and makes the trad-frame
>>>>> more flexible and capable of saving larger registers.
>>>>> 
>>>>> The memory for the bytes is allocated via the frame obstack, so it gets freed
>>>>> after we're done inspecting the frame.
>>>>> 
>>>>> gdb/ChangeLog:
>>>>> 
>>>>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>>>> 
>>>>> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
>>>>> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
>>>>> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
>>>>> 	(aarch64_maybe_swab128): New function.
>>>>> 	(aarch64_sve_regs_copy_to_reg_buf)
>>>>> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
>>>>> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
>>>>> 	the data field.
>>>>> 	(TF_REG_VALUE_BYTES): New enum value.
>>>>> 	(trad_frame_value_bytes_p): New function.
>>>>> 	(trad_frame_set_value_bytes): New function.
>>>>> 	(trad_frame_set_reg_value_bytes): New function.
>>>>> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
>>>>> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
>>>>> 	(struct trad_frame_saved_reg) <data>: New field.
>>>>> 	(trad_frame_set_value_bytes): New prototype.
>>>>> 	(trad_frame_value_bytes_p): New prototype.
>>>>> ---
>>>>> gdb/aarch64-linux-tdep.c           | 115 ++++++++++++++++++++++++-----
>>>>> gdb/nat/aarch64-sve-linux-ptrace.c |  57 +++++++++++++-
>>>>> gdb/trad-frame.c                   |  46 +++++++++++-
>>>>> gdb/trad-frame.h                   |  19 +++++
>>>>> 4 files changed, 213 insertions(+), 24 deletions(-)
>>>>> 
>>>>> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
>>>>> index c9898bdafd..108f96be71 100644
>>>>> --- a/gdb/aarch64-linux-tdep.c
>>>>> +++ b/gdb/aarch64-linux-tdep.c
>>>>> @@ -180,6 +180,94 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>>>>>   return magic;
>>>>> }
>>>>> 
>>>>> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
>>>>> +   registers from a signal frame.
>>>>> +
>>>>> +   VREG_NUM is the number of the V register being restored, OFFSET is the
>>>>> +   address containing the register value, BYTE_ORDER is the endianness and
>>>>> +   HAS_SVE tells us if we have a valid SVE context or not.  */
>>>>> +
>>>>> +static void
>>>>> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
>>>>> +			    int vreg_num, CORE_ADDR offset,
>>>>> +			    enum bfd_endian byte_order, bool has_sve)
>>>>> +{
>>>>> +  /* WARNING: SIMD state is laid out in memory in target-endian format, while
>>>>> +     SVE state is laid out in an endianness-independent format (LE).
>>>>> +
>>>>> +     So we have a couple cases to consider:
>>>>> +
>>>>> +     1 - If the target is big endian, then SIMD state is big endian,
>>>>> +     requiring a byteswap.
>>>>> +
>>>>> +     2 - If the target is little endian, then SIMD state is little endian,
>>>>> +     which matches the SVE format, so no byteswap is needed. */
>>>>> +
>>>> With this function, we are only handling FPSIMD values, so no need to mention SVE.
>>>> As it is now, it makes the has_sve parts confusing because they are being treated
>>>> the same as the rest of the fpsimd.
>>> 
>>> Though we are handling FPSIMD, we still need to set at least one SVE pseudo-register (AARCH64_SVE_V0_REGNUM), hence why I decided to keep the warning. It is not really a SVE register, but a pseudo-register of the FPSIMD V register.
>>> 
>>> Do you still want to remove the SVE references in the warning?
>> For this function, I would say remove the SVE comment, as everything in the function
>> is being treated in the same way. But I’m not overly hung up on it.
> 
> It now reads...
> 
>  /* WARNING: SIMD state is laid out in memory in target-endian format.
> 
>     So we have a couple cases to consider:
> 
>     1 - If the target is big endian, then SIMD state is big endian, requiring a byteswap.
> 
>     2 - If the target is little endian, then SIMD state is little endian, so no byteswap is needed. */
> 
> 

Ok.

>>> 
>>> What is the difference between AARCH64_SVE_V0_REGNUM and AARCH64_V0_REGNUM?
>> Z,V,Q,D,S,H,B registers all overlap with each other. It’s the same register,
>> just a different size.
>> In the tdep, when we only have neon, V is the real register, and Q,D,S,H,B are pseudos.
>> Then when we have sve, Z is the real register, and V,Q,D,S,H,B are pseudos.
>> AARCH64_SVE_Z0_REGNUM == AARCH64_V0_REGNUM.
>> And then AARCH64_SVE_V0_REGNUM represents the regnum of the V0 pseudo register
>> when SVE is enabled.
>> ARCH64_SVE_V0_REGNUM == AARCH64_B0_REGNUM + 32
>> Maybe eventually this should be updated to be similar to pauth. Have a single define
>> which requires a v0_reg_base value in the tdep.
>>> 
>>>> The same comment is fine when used elsewhere in the patch.
>>>>> +  if (byte_order == BFD_ENDIAN_BIG)
>>>>> +    {
>>>>> +      gdb_byte buf[V_REGISTER_SIZE];
>>>>> +
>>>>> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
>>>>> +	{
>>>>> +	  size_t size = V_REGISTER_SIZE/2;
>>>>> +
>>>>> +	  /* Read the two halves of the V register in reverse byte order.  */
>>>>> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
>>>>> +						    byte_order);
>>>>> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
>>>>> +						    byte_order);
>>>>> +
>>>>> +	  /* Copy the reversed bytes to the buffer.  */
>>>>> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
>>>>> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
>>>>> +
>>>>> +	  /* Now we can store the correct bytes for the V register.  */
>>>>> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
>>>>> +					  buf, V_REGISTER_SIZE);
>>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>>> +					  num_regs + AARCH64_Q0_REGNUM
>>>>> +					  + vreg_num, buf, Q_REGISTER_SIZE);
>>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>>> +					  num_regs + AARCH64_D0_REGNUM
>>>>> +					  + vreg_num, buf, D_REGISTER_SIZE);
>>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>>> +					  num_regs + AARCH64_S0_REGNUM
>>>>> +					  + vreg_num, buf, S_REGISTER_SIZE);
>>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>>> +					  num_regs + AARCH64_H0_REGNUM
>>>>> +					  + vreg_num, buf, H_REGISTER_SIZE);
>>>>> +	  trad_frame_set_reg_value_bytes (cache,
>>>>> +					  num_regs + AARCH64_B0_REGNUM
>>>>> +					  + vreg_num, buf, B_REGISTER_SIZE);
>>>>> +
>>>>> +	  if (has_sve)
>>>>> +	    trad_frame_set_reg_value_bytes (cache,
>>>>> +					    num_regs + AARCH64_SVE_V0_REGNUM
>>>>> +					    + vreg_num, buf, V_REGISTER_SIZE);
>>>>> +	}
>>>>> +      return;
>>>>> +    }
>>>>> +
>>>>> +  /* Little endian, just point at the address containing the register
>>>>> +     value.  */
>>>>> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
>>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
>>>>> +			   offset);
>>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
>>>>> +			   offset);
>>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
>>>>> +			   offset);
>>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
>>>>> +			   offset);
>>>>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
>>>>> +			   offset);
>>>>> +
>>>>> +  if (has_sve)
>>>>> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
>>>>> +			     + vreg_num, offset);
>>>>> +
>>>>> +}
>>>>> +
>>>>> /* Implement the "init" method of struct tramp_frame.  */
>>>>> 
>>>>> static void
>>>>> @@ -332,27 +420,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>>>>> 
>>>>>       /* If there was no SVE section then set up the V registers.  */
>>>>>       if (sve_regs == 0)
>>>>> -	for (int i = 0; i < 32; i++)
>>>>> -	  {
>>>>> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>>>> +	{
>>>>> +	  for (int i = 0; i < 32; i++)
>>>>> +	    {
>>>>> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>>>> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
>>>>> 
>>>>> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
>>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>>> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
>>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>>> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
>>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>>> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
>>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>>> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
>>>>> -	    trad_frame_set_reg_addr (this_cache,
>>>>> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
>>>>> -	    if (tdep->has_sve ())
>>>>> -	      trad_frame_set_reg_addr (this_cache,
>>>>> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
>>>>> -				       offset);
>>>>> -	  }
>>>>> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
>>>>> +					  byte_order, tdep->has_sve ());
>>>>> +	    }
>>>>> +	}
>>>>>     }
>>>>> 
>>>>>   trad_frame_set_id (this_cache, frame_id_build (sp, func));
>>>>> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
>>>>> index 2ce90ccfd7..9ef5e91801 100644
>>>>> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
>>>>> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
>>>>> @@ -26,6 +26,7 @@
>>>>> #include "arch/aarch64.h"
>>>>> #include "gdbsupport/common-regcache.h"
>>>>> #include "gdbsupport/byte-vector.h"
>>>>> +#include <endian.h>
>>>>> 
>>>>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>>>> 
>>>>> @@ -142,6 +143,27 @@ aarch64_sve_get_sveregs (int tid)
>>>>>   return buf;
>>>>> }
>>>>> 
>>>>> +/* If we are running in BE mode, convert the contents
>>>>> +   of VALUE (a 16 byte buffer) to LE.  */
>>>>> +
>>>>> +static void
>>>>> +aarch64_maybe_swab128 (gdb_byte *value)
>>>>> +{
>>>>> +  gdb_assert (value != nullptr);
>>>>> +
>>>>> +#if (__BYTE_ORDER == __BIG_ENDIAN)
>>>>> +  gdb_byte copy[16];
>>>>> +
>>>>> +  /* Save the original value.  */
>>>>> +  memcpy (copy, value, 16);
>>>>> +
>>>>> +  for (int i = 0; i < 15; i++)
>>>>> +    value[i] = copy[15 - i];
>>>>> +#else
>>>>> +  /* Nothing to be done.  */
>>>>> +#endif
>>>>> +}
>>>>> +
>>>>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>>>> 
>>>>> void
>>>>> @@ -184,11 +206,22 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>>>>     }
>>>>>   else
>>>>>     {
>>>>> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
>>>>> +	 while SVE state is laid out in an endianness-independent format (LE).
>>>>> +
>>>>> +	 So we have a couple cases to consider:
>>>>> +
>>>>> +	 1 - If the target is big endian, then SIMD state is big endian,
>>>>> +	 requiring a byteswap.
>>>>> +
>>>>> +	 2 - If the target is little endian, then SIMD state is little endian,
>>>>> +	 which matches the SVE format, so no byteswap is needed. */
>>>>> +
>>>>>       /* There is no SVE state yet - the register dump contains a fpsimd
>>>>> 	 structure instead.  These registers still exist in the hardware, but
>>>>> 	 the kernel has not yet initialised them, and so they will be null.  */
>>>>> 
>>>>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>>>       struct user_fpsimd_state *fpsimd
>>>>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>>>> 
>>>>> @@ -199,7 +232,9 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>>>> 
>>>>>       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>>>>> 	{
>>>>> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
>>>>> +	  memcpy (zero_reg, &fpsimd->vregs[i], 16);
>>>>> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
>>>>> +	  aarch64_maybe_swab128 (zero_reg);
>>>> I think we have a long standing bug here. zero_reg was meant to stay as the value 0. But then
>>>> it got reused as a general temp buffer.
>>>> It’s not shown in the diff, but we do:
>>>> * memset zero_reg to 0
>>>> * use zero_reg as a temp buffer for copying fpsimd values.
>>>> * use zero_reg as value 0 for fpsr and fpcr.
>>>> The memset needs moving after using it for fpsimd. (maybe also rename zero_reg to buf?)
>>> 
>>> I'll address this.
>>> 
>>>> Can we also reduce the number of memcpys - just byte swap vregs[i] directly into the zero_reg buffer?
>>> 
>>> The byte swap function only swaps things for __BYTE_ORDER == __BIG_ENDIAN, we would still need to memcpy the bytes to zero_reg. The number of memcpy's would be the same, no?
>> For LE, we want to copy into zero_reg to ensure it is zero padded after 128bits, then call raw_supply.
>> So that’s all fine.
>> For BE, we could just byteswap directly from fpsimd->vregs[i] into zero_reg (?) No need for any memcpys.
> 
> Are you suggesting rewriting aarch64_maybe_swab128 to take dst and src parameters, so we can byteswap from vregs[i] into zero_reg? Right now aarch64_maybe_swab128 only swaps things in place.

Yes, that’d probably be the best way.

> 
>>> 
>>>>> 	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>>>>> 	}
>>>>> 
>>>>> @@ -240,7 +275,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>>>> 	 kernel, which is why we try to avoid it.  */
>>>>> 
>>>>>       bool has_sve_state = false;
>>>>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>>> +      gdb_byte *zero_reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>>>>       struct user_fpsimd_state *fpsimd
>>>>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>>>> 
>>>>> @@ -274,6 +309,18 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>>>> 	 write out state and return.  */
>>>>>       if (!has_sve_state)
>>>>> 	{
>>>>> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
>>>>> +	     while SVE state is laid out in an endianness-independent format
>>>>> +	     (LE).
>>>>> +
>>>>> +	     So we have a couple cases to consider:
>>>>> +
>>>>> +	     1 - If the target is big endian, then SIMD state is big endian,
>>>>> +	     requiring a byteswap.
>>>>> +
>>>>> +	     2 - If the target is little endian, then SIMD state is little
>>>>> +	     endian, which matches the SVE format, so no byteswap is needed. */
>>>>> +
>>>>> 	  /* The collects of the Z registers will overflow the size of a vreg.
>>>>> 	     There is enough space in the structure to allow for this, but we
>>>>> 	     cannot overflow into the next register as we might not be
>>>>> @@ -285,7 +332,9 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>>>> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>>>>> 		{
>>>>> 		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>>>>> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
>>>>> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
>>>>> +		  aarch64_maybe_swab128 (zero_reg);
>>>>> +		  memcpy (&fpsimd->vregs[i], zero_reg, 16);
>>>>> 		}
>>>>> 	    }
>>>>> 
>>>>> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
>>>>> index a6a84790a9..8a1aa818ad 100644
>>>>> --- a/gdb/trad-frame.c
>>>>> +++ b/gdb/trad-frame.c
>>>>> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>>>>     {
>>>>>       regs[regnum].realreg = regnum;
>>>>>       regs[regnum].addr = -1;
>>>>> +      regs[regnum].data = nullptr;
>>>>>     }
>>>>> }
>>>>> 
>>>>> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>>>>>   return trad_frame_alloc_saved_regs (gdbarch);
>>>>> }
>>>>> 
>>>>> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
>>>>> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
>>>>> 
>>>>> int
>>>>> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
>>>>> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>>>>> 	  && this_saved_regs[regnum].addr == -1);
>>>>> }
>>>>> 
>>>>> +/* See trad-frame.h.  */
>>>>> +
>>>>> +bool
>>>>> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>>>>> +			  int regnum)
>>>>> +{
>>>>> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
>>>>> +	  && this_saved_regs[regnum].data != nullptr);
>>>>> +}
>>>>> +
>>>>> void
>>>>> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>>>>> 		      int regnum, LONGEST val)
>>>>> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>>>>   this_saved_regs[regnum].addr = -1;
>>>>> }
>>>>> 
>>>>> +/* See trad-frame.h.  */
>>>>> +
>>>>> +void
>>>>> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>>>>> +			    int regnum, const gdb_byte *bytes,
>>>>> +			    size_t size)
>>>>> +{
>>>>> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
>>>>> +
>>>>> +  /* Allocate the space and copy the data bytes.  */
>>>>> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
>>>> Am I right to assume this means data will be automatically unallocated when
>>>> the trad_frame_saved_reg goes out of scope?
>>> 
>>> Not when it goes out of scope, but when all the frame-related data is freed by GDB just before restarting inferior movement.
>> Ok.
>>> 
>>>>> +  memcpy (this_saved_regs[regnum].data, bytes, size);
>>>>> +}
>>>>> +
>>>>> +/* See trad-frame.h.  */
>>>>> +
>>>>> +void
>>>>> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>>>>> +				int regnum, const gdb_byte *bytes,
>>>>> +				size_t size)
>>>>> +{
>>>>> +  /* External interface for users of trad_frame_cache
>>>>> +     (who cannot access the prev_regs object directly).  */
>>>>> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
>>>>> +			      size);
>>>>> +}
>>>>> +
>>>>> +
>>>>> +
>>>>> struct value *
>>>>> trad_frame_get_prev_register (struct frame_info *this_frame,
>>>>> 			      struct trad_frame_saved_reg this_saved_regs[],
>>>>> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>>>>>     /* The register's value is available.  */
>>>>>     return frame_unwind_got_constant (this_frame, regnum,
>>>>> 				      this_saved_regs[regnum].addr);
>>>>> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
>>>>> +    /* The register's value is available as a sequence of bytes.  */
>>>>> +    return frame_unwind_got_bytes (this_frame, regnum,
>>>>> +				   this_saved_regs[regnum].data);
>>>>>   else
>>>>>     return frame_unwind_got_optimized (this_frame, regnum);
>>>>> }
>>>>> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
>>>>> index 7b5785616e..38db439579 100644
>>>>> --- a/gdb/trad-frame.h
>>>>> +++ b/gdb/trad-frame.h
>>>>> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
>>>>> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>>>>> 			       int regnum, LONGEST val);
>>>>> 
>>>>> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
>>>>> +   contained in BYTES with size SIZE.  */
>>>>> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>>>>> +				     int regnum, const gdb_byte *bytes,
>>>>> +				     size_t size);
>>>>> +
>>>>> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
>>>>> 				       struct frame_info *this_frame,
>>>>> 				       int regnum);
>>>>> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>>>>> {
>>>>>   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>>>>>   int realreg;
>>>>> +  /* Register data (for values that don't fit in ADDR).  */
>>>>> +  gdb_byte *data;
>>>>> };
>>>>> 
>>>>> /* Encode REGNUM value in the trad-frame.  */
>>>>> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
>>>>> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>>>> 			     int regnum);
>>>>> 
>>>>> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
>>>>> +   useful when the value is larger than what primitive types can hold.  */
>>>>> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>>>>> +				 int regnum, const gdb_byte *bytes,
>>>>> +				 size_t size);
>>>>> +
>>>>> /* Convenience functions, return non-zero if the register has been
>>>>>    encoded as specified.  */
>>>>> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
>>>>> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
>>>>> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>>>>> 			  int regnum);
>>>>> 
>>>>> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
>>>>> +   otherwise.  */
>>>>> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>>>>> +			      int regnum);
>>>>> +
>>>>> /* Reset the save regs cache, setting register values to -1.  */
>>>>> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>>>> 				  struct trad_frame_saved_reg *regs);
>>>>> -- 
>>>>> 2.25.1
>>>>> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-11-30 18:55 [PATCH][AArch64] SVE/FPSIMD fixup for big endian Luis Machado
  2020-12-01 11:28 ` Alan Hayward
@ 2020-12-02 17:57 ` Luis Machado
  2020-12-03 17:35   ` Alan Hayward
  2020-12-04 14:22   ` Luis Machado
  1 sibling, 2 replies; 13+ messages in thread
From: Luis Machado @ 2020-12-02 17:57 UTC (permalink / raw)
  To: gdb-patches

Updates on v2:

- Reworked ptrace functions and made byteswap function more like a memcpy.
- Adjusted memset calls to zero register buffers.
- Adjusted comments.
- Moved V_REGISTER_SIZE constant definition.

The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
structure follows the target endianness, whereas the SVE dumps are
endianness-independent (LE).

Therefore, when the system is in BE mode, we need to reverse the bytes
for the FPSIMD data.

Given the V registers are larger than 64-bit, I've added a way for value
bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
nicely with the unwinding *_got_bytes function and makes the trad-frame
more flexible and capable of saving larger registers.

The memory for the bytes is allocated via the frame obstack, so it gets freed
after we're done inspecting the frame.

gdb/ChangeLog:

YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>

	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
	* aarch64-tdep.h (V_REGISTER_SIZE): Move to ...
	* arch/aarch64.h: ... here.
	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
	(aarch64_maybe_swab128): New function.
	(aarch64_sve_regs_copy_to_reg_buf)
	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
	the data field.
	(TF_REG_VALUE_BYTES): New enum value.
	(trad_frame_value_bytes_p): New function.
	(trad_frame_set_value_bytes): New function.
	(trad_frame_set_reg_value_bytes): New function.
	(trad_frame_get_prev_register): Handle register values saved as bytes.
	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
	(struct trad_frame_saved_reg) <data>: New field.
	(trad_frame_set_value_bytes): New prototype.
	(trad_frame_value_bytes_p): New prototype.
---
 gdb/aarch64-linux-tdep.c           | 114 ++++++++++++++++++++++++-----
 gdb/aarch64-tdep.h                 |   1 -
 gdb/arch/aarch64.h                 |   2 +
 gdb/nat/aarch64-sve-linux-ptrace.c |  77 +++++++++++++++----
 gdb/trad-frame.c                   |  46 +++++++++++-
 gdb/trad-frame.h                   |  19 +++++
 6 files changed, 224 insertions(+), 35 deletions(-)

diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
index c9898bdafd..9429af0fe1 100644
--- a/gdb/aarch64-linux-tdep.c
+++ b/gdb/aarch64-linux-tdep.c
@@ -180,6 +180,93 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
   return magic;
 }
 
+/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
+   registers from a signal frame.
+
+   VREG_NUM is the number of the V register being restored, OFFSET is the
+   address containing the register value, BYTE_ORDER is the endianness and
+   HAS_SVE tells us if we have a valid SVE context or not.  */
+
+static void
+aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
+			    int vreg_num, CORE_ADDR offset,
+			    enum bfd_endian byte_order, bool has_sve)
+{
+  /* WARNING: SIMD state is laid out in memory in target-endian format.
+
+     So we have a couple cases to consider:
+
+     1 - If the target is big endian, then SIMD state is big endian,
+     requiring a byteswap.
+
+     2 - If the target is little endian, then SIMD state is little endian, so
+     no byteswap is needed. */
+
+  if (byte_order == BFD_ENDIAN_BIG)
+    {
+      gdb_byte buf[V_REGISTER_SIZE];
+
+      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
+	{
+	  size_t size = V_REGISTER_SIZE/2;
+
+	  /* Read the two halves of the V register in reverse byte order.  */
+	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
+						    byte_order);
+	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
+						    byte_order);
+
+	  /* Copy the reversed bytes to the buffer.  */
+	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
+	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
+
+	  /* Now we can store the correct bytes for the V register.  */
+	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
+					  buf, V_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_Q0_REGNUM
+					  + vreg_num, buf, Q_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_D0_REGNUM
+					  + vreg_num, buf, D_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_S0_REGNUM
+					  + vreg_num, buf, S_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_H0_REGNUM
+					  + vreg_num, buf, H_REGISTER_SIZE);
+	  trad_frame_set_reg_value_bytes (cache,
+					  num_regs + AARCH64_B0_REGNUM
+					  + vreg_num, buf, B_REGISTER_SIZE);
+
+	  if (has_sve)
+	    trad_frame_set_reg_value_bytes (cache,
+					    num_regs + AARCH64_SVE_V0_REGNUM
+					    + vreg_num, buf, V_REGISTER_SIZE);
+	}
+      return;
+    }
+
+  /* Little endian, just point at the address containing the register
+     value.  */
+  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
+			   offset);
+  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
+			   offset);
+
+  if (has_sve)
+    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
+			     + vreg_num, offset);
+
+}
+
 /* Implement the "init" method of struct tramp_frame.  */
 
 static void
@@ -332,27 +419,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
 
       /* If there was no SVE section then set up the V registers.  */
       if (sve_regs == 0)
-	for (int i = 0; i < 32; i++)
-	  {
-	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
+	{
+	  for (int i = 0; i < 32; i++)
+	    {
+	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
 
-	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_Q0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_D0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_S0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_H0_REGNUM + i, offset);
-	    trad_frame_set_reg_addr (this_cache,
-				     num_regs + AARCH64_B0_REGNUM + i, offset);
-	    if (tdep->has_sve ())
-	      trad_frame_set_reg_addr (this_cache,
-				       num_regs + AARCH64_SVE_V0_REGNUM + i,
-				       offset);
-	  }
+	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
+					  byte_order, tdep->has_sve ());
+	    }
+	}
     }
 
   trad_frame_set_id (this_cache, frame_id_build (sp, func));
diff --git a/gdb/aarch64-tdep.h b/gdb/aarch64-tdep.h
index 81ce4d84b4..ba72a771cd 100644
--- a/gdb/aarch64-tdep.h
+++ b/gdb/aarch64-tdep.h
@@ -47,7 +47,6 @@ struct regset;
 #define H_REGISTER_SIZE  2
 #define S_REGISTER_SIZE  4
 #define D_REGISTER_SIZE  8
-#define V_REGISTER_SIZE 16
 #define Q_REGISTER_SIZE 16
 
 /* Total number of general (X) registers.  */
diff --git a/gdb/arch/aarch64.h b/gdb/arch/aarch64.h
index 857bb22b03..b75352461a 100644
--- a/gdb/arch/aarch64.h
+++ b/gdb/arch/aarch64.h
@@ -58,6 +58,8 @@ enum aarch64_regnum
   AARCH64_LAST_V_ARG_REGNUM = AARCH64_V0_REGNUM + 7
 };
 
+#define V_REGISTER_SIZE 16
+
 /* Pseudo register base numbers.  */
 #define AARCH64_Q0_REGNUM 0
 #define AARCH64_D0_REGNUM (AARCH64_Q0_REGNUM + AARCH64_D_REGISTER_COUNT)
diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
index 2ce90ccfd7..5f08995f87 100644
--- a/gdb/nat/aarch64-sve-linux-ptrace.c
+++ b/gdb/nat/aarch64-sve-linux-ptrace.c
@@ -26,6 +26,7 @@
 #include "arch/aarch64.h"
 #include "gdbsupport/common-regcache.h"
 #include "gdbsupport/byte-vector.h"
+#include <endian.h>
 
 /* See nat/aarch64-sve-linux-ptrace.h.  */
 
@@ -142,6 +143,24 @@ aarch64_sve_get_sveregs (int tid)
   return buf;
 }
 
+/* If we are running in BE mode, byteswap the contents
+   of SRC to DST for SIZE bytes.  Other, just copy the contents
+   from SRC to DST.  */
+
+static void
+aarch64_maybe_swab128 (gdb_byte *dst, const gdb_byte *src, size_t size)
+{
+  gdb_assert (src != nullptr && dst != nullptr);
+  gdb_assert (size > 1);
+
+#if (__BYTE_ORDER == __BIG_ENDIAN)
+  for (int i = 0; i < size - 1; i++)
+    dst[i] = src[size - i];
+#else
+  memcpy (dst, src, size);
+#endif
+}
+
 /* See nat/aarch64-sve-linux-ptrace.h.  */
 
 void
@@ -184,34 +203,50 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
     }
   else
     {
+      /* WARNING: SIMD state is laid out in memory in target-endian format,
+	 while SVE state is laid out in an endianness-independent format (LE).
+
+	 So we have a couple cases to consider:
+
+	 1 - If the target is big endian, then SIMD state is big endian,
+	 requiring a byteswap.
+
+	 2 - If the target is little endian, then SIMD state is little endian,
+	 which matches the SVE format, so no byteswap is needed. */
+
       /* There is no SVE state yet - the register dump contains a fpsimd
 	 structure instead.  These registers still exist in the hardware, but
 	 the kernel has not yet initialised them, and so they will be null.  */
 
-      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
+      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
       struct user_fpsimd_state *fpsimd
 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
 
+      /* Make sure we have a zeroed register buffer.  We will need the zero
+	 padding below.  */
+      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
+
       /* Copy across the V registers from fpsimd structure to the Z registers,
 	 ensuring the non overlapping state is set to null.  */
 
-      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
-
       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
 	{
-	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
-	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
+	  /* Handle big endian/little endian SIMD/SVE conversion.  */
+	  aarch64_maybe_swab128 (reg, (const gdb_byte *) &fpsimd->vregs[i],
+				 V_REGISTER_SIZE);
+	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, reg);
 	}
 
       reg_buf->raw_supply (AARCH64_FPSR_REGNUM, &fpsimd->fpsr);
       reg_buf->raw_supply (AARCH64_FPCR_REGNUM, &fpsimd->fpcr);
 
       /* Clear the SVE only registers.  */
+      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
 
       for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
-	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, zero_reg);
+	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, reg);
 
-      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, zero_reg);
+      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, reg);
     }
 }
 
@@ -240,11 +275,11 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	 kernel, which is why we try to avoid it.  */
 
       bool has_sve_state = false;
-      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
+      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
       struct user_fpsimd_state *fpsimd
 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
 
-      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
+      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
 
       /* Check in the reg_buf if any of the Z registers are set after the
 	 first 128 bits, or if any of the other SVE registers are set.  */
@@ -252,7 +287,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
 	{
 	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i,
-						 zero_reg, sizeof (__int128_t));
+						 reg, sizeof (__int128_t));
 	  if (has_sve_state)
 	    break;
 	}
@@ -261,19 +296,31 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
 	  {
 	    has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM + i,
-						   zero_reg, 0);
+						   reg, 0);
 	    if (has_sve_state)
 	      break;
 	  }
 
       if (!has_sve_state)
 	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM,
-						 zero_reg, 0);
+						 reg, 0);
 
       /* If no SVE state exists, then use the existing fpsimd structure to
 	 write out state and return.  */
       if (!has_sve_state)
 	{
+	  /* WARNING: SIMD state is laid out in memory in target-endian format,
+	     while SVE state is laid out in an endianness-independent format
+	     (LE).
+
+	     So we have a couple cases to consider:
+
+	     1 - If the target is big endian, then SIMD state is big endian,
+	     requiring a byteswap.
+
+	     2 - If the target is little endian, then SIMD state is little
+	     endian, which matches the SVE format, so no byteswap is needed. */
+
 	  /* The collects of the Z registers will overflow the size of a vreg.
 	     There is enough space in the structure to allow for this, but we
 	     cannot overflow into the next register as we might not be
@@ -284,8 +331,10 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	      if (REG_VALID
 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
 		{
-		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
-		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
+		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, reg);
+		  /* Handle big endian/little endian SIMD/SVE conversion.  */
+		  aarch64_maybe_swab128 ((gdb_byte *) &fpsimd->vregs[i], reg,
+					 V_REGISTER_SIZE);
 		}
 	    }
 
diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
index a6a84790a9..8a1aa818ad 100644
--- a/gdb/trad-frame.c
+++ b/gdb/trad-frame.c
@@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
     {
       regs[regnum].realreg = regnum;
       regs[regnum].addr = -1;
+      regs[regnum].data = nullptr;
     }
 }
 
@@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
   return trad_frame_alloc_saved_regs (gdbarch);
 }
 
-enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
+enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
 
 int
 trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
@@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
 	  && this_saved_regs[regnum].addr == -1);
 }
 
+/* See trad-frame.h.  */
+
+bool
+trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
+			  int regnum)
+{
+  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
+	  && this_saved_regs[regnum].data != nullptr);
+}
+
 void
 trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
 		      int regnum, LONGEST val)
@@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
   this_saved_regs[regnum].addr = -1;
 }
 
+/* See trad-frame.h.  */
+
+void
+trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
+			    int regnum, const gdb_byte *bytes,
+			    size_t size)
+{
+  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
+
+  /* Allocate the space and copy the data bytes.  */
+  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
+  memcpy (this_saved_regs[regnum].data, bytes, size);
+}
+
+/* See trad-frame.h.  */
+
+void
+trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
+				int regnum, const gdb_byte *bytes,
+				size_t size)
+{
+  /* External interface for users of trad_frame_cache
+     (who cannot access the prev_regs object directly).  */
+  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
+			      size);
+}
+
+
+
 struct value *
 trad_frame_get_prev_register (struct frame_info *this_frame,
 			      struct trad_frame_saved_reg this_saved_regs[],
@@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
     /* The register's value is available.  */
     return frame_unwind_got_constant (this_frame, regnum,
 				      this_saved_regs[regnum].addr);
+  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
+    /* The register's value is available as a sequence of bytes.  */
+    return frame_unwind_got_bytes (this_frame, regnum,
+				   this_saved_regs[regnum].data);
   else
     return frame_unwind_got_optimized (this_frame, regnum);
 }
diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
index 7b5785616e..38db439579 100644
--- a/gdb/trad-frame.h
+++ b/gdb/trad-frame.h
@@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
 void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
 			       int regnum, LONGEST val);
 
+/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
+   contained in BYTES with size SIZE.  */
+void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
+				     int regnum, const gdb_byte *bytes,
+				     size_t size);
+
 struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
 				       struct frame_info *this_frame,
 				       int regnum);
@@ -86,6 +92,8 @@ struct trad_frame_saved_reg
 {
   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
   int realreg;
+  /* Register data (for values that don't fit in ADDR).  */
+  gdb_byte *data;
 };
 
 /* Encode REGNUM value in the trad-frame.  */
@@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
 void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
 			     int regnum);
 
+/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
+   useful when the value is larger than what primitive types can hold.  */
+void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
+				 int regnum, const gdb_byte *bytes,
+				 size_t size);
+
 /* Convenience functions, return non-zero if the register has been
    encoded as specified.  */
 int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
@@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
 int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
 			  int regnum);
 
+/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
+   otherwise.  */
+bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
+			      int regnum);
+
 /* Reset the save regs cache, setting register values to -1.  */
 void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
 				  struct trad_frame_saved_reg *regs);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-12-02 17:57 ` [PATCH,v2] " Luis Machado
@ 2020-12-03 17:35   ` Alan Hayward
  2020-12-03 17:37     ` Luis Machado
  2020-12-04 14:22   ` Luis Machado
  1 sibling, 1 reply; 13+ messages in thread
From: Alan Hayward @ 2020-12-03 17:35 UTC (permalink / raw)
  To: Luis Machado; +Cc: gdb-patches\@sourceware.org, nd

My only nit would be that the other register defines in aarch64-tdep.h
could get moved across to arch/aarch64.h too. However, everything in the
file is for pseudos and dwarf. So, I think it’s fine as is.

It’s an ok from me.


Alan


> On 2 Dec 2020, at 17:57, Luis Machado <luis.machado@linaro.org> wrote:
> 
> Updates on v2:
> 
> - Reworked ptrace functions and made byteswap function more like a memcpy.
> - Adjusted memset calls to zero register buffers.
> - Adjusted comments.
> - Moved V_REGISTER_SIZE constant definition.
> 
> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
> structure follows the target endianness, whereas the SVE dumps are
> endianness-independent (LE).
> 
> Therefore, when the system is in BE mode, we need to reverse the bytes
> for the FPSIMD data.
> 
> Given the V registers are larger than 64-bit, I've added a way for value
> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
> nicely with the unwinding *_got_bytes function and makes the trad-frame
> more flexible and capable of saving larger registers.
> 
> The memory for the bytes is allocated via the frame obstack, so it gets freed
> after we're done inspecting the frame.
> 
> gdb/ChangeLog:
> 
> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
> 
> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
> 	* aarch64-tdep.h (V_REGISTER_SIZE): Move to ...
> 	* arch/aarch64.h: ... here.
> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
> 	(aarch64_maybe_swab128): New function.
> 	(aarch64_sve_regs_copy_to_reg_buf)
> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
> 	the data field.
> 	(TF_REG_VALUE_BYTES): New enum value.
> 	(trad_frame_value_bytes_p): New function.
> 	(trad_frame_set_value_bytes): New function.
> 	(trad_frame_set_reg_value_bytes): New function.
> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
> 	(struct trad_frame_saved_reg) <data>: New field.
> 	(trad_frame_set_value_bytes): New prototype.
> 	(trad_frame_value_bytes_p): New prototype.
> ---
> gdb/aarch64-linux-tdep.c           | 114 ++++++++++++++++++++++++-----
> gdb/aarch64-tdep.h                 |   1 -
> gdb/arch/aarch64.h                 |   2 +
> gdb/nat/aarch64-sve-linux-ptrace.c |  77 +++++++++++++++----
> gdb/trad-frame.c                   |  46 +++++++++++-
> gdb/trad-frame.h                   |  19 +++++
> 6 files changed, 224 insertions(+), 35 deletions(-)
> 
> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
> index c9898bdafd..9429af0fe1 100644
> --- a/gdb/aarch64-linux-tdep.c
> +++ b/gdb/aarch64-linux-tdep.c
> @@ -180,6 +180,93 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>   return magic;
> }
> 
> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
> +   registers from a signal frame.
> +
> +   VREG_NUM is the number of the V register being restored, OFFSET is the
> +   address containing the register value, BYTE_ORDER is the endianness and
> +   HAS_SVE tells us if we have a valid SVE context or not.  */
> +
> +static void
> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
> +			    int vreg_num, CORE_ADDR offset,
> +			    enum bfd_endian byte_order, bool has_sve)
> +{
> +  /* WARNING: SIMD state is laid out in memory in target-endian format.
> +
> +     So we have a couple cases to consider:
> +
> +     1 - If the target is big endian, then SIMD state is big endian,
> +     requiring a byteswap.
> +
> +     2 - If the target is little endian, then SIMD state is little endian, so
> +     no byteswap is needed. */
> +
> +  if (byte_order == BFD_ENDIAN_BIG)
> +    {
> +      gdb_byte buf[V_REGISTER_SIZE];
> +
> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
> +	{
> +	  size_t size = V_REGISTER_SIZE/2;
> +
> +	  /* Read the two halves of the V register in reverse byte order.  */
> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
> +						    byte_order);
> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
> +						    byte_order);
> +
> +	  /* Copy the reversed bytes to the buffer.  */
> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
> +
> +	  /* Now we can store the correct bytes for the V register.  */
> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
> +					  buf, V_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_Q0_REGNUM
> +					  + vreg_num, buf, Q_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_D0_REGNUM
> +					  + vreg_num, buf, D_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_S0_REGNUM
> +					  + vreg_num, buf, S_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_H0_REGNUM
> +					  + vreg_num, buf, H_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_B0_REGNUM
> +					  + vreg_num, buf, B_REGISTER_SIZE);
> +
> +	  if (has_sve)
> +	    trad_frame_set_reg_value_bytes (cache,
> +					    num_regs + AARCH64_SVE_V0_REGNUM
> +					    + vreg_num, buf, V_REGISTER_SIZE);
> +	}
> +      return;
> +    }
> +
> +  /* Little endian, just point at the address containing the register
> +     value.  */
> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
> +			   offset);
> +
> +  if (has_sve)
> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
> +			     + vreg_num, offset);
> +
> +}
> +
> /* Implement the "init" method of struct tramp_frame.  */
> 
> static void
> @@ -332,27 +419,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
> 
>       /* If there was no SVE section then set up the V registers.  */
>       if (sve_regs == 0)
> -	for (int i = 0; i < 32; i++)
> -	  {
> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> +	{
> +	  for (int i = 0; i < 32; i++)
> +	    {
> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
> 
> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
> -	    if (tdep->has_sve ())
> -	      trad_frame_set_reg_addr (this_cache,
> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
> -				       offset);
> -	  }
> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
> +					  byte_order, tdep->has_sve ());
> +	    }
> +	}
>     }
> 
>   trad_frame_set_id (this_cache, frame_id_build (sp, func));
> diff --git a/gdb/aarch64-tdep.h b/gdb/aarch64-tdep.h
> index 81ce4d84b4..ba72a771cd 100644
> --- a/gdb/aarch64-tdep.h
> +++ b/gdb/aarch64-tdep.h
> @@ -47,7 +47,6 @@ struct regset;
> #define H_REGISTER_SIZE  2
> #define S_REGISTER_SIZE  4
> #define D_REGISTER_SIZE  8
> -#define V_REGISTER_SIZE 16
> #define Q_REGISTER_SIZE 16
> 
> /* Total number of general (X) registers.  */
> diff --git a/gdb/arch/aarch64.h b/gdb/arch/aarch64.h
> index 857bb22b03..b75352461a 100644
> --- a/gdb/arch/aarch64.h
> +++ b/gdb/arch/aarch64.h
> @@ -58,6 +58,8 @@ enum aarch64_regnum
>   AARCH64_LAST_V_ARG_REGNUM = AARCH64_V0_REGNUM + 7
> };
> 
> +#define V_REGISTER_SIZE 16
> +
> /* Pseudo register base numbers.  */
> #define AARCH64_Q0_REGNUM 0
> #define AARCH64_D0_REGNUM (AARCH64_Q0_REGNUM + AARCH64_D_REGISTER_COUNT)
> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
> index 2ce90ccfd7..5f08995f87 100644
> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
> @@ -26,6 +26,7 @@
> #include "arch/aarch64.h"
> #include "gdbsupport/common-regcache.h"
> #include "gdbsupport/byte-vector.h"
> +#include <endian.h>
> 
> /* See nat/aarch64-sve-linux-ptrace.h.  */
> 
> @@ -142,6 +143,24 @@ aarch64_sve_get_sveregs (int tid)
>   return buf;
> }
> 
> +/* If we are running in BE mode, byteswap the contents
> +   of SRC to DST for SIZE bytes.  Other, just copy the contents
> +   from SRC to DST.  */
> +
> +static void
> +aarch64_maybe_swab128 (gdb_byte *dst, const gdb_byte *src, size_t size)
> +{
> +  gdb_assert (src != nullptr && dst != nullptr);
> +  gdb_assert (size > 1);
> +
> +#if (__BYTE_ORDER == __BIG_ENDIAN)
> +  for (int i = 0; i < size - 1; i++)
> +    dst[i] = src[size - i];
> +#else
> +  memcpy (dst, src, size);
> +#endif
> +}
> +
> /* See nat/aarch64-sve-linux-ptrace.h.  */
> 
> void
> @@ -184,34 +203,50 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>     }
>   else
>     {
> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	 while SVE state is laid out in an endianness-independent format (LE).
> +
> +	 So we have a couple cases to consider:
> +
> +	 1 - If the target is big endian, then SIMD state is big endian,
> +	 requiring a byteswap.
> +
> +	 2 - If the target is little endian, then SIMD state is little endian,
> +	 which matches the SVE format, so no byteswap is needed. */
> +
>       /* There is no SVE state yet - the register dump contains a fpsimd
> 	 structure instead.  These registers still exist in the hardware, but
> 	 the kernel has not yet initialised them, and so they will be null.  */
> 
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>       struct user_fpsimd_state *fpsimd
> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
> 
> +      /* Make sure we have a zeroed register buffer.  We will need the zero
> +	 padding below.  */
> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> +
>       /* Copy across the V registers from fpsimd structure to the Z registers,
> 	 ensuring the non overlapping state is set to null.  */
> 
> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> -
>       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
> 	{
> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
> -	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +	  aarch64_maybe_swab128 (reg, (const gdb_byte *) &fpsimd->vregs[i],
> +				 V_REGISTER_SIZE);
> +	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, reg);
> 	}
> 
>       reg_buf->raw_supply (AARCH64_FPSR_REGNUM, &fpsimd->fpsr);
>       reg_buf->raw_supply (AARCH64_FPCR_REGNUM, &fpsimd->fpcr);
> 
>       /* Clear the SVE only registers.  */
> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> 
>       for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
> -	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, zero_reg);
> +	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, reg);
> 
> -      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, zero_reg);
> +      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, reg);
>     }
> }
> 
> @@ -240,11 +275,11 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	 kernel, which is why we try to avoid it.  */
> 
>       bool has_sve_state = false;
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>       struct user_fpsimd_state *fpsimd
> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
> 
> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> 
>       /* Check in the reg_buf if any of the Z registers are set after the
> 	 first 128 bits, or if any of the other SVE registers are set.  */
> @@ -252,7 +287,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
> 	{
> 	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i,
> -						 zero_reg, sizeof (__int128_t));
> +						 reg, sizeof (__int128_t));
> 	  if (has_sve_state)
> 	    break;
> 	}
> @@ -261,19 +296,31 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
> 	  {
> 	    has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM + i,
> -						   zero_reg, 0);
> +						   reg, 0);
> 	    if (has_sve_state)
> 	      break;
> 	  }
> 
>       if (!has_sve_state)
> 	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM,
> -						 zero_reg, 0);
> +						 reg, 0);
> 
>       /* If no SVE state exists, then use the existing fpsimd structure to
> 	 write out state and return.  */
>       if (!has_sve_state)
> 	{
> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	     while SVE state is laid out in an endianness-independent format
> +	     (LE).
> +
> +	     So we have a couple cases to consider:
> +
> +	     1 - If the target is big endian, then SIMD state is big endian,
> +	     requiring a byteswap.
> +
> +	     2 - If the target is little endian, then SIMD state is little
> +	     endian, which matches the SVE format, so no byteswap is needed. */
> +
> 	  /* The collects of the Z registers will overflow the size of a vreg.
> 	     There is enough space in the structure to allow for this, but we
> 	     cannot overflow into the next register as we might not be
> @@ -284,8 +331,10 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
> 	      if (REG_VALID
> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
> 		{
> -		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
> +		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, reg);
> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +		  aarch64_maybe_swab128 ((gdb_byte *) &fpsimd->vregs[i], reg,
> +					 V_REGISTER_SIZE);
> 		}
> 	    }
> 
> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
> index a6a84790a9..8a1aa818ad 100644
> --- a/gdb/trad-frame.c
> +++ b/gdb/trad-frame.c
> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>     {
>       regs[regnum].realreg = regnum;
>       regs[regnum].addr = -1;
> +      regs[regnum].data = nullptr;
>     }
> }
> 
> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>   return trad_frame_alloc_saved_regs (gdbarch);
> }
> 
> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
> 
> int
> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
> 	  && this_saved_regs[regnum].addr == -1);
> }
> 
> +/* See trad-frame.h.  */
> +
> +bool
> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			  int regnum)
> +{
> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
> +	  && this_saved_regs[regnum].data != nullptr);
> +}
> +
> void
> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
> 		      int regnum, LONGEST val)
> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>   this_saved_regs[regnum].addr = -1;
> }
> 
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +			    int regnum, const gdb_byte *bytes,
> +			    size_t size)
> +{
> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
> +
> +  /* Allocate the space and copy the data bytes.  */
> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
> +  memcpy (this_saved_regs[regnum].data, bytes, size);
> +}
> +
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				int regnum, const gdb_byte *bytes,
> +				size_t size)
> +{
> +  /* External interface for users of trad_frame_cache
> +     (who cannot access the prev_regs object directly).  */
> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
> +			      size);
> +}
> +
> +
> +
> struct value *
> trad_frame_get_prev_register (struct frame_info *this_frame,
> 			      struct trad_frame_saved_reg this_saved_regs[],
> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>     /* The register's value is available.  */
>     return frame_unwind_got_constant (this_frame, regnum,
> 				      this_saved_regs[regnum].addr);
> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
> +    /* The register's value is available as a sequence of bytes.  */
> +    return frame_unwind_got_bytes (this_frame, regnum,
> +				   this_saved_regs[regnum].data);
>   else
>     return frame_unwind_got_optimized (this_frame, regnum);
> }
> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
> index 7b5785616e..38db439579 100644
> --- a/gdb/trad-frame.h
> +++ b/gdb/trad-frame.h
> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
> 			       int regnum, LONGEST val);
> 
> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
> +   contained in BYTES with size SIZE.  */
> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				     int regnum, const gdb_byte *bytes,
> +				     size_t size);
> +
> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
> 				       struct frame_info *this_frame,
> 				       int regnum);
> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
> {
>   LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>   int realreg;
> +  /* Register data (for values that don't fit in ADDR).  */
> +  gdb_byte *data;
> };
> 
> /* Encode REGNUM value in the trad-frame.  */
> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
> 			     int regnum);
> 
> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
> +   useful when the value is larger than what primitive types can hold.  */
> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +				 int regnum, const gdb_byte *bytes,
> +				 size_t size);
> +
> /* Convenience functions, return non-zero if the register has been
>    encoded as specified.  */
> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
> 			  int regnum);
> 
> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
> +   otherwise.  */
> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			      int regnum);
> +
> /* Reset the save regs cache, setting register values to -1.  */
> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
> 				  struct trad_frame_saved_reg *regs);
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-12-03 17:35   ` Alan Hayward
@ 2020-12-03 17:37     ` Luis Machado
  0 siblings, 0 replies; 13+ messages in thread
From: Luis Machado @ 2020-12-03 17:37 UTC (permalink / raw)
  To: Alan Hayward; +Cc: gdb-patches\@sourceware.org, nd

On 12/3/20 2:35 PM, Alan Hayward wrote:
> My only nit would be that the other register defines in aarch64-tdep.h
> could get moved across to arch/aarch64.h too. However, everything in the
> file is for pseudos and dwarf. So, I think it’s fine as is.

Yeah. I thought about it, but then I didn't want to have unrelated 
changes in this particular fix. I could do it as a separate patch though.

> 
> It’s an ok from me.

Thanks for the review!

> 
> 
> Alan
> 
> 
>> On 2 Dec 2020, at 17:57, Luis Machado <luis.machado@linaro.org> wrote:
>>
>> Updates on v2:
>>
>> - Reworked ptrace functions and made byteswap function more like a memcpy.
>> - Adjusted memset calls to zero register buffers.
>> - Adjusted comments.
>> - Moved V_REGISTER_SIZE constant definition.
>>
>> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
>> structure follows the target endianness, whereas the SVE dumps are
>> endianness-independent (LE).
>>
>> Therefore, when the system is in BE mode, we need to reverse the bytes
>> for the FPSIMD data.
>>
>> Given the V registers are larger than 64-bit, I've added a way for value
>> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
>> nicely with the unwinding *_got_bytes function and makes the trad-frame
>> more flexible and capable of saving larger registers.
>>
>> The memory for the bytes is allocated via the frame obstack, so it gets freed
>> after we're done inspecting the frame.
>>
>> gdb/ChangeLog:
>>
>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>
>> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
>> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
>> 	* aarch64-tdep.h (V_REGISTER_SIZE): Move to ...
>> 	* arch/aarch64.h: ... here.
>> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
>> 	(aarch64_maybe_swab128): New function.
>> 	(aarch64_sve_regs_copy_to_reg_buf)
>> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
>> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
>> 	the data field.
>> 	(TF_REG_VALUE_BYTES): New enum value.
>> 	(trad_frame_value_bytes_p): New function.
>> 	(trad_frame_set_value_bytes): New function.
>> 	(trad_frame_set_reg_value_bytes): New function.
>> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
>> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
>> 	(struct trad_frame_saved_reg) <data>: New field.
>> 	(trad_frame_set_value_bytes): New prototype.
>> 	(trad_frame_value_bytes_p): New prototype.
>> ---
>> gdb/aarch64-linux-tdep.c           | 114 ++++++++++++++++++++++++-----
>> gdb/aarch64-tdep.h                 |   1 -
>> gdb/arch/aarch64.h                 |   2 +
>> gdb/nat/aarch64-sve-linux-ptrace.c |  77 +++++++++++++++----
>> gdb/trad-frame.c                   |  46 +++++++++++-
>> gdb/trad-frame.h                   |  19 +++++
>> 6 files changed, 224 insertions(+), 35 deletions(-)
>>
>> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
>> index c9898bdafd..9429af0fe1 100644
>> --- a/gdb/aarch64-linux-tdep.c
>> +++ b/gdb/aarch64-linux-tdep.c
>> @@ -180,6 +180,93 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>>    return magic;
>> }
>>
>> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
>> +   registers from a signal frame.
>> +
>> +   VREG_NUM is the number of the V register being restored, OFFSET is the
>> +   address containing the register value, BYTE_ORDER is the endianness and
>> +   HAS_SVE tells us if we have a valid SVE context or not.  */
>> +
>> +static void
>> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
>> +			    int vreg_num, CORE_ADDR offset,
>> +			    enum bfd_endian byte_order, bool has_sve)
>> +{
>> +  /* WARNING: SIMD state is laid out in memory in target-endian format.
>> +
>> +     So we have a couple cases to consider:
>> +
>> +     1 - If the target is big endian, then SIMD state is big endian,
>> +     requiring a byteswap.
>> +
>> +     2 - If the target is little endian, then SIMD state is little endian, so
>> +     no byteswap is needed. */
>> +
>> +  if (byte_order == BFD_ENDIAN_BIG)
>> +    {
>> +      gdb_byte buf[V_REGISTER_SIZE];
>> +
>> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
>> +	{
>> +	  size_t size = V_REGISTER_SIZE/2;
>> +
>> +	  /* Read the two halves of the V register in reverse byte order.  */
>> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
>> +						    byte_order);
>> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
>> +						    byte_order);
>> +
>> +	  /* Copy the reversed bytes to the buffer.  */
>> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
>> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
>> +
>> +	  /* Now we can store the correct bytes for the V register.  */
>> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
>> +					  buf, V_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_Q0_REGNUM
>> +					  + vreg_num, buf, Q_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_D0_REGNUM
>> +					  + vreg_num, buf, D_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_S0_REGNUM
>> +					  + vreg_num, buf, S_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_H0_REGNUM
>> +					  + vreg_num, buf, H_REGISTER_SIZE);
>> +	  trad_frame_set_reg_value_bytes (cache,
>> +					  num_regs + AARCH64_B0_REGNUM
>> +					  + vreg_num, buf, B_REGISTER_SIZE);
>> +
>> +	  if (has_sve)
>> +	    trad_frame_set_reg_value_bytes (cache,
>> +					    num_regs + AARCH64_SVE_V0_REGNUM
>> +					    + vreg_num, buf, V_REGISTER_SIZE);
>> +	}
>> +      return;
>> +    }
>> +
>> +  /* Little endian, just point at the address containing the register
>> +     value.  */
>> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
>> +			   offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
>> +			   offset);
>> +
>> +  if (has_sve)
>> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
>> +			     + vreg_num, offset);
>> +
>> +}
>> +
>> /* Implement the "init" method of struct tramp_frame.  */
>>
>> static void
>> @@ -332,27 +419,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>>
>>        /* If there was no SVE section then set up the V registers.  */
>>        if (sve_regs == 0)
>> -	for (int i = 0; i < 32; i++)
>> -	  {
>> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>> +	{
>> +	  for (int i = 0; i < 32; i++)
>> +	    {
>> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>> 				  + (i * AARCH64_FPSIMD_VREG_SIZE));
>>
>> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
>> -	    trad_frame_set_reg_addr (this_cache,
>> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
>> -	    if (tdep->has_sve ())
>> -	      trad_frame_set_reg_addr (this_cache,
>> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
>> -				       offset);
>> -	  }
>> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
>> +					  byte_order, tdep->has_sve ());
>> +	    }
>> +	}
>>      }
>>
>>    trad_frame_set_id (this_cache, frame_id_build (sp, func));
>> diff --git a/gdb/aarch64-tdep.h b/gdb/aarch64-tdep.h
>> index 81ce4d84b4..ba72a771cd 100644
>> --- a/gdb/aarch64-tdep.h
>> +++ b/gdb/aarch64-tdep.h
>> @@ -47,7 +47,6 @@ struct regset;
>> #define H_REGISTER_SIZE  2
>> #define S_REGISTER_SIZE  4
>> #define D_REGISTER_SIZE  8
>> -#define V_REGISTER_SIZE 16
>> #define Q_REGISTER_SIZE 16
>>
>> /* Total number of general (X) registers.  */
>> diff --git a/gdb/arch/aarch64.h b/gdb/arch/aarch64.h
>> index 857bb22b03..b75352461a 100644
>> --- a/gdb/arch/aarch64.h
>> +++ b/gdb/arch/aarch64.h
>> @@ -58,6 +58,8 @@ enum aarch64_regnum
>>    AARCH64_LAST_V_ARG_REGNUM = AARCH64_V0_REGNUM + 7
>> };
>>
>> +#define V_REGISTER_SIZE 16
>> +
>> /* Pseudo register base numbers.  */
>> #define AARCH64_Q0_REGNUM 0
>> #define AARCH64_D0_REGNUM (AARCH64_Q0_REGNUM + AARCH64_D_REGISTER_COUNT)
>> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
>> index 2ce90ccfd7..5f08995f87 100644
>> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
>> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
>> @@ -26,6 +26,7 @@
>> #include "arch/aarch64.h"
>> #include "gdbsupport/common-regcache.h"
>> #include "gdbsupport/byte-vector.h"
>> +#include <endian.h>
>>
>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>
>> @@ -142,6 +143,24 @@ aarch64_sve_get_sveregs (int tid)
>>    return buf;
>> }
>>
>> +/* If we are running in BE mode, byteswap the contents
>> +   of SRC to DST for SIZE bytes.  Other, just copy the contents
>> +   from SRC to DST.  */
>> +
>> +static void
>> +aarch64_maybe_swab128 (gdb_byte *dst, const gdb_byte *src, size_t size)
>> +{
>> +  gdb_assert (src != nullptr && dst != nullptr);
>> +  gdb_assert (size > 1);
>> +
>> +#if (__BYTE_ORDER == __BIG_ENDIAN)
>> +  for (int i = 0; i < size - 1; i++)
>> +    dst[i] = src[size - i];
>> +#else
>> +  memcpy (dst, src, size);
>> +#endif
>> +}
>> +
>> /* See nat/aarch64-sve-linux-ptrace.h.  */
>>
>> void
>> @@ -184,34 +203,50 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>>      }
>>    else
>>      {
>> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
>> +	 while SVE state is laid out in an endianness-independent format (LE).
>> +
>> +	 So we have a couple cases to consider:
>> +
>> +	 1 - If the target is big endian, then SIMD state is big endian,
>> +	 requiring a byteswap.
>> +
>> +	 2 - If the target is little endian, then SIMD state is little endian,
>> +	 which matches the SVE format, so no byteswap is needed. */
>> +
>>        /* There is no SVE state yet - the register dump contains a fpsimd
>> 	 structure instead.  These registers still exist in the hardware, but
>> 	 the kernel has not yet initialised them, and so they will be null.  */
>>
>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>        struct user_fpsimd_state *fpsimd
>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>
>> +      /* Make sure we have a zeroed register buffer.  We will need the zero
>> +	 padding below.  */
>> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>> +
>>        /* Copy across the V registers from fpsimd structure to the Z registers,
>> 	 ensuring the non overlapping state is set to null.  */
>>
>> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>> -
>>        for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>> 	{
>> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
>> -	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
>> +	  aarch64_maybe_swab128 (reg, (const gdb_byte *) &fpsimd->vregs[i],
>> +				 V_REGISTER_SIZE);
>> +	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, reg);
>> 	}
>>
>>        reg_buf->raw_supply (AARCH64_FPSR_REGNUM, &fpsimd->fpsr);
>>        reg_buf->raw_supply (AARCH64_FPCR_REGNUM, &fpsimd->fpcr);
>>
>>        /* Clear the SVE only registers.  */
>> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>>
>>        for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
>> -	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, zero_reg);
>> +	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, reg);
>>
>> -      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, zero_reg);
>> +      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, reg);
>>      }
>> }
>>
>> @@ -240,11 +275,11 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>> 	 kernel, which is why we try to avoid it.  */
>>
>>        bool has_sve_state = false;
>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>        struct user_fpsimd_state *fpsimd
>> 	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>>
>> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>>
>>        /* Check in the reg_buf if any of the Z registers are set after the
>> 	 first 128 bits, or if any of the other SVE registers are set.  */
>> @@ -252,7 +287,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>>        for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>> 	{
>> 	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i,
>> -						 zero_reg, sizeof (__int128_t));
>> +						 reg, sizeof (__int128_t));
>> 	  if (has_sve_state)
>> 	    break;
>> 	}
>> @@ -261,19 +296,31 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>> 	for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
>> 	  {
>> 	    has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM + i,
>> -						   zero_reg, 0);
>> +						   reg, 0);
>> 	    if (has_sve_state)
>> 	      break;
>> 	  }
>>
>>        if (!has_sve_state)
>> 	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM,
>> -						 zero_reg, 0);
>> +						 reg, 0);
>>
>>        /* If no SVE state exists, then use the existing fpsimd structure to
>> 	 write out state and return.  */
>>        if (!has_sve_state)
>> 	{
>> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
>> +	     while SVE state is laid out in an endianness-independent format
>> +	     (LE).
>> +
>> +	     So we have a couple cases to consider:
>> +
>> +	     1 - If the target is big endian, then SIMD state is big endian,
>> +	     requiring a byteswap.
>> +
>> +	     2 - If the target is little endian, then SIMD state is little
>> +	     endian, which matches the SVE format, so no byteswap is needed. */
>> +
>> 	  /* The collects of the Z registers will overflow the size of a vreg.
>> 	     There is enough space in the structure to allow for this, but we
>> 	     cannot overflow into the next register as we might not be
>> @@ -284,8 +331,10 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>> 	      if (REG_VALID
>> 		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>> 		{
>> -		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
>> +		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, reg);
>> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
>> +		  aarch64_maybe_swab128 ((gdb_byte *) &fpsimd->vregs[i], reg,
>> +					 V_REGISTER_SIZE);
>> 		}
>> 	    }
>>
>> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
>> index a6a84790a9..8a1aa818ad 100644
>> --- a/gdb/trad-frame.c
>> +++ b/gdb/trad-frame.c
>> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>      {
>>        regs[regnum].realreg = regnum;
>>        regs[regnum].addr = -1;
>> +      regs[regnum].data = nullptr;
>>      }
>> }
>>
>> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>>    return trad_frame_alloc_saved_regs (gdbarch);
>> }
>>
>> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
>> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
>>
>> int
>> trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
>> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>> 	  && this_saved_regs[regnum].addr == -1);
>> }
>>
>> +/* See trad-frame.h.  */
>> +
>> +bool
>> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>> +			  int regnum)
>> +{
>> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
>> +	  && this_saved_regs[regnum].data != nullptr);
>> +}
>> +
>> void
>> trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>> 		      int regnum, LONGEST val)
>> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>>    this_saved_regs[regnum].addr = -1;
>> }
>>
>> +/* See trad-frame.h.  */
>> +
>> +void
>> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>> +			    int regnum, const gdb_byte *bytes,
>> +			    size_t size)
>> +{
>> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
>> +
>> +  /* Allocate the space and copy the data bytes.  */
>> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
>> +  memcpy (this_saved_regs[regnum].data, bytes, size);
>> +}
>> +
>> +/* See trad-frame.h.  */
>> +
>> +void
>> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>> +				int regnum, const gdb_byte *bytes,
>> +				size_t size)
>> +{
>> +  /* External interface for users of trad_frame_cache
>> +     (who cannot access the prev_regs object directly).  */
>> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
>> +			      size);
>> +}
>> +
>> +
>> +
>> struct value *
>> trad_frame_get_prev_register (struct frame_info *this_frame,
>> 			      struct trad_frame_saved_reg this_saved_regs[],
>> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>>      /* The register's value is available.  */
>>      return frame_unwind_got_constant (this_frame, regnum,
>> 				      this_saved_regs[regnum].addr);
>> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
>> +    /* The register's value is available as a sequence of bytes.  */
>> +    return frame_unwind_got_bytes (this_frame, regnum,
>> +				   this_saved_regs[regnum].data);
>>    else
>>      return frame_unwind_got_optimized (this_frame, regnum);
>> }
>> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
>> index 7b5785616e..38db439579 100644
>> --- a/gdb/trad-frame.h
>> +++ b/gdb/trad-frame.h
>> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
>> void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>> 			       int regnum, LONGEST val);
>>
>> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
>> +   contained in BYTES with size SIZE.  */
>> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
>> +				     int regnum, const gdb_byte *bytes,
>> +				     size_t size);
>> +
>> struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
>> 				       struct frame_info *this_frame,
>> 				       int regnum);
>> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>> {
>>    LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>>    int realreg;
>> +  /* Register data (for values that don't fit in ADDR).  */
>> +  gdb_byte *data;
>> };
>>
>> /* Encode REGNUM value in the trad-frame.  */
>> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
>> void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>> 			     int regnum);
>>
>> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
>> +   useful when the value is larger than what primitive types can hold.  */
>> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
>> +				 int regnum, const gdb_byte *bytes,
>> +				 size_t size);
>> +
>> /* Convenience functions, return non-zero if the register has been
>>     encoded as specified.  */
>> int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
>> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
>> int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>> 			  int regnum);
>>
>> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
>> +   otherwise.  */
>> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>> +			      int regnum);
>> +
>> /* Reset the save regs cache, setting register values to -1.  */
>> void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>> 				  struct trad_frame_saved_reg *regs);
>> -- 
>> 2.25.1
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-12-02 17:57 ` [PATCH,v2] " Luis Machado
  2020-12-03 17:35   ` Alan Hayward
@ 2020-12-04 14:22   ` Luis Machado
  2020-12-08 13:39     ` Luis Machado
  2020-12-08 16:10     ` Simon Marchi
  1 sibling, 2 replies; 13+ messages in thread
From: Luis Machado @ 2020-12-04 14:22 UTC (permalink / raw)
  To: gdb-patches

With the AArch64-specific parts approved, do the global maintainers have 
any comments on the trad-frame changes, or does it look good?

On 12/2/20 2:57 PM, Luis Machado wrote:
> Updates on v2:
> 
> - Reworked ptrace functions and made byteswap function more like a memcpy.
> - Adjusted memset calls to zero register buffers.
> - Adjusted comments.
> - Moved V_REGISTER_SIZE constant definition.
> 
> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE context
> structure follows the target endianness, whereas the SVE dumps are
> endianness-independent (LE).
> 
> Therefore, when the system is in BE mode, we need to reverse the bytes
> for the FPSIMD data.
> 
> Given the V registers are larger than 64-bit, I've added a way for value
> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
> nicely with the unwinding *_got_bytes function and makes the trad-frame
> more flexible and capable of saving larger registers.
> 
> The memory for the bytes is allocated via the frame obstack, so it gets freed
> after we're done inspecting the frame.
> 
> gdb/ChangeLog:
> 
> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
> 
> 	* aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
> 	(aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
> 	* aarch64-tdep.h (V_REGISTER_SIZE): Move to ...
> 	* arch/aarch64.h: ... here.
> 	* nat/aarch64-sve-linux-ptrace.c: Include endian.h.
> 	(aarch64_maybe_swab128): New function.
> 	(aarch64_sve_regs_copy_to_reg_buf)
> 	(aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
> 	* trad-frame.c (trad_frame_reset_saved_regs): Initialize
> 	the data field.
> 	(TF_REG_VALUE_BYTES): New enum value.
> 	(trad_frame_value_bytes_p): New function.
> 	(trad_frame_set_value_bytes): New function.
> 	(trad_frame_set_reg_value_bytes): New function.
> 	(trad_frame_get_prev_register): Handle register values saved as bytes.
> 	* trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
> 	(struct trad_frame_saved_reg) <data>: New field.
> 	(trad_frame_set_value_bytes): New prototype.
> 	(trad_frame_value_bytes_p): New prototype.
> ---
>   gdb/aarch64-linux-tdep.c           | 114 ++++++++++++++++++++++++-----
>   gdb/aarch64-tdep.h                 |   1 -
>   gdb/arch/aarch64.h                 |   2 +
>   gdb/nat/aarch64-sve-linux-ptrace.c |  77 +++++++++++++++----
>   gdb/trad-frame.c                   |  46 +++++++++++-
>   gdb/trad-frame.h                   |  19 +++++
>   6 files changed, 224 insertions(+), 35 deletions(-)
> 
> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
> index c9898bdafd..9429af0fe1 100644
> --- a/gdb/aarch64-linux-tdep.c
> +++ b/gdb/aarch64-linux-tdep.c
> @@ -180,6 +180,93 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum bfd_endian byte_order,
>     return magic;
>   }
>   
> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
> +   registers from a signal frame.
> +
> +   VREG_NUM is the number of the V register being restored, OFFSET is the
> +   address containing the register value, BYTE_ORDER is the endianness and
> +   HAS_SVE tells us if we have a valid SVE context or not.  */
> +
> +static void
> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int num_regs,
> +			    int vreg_num, CORE_ADDR offset,
> +			    enum bfd_endian byte_order, bool has_sve)
> +{
> +  /* WARNING: SIMD state is laid out in memory in target-endian format.
> +
> +     So we have a couple cases to consider:
> +
> +     1 - If the target is big endian, then SIMD state is big endian,
> +     requiring a byteswap.
> +
> +     2 - If the target is little endian, then SIMD state is little endian, so
> +     no byteswap is needed. */
> +
> +  if (byte_order == BFD_ENDIAN_BIG)
> +    {
> +      gdb_byte buf[V_REGISTER_SIZE];
> +
> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
> +	{
> +	  size_t size = V_REGISTER_SIZE/2;
> +
> +	  /* Read the two halves of the V register in reverse byte order.  */
> +	  CORE_ADDR u64 = extract_unsigned_integer (buf, size,
> +						    byte_order);
> +	  CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
> +						    byte_order);
> +
> +	  /* Copy the reversed bytes to the buffer.  */
> +	  store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
> +	  store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, u64);
> +
> +	  /* Now we can store the correct bytes for the V register.  */
> +	  trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + vreg_num,
> +					  buf, V_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_Q0_REGNUM
> +					  + vreg_num, buf, Q_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_D0_REGNUM
> +					  + vreg_num, buf, D_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_S0_REGNUM
> +					  + vreg_num, buf, S_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_H0_REGNUM
> +					  + vreg_num, buf, H_REGISTER_SIZE);
> +	  trad_frame_set_reg_value_bytes (cache,
> +					  num_regs + AARCH64_B0_REGNUM
> +					  + vreg_num, buf, B_REGISTER_SIZE);
> +
> +	  if (has_sve)
> +	    trad_frame_set_reg_value_bytes (cache,
> +					    num_regs + AARCH64_SVE_V0_REGNUM
> +					    + vreg_num, buf, V_REGISTER_SIZE);
> +	}
> +      return;
> +    }
> +
> +  /* Little endian, just point at the address containing the register
> +     value.  */
> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + vreg_num,
> +			   offset);
> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + vreg_num,
> +			   offset);
> +
> +  if (has_sve)
> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
> +			     + vreg_num, offset);
> +
> +}
> +
>   /* Implement the "init" method of struct tramp_frame.  */
>   
>   static void
> @@ -332,27 +419,16 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>   
>         /* If there was no SVE section then set up the V registers.  */
>         if (sve_regs == 0)
> -	for (int i = 0; i < 32; i++)
> -	  {
> -	    CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
> +	{
> +	  for (int i = 0; i < 32; i++)
> +	    {
> +	      CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>   				  + (i * AARCH64_FPSIMD_VREG_SIZE));
>   
> -	    trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_Q0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_D0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_S0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_H0_REGNUM + i, offset);
> -	    trad_frame_set_reg_addr (this_cache,
> -				     num_regs + AARCH64_B0_REGNUM + i, offset);
> -	    if (tdep->has_sve ())
> -	      trad_frame_set_reg_addr (this_cache,
> -				       num_regs + AARCH64_SVE_V0_REGNUM + i,
> -				       offset);
> -	  }
> +	      aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
> +					  byte_order, tdep->has_sve ());
> +	    }
> +	}
>       }
>   
>     trad_frame_set_id (this_cache, frame_id_build (sp, func));
> diff --git a/gdb/aarch64-tdep.h b/gdb/aarch64-tdep.h
> index 81ce4d84b4..ba72a771cd 100644
> --- a/gdb/aarch64-tdep.h
> +++ b/gdb/aarch64-tdep.h
> @@ -47,7 +47,6 @@ struct regset;
>   #define H_REGISTER_SIZE  2
>   #define S_REGISTER_SIZE  4
>   #define D_REGISTER_SIZE  8
> -#define V_REGISTER_SIZE 16
>   #define Q_REGISTER_SIZE 16
>   
>   /* Total number of general (X) registers.  */
> diff --git a/gdb/arch/aarch64.h b/gdb/arch/aarch64.h
> index 857bb22b03..b75352461a 100644
> --- a/gdb/arch/aarch64.h
> +++ b/gdb/arch/aarch64.h
> @@ -58,6 +58,8 @@ enum aarch64_regnum
>     AARCH64_LAST_V_ARG_REGNUM = AARCH64_V0_REGNUM + 7
>   };
>   
> +#define V_REGISTER_SIZE 16
> +
>   /* Pseudo register base numbers.  */
>   #define AARCH64_Q0_REGNUM 0
>   #define AARCH64_D0_REGNUM (AARCH64_Q0_REGNUM + AARCH64_D_REGISTER_COUNT)
> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c b/gdb/nat/aarch64-sve-linux-ptrace.c
> index 2ce90ccfd7..5f08995f87 100644
> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
> @@ -26,6 +26,7 @@
>   #include "arch/aarch64.h"
>   #include "gdbsupport/common-regcache.h"
>   #include "gdbsupport/byte-vector.h"
> +#include <endian.h>
>   
>   /* See nat/aarch64-sve-linux-ptrace.h.  */
>   
> @@ -142,6 +143,24 @@ aarch64_sve_get_sveregs (int tid)
>     return buf;
>   }
>   
> +/* If we are running in BE mode, byteswap the contents
> +   of SRC to DST for SIZE bytes.  Other, just copy the contents
> +   from SRC to DST.  */
> +
> +static void
> +aarch64_maybe_swab128 (gdb_byte *dst, const gdb_byte *src, size_t size)
> +{
> +  gdb_assert (src != nullptr && dst != nullptr);
> +  gdb_assert (size > 1);
> +
> +#if (__BYTE_ORDER == __BIG_ENDIAN)
> +  for (int i = 0; i < size - 1; i++)
> +    dst[i] = src[size - i];
> +#else
> +  memcpy (dst, src, size);
> +#endif
> +}
> +
>   /* See nat/aarch64-sve-linux-ptrace.h.  */
>   
>   void
> @@ -184,34 +203,50 @@ aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
>       }
>     else
>       {
> +      /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	 while SVE state is laid out in an endianness-independent format (LE).
> +
> +	 So we have a couple cases to consider:
> +
> +	 1 - If the target is big endian, then SIMD state is big endian,
> +	 requiring a byteswap.
> +
> +	 2 - If the target is little endian, then SIMD state is little endian,
> +	 which matches the SVE format, so no byteswap is needed. */
> +
>         /* There is no SVE state yet - the register dump contains a fpsimd
>   	 structure instead.  These registers still exist in the hardware, but
>   	 the kernel has not yet initialised them, and so they will be null.  */
>   
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>         struct user_fpsimd_state *fpsimd
>   	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>   
> +      /* Make sure we have a zeroed register buffer.  We will need the zero
> +	 padding below.  */
> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> +
>         /* Copy across the V registers from fpsimd structure to the Z registers,
>   	 ensuring the non overlapping state is set to null.  */
>   
> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> -
>         for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>   	{
> -	  memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
> -	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> +	  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +	  aarch64_maybe_swab128 (reg, (const gdb_byte *) &fpsimd->vregs[i],
> +				 V_REGISTER_SIZE);
> +	  reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, reg);
>   	}
>   
>         reg_buf->raw_supply (AARCH64_FPSR_REGNUM, &fpsimd->fpsr);
>         reg_buf->raw_supply (AARCH64_FPCR_REGNUM, &fpsimd->fpcr);
>   
>         /* Clear the SVE only registers.  */
> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>   
>         for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
> -	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, zero_reg);
> +	reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, reg);
>   
> -      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, zero_reg);
> +      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, reg);
>       }
>   }
>   
> @@ -240,11 +275,11 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>   	 kernel, which is why we try to avoid it.  */
>   
>         bool has_sve_state = false;
> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>         struct user_fpsimd_state *fpsimd
>   	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>   
> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>   
>         /* Check in the reg_buf if any of the Z registers are set after the
>   	 first 128 bits, or if any of the other SVE registers are set.  */
> @@ -252,7 +287,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>         for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>   	{
>   	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i,
> -						 zero_reg, sizeof (__int128_t));
> +						 reg, sizeof (__int128_t));
>   	  if (has_sve_state)
>   	    break;
>   	}
> @@ -261,19 +296,31 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>   	for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
>   	  {
>   	    has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM + i,
> -						   zero_reg, 0);
> +						   reg, 0);
>   	    if (has_sve_state)
>   	      break;
>   	  }
>   
>         if (!has_sve_state)
>   	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM,
> -						 zero_reg, 0);
> +						 reg, 0);
>   
>         /* If no SVE state exists, then use the existing fpsimd structure to
>   	 write out state and return.  */
>         if (!has_sve_state)
>   	{
> +	  /* WARNING: SIMD state is laid out in memory in target-endian format,
> +	     while SVE state is laid out in an endianness-independent format
> +	     (LE).
> +
> +	     So we have a couple cases to consider:
> +
> +	     1 - If the target is big endian, then SIMD state is big endian,
> +	     requiring a byteswap.
> +
> +	     2 - If the target is little endian, then SIMD state is little
> +	     endian, which matches the SVE format, so no byteswap is needed. */
> +
>   	  /* The collects of the Z registers will overflow the size of a vreg.
>   	     There is enough space in the structure to allow for this, but we
>   	     cannot overflow into the next register as we might not be
> @@ -284,8 +331,10 @@ aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
>   	      if (REG_VALID
>   		  == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>   		{
> -		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
> -		  memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
> +		  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, reg);
> +		  /* Handle big endian/little endian SIMD/SVE conversion.  */
> +		  aarch64_maybe_swab128 ((gdb_byte *) &fpsimd->vregs[i], reg,
> +					 V_REGISTER_SIZE);
>   		}
>   	    }
>   
> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
> index a6a84790a9..8a1aa818ad 100644
> --- a/gdb/trad-frame.c
> +++ b/gdb/trad-frame.c
> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>       {
>         regs[regnum].realreg = regnum;
>         regs[regnum].addr = -1;
> +      regs[regnum].data = nullptr;
>       }
>   }
>   
> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info *this_frame)
>     return trad_frame_alloc_saved_regs (gdbarch);
>   }
>   
> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = -3 };
>   
>   int
>   trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], int regnum)
> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>   	  && this_saved_regs[regnum].addr == -1);
>   }
>   
> +/* See trad-frame.h.  */
> +
> +bool
> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			  int regnum)
> +{
> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
> +	  && this_saved_regs[regnum].data != nullptr);
> +}
> +
>   void
>   trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>   		      int regnum, LONGEST val)
> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>     this_saved_regs[regnum].addr = -1;
>   }
>   
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +			    int regnum, const gdb_byte *bytes,
> +			    size_t size)
> +{
> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
> +
> +  /* Allocate the space and copy the data bytes.  */
> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
> +  memcpy (this_saved_regs[regnum].data, bytes, size);
> +}
> +
> +/* See trad-frame.h.  */
> +
> +void
> +trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				int regnum, const gdb_byte *bytes,
> +				size_t size)
> +{
> +  /* External interface for users of trad_frame_cache
> +     (who cannot access the prev_regs object directly).  */
> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
> +			      size);
> +}
> +
> +
> +
>   struct value *
>   trad_frame_get_prev_register (struct frame_info *this_frame,
>   			      struct trad_frame_saved_reg this_saved_regs[],
> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info *this_frame,
>       /* The register's value is available.  */
>       return frame_unwind_got_constant (this_frame, regnum,
>   				      this_saved_regs[regnum].addr);
> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
> +    /* The register's value is available as a sequence of bytes.  */
> +    return frame_unwind_got_bytes (this_frame, regnum,
> +				   this_saved_regs[regnum].data);
>     else
>       return frame_unwind_got_optimized (this_frame, regnum);
>   }
> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
> index 7b5785616e..38db439579 100644
> --- a/gdb/trad-frame.h
> +++ b/gdb/trad-frame.h
> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct trad_frame_cache *this_trad_cache,
>   void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>   			       int regnum, LONGEST val);
>   
> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the bytes
> +   contained in BYTES with size SIZE.  */
> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache *this_trad_cache,
> +				     int regnum, const gdb_byte *bytes,
> +				     size_t size);
> +
>   struct value *trad_frame_get_register (struct trad_frame_cache *this_trad_cache,
>   				       struct frame_info *this_frame,
>   				       int regnum);
> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>   {
>     LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>     int realreg;
> +  /* Register data (for values that don't fit in ADDR).  */
> +  gdb_byte *data;
>   };
>   
>   /* Encode REGNUM value in the trad-frame.  */
> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct trad_frame_saved_reg this_trad_cache[],
>   void trad_frame_set_unknown (struct trad_frame_saved_reg this_saved_regs[],
>   			     int regnum);
>   
> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  This is
> +   useful when the value is larger than what primitive types can hold.  */
> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg this_saved_regs[],
> +				 int regnum, const gdb_byte *bytes,
> +				 size_t size);
> +
>   /* Convenience functions, return non-zero if the register has been
>      encoded as specified.  */
>   int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct trad_frame_saved_reg this_saved_regs[],
>   int trad_frame_realreg_p (struct trad_frame_saved_reg this_saved_regs[],
>   			  int regnum);
>   
> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
> +   otherwise.  */
> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
> +			      int regnum);
> +
>   /* Reset the save regs cache, setting register values to -1.  */
>   void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>   				  struct trad_frame_saved_reg *regs);
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-12-04 14:22   ` Luis Machado
@ 2020-12-08 13:39     ` Luis Machado
  2020-12-08 16:10     ` Simon Marchi
  1 sibling, 0 replies; 13+ messages in thread
From: Luis Machado @ 2020-12-08 13:39 UTC (permalink / raw)
  To: gdb-patches

Ping?

On 12/4/20 11:22 AM, Luis Machado wrote:
> With the AArch64-specific parts approved, do the global maintainers have 
> any comments on the trad-frame changes, or does it look good?
> 
> On 12/2/20 2:57 PM, Luis Machado wrote:
>> Updates on v2:
>>
>> - Reworked ptrace functions and made byteswap function more like a 
>> memcpy.
>> - Adjusted memset calls to zero register buffers.
>> - Adjusted comments.
>> - Moved V_REGISTER_SIZE constant definition.
>>
>> The FPSIMD dump in signal frames and ptrace FPSIMD dump in the SVE 
>> context
>> structure follows the target endianness, whereas the SVE dumps are
>> endianness-independent (LE).
>>
>> Therefore, when the system is in BE mode, we need to reverse the bytes
>> for the FPSIMD data.
>>
>> Given the V registers are larger than 64-bit, I've added a way for value
>> bytes to be set, as opposed to passing a 64-bit fixed quantity. This fits
>> nicely with the unwinding *_got_bytes function and makes the trad-frame
>> more flexible and capable of saving larger registers.
>>
>> The memory for the bytes is allocated via the frame obstack, so it 
>> gets freed
>> after we're done inspecting the frame.
>>
>> gdb/ChangeLog:
>>
>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>
>>     * aarch64-linux-tdep.c (aarch64_linux_restore_vreg) New function.
>>     (aarch64_linux_sigframe_init): Call aarch64_linux_restore_vreg.
>>     * aarch64-tdep.h (V_REGISTER_SIZE): Move to ...
>>     * arch/aarch64.h: ... here.
>>     * nat/aarch64-sve-linux-ptrace.c: Include endian.h.
>>     (aarch64_maybe_swab128): New function.
>>     (aarch64_sve_regs_copy_to_reg_buf)
>>     (aarch64_sve_regs_copy_from_reg_buf): Adjust FPSIMD entries.
>>     * trad-frame.c (trad_frame_reset_saved_regs): Initialize
>>     the data field.
>>     (TF_REG_VALUE_BYTES): New enum value.
>>     (trad_frame_value_bytes_p): New function.
>>     (trad_frame_set_value_bytes): New function.
>>     (trad_frame_set_reg_value_bytes): New function.
>>     (trad_frame_get_prev_register): Handle register values saved as 
>> bytes.
>>     * trad-frame.h (trad_frame_set_reg_value_bytes): New prototype.
>>     (struct trad_frame_saved_reg) <data>: New field.
>>     (trad_frame_set_value_bytes): New prototype.
>>     (trad_frame_value_bytes_p): New prototype.
>> ---
>>   gdb/aarch64-linux-tdep.c           | 114 ++++++++++++++++++++++++-----
>>   gdb/aarch64-tdep.h                 |   1 -
>>   gdb/arch/aarch64.h                 |   2 +
>>   gdb/nat/aarch64-sve-linux-ptrace.c |  77 +++++++++++++++----
>>   gdb/trad-frame.c                   |  46 +++++++++++-
>>   gdb/trad-frame.h                   |  19 +++++
>>   6 files changed, 224 insertions(+), 35 deletions(-)
>>
>> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
>> index c9898bdafd..9429af0fe1 100644
>> --- a/gdb/aarch64-linux-tdep.c
>> +++ b/gdb/aarch64-linux-tdep.c
>> @@ -180,6 +180,93 @@ read_aarch64_ctx (CORE_ADDR ctx_addr, enum 
>> bfd_endian byte_order,
>>     return magic;
>>   }
>> +/* Given CACHE, use the trad_frame* functions to restore the FPSIMD
>> +   registers from a signal frame.
>> +
>> +   VREG_NUM is the number of the V register being restored, OFFSET is 
>> the
>> +   address containing the register value, BYTE_ORDER is the 
>> endianness and
>> +   HAS_SVE tells us if we have a valid SVE context or not.  */
>> +
>> +static void
>> +aarch64_linux_restore_vreg (struct trad_frame_cache *cache, int 
>> num_regs,
>> +                int vreg_num, CORE_ADDR offset,
>> +                enum bfd_endian byte_order, bool has_sve)
>> +{
>> +  /* WARNING: SIMD state is laid out in memory in target-endian format.
>> +
>> +     So we have a couple cases to consider:
>> +
>> +     1 - If the target is big endian, then SIMD state is big endian,
>> +     requiring a byteswap.
>> +
>> +     2 - If the target is little endian, then SIMD state is little 
>> endian, so
>> +     no byteswap is needed. */
>> +
>> +  if (byte_order == BFD_ENDIAN_BIG)
>> +    {
>> +      gdb_byte buf[V_REGISTER_SIZE];
>> +
>> +      if (target_read_memory (offset, buf, V_REGISTER_SIZE) != 0)
>> +    {
>> +      size_t size = V_REGISTER_SIZE/2;
>> +
>> +      /* Read the two halves of the V register in reverse byte 
>> order.  */
>> +      CORE_ADDR u64 = extract_unsigned_integer (buf, size,
>> +                            byte_order);
>> +      CORE_ADDR l64 = extract_unsigned_integer (buf + size, size,
>> +                            byte_order);
>> +
>> +      /* Copy the reversed bytes to the buffer.  */
>> +      store_unsigned_integer (buf, size, BFD_ENDIAN_LITTLE, l64);
>> +      store_unsigned_integer (buf + size , size, BFD_ENDIAN_LITTLE, 
>> u64);
>> +
>> +      /* Now we can store the correct bytes for the V register.  */
>> +      trad_frame_set_reg_value_bytes (cache, AARCH64_V0_REGNUM + 
>> vreg_num,
>> +                      buf, V_REGISTER_SIZE);
>> +      trad_frame_set_reg_value_bytes (cache,
>> +                      num_regs + AARCH64_Q0_REGNUM
>> +                      + vreg_num, buf, Q_REGISTER_SIZE);
>> +      trad_frame_set_reg_value_bytes (cache,
>> +                      num_regs + AARCH64_D0_REGNUM
>> +                      + vreg_num, buf, D_REGISTER_SIZE);
>> +      trad_frame_set_reg_value_bytes (cache,
>> +                      num_regs + AARCH64_S0_REGNUM
>> +                      + vreg_num, buf, S_REGISTER_SIZE);
>> +      trad_frame_set_reg_value_bytes (cache,
>> +                      num_regs + AARCH64_H0_REGNUM
>> +                      + vreg_num, buf, H_REGISTER_SIZE);
>> +      trad_frame_set_reg_value_bytes (cache,
>> +                      num_regs + AARCH64_B0_REGNUM
>> +                      + vreg_num, buf, B_REGISTER_SIZE);
>> +
>> +      if (has_sve)
>> +        trad_frame_set_reg_value_bytes (cache,
>> +                        num_regs + AARCH64_SVE_V0_REGNUM
>> +                        + vreg_num, buf, V_REGISTER_SIZE);
>> +    }
>> +      return;
>> +    }
>> +
>> +  /* Little endian, just point at the address containing the register
>> +     value.  */
>> +  trad_frame_set_reg_addr (cache, AARCH64_V0_REGNUM + vreg_num, offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_Q0_REGNUM + 
>> vreg_num,
>> +               offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_D0_REGNUM + 
>> vreg_num,
>> +               offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_S0_REGNUM + 
>> vreg_num,
>> +               offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_H0_REGNUM + 
>> vreg_num,
>> +               offset);
>> +  trad_frame_set_reg_addr (cache, num_regs + AARCH64_B0_REGNUM + 
>> vreg_num,
>> +               offset);
>> +
>> +  if (has_sve)
>> +    trad_frame_set_reg_addr (cache, num_regs + AARCH64_SVE_V0_REGNUM
>> +                 + vreg_num, offset);
>> +
>> +}
>> +
>>   /* Implement the "init" method of struct tramp_frame.  */
>>   static void
>> @@ -332,27 +419,16 @@ aarch64_linux_sigframe_init (const struct 
>> tramp_frame *self,
>>         /* If there was no SVE section then set up the V registers.  */
>>         if (sve_regs == 0)
>> -    for (int i = 0; i < 32; i++)
>> -      {
>> -        CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>> +    {
>> +      for (int i = 0; i < 32; i++)
>> +        {
>> +          CORE_ADDR offset = (fpsimd + AARCH64_FPSIMD_V0_OFFSET
>>                     + (i * AARCH64_FPSIMD_VREG_SIZE));
>> -        trad_frame_set_reg_addr (this_cache, AARCH64_V0_REGNUM + i, 
>> offset);
>> -        trad_frame_set_reg_addr (this_cache,
>> -                     num_regs + AARCH64_Q0_REGNUM + i, offset);
>> -        trad_frame_set_reg_addr (this_cache,
>> -                     num_regs + AARCH64_D0_REGNUM + i, offset);
>> -        trad_frame_set_reg_addr (this_cache,
>> -                     num_regs + AARCH64_S0_REGNUM + i, offset);
>> -        trad_frame_set_reg_addr (this_cache,
>> -                     num_regs + AARCH64_H0_REGNUM + i, offset);
>> -        trad_frame_set_reg_addr (this_cache,
>> -                     num_regs + AARCH64_B0_REGNUM + i, offset);
>> -        if (tdep->has_sve ())
>> -          trad_frame_set_reg_addr (this_cache,
>> -                       num_regs + AARCH64_SVE_V0_REGNUM + i,
>> -                       offset);
>> -      }
>> +          aarch64_linux_restore_vreg (this_cache, num_regs, i, offset,
>> +                      byte_order, tdep->has_sve ());
>> +        }
>> +    }
>>       }
>>     trad_frame_set_id (this_cache, frame_id_build (sp, func));
>> diff --git a/gdb/aarch64-tdep.h b/gdb/aarch64-tdep.h
>> index 81ce4d84b4..ba72a771cd 100644
>> --- a/gdb/aarch64-tdep.h
>> +++ b/gdb/aarch64-tdep.h
>> @@ -47,7 +47,6 @@ struct regset;
>>   #define H_REGISTER_SIZE  2
>>   #define S_REGISTER_SIZE  4
>>   #define D_REGISTER_SIZE  8
>> -#define V_REGISTER_SIZE 16
>>   #define Q_REGISTER_SIZE 16
>>   /* Total number of general (X) registers.  */
>> diff --git a/gdb/arch/aarch64.h b/gdb/arch/aarch64.h
>> index 857bb22b03..b75352461a 100644
>> --- a/gdb/arch/aarch64.h
>> +++ b/gdb/arch/aarch64.h
>> @@ -58,6 +58,8 @@ enum aarch64_regnum
>>     AARCH64_LAST_V_ARG_REGNUM = AARCH64_V0_REGNUM + 7
>>   };
>> +#define V_REGISTER_SIZE 16
>> +
>>   /* Pseudo register base numbers.  */
>>   #define AARCH64_Q0_REGNUM 0
>>   #define AARCH64_D0_REGNUM (AARCH64_Q0_REGNUM + 
>> AARCH64_D_REGISTER_COUNT)
>> diff --git a/gdb/nat/aarch64-sve-linux-ptrace.c 
>> b/gdb/nat/aarch64-sve-linux-ptrace.c
>> index 2ce90ccfd7..5f08995f87 100644
>> --- a/gdb/nat/aarch64-sve-linux-ptrace.c
>> +++ b/gdb/nat/aarch64-sve-linux-ptrace.c
>> @@ -26,6 +26,7 @@
>>   #include "arch/aarch64.h"
>>   #include "gdbsupport/common-regcache.h"
>>   #include "gdbsupport/byte-vector.h"
>> +#include <endian.h>
>>   /* See nat/aarch64-sve-linux-ptrace.h.  */
>> @@ -142,6 +143,24 @@ aarch64_sve_get_sveregs (int tid)
>>     return buf;
>>   }
>> +/* If we are running in BE mode, byteswap the contents
>> +   of SRC to DST for SIZE bytes.  Other, just copy the contents
>> +   from SRC to DST.  */
>> +
>> +static void
>> +aarch64_maybe_swab128 (gdb_byte *dst, const gdb_byte *src, size_t size)
>> +{
>> +  gdb_assert (src != nullptr && dst != nullptr);
>> +  gdb_assert (size > 1);
>> +
>> +#if (__BYTE_ORDER == __BIG_ENDIAN)
>> +  for (int i = 0; i < size - 1; i++)
>> +    dst[i] = src[size - i];
>> +#else
>> +  memcpy (dst, src, size);
>> +#endif
>> +}
>> +
>>   /* See nat/aarch64-sve-linux-ptrace.h.  */
>>   void
>> @@ -184,34 +203,50 @@ aarch64_sve_regs_copy_to_reg_buf (struct 
>> reg_buffer_common *reg_buf,
>>       }
>>     else
>>       {
>> +      /* WARNING: SIMD state is laid out in memory in target-endian 
>> format,
>> +     while SVE state is laid out in an endianness-independent format 
>> (LE).
>> +
>> +     So we have a couple cases to consider:
>> +
>> +     1 - If the target is big endian, then SIMD state is big endian,
>> +     requiring a byteswap.
>> +
>> +     2 - If the target is little endian, then SIMD state is little 
>> endian,
>> +     which matches the SVE format, so no byteswap is needed. */
>> +
>>         /* There is no SVE state yet - the register dump contains a 
>> fpsimd
>>        structure instead.  These registers still exist in the 
>> hardware, but
>>        the kernel has not yet initialised them, and so they will be 
>> null.  */
>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>         struct user_fpsimd_state *fpsimd
>>       = (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>> +      /* Make sure we have a zeroed register buffer.  We will need 
>> the zero
>> +     padding below.  */
>> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>> +
>>         /* Copy across the V registers from fpsimd structure to the Z 
>> registers,
>>        ensuring the non overlapping state is set to null.  */
>> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>> -
>>         for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>>       {
>> -      memcpy (zero_reg, &fpsimd->vregs[i], sizeof (__int128_t));
>> -      reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>> +      /* Handle big endian/little endian SIMD/SVE conversion.  */
>> +      aarch64_maybe_swab128 (reg, (const gdb_byte *) &fpsimd->vregs[i],
>> +                 V_REGISTER_SIZE);
>> +      reg_buf->raw_supply (AARCH64_SVE_Z0_REGNUM + i, reg);
>>       }
>>         reg_buf->raw_supply (AARCH64_FPSR_REGNUM, &fpsimd->fpsr);
>>         reg_buf->raw_supply (AARCH64_FPCR_REGNUM, &fpsimd->fpcr);
>>         /* Clear the SVE only registers.  */
>> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>>         for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
>> -    reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, zero_reg);
>> +    reg_buf->raw_supply (AARCH64_SVE_P0_REGNUM + i, reg);
>> -      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, zero_reg);
>> +      reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, reg);
>>       }
>>   }
>> @@ -240,11 +275,11 @@ aarch64_sve_regs_copy_from_reg_buf (const struct 
>> reg_buffer_common *reg_buf,
>>        kernel, which is why we try to avoid it.  */
>>         bool has_sve_state = false;
>> -      char *zero_reg = (char *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>> +      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
>>         struct user_fpsimd_state *fpsimd
>>       = (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
>> -      memset (zero_reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>> +      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
>>         /* Check in the reg_buf if any of the Z registers are set 
>> after the
>>        first 128 bits, or if any of the other SVE registers are set.  */
>> @@ -252,7 +287,7 @@ aarch64_sve_regs_copy_from_reg_buf (const struct 
>> reg_buffer_common *reg_buf,
>>         for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
>>       {
>>         has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i,
>> -                         zero_reg, sizeof (__int128_t));
>> +                         reg, sizeof (__int128_t));
>>         if (has_sve_state)
>>           break;
>>       }
>> @@ -261,19 +296,31 @@ aarch64_sve_regs_copy_from_reg_buf (const struct 
>> reg_buffer_common *reg_buf,
>>       for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
>>         {
>>           has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM 
>> + i,
>> -                           zero_reg, 0);
>> +                           reg, 0);
>>           if (has_sve_state)
>>             break;
>>         }
>>         if (!has_sve_state)
>>         has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM,
>> -                         zero_reg, 0);
>> +                         reg, 0);
>>         /* If no SVE state exists, then use the existing fpsimd 
>> structure to
>>        write out state and return.  */
>>         if (!has_sve_state)
>>       {
>> +      /* WARNING: SIMD state is laid out in memory in target-endian 
>> format,
>> +         while SVE state is laid out in an endianness-independent format
>> +         (LE).
>> +
>> +         So we have a couple cases to consider:
>> +
>> +         1 - If the target is big endian, then SIMD state is big endian,
>> +         requiring a byteswap.
>> +
>> +         2 - If the target is little endian, then SIMD state is little
>> +         endian, which matches the SVE format, so no byteswap is 
>> needed. */
>> +
>>         /* The collects of the Z registers will overflow the size of a 
>> vreg.
>>            There is enough space in the structure to allow for this, 
>> but we
>>            cannot overflow into the next register as we might not be
>> @@ -284,8 +331,10 @@ aarch64_sve_regs_copy_from_reg_buf (const struct 
>> reg_buffer_common *reg_buf,
>>             if (REG_VALID
>>             == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
>>           {
>> -          reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, zero_reg);
>> -          memcpy (&fpsimd->vregs[i], zero_reg, sizeof (__int128_t));
>> +          reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i, reg);
>> +          /* Handle big endian/little endian SIMD/SVE conversion.  */
>> +          aarch64_maybe_swab128 ((gdb_byte *) &fpsimd->vregs[i], reg,
>> +                     V_REGISTER_SIZE);
>>           }
>>           }
>> diff --git a/gdb/trad-frame.c b/gdb/trad-frame.c
>> index a6a84790a9..8a1aa818ad 100644
>> --- a/gdb/trad-frame.c
>> +++ b/gdb/trad-frame.c
>> @@ -56,6 +56,7 @@ trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>       {
>>         regs[regnum].realreg = regnum;
>>         regs[regnum].addr = -1;
>> +      regs[regnum].data = nullptr;
>>       }
>>   }
>> @@ -83,7 +84,7 @@ trad_frame_alloc_saved_regs (struct frame_info 
>> *this_frame)
>>     return trad_frame_alloc_saved_regs (gdbarch);
>>   }
>> -enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2 };
>> +enum { TF_REG_VALUE = -1, TF_REG_UNKNOWN = -2, TF_REG_VALUE_BYTES = 
>> -3 };
>>   int
>>   trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[], 
>> int regnum)
>> @@ -106,6 +107,16 @@ trad_frame_realreg_p (struct trad_frame_saved_reg 
>> this_saved_regs[],
>>         && this_saved_regs[regnum].addr == -1);
>>   }
>> +/* See trad-frame.h.  */
>> +
>> +bool
>> +trad_frame_value_bytes_p (struct trad_frame_saved_reg this_saved_regs[],
>> +              int regnum)
>> +{
>> +  return (this_saved_regs[regnum].realreg == TF_REG_VALUE_BYTES
>> +      && this_saved_regs[regnum].data != nullptr);
>> +}
>> +
>>   void
>>   trad_frame_set_value (struct trad_frame_saved_reg this_saved_regs[],
>>                 int regnum, LONGEST val)
>> @@ -224,6 +235,35 @@ trad_frame_set_unknown (struct 
>> trad_frame_saved_reg this_saved_regs[],
>>     this_saved_regs[regnum].addr = -1;
>>   }
>> +/* See trad-frame.h.  */
>> +
>> +void
>> +trad_frame_set_value_bytes (struct trad_frame_saved_reg 
>> this_saved_regs[],
>> +                int regnum, const gdb_byte *bytes,
>> +                size_t size)
>> +{
>> +  this_saved_regs[regnum].realreg = TF_REG_VALUE_BYTES;
>> +
>> +  /* Allocate the space and copy the data bytes.  */
>> +  this_saved_regs[regnum].data = FRAME_OBSTACK_CALLOC (size, gdb_byte);
>> +  memcpy (this_saved_regs[regnum].data, bytes, size);
>> +}
>> +
>> +/* See trad-frame.h.  */
>> +
>> +void
>> +trad_frame_set_reg_value_bytes (struct trad_frame_cache 
>> *this_trad_cache,
>> +                int regnum, const gdb_byte *bytes,
>> +                size_t size)
>> +{
>> +  /* External interface for users of trad_frame_cache
>> +     (who cannot access the prev_regs object directly).  */
>> +  trad_frame_set_value_bytes (this_trad_cache->prev_regs, regnum, bytes,
>> +                  size);
>> +}
>> +
>> +
>> +
>>   struct value *
>>   trad_frame_get_prev_register (struct frame_info *this_frame,
>>                     struct trad_frame_saved_reg this_saved_regs[],
>> @@ -240,6 +280,10 @@ trad_frame_get_prev_register (struct frame_info 
>> *this_frame,
>>       /* The register's value is available.  */
>>       return frame_unwind_got_constant (this_frame, regnum,
>>                         this_saved_regs[regnum].addr);
>> +  else if (trad_frame_value_bytes_p (this_saved_regs, regnum))
>> +    /* The register's value is available as a sequence of bytes.  */
>> +    return frame_unwind_got_bytes (this_frame, regnum,
>> +                   this_saved_regs[regnum].data);
>>     else
>>       return frame_unwind_got_optimized (this_frame, regnum);
>>   }
>> diff --git a/gdb/trad-frame.h b/gdb/trad-frame.h
>> index 7b5785616e..38db439579 100644
>> --- a/gdb/trad-frame.h
>> +++ b/gdb/trad-frame.h
>> @@ -52,6 +52,12 @@ void trad_frame_set_reg_regmap (struct 
>> trad_frame_cache *this_trad_cache,
>>   void trad_frame_set_reg_value (struct trad_frame_cache *this_cache,
>>                      int regnum, LONGEST val);
>> +/* Given the cache in THIS_TRAD_CACHE, set the value of REGNUM to the 
>> bytes
>> +   contained in BYTES with size SIZE.  */
>> +void trad_frame_set_reg_value_bytes (struct trad_frame_cache 
>> *this_trad_cache,
>> +                     int regnum, const gdb_byte *bytes,
>> +                     size_t size);
>> +
>>   struct value *trad_frame_get_register (struct trad_frame_cache 
>> *this_trad_cache,
>>                          struct frame_info *this_frame,
>>                          int regnum);
>> @@ -86,6 +92,8 @@ struct trad_frame_saved_reg
>>   {
>>     LONGEST addr; /* A CORE_ADDR fits in a longest.  */
>>     int realreg;
>> +  /* Register data (for values that don't fit in ADDR).  */
>> +  gdb_byte *data;
>>   };
>>   /* Encode REGNUM value in the trad-frame.  */
>> @@ -104,6 +112,12 @@ void trad_frame_set_addr (struct 
>> trad_frame_saved_reg this_trad_cache[],
>>   void trad_frame_set_unknown (struct trad_frame_saved_reg 
>> this_saved_regs[],
>>                    int regnum);
>> +/* Encode REGNUM value in the trad-frame as a sequence of bytes.  
>> This is
>> +   useful when the value is larger than what primitive types can 
>> hold.  */
>> +void trad_frame_set_value_bytes (struct trad_frame_saved_reg 
>> this_saved_regs[],
>> +                 int regnum, const gdb_byte *bytes,
>> +                 size_t size);
>> +
>>   /* Convenience functions, return non-zero if the register has been
>>      encoded as specified.  */
>>   int trad_frame_value_p (struct trad_frame_saved_reg this_saved_regs[],
>> @@ -113,6 +127,11 @@ int trad_frame_addr_p (struct 
>> trad_frame_saved_reg this_saved_regs[],
>>   int trad_frame_realreg_p (struct trad_frame_saved_reg 
>> this_saved_regs[],
>>                 int regnum);
>> +/* Return TRUE if REGNUM is stored as a sequence of bytes, and FALSE
>> +   otherwise.  */
>> +bool trad_frame_value_bytes_p (struct trad_frame_saved_reg 
>> this_saved_regs[],
>> +                  int regnum);
>> +
>>   /* Reset the save regs cache, setting register values to -1.  */
>>   void trad_frame_reset_saved_regs (struct gdbarch *gdbarch,
>>                     struct trad_frame_saved_reg *regs);
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-12-04 14:22   ` Luis Machado
  2020-12-08 13:39     ` Luis Machado
@ 2020-12-08 16:10     ` Simon Marchi
  2020-12-08 19:22       ` Luis Machado
  1 sibling, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2020-12-08 16:10 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

On 2020-12-04 9:22 a.m., Luis Machado via Gdb-patches wrote:
> With the AArch64-specific parts approved, do the global maintainers have any comments on the trad-frame changes, or does it look good?

The changes in trad-frame look good to me.  If you feel like it, you could
change the bytes/size parameters in the functions' prototypes to use
gdb::array_view.

While reading your patch, I spotted some cleanups I'd like to see happening
to make the code cleaner/safer (ideas for future patches).  The first one
would be to make the fields private and add getters that assert that the kind
of value you are getting is indeed the one stored, a bit like I did for the
dynamic_prop structure.  That could catch some problems where we set value as
bytes but fetch it as a register number.

The second one is: instead of overloading the realreg field to hold the kind
of value that is stored in the object, and overloading the "addr" field for
both LONGEST values and addresses, it would be much clearer to have a kind
field + a union:

struct trad_frame_saved_reg
{
  trad_frame_saved_reg_kind kind;

  union
    {
      CORE_ADDR addr;
      LONGEST value;
      int regnum;
      gdb_byte *data;
    };
};

It would take less space, too.

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH,v2] SVE/FPSIMD fixup for big endian
  2020-12-08 16:10     ` Simon Marchi
@ 2020-12-08 19:22       ` Luis Machado
  0 siblings, 0 replies; 13+ messages in thread
From: Luis Machado @ 2020-12-08 19:22 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 12/8/20 1:10 PM, Simon Marchi wrote:
> On 2020-12-04 9:22 a.m., Luis Machado via Gdb-patches wrote:
>> With the AArch64-specific parts approved, do the global maintainers have any comments on the trad-frame changes, or does it look good?
> 
> The changes in trad-frame look good to me.  If you feel like it, you could
> change the bytes/size parameters in the functions' prototypes to use
> gdb::array_view.

I agree. The current interface is being abused a little. I'll push this 
fix first, then I'll do it as a separate patch. Might be useful for 
other targets as well, like ones that have larger vector types.

> 
> While reading your patch, I spotted some cleanups I'd like to see happening
> to make the code cleaner/safer (ideas for future patches).  The first one
> would be to make the fields private and add getters that assert that the kind
> of value you are getting is indeed the one stored, a bit like I did for the
> dynamic_prop structure.  That could catch some problems where we set value as
> bytes but fetch it as a register number.
> 
> The second one is: instead of overloading the realreg field to hold the kind
> of value that is stored in the object, and overloading the "addr" field for
> both LONGEST values and addresses, it would be much clearer to have a kind
> field + a union:
> 
> struct trad_frame_saved_reg
> {
>    trad_frame_saved_reg_kind kind;
> 
>    union
>      {
>        CORE_ADDR addr;
>        LONGEST value;
>        int regnum;
>        gdb_byte *data;
>      };
> };
> 
> It would take less space, too.
Thanks. That's good feedback. I'll take notes and will work on something 
to improve these structures/mechanisms.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-08 19:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-30 18:55 [PATCH][AArch64] SVE/FPSIMD fixup for big endian Luis Machado
2020-12-01 11:28 ` Alan Hayward
2020-12-01 12:19   ` Luis Machado
2020-12-01 17:38     ` Alan Hayward
2020-12-01 18:40       ` Luis Machado
2020-12-02  9:07         ` Alan Hayward
2020-12-02 17:57 ` [PATCH,v2] " Luis Machado
2020-12-03 17:35   ` Alan Hayward
2020-12-03 17:37     ` Luis Machado
2020-12-04 14:22   ` Luis Machado
2020-12-08 13:39     ` Luis Machado
2020-12-08 16:10     ` Simon Marchi
2020-12-08 19:22       ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).