public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Add AMX support.
@ 2022-05-06 12:12 Felix Willgerodt
  2022-05-06 12:12 ` [PATCH 1/4] gdb: define int512 and uint512 as built-in types Felix Willgerodt
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Felix Willgerodt @ 2022-05-06 12:12 UTC (permalink / raw)
  To: gdb-patches

Hi all,

This is a series to add support for the new Advanced Matrix Extensions (AMX)
on x86 architectures. They add new registers that require modeling
in GDB and gdbserver.

Happy about any feedback!

Regards,
Felix

Aleksandar Paunovic (1):
  gdb: define int512 and uint512 as built-in types.

Felix Willgerodt (3):
  gdb, gdbserver: Add AMX registers.
  gdb, gdbserver: Allocate only a sane amount of buffer when fetching
    registers.
  gdb: Clear tilecfg.start_row for any PC modification.

 gdb/amd64-linux-nat.c                         |   2 +
 gdb/amd64-linux-tdep.c                        |  36 +-
 gdb/amd64-tdep.c                              | 214 +++++++++++-
 gdb/amd64-tdep.h                              |   2 +
 gdb/arch/amd64.c                              |   4 +
 gdb/doc/gdb.texinfo                           |  44 +++
 gdb/features/Makefile                         |   1 +
 gdb/features/i386/64bit-amx.c                 |  60 ++++
 gdb/features/i386/64bit-amx.xml               |  36 ++
 gdb/gdbtypes.c                                |   4 +
 gdb/gdbtypes.h                                |   2 +
 gdb/i386-linux-tdep.c                         |   2 +
 gdb/i386-linux-tdep.h                         |   2 +-
 gdb/i386-tdep.c                               | 321 +++++++++++++++++-
 gdb/i386-tdep.h                               | 122 ++++++-
 gdb/i387-tdep.c                               | 156 ++++++++-
 gdb/i387-tdep.h                               |   8 +
 gdb/target-descriptions.c                     |   6 +
 gdb/testsuite/gdb.arch/amd64-amx-corefile.exp | 113 ++++++
 gdb/testsuite/gdb.arch/amd64-amx-startrow.c   | 122 +++++++
 gdb/testsuite/gdb.arch/amd64-amx-startrow.exp |  91 +++++
 gdb/testsuite/gdb.arch/amd64-amx.c            | 173 ++++++++++
 gdb/testsuite/gdb.arch/amd64-amx.exp          | 231 +++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  67 ++++
 gdbserver/i387-fp.cc                          |  70 +++-
 gdbserver/linux-amd64-ipa.cc                  |   2 +-
 gdbserver/linux-i386-ipa.cc                   |   2 +-
 gdbserver/linux-x86-low.cc                    |   4 +-
 gdbserver/linux-x86-tdesc.cc                  |   3 +
 gdbserver/linux-x86-tdesc.h                   |   3 +-
 gdbserver/server.h                            |   2 +-
 gdbsupport/tdesc.cc                           |   2 +
 gdbsupport/tdesc.h                            |   2 +
 gdbsupport/x86-xstate.h                       |  33 +-
 34 files changed, 1901 insertions(+), 41 deletions(-)
 create mode 100644 gdb/features/i386/64bit-amx.c
 create mode 100644 gdb/features/i386/64bit-amx.xml
 create mode 100644 gdb/testsuite/gdb.arch/amd64-amx-corefile.exp
 create mode 100644 gdb/testsuite/gdb.arch/amd64-amx-startrow.c
 create mode 100755 gdb/testsuite/gdb.arch/amd64-amx-startrow.exp
 create mode 100644 gdb/testsuite/gdb.arch/amd64-amx.c
 create mode 100755 gdb/testsuite/gdb.arch/amd64-amx.exp

-- 
2.34.1

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/4] gdb: define int512 and uint512 as built-in types.
  2022-05-06 12:12 [PATCH 0/4] Add AMX support Felix Willgerodt
@ 2022-05-06 12:12 ` Felix Willgerodt
  2022-05-06 12:19   ` Eli Zaretskii
  2022-06-27 18:17   ` Pedro Alves
  2022-05-06 12:12 ` [PATCH 2/4] gdb, gdbserver: Add AMX registers Felix Willgerodt
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 27+ messages in thread
From: Felix Willgerodt @ 2022-05-06 12:12 UTC (permalink / raw)
  To: gdb-patches; +Cc: Aleksandar Paunovic

From: Aleksandar Paunovic <aleksandar.paunovic@intel.com>

Allow using int512 and uint512 as built-in types, particularly
for the target descriptions.
---
 gdb/doc/gdb.texinfo       | 2 ++
 gdb/gdbtypes.c            | 4 ++++
 gdb/gdbtypes.h            | 2 ++
 gdb/target-descriptions.c | 6 ++++++
 gdbsupport/tdesc.cc       | 2 ++
 gdbsupport/tdesc.h        | 2 ++
 6 files changed, 18 insertions(+)

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 38ad2ac32b0..3972b85fe79 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -46457,6 +46457,7 @@ Boolean type, occupying a single bit.
 @itemx int32
 @itemx int64
 @itemx int128
+@itemx int512
 Signed integer types holding the specified number of bits.
 
 @item uint8
@@ -46465,6 +46466,7 @@ Signed integer types holding the specified number of bits.
 @itemx uint32
 @itemx uint64
 @itemx uint128
+@itemx uint512
 Unsigned integer types holding the specified number of bits.
 
 @item code_ptr
diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
index 2a51372a037..bba7e7bf288 100644
--- a/gdb/gdbtypes.c
+++ b/gdb/gdbtypes.c
@@ -6279,6 +6279,10 @@ gdbtypes_post_init (struct gdbarch *gdbarch)
     = arch_integer_type (gdbarch, 128, 0, "int128_t");
   builtin_type->builtin_uint128
     = arch_integer_type (gdbarch, 128, 1, "uint128_t");
+  builtin_type->builtin_int512
+    = arch_integer_type (gdbarch, 512, 0, "int512_t");
+  builtin_type->builtin_uint512
+    = arch_integer_type (gdbarch, 512, 1, "uint512_t");
 
   builtin_type->builtin_int8->set_instance_flags
     (builtin_type->builtin_int8->instance_flags ()
diff --git a/gdb/gdbtypes.h b/gdb/gdbtypes.h
index 7437e1db8ab..6df2df2f13d 100644
--- a/gdb/gdbtypes.h
+++ b/gdb/gdbtypes.h
@@ -2358,6 +2358,8 @@ struct builtin_type
   struct type *builtin_uint64;
   struct type *builtin_int128;
   struct type *builtin_uint128;
+  struct type *builtin_int512;
+  struct type *builtin_uint512;
 
   /* Wide character types.  */
   struct type *builtin_char16;
diff --git a/gdb/target-descriptions.c b/gdb/target-descriptions.c
index 85954ac2939..f56aa2b669e 100644
--- a/gdb/target-descriptions.c
+++ b/gdb/target-descriptions.c
@@ -89,6 +89,9 @@ make_gdb_type (struct gdbarch *gdbarch, struct tdesc_type *ttype)
 	case TDESC_TYPE_INT128:
 	  m_type = builtin_type (m_gdbarch)->builtin_int128;
 	  return;
+	case TDESC_TYPE_INT512:
+	  m_type = builtin_type (m_gdbarch)->builtin_int512;
+	  return;
 	case TDESC_TYPE_UINT8:
 	  m_type = builtin_type (m_gdbarch)->builtin_uint8;
 	  return;
@@ -104,6 +107,9 @@ make_gdb_type (struct gdbarch *gdbarch, struct tdesc_type *ttype)
 	case TDESC_TYPE_UINT128:
 	  m_type = builtin_type (m_gdbarch)->builtin_uint128;
 	  return;
+	case TDESC_TYPE_UINT512:
+	  m_type = builtin_type (m_gdbarch)->builtin_uint512;
+	  return;
 	case TDESC_TYPE_CODE_PTR:
 	  m_type = builtin_type (m_gdbarch)->builtin_func_ptr;
 	  return;
diff --git a/gdbsupport/tdesc.cc b/gdbsupport/tdesc.cc
index 4d41d0b168a..90280a69d5c 100644
--- a/gdbsupport/tdesc.cc
+++ b/gdbsupport/tdesc.cc
@@ -43,11 +43,13 @@ static tdesc_type_builtin tdesc_predefined_types[] =
   { "int32", TDESC_TYPE_INT32 },
   { "int64", TDESC_TYPE_INT64 },
   { "int128", TDESC_TYPE_INT128 },
+  { "int512", TDESC_TYPE_INT512 },
   { "uint8", TDESC_TYPE_UINT8 },
   { "uint16", TDESC_TYPE_UINT16 },
   { "uint32", TDESC_TYPE_UINT32 },
   { "uint64", TDESC_TYPE_UINT64 },
   { "uint128", TDESC_TYPE_UINT128 },
+  { "uint512", TDESC_TYPE_UINT512 },
   { "code_ptr", TDESC_TYPE_CODE_PTR },
   { "data_ptr", TDESC_TYPE_DATA_PTR },
   { "ieee_half", TDESC_TYPE_IEEE_HALF },
diff --git a/gdbsupport/tdesc.h b/gdbsupport/tdesc.h
index 403aa2c3d19..0e0f65e123b 100644
--- a/gdbsupport/tdesc.h
+++ b/gdbsupport/tdesc.h
@@ -161,11 +161,13 @@ enum tdesc_type_kind
   TDESC_TYPE_INT32,
   TDESC_TYPE_INT64,
   TDESC_TYPE_INT128,
+  TDESC_TYPE_INT512,
   TDESC_TYPE_UINT8,
   TDESC_TYPE_UINT16,
   TDESC_TYPE_UINT32,
   TDESC_TYPE_UINT64,
   TDESC_TYPE_UINT128,
+  TDESC_TYPE_UINT512,
   TDESC_TYPE_CODE_PTR,
   TDESC_TYPE_DATA_PTR,
   TDESC_TYPE_IEEE_HALF,
-- 
2.34.1

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-06 12:12 [PATCH 0/4] Add AMX support Felix Willgerodt
  2022-05-06 12:12 ` [PATCH 1/4] gdb: define int512 and uint512 as built-in types Felix Willgerodt
@ 2022-05-06 12:12 ` Felix Willgerodt
  2022-05-06 12:25   ` Eli Zaretskii
                     ` (2 more replies)
  2022-05-06 12:12 ` [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers Felix Willgerodt
  2022-05-06 12:12 ` [PATCH 4/4] gdb: Clear tilecfg.start_row for any PC modification Felix Willgerodt
  3 siblings, 3 replies; 27+ messages in thread
From: Felix Willgerodt @ 2022-05-06 12:12 UTC (permalink / raw)
  To: gdb-patches

Advanced Matrix Extensions (AMX) adds one 64 byte TILECFG register and
eight 1024 byte tile registers TMM0, TMM1, ..., TMM7.  The tile registers
each represent a matrix, whose dimensions are configured via TILECFG.
In XSAVE, all tiles are represented in the 8kB TILEDATA section.

Future AMX platforms are free to add new palettes, which are
run-time configurable partitionings of the TILEDATA space.
Currently only palette 0 (initialized zero state) and palette 1 exist.
New palettes might change any of the following parameters, which are defined
in the palette_table (which can be accessed via CPUID):

define palette_table[id]:
	uint16_t total_tile_bytes
	uint16_t bytes_per_tile
	uint16_t bytes_per_row
	uint16_t max_names
	uint16_t max_rows

More information about AMX can be found in the Intel(R) Architecture
Instruction Set Extensions Programming Reference, May 2021.

The $tilecfg register is implemented as a pseudo register.  For convenience
it is partitioned as a struct, representing the single configuration options
as members.  It doesn't show reserved space, as structs can only contain
existing data types.  To also be able to show the full register, $tilecfg_raw
is implemented as a uint512 value.

The $tmm0-7 registers are also represented as pseudo registers.  This allows
to only show the actually configured matrix and to omit filling zeros, which
greatly increases readability on smaller matrices.  A raw $tiledata register
is implemented as the base for the pseudo registers.

When developing this we also considered updating the target description at
runtime to achieve the dynamic sizing.  This however would have required
extensive changes to the writing and reading from/to XSAVE.  And it wouldn't
work with gdbserver easily, as there currently is no infrastructure to keep
XML target descriptions in sync after the initial transfer.
---
 gdb/amd64-linux-nat.c                         |   2 +
 gdb/amd64-linux-tdep.c                        |  12 +-
 gdb/amd64-tdep.c                              | 214 ++++++++++++-
 gdb/amd64-tdep.h                              |   2 +
 gdb/arch/amd64.c                              |   4 +
 gdb/doc/gdb.texinfo                           |  42 +++
 gdb/features/Makefile                         |   1 +
 gdb/features/i386/64bit-amx.c                 |  60 ++++
 gdb/features/i386/64bit-amx.xml               |  36 +++
 gdb/i386-linux-tdep.c                         |   2 +
 gdb/i386-linux-tdep.h                         |   2 +-
 gdb/i386-tdep.c                               | 300 +++++++++++++++++-
 gdb/i386-tdep.h                               | 122 ++++++-
 gdb/i387-tdep.c                               | 137 +++++++-
 gdb/i387-tdep.h                               |   8 +
 gdb/testsuite/gdb.arch/amd64-amx-corefile.exp | 113 +++++++
 gdb/testsuite/gdb.arch/amd64-amx.c            | 173 ++++++++++
 gdb/testsuite/gdb.arch/amd64-amx.exp          | 231 ++++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  67 ++++
 gdbserver/i387-fp.cc                          |  64 +++-
 gdbserver/linux-amd64-ipa.cc                  |   2 +-
 gdbserver/linux-i386-ipa.cc                   |   2 +-
 gdbserver/linux-x86-low.cc                    |   4 +-
 gdbserver/linux-x86-tdesc.cc                  |   3 +
 gdbserver/linux-x86-tdesc.h                   |   3 +-
 gdbserver/server.h                            |   2 +-
 gdbsupport/x86-xstate.h                       |  33 +-
 27 files changed, 1610 insertions(+), 31 deletions(-)
 create mode 100644 gdb/features/i386/64bit-amx.c
 create mode 100644 gdb/features/i386/64bit-amx.xml
 create mode 100644 gdb/testsuite/gdb.arch/amd64-amx-corefile.exp
 create mode 100644 gdb/testsuite/gdb.arch/amd64-amx.c
 create mode 100755 gdb/testsuite/gdb.arch/amd64-amx.exp

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index 3d28d7e1d57..23f8b6f3e70 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -85,6 +85,8 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1, /* k0 ... k7 (AVX512)  */
   -1, -1, -1, -1, -1, -1, -1, -1, /* zmm0 ... zmm7 (AVX512)  */
   -1,				  /* PKEYS register PKRU  */
+  -1,				  /* TILECFG register (AMX).  */
+  -1,	 			  /* TILEDATA registers tmm0 ... tmm7 (AMX).  */
   ORIG_RAX * 8			  /* "orig_eax"  */
 };
 \f
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 0e5194fbeee..cda90de54c6 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -97,6 +97,8 @@ int amd64_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,				/* PKEYS register pkru  */
+  -1,				/* TILECFG register (AMX).  */
+  -1,			 	/* TILEDATA registers tmm0 ... tmm7 (AMX).  */
 
   /* End of hardware registers */
   21 * 8, 22 * 8,		      /* fs_base and gs_base.  */
@@ -1577,9 +1579,9 @@ const target_desc *
 amd64_linux_read_description (uint64_t xcr0_features_bit, bool is_x32)
 {
   static target_desc *amd64_linux_tdescs \
-    [2/*AVX*/][2/*MPX*/][2/*AVX512*/][2/*PKRU*/] = {};
+    [2/*AVX*/][2/*MPX*/][2/*AVX512*/][2/*PKRU*/][2/*AMX*/] = {};
   static target_desc *x32_linux_tdescs \
-    [2/*AVX*/][2/*AVX512*/][2/*PKRU*/] = {};
+    [2/*AVX*/][2/*AVX512*/][2/*PKRU*/][2/*AMX*/] = {};
 
   target_desc **tdesc;
 
@@ -1587,14 +1589,16 @@ amd64_linux_read_description (uint64_t xcr0_features_bit, bool is_x32)
     {
       tdesc = &x32_linux_tdescs[(xcr0_features_bit & X86_XSTATE_AVX) ? 1 : 0 ]
 	[(xcr0_features_bit & X86_XSTATE_AVX512) ? 1 : 0]
-	[(xcr0_features_bit & X86_XSTATE_PKRU) ? 1 : 0];
+	[(xcr0_features_bit & X86_XSTATE_PKRU) ? 1 : 0]
+	[(xcr0_features_bit & X86_XSTATE_AMX) ? 1 : 0];
     }
   else
     {
       tdesc = &amd64_linux_tdescs[(xcr0_features_bit & X86_XSTATE_AVX) ? 1 : 0]
 	[(xcr0_features_bit & X86_XSTATE_MPX) ? 1 : 0]
 	[(xcr0_features_bit & X86_XSTATE_AVX512) ? 1 : 0]
-	[(xcr0_features_bit & X86_XSTATE_PKRU) ? 1 : 0];
+	[(xcr0_features_bit & X86_XSTATE_PKRU) ? 1 : 0]
+	[(xcr0_features_bit & X86_XSTATE_AMX) ? 1 : 0];
     }
 
   if (*tdesc == NULL)
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index b95ab1e87b8..50347f2f6fa 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -156,6 +156,14 @@ static const char * const amd64_pkeys_names[] = {
     "pkru"
 };
 
+static const char *amd64_tilecfg_raw_names[] = {
+    "tilecfg_raw"
+};
+
+static const char *amd64_tiledata_names[] = {
+    "tiledata"
+};
+
 /* DWARF Register Number Mapping as defined in the System V psABI,
    section 3.6.  */
 
@@ -326,6 +334,19 @@ static const char * const amd64_dword_names[] =
   "eip"
 };
 
+/* Register names for tmm pseudo-registers.  */
+
+static const char *amd64_tmm_names[] = {
+    "tmm0", "tmm1", "tmm2", "tmm3",
+    "tmm4", "tmm5", "tmm6", "tmm7"
+};
+
+/* Register name for tilecfg pseudo-register.  */
+
+static const char *amd64_tilecfg_names[] = {
+    "tilecfg"
+};
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -334,6 +355,10 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
+    return amd64_tilecfg_names[regnum - tdep->tilecfg_regnum];
+  else if (i386_tmm_regnum_p (gdbarch, regnum))
+    return amd64_tmm_names[regnum - tdep->tmm_regnum];
   else if (i386_zmm_regnum_p (gdbarch, regnum))
     return amd64_zmm_names[regnum - tdep->zmm0_regnum];
   else if (i386_ymm_regnum_p (gdbarch, regnum))
@@ -348,6 +373,91 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
     return i386_pseudo_register_name (gdbarch, regnum);
 }
 
+/* A helper function to re-size AMX pseudo registers during reads.  Copies
+   the contents from RAW_BUF to BUF and re-sizes the value.  */
+
+static void
+amd64_tmm_resize_read (const tilecfg_reg *tilecfg, const gdb_byte *raw_buf,
+		       gdb_byte *buf, value *result_value, const int tmmnum)
+{
+  uint16_t columns = 64;
+  uint8_t rows = 16;
+
+  if (tilecfg != nullptr)
+    {
+      columns = tilecfg->bytes_per_row (tmmnum);
+      rows = tilecfg->rows (tmmnum);
+      if (columns == 0)
+	columns = 64;
+      if (rows == 0)
+	rows = 16;
+    }
+
+  gdb_assert (TYPE_LENGTH (value_type (result_value)) >= rows * columns);
+
+  /* Copy each row from raw_buf into buf.  The rows are not consecutive
+     but they are on MAX_BYTES_PER_ROW * iRow position.  */
+  const gdb_byte *raw_buf_offset
+    = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE;
+  for (uint8_t iRow = 0; iRow < rows; ++iRow)
+    {
+      memcpy (buf + columns * iRow,
+	      raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow,
+	      columns);
+    }
+
+  /* Adjust the result_value.  The value is a union of matrices of different
+     types.  See i386_tmm_type ().  This iterates over each member and
+     adjusts the dimensions according to the type.  */
+  for (int i = 0; i < value_type (result_value)->num_fields (); ++i)
+    {
+      type *rows_type = value_type (result_value)->fields ()[i].m_type;
+      type *cols_type = rows_type->main_type->target_type;
+
+      /* Adjust array bit lengths.  */
+      rows_type->length = columns * rows;
+      cols_type->length = columns;
+
+      /* Adjust array dimensions.  */
+      rows_type->bounds ()->high.set_const_val (rows - 1);
+      int num_bytes = cols_type->main_type->target_type->length;
+      cols_type->bounds ()->high.set_const_val (columns / num_bytes - 1);
+    }
+}
+
+/* A helper function to re-size AMX pseudo registers during writes.  Copies
+   the contents from BUF to RAW_BUF.  */
+
+static void
+amd64_tmm_resize_write (const tilecfg_reg *tilecfg, gdb_byte *raw_buf,
+			const gdb_byte *buf, const int tmmnum)
+{
+  uint16_t columns = 64;
+  uint8_t rows = 16;
+
+  if (tilecfg != nullptr)
+    {
+      columns = tilecfg->bytes_per_row (tmmnum);
+      rows = tilecfg->rows (tmmnum);
+      if (columns == 0)
+	columns = 64;
+      if (rows == 0)
+	rows = 16;
+    }
+
+  /* Copy each row from buf into raw_buf.  BUF represents a tile as the user
+     would see it in the pseudo register type.  RAW_BUF represents the whole
+     tiledata section.  We therefore need to find the tile's position in
+     tiledata and find the right rows from there.  */
+  gdb_byte *raw_buf_offset = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE;
+  for (uint8_t iRow = 0; iRow < rows; ++iRow)
+    {
+      memcpy (raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow,
+	      buf + columns * iRow,
+	      columns);
+    }
+}
+
 static struct value *
 amd64_pseudo_register_read_value (struct gdbarch *gdbarch,
 				  readable_regcache *regcache,
@@ -401,6 +511,58 @@ amd64_pseudo_register_read_value (struct gdbarch *gdbarch,
 	mark_value_bytes_unavailable (result_value, 0,
 				      TYPE_LENGTH (value_type (result_value)));
     }
+  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
+    {
+      /* Read tilecfg.  */
+      gdb_byte raw_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)];
+      register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum,
+						   raw_buf);
+      if (status != REG_VALID)
+	{
+	  mark_value_bytes_unavailable (
+	    result_value, 0, TYPE_LENGTH (value_type (result_value)));
+	}
+      else
+	{
+	  /* Copy palette and start_row.  See tilecfg_type ().  */
+	  memcpy (buf, raw_buf, 2 * 1);
+	  /* Copy all colsb.  */
+	  memcpy (buf + 2, raw_buf + 16, 2 * 8);
+	  /* Copy all rows.  */
+	  memcpy (buf + 18, raw_buf + 48, 1 * 8);
+	}
+    }
+  else if (i386_tmm_regnum_p (gdbarch, regnum))
+    {
+      /* Read tilecfg.  */
+      gdb_byte tilecfg_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)];
+      register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum,
+						   tilecfg_buf);
+      if (status != REG_VALID)
+	{
+	  mark_value_bytes_unavailable (
+	    result_value, 0, TYPE_LENGTH (value_type (result_value)));
+	}
+      else
+	{
+	  tilecfg_reg tilecfg{ tilecfg_buf };
+	  gdb_byte raw_buf[register_size (gdbarch, tdep->tiledata_regnum)];
+	  status = regcache->raw_read (tdep->tiledata_regnum, raw_buf);
+
+	  if (status != REG_VALID)
+	    {
+	      mark_value_bytes_unavailable (
+		result_value, 0, TYPE_LENGTH (value_type (result_value)));
+	    }
+	  else
+	    {
+	      /* Re-size value and copy data.  */
+	      amd64_tmm_resize_read (&tilecfg, raw_buf,
+				     buf, result_value,
+				     regnum - tdep->tmm_regnum);
+	    }
+	}
+    }
   else
     i386_pseudo_register_read_into_value (gdbarch, regcache, regnum,
 					  result_value);
@@ -455,6 +617,38 @@ amd64_pseudo_register_write (struct gdbarch *gdbarch,
       /* ... Write.  */
       regcache->raw_write (gpnum, raw_buf);
     }
+  else if (i386_tmm_regnum_p (gdbarch, regnum))
+    {
+      /* Read tilecfg.  */
+      gdb_byte tilecfg_buf[register_size (gdbarch,
+					  tdep->tilecfg_raw_regnum)];
+      register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum,
+						   tilecfg_buf);
+
+      if (status != REG_VALID)
+	error (_("Could not read tilecfg to determine tmm dimensions."));
+
+      tilecfg_reg tilecfg{ tilecfg_buf };
+      gdb_byte raw_buf[register_size (gdbarch, tdep->tiledata_regnum)];
+      /* Modify tile.  */
+      regcache->raw_read (tdep->tiledata_regnum, raw_buf);
+      amd64_tmm_resize_write (&tilecfg, raw_buf, buf,
+			      regnum - tdep->tmm_regnum);
+      /* ... Write.  */
+      regcache->raw_write (tdep->tiledata_regnum, raw_buf);
+    }
+  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
+    {
+      gdb_byte raw_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)]
+	= { 0 };
+      /* Copy palette and start_row.  See tilecfg_type ().  */
+      memcpy (raw_buf, buf, 2 * 1);
+      /* Copy all colsb.  */
+      memcpy (raw_buf + 16, buf + 2, 2 * 8);
+      /* Copy all rows.  */
+      memcpy (raw_buf + 48, buf + 18, 1 * 8);
+      regcache->raw_write (tdep->tilecfg_raw_regnum, raw_buf);
+    }
   else
     i386_pseudo_register_write (gdbarch, regcache, regnum, buf);
 }
@@ -3181,6 +3375,23 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch,
       tdep->num_pkeys_regs = 1;
     }
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.amx") != nullptr)
+    {
+      tdep->tilecfg_raw_register_names = amd64_tilecfg_raw_names;
+      tdep->tilecfg_raw_regnum = AMD64_AMX_TILECFG_RAW_REGNUM;
+      tdep->num_tilecfg_raw_regs = 1;
+
+      tdep->tilecfg_register_names = amd64_tilecfg_names;
+      tdep->num_tilecfg_regs = 1;
+
+      tdep->tiledata_register_names = amd64_tiledata_names;
+      tdep->tiledata_regnum = AMD64_AMX_TILEDATA_REGNUM;
+      tdep->num_tiledata_regs = 1;
+
+      tdep->tmm_register_names = amd64_tmm_names;
+      tdep->num_tmm_regs = 8;
+    }
+
   tdep->num_byte_regs = 20;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -3340,13 +3551,14 @@ const struct target_desc *
 amd64_target_description (uint64_t xcr0, bool segments)
 {
   static target_desc *amd64_tdescs \
-    [2/*AVX*/][2/*MPX*/][2/*AVX512*/][2/*PKRU*/][2/*segments*/] = {};
+    [2/*AVX*/][2/*MPX*/][2/*AVX512*/][2/*PKRU*/][2/*AMX*/][2/*segments*/] = {};
   target_desc **tdesc;
 
   tdesc = &amd64_tdescs[(xcr0 & X86_XSTATE_AVX) ? 1 : 0]
     [(xcr0 & X86_XSTATE_MPX) ? 1 : 0]
     [(xcr0 & X86_XSTATE_AVX512) ? 1 : 0]
     [(xcr0 & X86_XSTATE_PKRU) ? 1 : 0]
+    [(xcr0 & X86_XSTATE_AMX) ? 1 : 0]
     [segments ? 1 : 0];
 
   if (*tdesc == NULL)
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index c18766e71c4..3dd279bd0ae 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -79,6 +79,8 @@ enum amd64_regnum
   AMD64_ZMM0H_REGNUM,
   AMD64_ZMM31H_REGNUM = AMD64_ZMM0H_REGNUM + 31,
   AMD64_PKRU_REGNUM,
+  AMD64_AMX_TILECFG_RAW_REGNUM,
+  AMD64_AMX_TILEDATA_REGNUM,
   AMD64_FSBASE_REGNUM,
   AMD64_GSBASE_REGNUM
 };
diff --git a/gdb/arch/amd64.c b/gdb/arch/amd64.c
index 559f678d356..f0c350e8311 100644
--- a/gdb/arch/amd64.c
+++ b/gdb/arch/amd64.c
@@ -22,6 +22,7 @@
 
 #include "../features/i386/64bit-avx.c"
 #include "../features/i386/64bit-avx512.c"
+#include "../features/i386/64bit-amx.c"
 #include "../features/i386/64bit-core.c"
 #include "../features/i386/64bit-linux.c"
 #include "../features/i386/64bit-mpx.c"
@@ -75,5 +76,8 @@ amd64_create_target_description (uint64_t xcr0, bool is_x32, bool is_linux,
   if (xcr0 & X86_XSTATE_PKRU)
     regnum = create_feature_i386_pkeys (tdesc.get (), regnum);
 
+  if (xcr0 & X86_XSTATE_AMX)
+    regnum = create_feature_i386_64bit_amx (tdesc.get (), regnum);
+
   return tdesc.release ();
 }
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 3972b85fe79..0bf1a022042 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -25659,6 +25659,43 @@ At this last step the value of bnd0 can be changed for investigation of bound
 violations caused along the execution of the call.  In order to know how to
 set the bound registers or bound table for the call consult the ABI.
 
+@subsubsection Intel @dfn{Advanced Matrix Extensions} (AMX).
+@cindex Advanced Matrix Extensions (AMX).
+
+Advanced Matrix Extensions (AMX) adds one 64 byte @samp{TILECFG} register and
+eight 1024 byte tile registers @samp{TMM0}, @samp{TMM1}, ..., @samp{TMM7}.
+The tile registers each represent a matrix, whose dimensions are configured via
+@samp{TILECFG}.  Future platforms might also partition the register area of
+8 * 1024 bytes between a different number of tiles.
+
+To present such big registers in a user friendly way, @value{GDBN} represents
+the @samp{TILECFG} and tile registers as pseudo registers.
+The @samp{TILECFG} is shown as a @code{struct}, omitting reserved bits.
+The full register can still be viewed using @samp{TILECFG_RAW}.  The tile
+registers are sized dynamically according to the configuration in
+@samp{TILECFG}.  For example:
+
+@smallexample
+	(gdb) print/x $tilecfg_raw
+	$1 = 0x203020000000000000000000000000000000000000000000000000000001000
+	10000c00000000000000000000000000000001
+	(gdb) print $tilecfg
+	$2 = @{palette = 0x1, start_row = 0x0, tile0.colsb = 0xc,
+	tile1.colsb = 0x10, tile2.colsb = 0x10, tile3.colsb = 0x0,
+	tile4.colsb = 0x0, tile5.colsb = 0x0, tile6.colsb = 0x0,
+	tile7.colsb = 0x0, tile0.rows = 0x2, tile1.rows = 0x3, tile2.rows = 0x2,
+	tile3.rows = 0x0, tile4.rows = 0x0, tile5.rows = 0x0, tile6.rows = 0x0,
+	tile7.rows = 0x0@}
+	(gdb) p $tmm0.m_int8
+	$3 = @{@{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2@},	@{1, 1, 1, 1, 2, 2, 2,
+	2, 3, 3, 3, 3@}@}
+@end smallexample
+
+The raw register data for tiles can be seen in the register
+@samp{TILEDATA}, which represents the whole 8 * 1024 bytes available for tiles.
+Setting any pseudo register will result in changes in the corresponding raw
+register (e.g. @samp{TILECFG_RAW} and @samp{TILEDATA}).
+
 @node Alpha
 @subsection Alpha
 
@@ -46824,6 +46861,11 @@ targets.  It should contain the registers @samp{r0} through @samp{r31},
 @samp{pc}, and @samp{badv}.  Either the architectural names (@samp{r0},
 @samp{r1}, etc) can be used, or the ABI names (@samp{zero}, @samp{ra}, etc).
 
+The @samp{org.gnu.gdb.i386.amx} feature is optional.  It should
+describe one config user mode register @samp{tilecfg_raw} that has
+64 bytes and one @samp{tiledata} register that has 8192 bytes.
+All AMX registers are valid only for amd64.
+
 @node MicroBlaze Features
 @subsection MicroBlaze Features
 @cindex target descriptions, MicroBlaze features
diff --git a/gdb/features/Makefile b/gdb/features/Makefile
index 15d623c2681..302aef29a55 100644
--- a/gdb/features/Makefile
+++ b/gdb/features/Makefile
@@ -225,6 +225,7 @@ FEATURE_XMLFILES = aarch64-core.xml \
 	i386/64bit-mpx.xml \
 	i386/64bit-segments.xml \
 	i386/64bit-avx.xml \
+	i386/64bit-amx.xml \
 	i386/64bit-linux.xml \
 	i386/64bit-sse.xml \
 	i386/pkeys.xml \
diff --git a/gdb/features/i386/64bit-amx.c b/gdb/features/i386/64bit-amx.c
new file mode 100644
index 00000000000..a22e607cf0d
--- /dev/null
+++ b/gdb/features/i386/64bit-amx.c
@@ -0,0 +1,60 @@
+/* THIS FILE IS GENERATED.  -*- buffer-read-only: t -*- vi:set ro:
+  Original: 64bit-amx.xml */
+
+#include "gdbsupport/tdesc.h"
+
+static int
+create_feature_i386_64bit_amx (struct target_desc *result, long regnum)
+{
+  struct tdesc_feature *feature;
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.amx");
+  tdesc_type *element_type;
+  element_type = tdesc_named_type (feature, "int8");
+  tdesc_create_vector (feature, "v_i8", element_type, 64);
+
+  element_type = tdesc_named_type (feature, "v_i8");
+  tdesc_create_vector (feature, "matrix_i8", element_type, 128);
+
+  element_type = tdesc_named_type (feature, "uint8");
+  tdesc_create_vector (feature, "v_ui8", element_type, 64);
+
+  element_type = tdesc_named_type (feature, "v_ui8");
+  tdesc_create_vector (feature, "matrix_ui8", element_type, 128);
+
+  element_type = tdesc_named_type (feature, "int32");
+  tdesc_create_vector (feature, "v_i32", element_type, 16);
+
+  element_type = tdesc_named_type (feature, "v_i32");
+  tdesc_create_vector (feature, "matrix_i32", element_type, 128);
+
+  element_type = tdesc_named_type (feature, "bfloat16");
+  tdesc_create_vector (feature, "v_bf16", element_type, 32);
+
+  element_type = tdesc_named_type (feature, "v_bf16");
+  tdesc_create_vector (feature, "matrix_bf16", element_type, 128);
+
+  element_type = tdesc_named_type (feature, "ieee_single");
+  tdesc_create_vector (feature, "v_fp32", element_type, 16);
+
+  element_type = tdesc_named_type (feature, "v_fp32");
+  tdesc_create_vector (feature, "matrix_fp32", element_type, 128);
+
+  tdesc_type_with_fields *type_with_fields;
+  type_with_fields = tdesc_create_union (feature, "tiledata_type");
+  tdesc_type *field_type;
+  field_type = tdesc_named_type (feature, "matrix_i8");
+  tdesc_add_field (type_with_fields, "m_int8", field_type);
+  field_type = tdesc_named_type (feature, "matrix_ui8");
+  tdesc_add_field (type_with_fields, "m_uint8", field_type);
+  field_type = tdesc_named_type (feature, "matrix_i32");
+  tdesc_add_field (type_with_fields, "m_int32", field_type);
+  field_type = tdesc_named_type (feature, "matrix_bf16");
+  tdesc_add_field (type_with_fields, "m_bf16", field_type);
+  field_type = tdesc_named_type (feature, "matrix_fp32");
+  tdesc_add_field (type_with_fields, "m_fp32", field_type);
+
+  tdesc_create_reg (feature, "tilecfg_raw", regnum++, 1, NULL, 512, "uint512");
+  tdesc_create_reg (feature, "tiledata", regnum++, 1, NULL, 65536, "tiledata_type");
+  return regnum;
+}
diff --git a/gdb/features/i386/64bit-amx.xml b/gdb/features/i386/64bit-amx.xml
new file mode 100644
index 00000000000..1e2662bca07
--- /dev/null
+++ b/gdb/features/i386/64bit-amx.xml
@@ -0,0 +1,36 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2020-2022 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<feature name="org.gnu.gdb.i386.amx">
+  <reg name="tilecfg_raw" bitsize="512" type="uint512"/>
+
+  <vector id="v_i8" type="int8" count="64"/>
+  <vector id="matrix_i8" type="v_i8" count="128"/>
+
+  <vector id="v_ui8" type="uint8" count="64"/>
+  <vector id="matrix_ui8" type="v_ui8" count="128"/>
+
+  <vector id="v_i32" type="int32" count="16"/>
+  <vector id="matrix_i32" type="v_i32" count="128"/>
+
+  <vector id="v_bf16" type="bfloat16" count="32"/>
+  <vector id="matrix_bf16" type="v_bf16" count="128"/>
+
+  <vector id="v_fp32" type="ieee_single" count="16"/>
+  <vector id="matrix_fp32" type="v_fp32" count="128"/>
+
+  <union id="tiledata_type">
+    <field name="m_int8" type="matrix_i8"/>
+    <field name="m_uint8" type="matrix_ui8"/>
+    <field name="m_int32" type="matrix_i32"/>
+    <field name="m_bf16" type="matrix_bf16"/>
+    <field name="m_fp32" type="matrix_fp32"/>
+  </union>
+
+  <reg name="tiledata" bitsize="65536" type="tiledata_type"/>
+</feature>
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index 5d7f54194af..7b73417c7a8 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -611,6 +611,8 @@ int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1, /* k0 ... k7 (AVX512)  */
   -1, -1, -1, -1, -1, -1, -1, -1, /* zmm0 ... zmm7 (AVX512)  */
   -1,				  /* PKRU register  */
+  -1,				  /* AMX register TILECFG.  */
+  -1,				  /* AMX TILEDATA registers: tmm0 ... tmm7.  */
   11 * 4,			  /* "orig_eax"  */
 };
 
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 6b3555aa3ea..705c7bcd602 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -29,7 +29,7 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM (I386_PKRU_REGNUM + 1)
+#define I386_LINUX_ORIG_EAX_REGNUM (I386_AMX_TILEDATA_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 8501e12e241..921b24ab60f 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -142,6 +142,31 @@ static const char * const i386_mmx_names[] =
   "mm4", "mm5", "mm6", "mm7"
 };
 
+/* Register names for AMX registers.  */
+
+static const char * const i386_tilecfg_raw_names[] =
+{
+  "tilecfg_raw"
+};
+
+static const char * const i386_tiledata_names[] =
+{
+  "tiledata"
+};
+
+/* Register names for AMX pseudo-registers.  */
+
+static const char * const i386_tilecfg_names[] =
+{
+  "tilecfg"
+};
+
+static const char * const i386_tmm_names[] =
+{
+  "tmm0", "tmm1", "tmm2", "tmm3",
+  "tmm4", "tmm5", "tmm6", "tmm7"
+};
+
 /* Register names for byte pseudo-registers.  */
 
 static const char * const i386_byte_names[] =
@@ -436,6 +461,36 @@ i386_pkru_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < I387_NUM_PKEYS_REGS;
 }
 
+/* AMX tilecfg register?  */
+
+bool
+i386_tilecfg_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
+  int tilecfg_regnum = tdep->tilecfg_regnum;
+
+  if (tilecfg_regnum < 0)
+    return false;
+
+  regnum -= tilecfg_regnum;
+  return regnum >= 0 && regnum < I387_NUM_TILECFG_REGS;
+}
+
+/* AMX tmm register?  */
+
+bool
+i386_tmm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
+  int tmm_regnum = tdep->tmm_regnum;
+
+  if (tmm_regnum < 0)
+    return false;
+
+  regnum -= tmm_regnum;
+  return regnum >= 0 && regnum < I387_NUM_TMM_REGS;
+}
+
 /* Return the name of register REGNUM, or the empty string if it is
    an anonymous register.  */
 
@@ -475,10 +530,37 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return i386_word_names[regnum - tdep->ax_regnum];
+  else if (i386_tmm_regnum_p (gdbarch, regnum))
+    return i386_tmm_names[regnum - tdep->tmm_regnum];
+  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
+    return i386_tilecfg_names[regnum - tdep->tilecfg_regnum];
 
   internal_error (__FILE__, __LINE__, _("invalid regnum"));
 }
 
+/* AMX tilecfg_reg constructor.  */
+
+tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg) : tilecfg_reg ()
+{
+  /* Use default values.  */
+  if (raw_tilecfg == nullptr)
+    return;
+
+  palette = raw_tilecfg[0];
+  start_row = raw_tilecfg[1];
+
+  /* Read TILECFG column and row values via pointers.
+     Columns are represented by 2 bytes and rows are represented
+     by 1 byte.  Column pointer which is *uint8_t needs to be converted
+     to *uint16_t pointer.  */
+  uint16_t *vec_col_pos
+      = reinterpret_cast<uint16_t *> (raw_tilecfg + COLUMN_MEMORY_OFFSET);
+  uint8_t *vec_row_pos = raw_tilecfg + ROW_MEMORY_OFFSET;
+
+  for (int i = 0; i < MAX_NAMES; i++)
+    columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] };
+}
+
 /* Convert a dbx register number REG to the appropriate register
    number used by GDB.  */
 
@@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch)
   return tdep->i386_mmx_type;
 }
 
+/* Construct vector type for TMM registers.  */
+
+static struct type *
+i386_tmm_type (struct gdbarch *gdbarch)
+{
+  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_tmm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW;
+      uint8_t max_rows = tilecfg_reg::MAX_ROWS;
+
+      /* The type we're building is this:  */
+#if 0
+      union __gdb_builtin_type_matrix1024i
+      {
+	int8_t m_int8[max_rows][bytes_per_row];
+	uint8_t m_uint8[max_rows][bytes_per_row];
+	int32_t m_int32[max_rows][bytes_per_row/4];
+	bfloat16_t m_bfloat16[max_rows][bytes_per_row/2];
+	float m_int32[max_rows][bytes_per_row/4];
+      };
+#endif
+
+      struct type *t;
+      t = arch_composite_type (gdbarch, "builtin_type_tile", TYPE_CODE_UNION);
+
+      append_composite_type_field (
+	  t, "m_int8",
+	  init_vector_type (init_vector_type (bt->builtin_int8, bytes_per_row),
+			    max_rows));
+
+      append_composite_type_field (
+	  t, "m_uint8",
+	  init_vector_type (
+	      init_vector_type (bt->builtin_uint8, bytes_per_row), max_rows));
+
+      append_composite_type_field (
+	  t, "m_int32",
+	  init_vector_type (
+	      init_vector_type (bt->builtin_int32, bytes_per_row / 4),
+	      max_rows));
+
+      append_composite_type_field (
+	  t, "m_bf16",
+	  init_vector_type (
+	      init_vector_type (bt->builtin_bfloat16, bytes_per_row / 2),
+	      max_rows));
+
+      append_composite_type_field (
+	  t, "m_fp32",
+	  init_vector_type (
+	      init_vector_type (bt->builtin_float, bytes_per_row / 4),
+	      max_rows));
+
+      t->set_is_vector (true);
+      t->set_name ("builtin_type_tile");
+      tdep->i386_tmm_type = t;
+    }
+
+  return tdep->i386_tmm_type;
+}
+
+/* Construct vector type for TILECFG registers.  */
+
+static struct type *
+i386_tilecfg_type (struct gdbarch *gdbarch)
+{
+  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_tilecfg_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this:  */
+#if 0
+      struct __gdb_builtin_type_tilecfg
+	{
+	  uint8_t palette;
+	  uint8_t start_row;
+	  // Bytes 2-15 reserved
+	  uint16_t tile0.colsb;
+	  uint16_t tile1.colsb;
+	  uint16_t tile2.colsb;
+	  uint16_t tile3.colsb;
+	  uint16_t tile4.colsb;
+	  uint16_t tile5.colsb;
+	  uint16_t tile6.colsb;
+	  uint16_t tile7.colsb;
+	  // Bytes 32-47 reserved
+	  uint8_t tile0.rows;
+	  uint8_t tile1.rows;
+	  uint8_t tile2.rows;
+	  uint8_t tile3.rows;
+	  uint8_t tile4.rows;
+	  uint8_t tile5.rows;
+	  uint8_t tile6.rows;
+	  uint8_t tile7.rows;
+	  // Bytes 56-63 reserved
+	};
+#endif
+
+    struct type *t;
+    t = arch_composite_type (gdbarch, "builtin_type_tilecfg",
+			     TYPE_CODE_STRUCT);
+
+    append_composite_type_field (t, "palette", bt->builtin_uint8);
+    append_composite_type_field (t, "start_row", bt->builtin_uint8);
+    /* Note: GDBs expression evaluation cannot handle naming these
+       tile0.colsb.  */
+    append_composite_type_field (t, "tile0_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile1_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile2_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile3_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile4_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile5_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile6_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile7_colsb", bt->builtin_uint16);
+    append_composite_type_field (t, "tile0_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile1_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile2_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile3_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile4_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile5_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile6_rows", bt->builtin_uint8);
+    append_composite_type_field (t, "tile7_rows", bt->builtin_uint8);
+
+    t->set_name ("builtin_type_tilecfg");
+    tdep->i386_tilecfg_type = t;
+  }
+
+  return tdep->i386_tilecfg_type;
+}
+
 /* Return the GDB type object for the "standard" data type of data in
    register REGNUM.  */
 
@@ -3323,6 +3541,10 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
     return i386_ymm_type (gdbarch);
   else if (i386_zmm_regnum_p (gdbarch, regnum))
     return i386_zmm_type (gdbarch);
+  else if (i386_tmm_regnum_p (gdbarch, regnum))
+    return i386_tmm_type (gdbarch);
+  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
+    return i386_tilecfg_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -4557,7 +4779,8 @@ i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
       ymm_regnum_p, ymmh_regnum_p, ymm_avx512_regnum_p, ymmh_avx512_regnum_p,
       bndr_regnum_p, bnd_regnum_p, zmm_regnum_p, zmmh_regnum_p,
       mpx_ctrl_regnum_p, xmm_avx512_regnum_p,
-      avx512_p, avx_p, sse_p, pkru_regnum_p;
+      avx512_p, avx_p, sse_p, pkru_regnum_p, tilecfg_regnum_p,
+      tmm_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
@@ -4584,6 +4807,8 @@ i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
   ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   ymm_avx512_regnum_p = i386_ymm_avx512_regnum_p (gdbarch, regnum);
   zmm_regnum_p = i386_zmm_regnum_p (gdbarch, regnum);
+  tmm_regnum_p = i386_tmm_regnum_p (gdbarch, regnum);
+  tilecfg_regnum_p = i386_tilecfg_regnum_p (gdbarch, regnum);
 
   avx512_p = ((tdep->xcr0 & X86_XSTATE_AVX_AVX512_MASK)
 	      == X86_XSTATE_AVX_AVX512_MASK);
@@ -4597,7 +4822,7 @@ i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 	    || (zmm_regnum_p && avx512_p)
 	    || ((ymm_regnum_p || ymm_avx512_regnum_p) && avx_p)
 	    || ((xmm_regnum_p || xmm_avx512_regnum_p) && sse_p)
-	    || mxcsr_regnum_p);
+	    || mxcsr_regnum_p || tmm_regnum_p || tilecfg_regnum_p);
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
@@ -4647,7 +4872,9 @@ i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 	    && !mpx_ctrl_regnum_p
 	    && !zmm_regnum_p
 	    && !zmmh_regnum_p
-	    && !pkru_regnum_p);
+	    && !pkru_regnum_p
+	    && !tmm_regnum_p
+	    && !tilecfg_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -8249,7 +8476,8 @@ i386_validate_tdesc_p (i386_gdbarch_tdep *tdep,
   const struct tdesc_feature *feature_core;
 
   const struct tdesc_feature *feature_sse, *feature_avx, *feature_mpx,
-			     *feature_avx512, *feature_pkeys, *feature_segments;
+			     *feature_avx512, *feature_pkeys, *feature_segments,
+			     *feature_amx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -8278,6 +8506,9 @@ i386_validate_tdesc_p (i386_gdbarch_tdep *tdep,
   /* Try PKEYS  */
   feature_pkeys = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.pkeys");
 
+  /* Try AMX.  */
+  feature_amx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.amx");
+
   valid_p = 1;
 
   /* The XCR0 bits.  */
@@ -8410,6 +8641,35 @@ i386_validate_tdesc_p (i386_gdbarch_tdep *tdep,
 					    tdep->pkeys_register_names[i]);
     }
 
+  if (feature_amx != nullptr)
+    {
+      tdep->xcr0 |= X86_XSTATE_TILECFG;
+
+      if (tdep->tilecfg_raw_regnum < 0)
+	{
+	  tdep->tilecfg_raw_register_names = i386_tilecfg_raw_names;
+	  tdep->tilecfg_raw_regnum = I386_AMX_TILECFG_RAW_REGNUM;
+	  tdep->num_tilecfg_raw_regs = 1;
+	}
+
+      valid_p &= tdesc_numbered_register (feature_amx, tdesc_data,
+					  tdep->tilecfg_raw_regnum,
+					  tdep->tilecfg_raw_register_names[0]);
+
+      tdep->xcr0 |= X86_XSTATE_TILEDATA;
+
+      if (tdep->tiledata_regnum < 0)
+	{
+	  tdep->tiledata_register_names = i386_tiledata_names;
+	  tdep->tiledata_regnum = I386_AMX_TILEDATA_REGNUM;
+	  tdep->num_tiledata_regs = 1;
+	}
+
+      valid_p &= tdesc_numbered_register (feature_amx, tdesc_data,
+					  tdep->tiledata_regnum,
+					  tdep->tiledata_register_names[0]);
+    }
+
   return valid_p;
 }
 
@@ -8683,6 +8943,16 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   /* No segment base registers.  */
   tdep->fsbase_regnum = -1;
 
+  /* No AMX registers.  */
+  tdep->tilecfg_regnum = -1;
+  tdep->num_tilecfg_regs = 0;
+  tdep->tilecfg_raw_regnum = -1;
+  tdep->num_tilecfg_raw_regs = 0;
+  tdep->tmm_regnum = -1;
+  tdep->num_tmm_regs = 0;
+  tdep->tiledata_regnum = -1;
+  tdep->num_tiledata_regs = 0;
+
   tdesc_arch_data_up tdesc_data = tdesc_data_alloc ();
 
   set_gdbarch_relocate_instruction (gdbarch, i386_relocate_instruction);
@@ -8717,7 +8987,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
 					 + tdep->num_ymm_regs
 					 + num_bnd_cooked
 					 + tdep->num_ymm_avx512_regs
-					 + tdep->num_zmm_regs));
+					 + tdep->num_zmm_regs
+					 + tdep->num_tmm_regs
+					 + tdep->num_tilecfg_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
@@ -8769,6 +9041,24 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   else
     tdep->zmm0_regnum = -1;
 
+  if (tdep->num_tmm_regs != 0)
+    {
+      /* Support TMM pseudo-register if it is available.  */
+      tdep->tmm_regnum = mm0_regnum;
+      mm0_regnum += tdep->num_tmm_regs;
+    }
+  else
+    tdep->num_tmm_regs = -1;
+
+  if (tdep->num_tilecfg_regs != 0)
+    {
+      /* Support TMM pseudo-register if it is available.  */
+      tdep->tilecfg_regnum = mm0_regnum;
+      mm0_regnum += tdep->num_tilecfg_regs;
+    }
+  else
+    tdep->num_tmm_regs = -1;
+
   bnd0_regnum = mm0_regnum;
   if (tdep->num_mmx_regs != 0)
     {
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index a8067cf6b6c..d1162d866a1 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -23,6 +23,8 @@
 #include "gdbarch.h"
 #include "infrun.h"
 #include "expression.h"
+#include <vector>
+#include <utility>
 
 struct frame_info;
 struct gdbarch;
@@ -202,6 +204,28 @@ struct i386_gdbarch_tdep : gdbarch_tdep
   /* PKEYS register names.  */
   const char * const *pkeys_register_names = nullptr;
 
+  /* Register number for AMX tilecfg register, including pseudo register.  */
+  int tilecfg_regnum = 0;
+  int tilecfg_raw_regnum = 0;
+
+  /* Number of tilecfg registers, including pseudo register.  */
+  int num_tilecfg_regs = 0;
+  int num_tilecfg_raw_regs = 0;
+
+  /* Register number for AMX tmm register, including pseudo registers.  */
+  int tmm_regnum = 0;
+  int tiledata_regnum = 0;
+
+  /* Number of AMX tmm registers, including pseudo registers.  */
+  int num_tmm_regs = 0;
+  int num_tiledata_regs = 0;
+
+  /* AMX register names.  */
+  const char * const *tilecfg_raw_register_names = nullptr;
+  const char * const *tilecfg_register_names = nullptr;
+  const char * const *tmm_register_names = nullptr;
+  const char * const *tiledata_register_names = nullptr;
+
   /* Register number for %fsbase.  Set this to -1 to indicate the
      absence of segment base registers.  */
   int fsbase_regnum = 0;
@@ -241,6 +265,8 @@ struct i386_gdbarch_tdep : gdbarch_tdep
   struct type *i386_mmx_type = nullptr;
   struct type *i386_ymm_type = nullptr;
   struct type *i386_zmm_type = nullptr;
+  struct type *i386_tmm_type = nullptr;
+  struct type *i386_tilecfg_type = nullptr;
   struct type *i387_ext_type = nullptr;
   struct type *i386_bnd_type = nullptr;
 
@@ -303,6 +329,8 @@ enum i386_regnum
   I386_ZMM0H_REGNUM,		/* %zmm0h */
   I386_ZMM7H_REGNUM = I386_ZMM0H_REGNUM + 7,
   I386_PKRU_REGNUM,
+  I386_AMX_TILECFG_RAW_REGNUM,
+  I386_AMX_TILEDATA_REGNUM,
   I386_FSBASE_REGNUM,
   I386_GSBASE_REGNUM
 };
@@ -348,7 +376,7 @@ enum record_i386_regnum
 #define I386_NUM_REGS		(I386_GSBASE_REGNUM + 1)
 
 /* Size of the largest register.  */
-#define I386_MAX_REGISTER_SIZE	64
+#define I386_MAX_REGISTER_SIZE	8192
 
 /* Types for i386-specific registers.  */
 extern struct type *i387_ext_type (struct gdbarch *gdbarch);
@@ -366,6 +394,8 @@ extern int i386_k_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_zmm_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_zmmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern bool i386_pkru_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern bool i386_tilecfg_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern bool i386_tmm_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);
@@ -485,4 +515,94 @@ extern int i386_stap_is_single_operand (struct gdbarch *gdbarch,
 extern expr::operation_up i386_stap_parse_special_token
      (struct gdbarch *gdbarch, struct stap_parse_info *p);
 
+/* AMX utilities.  */
+
+/* TILECFG register.
+   0       palette
+   1       start_row
+   2-15    reserved, must be zero
+   16-17   tile0.colsb Tile 0 bytes per row.
+   18-19   tile1.colsb Tile 1 bytes per row.
+   20-21   tile2.colsb Tile 2 bytes per row.
+   ...     (sequence continues)
+   30-31   tile7.colsb Tile 7 bytes per row.
+   32-47   reserved, must be zero
+   48      tile0.rows Tile 0 rows.
+   49      tile1.rows Tile 1 rows.
+   50      tile2.rows Tile 2 rows.
+   ...     (sequence continues)
+   55      tile7.rows Tile 7 rows.
+   56-63   reserved, must be zero.  */
+
+/* TILECFG class representing the AMX Tilecfg register.  */
+
+class tilecfg_reg
+{
+public:
+  tilecfg_reg ()
+      : columns_and_rows (
+	  std::vector<std::pair<uint16_t, uint8_t>> (MAX_NAMES, { 0, 0 }))
+  {
+  }
+
+  /* Construct it from raw tilecfg data.  */
+  explicit tilecfg_reg (uint8_t *raw_tilecfg);
+
+  ~tilecfg_reg () noexcept = default;
+  tilecfg_reg (const tilecfg_reg &t) = default;
+  tilecfg_reg (tilecfg_reg &&t) noexcept = default;
+
+  tilecfg_reg &operator= (tilecfg_reg &&t) noexcept = default;
+
+  /* Get number of configured bytes per row for tile p.  */
+  inline uint16_t
+  bytes_per_row (uint8_t p) const
+  {
+    gdb_assert (columns_and_rows.size () > p);
+    return columns_and_rows[p].first;
+  }
+
+  /* Get number of configured rows for tile p.  */
+  inline uint8_t
+  rows (uint8_t p) const
+  {
+    gdb_assert (columns_and_rows.size () > p);
+    return columns_and_rows[p].second;
+  }
+
+  bool
+  operator== (const tilecfg_reg &t) const
+  {
+    return palette == t.palette && start_row == t.start_row
+	   && columns_and_rows == t.columns_and_rows;
+  }
+
+  bool
+  operator!= (const tilecfg_reg &t) const
+  {
+    return !(*this == t);
+  }
+
+  /* Offsets for reading from TILEDATA.  */
+  static const uint16_t COLUMN_MEMORY_OFFSET = 16;
+  static const uint16_t ROW_MEMORY_OFFSET = 48;
+
+  /* Maximum possible values for the current target.  */
+  static const uint16_t MAX_PALETTE = 1;
+  static const uint16_t MAX_NAMES = 8;
+  static const uint16_t MAX_ROWS = 16;
+  static const uint16_t MAX_BYTES_PER_ROW = 64;
+  static const uint16_t MAX_BYTES_PER_TILE = 1024;
+
+  /* Palette id entry.  */
+  uint8_t palette = 0;
+
+  /* start_row entry.  */
+  uint8_t start_row = 0;
+
+private:
+  /* This stores the colsb and rows entries.  */
+  std::vector<std::pair<uint16_t, uint8_t>> columns_and_rows;
+};
+
 #endif /* i386-tdep.h */
diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
index f056ea59347..38ffa3f967b 100644
--- a/gdb/i387-tdep.c
+++ b/gdb/i387-tdep.c
@@ -897,6 +897,21 @@ static int xsave_pkeys_offset[] =
 #define XSAVE_PKEYS_ADDR(tdep, xsave, regnum) \
   (xsave + xsave_pkeys_offset[regnum - I387_PKRU_REGNUM (tdep)])
 
+static int xsave_tilecfg_raw_offset[] =
+{
+  2752 + 0 * 64		/* tilecfg.  */
+};
+
+#define XSAVE_TILECFG_RAW_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_tilecfg_raw_offset[regnum - I387_TILECFG_RAW_REGNUM (tdep)])
+
+static int xsave_tiledata_offset[] =
+{
+  2816 + 0 * 8192	/* tiledata.  */
+};
+
+#define XSAVE_TILEDATA_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_tiledata_offset[regnum - I387_TILEDATA_REGNUM (tdep)])
 
 /* Extract from XSAVE a bitset of the features that are available on the
    target, but which have not yet been enabled.  */
@@ -949,8 +964,11 @@ i387_supply_xsave (struct regcache *regcache, int regnum,
       avx512_ymmh_avx512 = 0x40,
       avx512_xmm_avx512 = 0x80,
       pkeys = 0x100,
+      tilecfg = 0x200,
+      tiledata = 0x400,
       all = x87 | sse | avxh | mpx | avx512_k | avx512_zmm_h
-	    | avx512_ymmh_avx512 | avx512_xmm_avx512 | pkeys
+	    | avx512_ymmh_avx512 | avx512_xmm_avx512 | pkeys | tilecfg
+	    | tiledata
     } regclass;
 
   gdb_assert (regs != NULL);
@@ -959,6 +977,10 @@ i387_supply_xsave (struct regcache *regcache, int regnum,
 
   if (regnum == -1)
     regclass = all;
+  else if (regnum == I387_TILECFG_RAW_REGNUM (tdep))
+    regclass = tilecfg;
+  else if (regnum == I387_TILEDATA_REGNUM (tdep))
+    regclass = tiledata;
   else if (regnum >= I387_PKRU_REGNUM (tdep)
 	   && regnum < I387_PKEYSEND_REGNUM (tdep))
     regclass = pkeys;
@@ -1005,6 +1027,26 @@ i387_supply_xsave (struct regcache *regcache, int regnum,
     case none:
       break;
 
+    case tilecfg:
+      if ((clear_bv & X86_XSTATE_TILECFG))
+	regcache->raw_supply (regnum, zero);
+      else
+	{
+	  regcache->raw_supply (regnum,
+				XSAVE_TILECFG_RAW_ADDR (tdep, regs, regnum));
+	}
+      return;
+
+    case tiledata:
+      if ((clear_bv & X86_XSTATE_TILEDATA))
+	regcache->raw_supply (regnum, zero);
+      else
+	{
+	  regcache->raw_supply (regnum,
+				XSAVE_TILEDATA_ADDR (tdep, regs, regnum));
+	}
+      return;
+
     case pkeys:
       if ((clear_bv & X86_XSTATE_PKRU))
 	regcache->raw_supply (regnum, zero);
@@ -1177,6 +1219,32 @@ i387_supply_xsave (struct regcache *regcache, int regnum,
 	    }
 	}
 
+      /* Handle the tilecfg register.  */
+      if ((tdep->xcr0 & X86_XSTATE_TILECFG) != 0)
+	{
+	  if ((clear_bv & X86_XSTATE_TILECFG) != 0)
+	    regcache->raw_supply (I387_TILECFG_RAW_REGNUM (tdep), zero);
+	  else
+	    {
+	      i = I387_TILECFG_RAW_REGNUM (tdep);
+	      regcache->raw_supply (i,
+				    XSAVE_TILECFG_RAW_ADDR (tdep, regs, i));
+	    }
+	}
+
+      /* Handle the tiledata register.  */
+      if ((tdep->xcr0 & X86_XSTATE_TILEDATA) != 0)
+	{
+	  if ((clear_bv & X86_XSTATE_TILEDATA) != 0)
+	    regcache->raw_supply (I387_TILEDATA_REGNUM (tdep), zero);
+	  else
+	    {
+	      i = I387_TILEDATA_REGNUM (tdep);
+	      regcache->raw_supply (i,
+				    XSAVE_TILEDATA_ADDR (tdep, regs, i));
+	    }
+	}
+
       /* Handle the MPX registers.  */
       if ((tdep->xcr0 & X86_XSTATE_BNDREGS))
 	{
@@ -1369,8 +1437,11 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
       avx512_ymmh_avx512 = 0x80,
       avx512_xmm_avx512 = 0x100,
       pkeys = 0x200,
+      tilecfg = 0x400,
+      tiledata = 0x800,
       all = x87 | sse | avxh | mpx | avx512_k | avx512_zmm_h
-	    | avx512_ymmh_avx512 | avx512_xmm_avx512 | pkeys
+	    | avx512_ymmh_avx512 | avx512_xmm_avx512 | pkeys | tilecfg
+	    | tiledata
     } regclass;
 
   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
@@ -1378,6 +1449,10 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
 
   if (regnum == -1)
     regclass = all;
+  else if (regnum == I387_TILECFG_RAW_REGNUM (tdep))
+    regclass = tilecfg;
+  else if (regnum == I387_TILEDATA_REGNUM (tdep))
+    regclass = tiledata;
   else if (regnum >= I387_PKRU_REGNUM (tdep)
 	   && regnum < I387_PKEYSEND_REGNUM (tdep))
     regclass = pkeys;
@@ -1442,6 +1517,18 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
      seem justified at this point.  */
   if (clear_bv)
     {
+      if ((clear_bv & X86_XSTATE_TILECFG))
+	{
+	  i = I387_TILECFG_RAW_REGNUM (tdep);
+	  memset (XSAVE_TILECFG_RAW_ADDR (tdep, regs, i), 0, 64);
+	}
+
+      if ((clear_bv & X86_XSTATE_TILEDATA))
+	{
+	  i = I387_TILEDATA_REGNUM (tdep);
+	  memset (XSAVE_TILEDATA_ADDR (tdep, regs, i), 0, 8192);
+	}
+
       if ((clear_bv & X86_XSTATE_PKRU))
 	for (i = I387_PKRU_REGNUM (tdep);
 	     i < I387_PKEYSEND_REGNUM (tdep); i++)
@@ -1517,6 +1604,32 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
 
   if (regclass == all)
     {
+      /* Check if the tilecfg register is changed.  */
+      if ((tdep->xcr0 & X86_XSTATE_TILECFG))
+	{
+	  i = I387_TILECFG_RAW_REGNUM (tdep);
+	  regcache->raw_collect (i, raw);
+	  p = XSAVE_TILECFG_RAW_ADDR (tdep, regs, i);
+	  if (memcmp (raw, p, 64) != 0)
+	    {
+	      xstate_bv |= X86_XSTATE_TILECFG;
+	      memcpy (p, raw, 64);
+	    }
+	}
+
+      /* Check if the tiledata register is changed.  */
+      if ((tdep->xcr0 & X86_XSTATE_TILEDATA))
+	{
+	  i = I387_TILEDATA_REGNUM (tdep);
+	  regcache->raw_collect (i, raw);
+	  p = XSAVE_TILEDATA_ADDR (tdep, regs, i);
+	  if (memcmp (raw, p, 8192) != 0)
+	    {
+	      xstate_bv |= X86_XSTATE_TILEDATA;
+	      memcpy (p, raw, 8192);
+	    }
+	}
+
       /* Check if any PKEYS registers are changed.  */
       if ((tdep->xcr0 & X86_XSTATE_PKRU))
 	for (i = I387_PKRU_REGNUM (tdep);
@@ -1686,6 +1799,26 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
 	  internal_error (__FILE__, __LINE__,
 			  _("invalid i387 regclass"));
 
+	case tilecfg:
+	  /* This is a tilecfg register.  */
+	  p = XSAVE_TILECFG_RAW_ADDR (tdep, regs, regnum);
+	  if (memcmp (raw, p, 64) != 0)
+	    {
+	      xstate_bv |= X86_XSTATE_TILECFG;
+	      memcpy (p, raw, 64);
+	    }
+	  break;
+
+	case tiledata:
+	  /* This is a tiledata register.  */
+	  p = XSAVE_TILEDATA_ADDR (tdep, regs, regnum);
+	  if (memcmp (raw, p, 8192) != 0)
+	    {
+	      xstate_bv |= X86_XSTATE_TILEDATA;
+	      memcpy (p, raw, 8192);
+	    }
+	  break;
+
 	case pkeys:
 	  /* This is a PKEYS register.  */
 	  p = XSAVE_PKEYS_ADDR (tdep, regs, regnum);
diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
index 698ff2ee206..c9af33fbba4 100644
--- a/gdb/i387-tdep.h
+++ b/gdb/i387-tdep.h
@@ -45,6 +45,10 @@ struct ui_file;
 #define I387_NUM_MPX_CTRL_REGS 2
 #define I387_NUM_K_REGS 8
 #define I387_NUM_PKEYS_REGS 1
+#define I387_NUM_TILECFG_REGS 1
+#define I387_NUM_TILECFG_RAW_REGS 1
+#define I387_NUM_TILEDATA_REGS 1
+#define I387_NUM_TMM_REGS 8
 
 #define I387_PKRU_REGNUM(tdep) ((tdep)->pkru_regnum)
 #define I387_K0_REGNUM(tdep) ((tdep)->k0_regnum)
@@ -52,6 +56,10 @@ struct ui_file;
 #define I387_ZMM0H_REGNUM(tdep) ((tdep)->zmm0h_regnum)
 #define I387_NUM_YMM_AVX512_REGS(tdep) ((tdep)->num_ymm_avx512_regs)
 #define I387_YMM16H_REGNUM(tdep) ((tdep)->ymm16h_regnum)
+#define I387_TILECFG_REGNUM(tdep) ((tdep)->tilecfg_regnum)
+#define I387_TILECFG_RAW_REGNUM(tdep) ((tdep)->tilecfg_raw_regnum)
+#define I387_TMM_REGNUM(tdep) ((tdep)->tmm_regnum)
+#define I387_TILEDATA_REGNUM(tdep) ((tdep)->tiledata_regnum)
 
 #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
 #define I387_FSTAT_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 1)
diff --git a/gdb/testsuite/gdb.arch/amd64-amx-corefile.exp b/gdb/testsuite/gdb.arch/amd64-amx-corefile.exp
new file mode 100644
index 00000000000..750aad9c5d0
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/amd64-amx-corefile.exp
@@ -0,0 +1,113 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test AMX in core dumps.
+
+if { [skip_amx_tests] } {
+    unsupported "Target does not support AMX."
+    return -1
+}
+
+standard_testfile amd64-amx.c
+set gcorefile ${binfile}.gcore
+
+if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} debug] } {
+    return -1
+}
+
+if { ![runto_main] } {
+    untested "could not run to main"
+    return -1
+}
+
+proc test_zeroed_tile {reg} {
+    gdb_test "print $reg.m_int8" \
+	"= \\{\\{0 <repeats 64 times>\\} <repeats 16 times>\\}"
+}
+
+set line1 [gdb_get_line_number "BP1"]
+gdb_breakpoint $line1
+gdb_continue_to_breakpoint "line1" ".*$srcfile:$line1.*"
+
+# Other corefile tests check and save the variables here to use them again
+# when the corefile is loaded.  Due to the complexity of the tiles, this
+# is not done here.
+
+if { ![gdb_gcore_cmd $gcorefile "save a corefile"] } {
+    return -1
+}
+
+# Now restart gdb and load the corefile.
+clean_restart ${binfile}
+gdb_test "core ${gcorefile}" \
+    "Core was generated by .*" "re-load generated corefile"
+
+gdb_test "print/x \$tilecfg_raw" \
+	"= 0x202020203020{43}80008000800100010000c0{31}1"
+gdb_test "print \$tilecfg" \
+    [join {"= \\{palette = 1" \
+	"start_row = 0" \
+	"tile0_colsb = 12" \
+	"tile1_colsb = 16" \
+	"tile2_colsb = 16" \
+	"tile3_colsb = 8" \
+	"tile4_colsb = 8" \
+	"tile5_colsb = 8" \
+	"tile6_colsb = 0" \
+	"tile7_colsb = 0" \
+	"tile0_rows = 2" \
+	"tile1_rows = 3" \
+	"tile2_rows = 2" \
+	"tile3_rows = 2" \
+	"tile4_rows = 2" \
+	"tile5_rows = 2" \
+	"tile6_rows = 0" \
+	"tile7_rows = 0\\}"} \
+	", "]
+
+gdb_test "print \$tmm0.m_uint8" \
+    [join {"= \\{\\{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2\\}" \
+	"\\{1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3\\}\\}"} \
+	", "]
+
+gdb_test "print \$tmm1.m_uint8" \
+    [join {"= \\{\\{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3\\}" \
+	"\\{1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4\\}" \
+	"\\{2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5\\}\\}"} \
+	", "]
+
+gdb_test "print \$tmm2.m_int32" \
+    [join {"= \\{\\{20, 32, 44, 56\\}" \
+	"\\{32, 56, 80, 104\\}\\}"} \
+	", "]
+
+gdb_test "print \$tmm3.m_bf16" \
+    [join {"= \\{\\{0, 0.125, 0.25, 0.375\\}" \
+	"\\{0.5, 0.625, 0.75, 0.875\\}\\}"} \
+	", "]
+
+gdb_test "print \$tmm4.m_fp32" \
+    [join {"= \\{\\{1, 1.125\\}" \
+	"\\{1.25, 1.375\\}\\}"} \
+	", "]
+
+gdb_test "print \$tmm5.m_int8" \
+    [join {"= \\{\\{-1, -1, -1, -1, 1, 1, 1, 1\\}" \
+	"\\{1, 1, 1, 1, -5, -5, -5, -5\\}\\}"} \
+	", "]
+
+for {set i 6} {$i < 8} {incr i} {
+    test_zeroed_tile "\$tmm$i"
+}
diff --git a/gdb/testsuite/gdb.arch/amd64-amx.c b/gdb/testsuite/gdb.arch/amd64-amx.c
new file mode 100644
index 00000000000..1926176e9e4
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/amd64-amx.c
@@ -0,0 +1,173 @@
+/* Test program for AMX registers.
+
+   Copyright 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <immintrin.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <asm/prctl.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+#define XFEATURE_XTILEDATA 18
+#define ARCH_REQ_XCOMP_PERM 0x1023
+
+#define TILE int
+
+#define N1 2
+#define K1 3
+#define M1 4
+
+#define N2 1
+#define K2 2
+#define M2 3
+
+uint8_t memA1[N1][4 * K1] = { { 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2 },
+			      { 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3 } };
+
+uint8_t memB1[K1][4 * M1] = {
+  { 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3 },
+  { 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 },
+  { 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5 },
+};
+
+uint32_t memC1[N1][M1] = { 0 };
+
+uint8_t memA2[N2][4 * K2] = { { 5, 5, 5, 5, 6, 6, 6, 6 } };
+
+uint8_t memB2[K2][4 * M2] = { { 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2 },
+			      { 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3 } };
+
+uint32_t memC2[N2][M2] = { 0 };
+
+
+/* Data for type testing.  */
+
+int8_t int8_matrix[2][8] = { { -1, -1, -1, -1, 1, 1, 1, 1 },
+			     { 1, 1, 1, 1, -5, -5, -5, -5 } };
+
+float fp32_matrix[2][2] = { { 1.0, 1.125 },
+			    { 1.25, 1.375 } };
+
+
+/* This is the bf16 matrix.  But as bf16 is not really a valid data type
+in most/any compilers, and as using a library also isn't a good option,
+the bytes for this are calculated manually.  This needs to take endianess into
+account.
+
+_bfloat16 bf16_matrix[2][2 * 2] =
+{ { 0.0, 0.125, 0.25, 0.375 },
+  { 0.5, 0.625, 0.75, 0.875 }
+};
+
+_bfloat16 bf16_binary_matrix[2][2 * 2] =
+{
+  { 0000000000000000, 0000000000111110, 0011111010000000, 1100000000111110 },
+  { 0000000000111111, 0010000000111111, 0100000000111111, 0110000000111111 }
+};
+
+uint8_t bf16_binary_matrix[2][2 * 4] =
+{
+  { 00000000, 00000000, 00000000, 00111110, 10000000, 00111110, 11000000, 00111110 },
+  { 00000000, 00111111, 00100000, 00111111, 01000000, 00111111, 01100000, 00111111 }
+};
+*/
+
+uint8_t bf16_matrix[2][2 * 4] = { { 0, 0, 0, 62, 128, 62, 192, 62 },
+				  { 0, 63, 32, 63, 64, 63, 96, 63 } };
+
+void
+tfmaps_calc (int whichMatrix, int N, int K, int M)
+{
+  int strideA = 4 * K;
+  int strideB = 4 * M;
+  int strideC = 4 * M;
+
+  /* Configure.  */
+  struct tileconfig_t
+  {
+    uint8_t palette_id;
+    uint8_t startRow;
+    uint8_t reserved[14];
+    uint16_t cols[16];
+    uint8_t rows[16];
+  };
+
+  struct tileconfig_t tc = { 1 };
+
+  const TILE A = 0;
+  const TILE B = 1;
+  const TILE C = 2;
+
+  tc.rows[A] = N;
+  tc.cols[A] = K * 4;
+  tc.rows[B] = K;
+  tc.cols[B] = M * 4;
+  tc.rows[C] = N;
+  tc.cols[C] = M * 4;
+
+  /* Compute.  */
+  if (whichMatrix == 1)
+    {
+      tc.rows[3] = 2;
+      tc.cols[3] = 8;
+      tc.rows[4] = 2;
+      tc.cols[4] = 8;
+      tc.rows[5] = 2;
+      tc.cols[5] = 8;
+
+      _tile_loadconfig (&tc);
+
+      /* Load additional types for type testing.  */
+      _tile_loadd (3, bf16_matrix, 4 * 2);
+      _tile_loadd (4, fp32_matrix, 2 * 4);
+      _tile_loadd (5, int8_matrix, 4 * 2);
+
+     /* Computation.  */
+      _tile_loadd (A, memA1, strideA);
+      _tile_loadd (B, memB1, strideB);
+      _tile_dpbuud (C, A, B);
+      _tile_stored (C, memC1, strideC); /* BP1.  */
+    }
+  else
+    {
+      _tile_loadconfig (&tc);
+      _tile_loadd (A, memA2, strideA);
+      _tile_loadd (B, memB2, strideB);
+      _tile_dpbuud (C, A, B);
+      _tile_stored (C, memC2, strideC); /* BP2.  */
+    }
+
+  _tile_release (); /* BP3.  */
+}
+
+
+int
+main (int argc, char **argv)
+{
+  /* Ask the OS to configure AMX in xsave.  */
+  if (syscall (SYS_arch_prctl, ARCH_REQ_XCOMP_PERM, XFEATURE_XTILEDATA) != 0)
+    return -1;
+
+  tfmaps_calc (1, N1, K1, M1);
+  tfmaps_calc (2, N2, K2, M2);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.arch/amd64-amx.exp b/gdb/testsuite/gdb.arch/amd64-amx.exp
new file mode 100755
index 00000000000..ab0fc42ad8f
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/amd64-amx.exp
@@ -0,0 +1,231 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file tests basic AMX functionality.
+
+if { [skip_amx_tests] } {
+    unsupported "Target does not support AMX."
+    return -1
+}
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} debug] } {
+    return -1
+}
+
+if { ![runto_main] } {
+    untested "could not run to main"
+    return -1
+}
+
+proc test_zeroed_tile {reg} {
+    gdb_test "print $reg.m_int8" \
+	"= \\{\\{0 <repeats 64 times>\\} <repeats 16 times>\\}"
+}
+
+set line1 [gdb_get_line_number "BP1"]
+set line2 [gdb_get_line_number "BP2"]
+gdb_breakpoint $line1
+gdb_breakpoint $line2
+
+# Registers should be displayed as zeroed before AMX enablement.
+with_test_prefix "Before AMX is enabled" {
+    gdb_test "print \$tilecfg_raw" "= 0"
+    for {set i 0} {$i < 8} {incr i} {
+	test_zeroed_tile "\$tmm$i"
+    }
+}
+
+
+# First matrix multiplication: matC1 = matA1 x matB1.
+gdb_continue_to_breakpoint "line1" ".*$srcfile:$line1.*"
+
+with_test_prefix "matC1" {
+    gdb_test "print/x \$tilecfg_raw" \
+	"= 0x202020203020{43}80008000800100010000c0{31}1"
+    gdb_test "print \$tilecfg" \
+	[join {"= \\{palette = 1" \
+	    "start_row = 0" \
+	    "tile0_colsb = 12" \
+	    "tile1_colsb = 16" \
+	    "tile2_colsb = 16" \
+	    "tile3_colsb = 8" \
+	    "tile4_colsb = 8" \
+	    "tile5_colsb = 8" \
+	    "tile6_colsb = 0" \
+	    "tile7_colsb = 0" \
+	    "tile0_rows = 2" \
+	    "tile1_rows = 3" \
+	    "tile2_rows = 2" \
+	    "tile3_rows = 2" \
+	    "tile4_rows = 2" \
+	    "tile5_rows = 2" \
+	    "tile6_rows = 0" \
+	    "tile7_rows = 0\\}"} \
+	    ", "]
+
+    gdb_test "print \$tmm0.m_uint8" \
+	[join {"= \\{\\{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2\\}" \
+	    "\\{1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3\\}\\}"} \
+	    ", "]
+
+    gdb_test "print \$tmm1.m_uint8" \
+	[join {"= \\{\\{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3\\}" \
+	    "\\{1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4\\}" \
+	    "\\{2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5\\}\\}"} \
+	    ", "]
+
+    gdb_test "print \$tmm2.m_int32" \
+	"= \\{\\{20, 32, 44, 56\\}, \\{32, 56, 80, 104\\}\\}"
+
+    gdb_test "print \$tmm3.m_bf16" \
+	[join {"= \\{\\{0, 0.125, 0.25, 0.375\\}" \
+	    "\\{0.5, 0.625, 0.75, 0.875\\}\\}"} \
+	    ", "]
+
+    gdb_test "print \$tmm4.m_fp32" "= \\{\\{1, 1.125\\}, \\{1.25, 1.375\\}\\}"
+
+    gdb_test "print \$tmm5.m_int8" \
+	[join {"= \\{\\{-1, -1, -1, -1, 1, 1, 1, 1\\}" \
+	    "\\{1, 1, 1, 1, -5, -5, -5, -5\\}\\}"} \
+	    ", "]
+
+    for {set i 6} {$i < 8} {incr i} {
+	test_zeroed_tile "\$tmm$i"
+    }
+}
+
+
+# Second matrix multiplication: matC2 = matA2 x matB2.
+gdb_continue_to_breakpoint "line2" ".*$srcfile:$line2.*"
+
+with_test_prefix "matC2" {
+    gdb_test "print/x \$tilecfg_raw" "= 0x102010{55}c000c00080{31}1"
+    gdb_test "print \$tilecfg" \
+	[join {"= \\{palette = 1" \
+	    "start_row = 0" \
+	    "tile0_colsb = 8" \
+	    "tile1_colsb = 12" \
+	    "tile2_colsb = 12" \
+	    "tile3_colsb = 0" \
+	    "tile4_colsb = 0" \
+	    "tile5_colsb = 0" \
+	    "tile6_colsb = 0" \
+	    "tile7_colsb = 0" \
+	    "tile0_rows = 1" \
+	    "tile1_rows = 2" \
+	    "tile2_rows = 1" \
+	    "tile3_rows = 0" \
+	    "tile4_rows = 0" \
+	    "tile5_rows = 0" \
+	    "tile6_rows = 0" \
+	    "tile7_rows = 0\\}"} \
+	    ", "]
+
+    gdb_test "print \$tmm0.m_int8" "= \\{\\{5, 5, 5, 5, 6, 6, 6, 6\\}\\}"
+
+    gdb_test "print \$tmm1.m_int8" \
+	[join {"= \\{\\{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2\\}" \
+	    "\\{1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3\\}\\}"} \
+	    ", "]
+
+    gdb_test "print \$tmm2.m_int32" "= \\{\\{24, 68, 112\\}\\}"
+
+    for {set i 3} {$i < 8} {incr i} {
+	test_zeroed_tile "\$tmm$i"
+    }
+}
+
+
+# Test setting tiles.
+with_test_prefix "setting tiles" {
+    gdb_test_no_output "set \$tmm0.m_uint8\[0\]\[0\] = 1"
+    gdb_test "print \$tmm0.m_uint8" "= \\{\\{1, 5, 5, 5, 6, 6, 6, 6\\}\\}"
+
+    gdb_test_no_output "set \$tmm2.m_int32\[0\] = {1, 1, 1}"
+    gdb_test "print \$tmm2.m_int32" "= \\{\\{1, 1, 1\\}\\}"
+
+    gdb_test_no_output "set \$tmm0.m_bf16\[0\]\[0\] = 0.5"
+    gdb_test "print \$tmm0.m_bf16\[0\]\[0\]" "= 0.5"
+
+    gdb_test_no_output "set \$tmm0.m_fp32\[0\]\[0\] = 0.75"
+    gdb_test "print \$tmm0.m_fp32\[0\]\[0\]" "= 0.75"
+
+    gdb_test_no_output "set \$tmm0.m_int8\[0\]\[0\] = -1"
+    gdb_test "print \$tmm0.m_int8\[0\]\[0\]" "= -1"
+}
+
+set line3 [gdb_get_line_number "BP3"]
+gdb_breakpoint $line3
+gdb_continue_to_breakpoint "line3" ".*$srcfile:$line3.*"
+
+# Tilecfg modifications can lead to exceptions.  Hence, we wait with
+# testing it until after we are done with AMX computations.
+with_test_prefix "set tilecfg raw" {
+    gdb_test_no_output "set \$tilecfg_raw = 0x1"
+    gdb_test "print/x \$tilecfg_raw" "= 0x1"
+    gdb_test "print \$tilecfg" \
+	[join {"= \\{palette = 1" \
+	    "start_row = 0" \
+	    "tile0_colsb = 0" \
+	    "tile1_colsb = 0" \
+	    "tile2_colsb = 0" \
+	    "tile3_colsb = 0" \
+	    "tile4_colsb = 0" \
+	    "tile5_colsb = 0" \
+	    "tile6_colsb = 0" \
+	    "tile7_colsb = 0" \
+	    "tile0_rows = 0" \
+	    "tile1_rows = 0" \
+	    "tile2_rows = 0" \
+	    "tile3_rows = 0" \
+	    "tile4_rows = 0" \
+	    "tile5_rows = 0" \
+	    "tile6_rows = 0" \
+	    "tile7_rows = 0\\}"} \
+	    ", "]
+}
+
+with_test_prefix "set tilecfg" {
+    gdb_test_no_output "set \$tilecfg.palette = 0x2"
+    gdb_test_no_output "set \$tilecfg.start_row = 0x3"
+    gdb_test_no_output "set \$tilecfg.tile0_rows = 0x4"
+
+    gdb_test "print/x \$tilecfg_raw" "= 0x40{93}302"
+    gdb_test "print \$tilecfg" \
+	[join {"= \\{palette = 2" \
+	    "start_row = 3" \
+	    "tile0_colsb = 0" \
+	    "tile1_colsb = 0" \
+	    "tile2_colsb = 0" \
+	    "tile3_colsb = 0" \
+	    "tile4_colsb = 0" \
+	    "tile5_colsb = 0" \
+	    "tile6_colsb = 0" \
+	    "tile7_colsb = 0" \
+	    "tile0_rows = 4" \
+	    "tile1_rows = 0" \
+	    "tile2_rows = 0" \
+	    "tile3_rows = 0" \
+	    "tile4_rows = 0" \
+	    "tile5_rows = 0" \
+	    "tile6_rows = 0" \
+	    "tile7_rows = 0\\}"} \
+	    ", "]
+}
+
+gdb_test "continue" \
+    ".*\\\[Inferior $decimal \\\(process $decimal\\\) exited normally\\]"
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 47cb2b23676..97664191244 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -3429,6 +3429,73 @@ gdb_caching_proc skip_tsx_tests {
     return $skip_tsx_tests
 }
 
+# Run a test on the target to see if it supports AMX.  Return 0 if so,
+# 1 if it does not.  Based on 'check_vmx_hw_available' from the GCC testsuite.
+
+gdb_caching_proc skip_amx_tests {
+    global srcdir subdir gdb_prompt inferior_exited_re
+
+    set me "skip_amx_tests"
+    if { ![istarget "x86_64-*-*"] } {
+	verbose "$me:  target does not support AMX, returning 1" 2
+	return 1
+    }
+
+    # Compile a test program.
+    set src {
+	#include <stdint.h>
+	#include <immintrin.h>
+
+	int
+	main ()
+	{
+	   struct tileconfig_t {
+	      uint8_t  palette_id;
+	      uint8_t  startRow;
+	      uint8_t  reserved[14];
+	      uint16_t cols[16];
+	      uint8_t  rows[16];
+	   };
+
+	   struct tileconfig_t tc = {1};
+	   _tile_loadconfig (&tc);
+	   _tile_release ();
+
+	   return 0;
+	}
+    }
+    if {![gdb_simple_compile $me $src executable]} {
+	return 1
+    }
+
+    # No error message, compilation succeeded so now run it via gdb.
+
+    gdb_exit
+    gdb_start
+    gdb_reinitialize_dir $srcdir/$subdir
+    gdb_load "$obj"
+    gdb_run_cmd
+    gdb_expect {
+	-re ".*Illegal instruction.*${gdb_prompt} $" {
+	    verbose -log "$me:  AMX hardware not detected."
+	    set skip_amx_tests 1
+	}
+	-re ".*$inferior_exited_re normally.*${gdb_prompt} $" {
+	    verbose -log "$me:  AMX hardware detected."
+	    set skip_amx_tests 0
+	}
+	default {
+	    warning "\n$me:  default case taken."
+	    set skip_amx_tests 1
+	}
+    }
+    gdb_exit
+    remote_file build delete $obj
+
+    verbose "$me:  returning $skip_amx_tests" 2
+    return $skip_amx_tests
+}
+
 # Run a test on the target to see if it supports avx512bf16.  Return 0 if so,
 # 1 if it does not.  Based on 'check_vmx_hw_available' from the GCC testsuite.
 
diff --git a/gdbserver/i387-fp.cc b/gdbserver/i387-fp.cc
index 674889674f1..2d22b0419f8 100644
--- a/gdbserver/i387-fp.cc
+++ b/gdbserver/i387-fp.cc
@@ -137,6 +137,14 @@ struct i387_xsave {
   /* Space for 1 32-bit PKRU register.  The HW XSTATE size for this feature is
      actually 64 bits, but WRPKRU/RDPKRU instructions ignore upper 32 bits.  */
   unsigned char pkru_space[8];
+
+  unsigned char reserved6[56];
+
+  /* Space for 1 TILECFG register with size of 64 bytes.  */
+  unsigned char tilecfg_space[64];
+
+  /* Space for 1 TILEDATA register, with size of 8192 bytes.  */
+  unsigned char tiledata_space[8192];
 };
 
 void
@@ -257,7 +265,7 @@ i387_cache_to_xsave (struct regcache *regcache, void *buf)
   unsigned long val, val2;
   unsigned long long xstate_bv = 0;
   unsigned long long clear_bv = 0;
-  char raw[64];
+  char raw[8192];
   char *p;
 
   /* Amd64 has 16 xmm regs; I386 has 8 xmm regs.  */
@@ -332,6 +340,12 @@ i387_cache_to_xsave (struct regcache *regcache, void *buf)
       if ((clear_bv & X86_XSTATE_PKRU))
 	for (i = 0; i < num_pkeys_registers; i++)
 	  memset (((char *) &fp->pkru_space[0]) + i * 4, 0, 4);
+
+      if (amd64 && (clear_bv & X86_XSTATE_AMX) != 0)
+	{
+	  memset (((char *) &fp->tilecfg_space[0]), 0, 64);
+	  memset (((char *) &fp->tiledata_space[0]), 0, 8192);
+	}
     }
 
   /* Check if any x87 registers are changed.  */
@@ -527,6 +541,30 @@ i387_cache_to_xsave (struct regcache *regcache, void *buf)
 	}
     }
 
+  /* Check if TILECFG register is changed (only for amd64).  */
+  if (amd64 && ((x86_xcr0 & X86_XSTATE_TILECFG) != 0))
+    {
+      collect_register_by_name (regcache, "tilecfg_raw", raw);
+      p = (char *) &fp->tilecfg_space;
+      if (memcmp (raw, p, 64) != 0)
+	{
+	  xstate_bv |= X86_XSTATE_TILECFG;
+	  memcpy (p, raw, 64);
+	}
+    }
+
+  /* Check if TILEDATA register is changed (only for amd64).  */
+  if (amd64 && ((x86_xcr0 & X86_XSTATE_TILEDATA) != 0))
+    {
+      collect_register_by_name (regcache, "tiledata", raw);
+      p = (char *) &fp->tiledata_space;
+      if (memcmp (raw, p, 8192) != 0)
+	{
+	  xstate_bv |= X86_XSTATE_TILEDATA;
+	  memcpy (p, raw, 8192);
+	}
+    }
+
   if ((x86_xcr0 & X86_XSTATE_SSE) || (x86_xcr0 & X86_XSTATE_AVX))
     {
       collect_register_by_name (regcache, "mxcsr", raw);
@@ -911,6 +949,30 @@ i387_xsave_to_cache (struct regcache *regcache, const void *buf)
 	}
     }
 
+  if (amd64 && (x86_xcr0 & X86_XSTATE_AMX) != 0)
+    {
+      /* When tilecfg is rewritten, the tiles are cleared.  Therefore,
+	 we need to check tilecfg and tiledata separately here.  */
+      int tilecfg_regnum = find_regno (regcache->tdesc, "tilecfg_raw");
+      int tiledata_regnum = find_regno (regcache->tdesc, "tiledata");
+
+      if ((clear_bv & X86_XSTATE_TILECFG) != 0)
+	supply_register_zeroed (regcache, tilecfg_regnum);
+      else
+	{
+	  p = (gdb_byte *) &fp->tilecfg_space[0];
+	  supply_register (regcache, tilecfg_regnum, p);
+	}
+
+      if ((clear_bv & X86_XSTATE_TILEDATA) != 0)
+	supply_register_zeroed (regcache, tiledata_regnum);
+      else
+	{
+	  p = (gdb_byte *) &fp->tiledata_space[0];
+	  supply_register (regcache, tiledata_regnum, p);
+	}
+    }
+
   if ((clear_bv & (X86_XSTATE_SSE | X86_XSTATE_AVX))
       == (X86_XSTATE_SSE | X86_XSTATE_AVX))
     {
diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
index bb89ba575bc..5a328733951 100644
--- a/gdbserver/linux-amd64-ipa.cc
+++ b/gdbserver/linux-amd64-ipa.cc
@@ -178,7 +178,7 @@ static uint64_t idx2mask[X86_TDESC_LAST] = {
   X86_XSTATE_MPX_MASK,
   X86_XSTATE_AVX_MPX_MASK,
   X86_XSTATE_AVX_AVX512_MASK,
-  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
+  X86_XSTATE_AVX_MPX_AVX512_PKU_AMX_MASK
 };
 #endif
 
diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
index 88383a87b4c..7b85f61664a 100644
--- a/gdbserver/linux-i386-ipa.cc
+++ b/gdbserver/linux-i386-ipa.cc
@@ -253,7 +253,7 @@ static uint64_t idx2mask[X86_TDESC_LAST] = {
   X86_XSTATE_MPX_MASK,
   X86_XSTATE_AVX_MPX_MASK,
   X86_XSTATE_AVX_AVX512_MASK,
-  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
+  X86_XSTATE_AVX_MPX_AVX512_PKU_AMX_MASK
 };
 
 /* Return target_desc to use for IPA, given the tdesc index passed by
diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
index d2b55f6f0d2..356b7204f77 100644
--- a/gdbserver/linux-x86-low.cc
+++ b/gdbserver/linux-x86-low.cc
@@ -245,7 +245,9 @@ static const int x86_64_regmap[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
-  -1					/* pkru  */
+  -1,					/* pkru  */
+  -1,					/* AMX TILECFG register.  */
+  -1,					/* AMX TILEDATA: tmm0 ... tmm7.  */
 };
 
 #define X86_64_NUM_REGS (sizeof (x86_64_regmap) / sizeof (x86_64_regmap[0]))
diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
index b7cc307a764..c914ad6a13d 100644
--- a/gdbserver/linux-x86-tdesc.cc
+++ b/gdbserver/linux-x86-tdesc.cc
@@ -33,6 +33,9 @@
 static enum x86_linux_tdesc
 xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
 {
+  if (!is_x32 && ((xcr0 & X86_XSTATE_AMX) != 0))
+    return X86_TDESC_AVX_MPX_AVX512_PKU_AMX;
+
   if (xcr0 & X86_XSTATE_PKRU)
     {
       if (is_x32)
diff --git a/gdbserver/linux-x86-tdesc.h b/gdbserver/linux-x86-tdesc.h
index 4c7aebc2065..90ee8e19321 100644
--- a/gdbserver/linux-x86-tdesc.h
+++ b/gdbserver/linux-x86-tdesc.h
@@ -33,7 +33,8 @@ enum x86_linux_tdesc {
   X86_TDESC_AVX_MPX = 4,
   X86_TDESC_AVX_AVX512 = 5,
   X86_TDESC_AVX_MPX_AVX512_PKU = 6,
-  X86_TDESC_LAST = 7,
+  X86_TDESC_AVX_MPX_AVX512_PKU_AMX = 7,
+  X86_TDESC_LAST = 8,
 };
 
 #if defined __i386__ || !defined IN_PROCESS_AGENT
diff --git a/gdbserver/server.h b/gdbserver/server.h
index 6c64fe1ad80..2fe193c1c0c 100644
--- a/gdbserver/server.h
+++ b/gdbserver/server.h
@@ -105,7 +105,7 @@ extern int in_queued_stop_replies (ptid_t ptid);
 /* Buffer sizes for transferring memory, registers, etc.   Set to a constant
    value to accomodate multiple register formats.  This value must be at least
    as large as the largest register set supported by gdbserver.  */
-#define PBUFSIZ 18432
+#define PBUFSIZ 21416
 
 /* Definition for an unknown syscall, used basically in error-cases.  */
 #define UNKNOWN_SYSCALL (-1)
diff --git a/gdbsupport/x86-xstate.h b/gdbsupport/x86-xstate.h
index d4845243b4b..2452d1ae7a3 100644
--- a/gdbsupport/x86-xstate.h
+++ b/gdbsupport/x86-xstate.h
@@ -37,6 +37,11 @@
 
 #define X86_XSTATE_PKRU		(1ULL << 9)
 
+/* AMX adds two feature bits.  Both must be enabled.  */
+#define X86_XSTATE_TILECFG	(1ULL << 17)
+#define X86_XSTATE_TILEDATA	(1ULL << 18)
+#define X86_XSTATE_AMX		(X86_XSTATE_TILECFG | X86_XSTATE_TILEDATA)
+
 /* Supported mask and size of the extended state.  */
 #define X86_XSTATE_X87_MASK	X86_XSTATE_X87
 #define X86_XSTATE_SSE_MASK	(X86_XSTATE_X87 | X86_XSTATE_SSE)
@@ -44,19 +49,23 @@
 #define X86_XSTATE_MPX_MASK	(X86_XSTATE_SSE_MASK | X86_XSTATE_MPX)
 #define X86_XSTATE_AVX_MPX_MASK	(X86_XSTATE_AVX_MASK | X86_XSTATE_MPX)
 #define X86_XSTATE_AVX_AVX512_MASK	(X86_XSTATE_AVX_MASK | X86_XSTATE_AVX512)
-#define X86_XSTATE_AVX_MPX_AVX512_PKU_MASK 	(X86_XSTATE_AVX_MPX_MASK\
-					| X86_XSTATE_AVX512 | X86_XSTATE_PKRU)
+#define X86_XSTATE_AVX_MPX_AVX512_PKU_AMX_MASK 	(X86_XSTATE_AVX_MPX_MASK\
+					| X86_XSTATE_AVX512 | X86_XSTATE_PKRU\
+					| X86_XSTATE_AMX)
 
-#define X86_XSTATE_ALL_MASK		(X86_XSTATE_AVX_MPX_AVX512_PKU_MASK)
+#define X86_XSTATE_ALL_MASK		(X86_XSTATE_AVX_MPX_AVX512_PKU_AMX_MASK)
 
 
-#define X86_XSTATE_SSE_SIZE	576
-#define X86_XSTATE_AVX_SIZE	832
-#define X86_XSTATE_BNDREGS_SIZE	1024
-#define X86_XSTATE_BNDCFG_SIZE	1088
-#define X86_XSTATE_AVX512_SIZE	2688
-#define X86_XSTATE_PKRU_SIZE	2696
-#define X86_XSTATE_MAX_SIZE	2696
+/* Sizes in bytes.  */
+#define X86_XSTATE_SSE_SIZE		576
+#define X86_XSTATE_AVX_SIZE		832
+#define X86_XSTATE_BNDREGS_SIZE		1024
+#define X86_XSTATE_BNDCFG_SIZE		1088
+#define X86_XSTATE_AVX512_SIZE		2688
+#define X86_XSTATE_PKRU_SIZE		2696
+#define X86_XSTATE_TILECFG_SIZE		2816
+#define X86_XSTATE_TILEDATA_SIZE	11008
+#define X86_XSTATE_MAX_SIZE		11008
 
 
 /* In case one of the MPX XCR0 bits is set we consider we have MPX.  */
@@ -64,13 +73,15 @@
 #define HAS_AVX(XCR0) (((XCR0) & X86_XSTATE_AVX) != 0)
 #define HAS_AVX512(XCR0) (((XCR0) & X86_XSTATE_AVX512) != 0)
 #define HAS_PKRU(XCR0) (((XCR0) & X86_XSTATE_PKRU) != 0)
+#define HAS_AMX(XCR0) (((XCR0) & X86_XSTATE_AMX) != 0)
 
 /* Get I386 XSAVE extended state size.  */
 #define X86_XSTATE_SIZE(XCR0) \
+   (HAS_AMX (XCR0) ? X86_XSTATE_TILEDATA_SIZE : \
     (HAS_PKRU (XCR0) ? X86_XSTATE_PKRU_SIZE : \
      (HAS_AVX512 (XCR0) ? X86_XSTATE_AVX512_SIZE : \
       (HAS_MPX (XCR0) ? X86_XSTATE_BNDCFG_SIZE : \
-       (HAS_AVX (XCR0) ? X86_XSTATE_AVX_SIZE : X86_XSTATE_SSE_SIZE))))
+       (HAS_AVX (XCR0) ? X86_XSTATE_AVX_SIZE : X86_XSTATE_SSE_SIZE)))))
 
 /* Initial value for fctrl register, as defined in the X86 manual, and
    confirmed in the (Linux) kernel source.  When the x87 floating point
-- 
2.34.1

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers.
  2022-05-06 12:12 [PATCH 0/4] Add AMX support Felix Willgerodt
  2022-05-06 12:12 ` [PATCH 1/4] gdb: define int512 and uint512 as built-in types Felix Willgerodt
  2022-05-06 12:12 ` [PATCH 2/4] gdb, gdbserver: Add AMX registers Felix Willgerodt
@ 2022-05-06 12:12 ` Felix Willgerodt
  2022-05-06 16:08   ` John Baldwin
  2022-06-27 18:30   ` Pedro Alves
  2022-05-06 12:12 ` [PATCH 4/4] gdb: Clear tilecfg.start_row for any PC modification Felix Willgerodt
  3 siblings, 2 replies; 27+ messages in thread
From: Felix Willgerodt @ 2022-05-06 12:12 UTC (permalink / raw)
  To: gdb-patches

A couple of functions blindly allocate a buffer of the size of
I386_MAX_REGISTER_SIZE.  With the addition of AMX, this size has increased
drastically from 64 bytes to 8192.  This changes these buffer allocations
to only use the actual amount needed, similar to how it is already done in
amd64-tdep.c (amd64_pseudo_register_read_value).

For the i387_collect_xsave and i387_cache_to_xsave functions any feedback is
welcome.  I opted to take the middle ground and only distinguish
between "AMX" and "Not-AMX".  That might be unnecessary optimization,
we could alternatively be okay with using an 8kB buffer unconditionally or
be okay with having many smaller buffer allocations.
---
 gdb/i386-tdep.c      | 21 ++++++++++++++++-----
 gdb/i387-tdep.c      | 19 ++++++++++++++-----
 gdbserver/i387-fp.cc |  8 +++++++-
 3 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 921b24ab60f..94106668e50 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -2944,7 +2944,7 @@ i386_extract_return_value (struct gdbarch *gdbarch, struct type *type,
 {
   i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
   int len = TYPE_LENGTH (type);
-  gdb_byte buf[I386_MAX_REGISTER_SIZE];
+  gdb_byte buf[register_size (gdbarch, I386_ST0_REGNUM)];
 
   /* _Float16 and _Float16 _Complex values are returned via xmm0.  */
   if (((type->code () == TYPE_CODE_FLT) && len == 2)
@@ -3006,7 +3006,7 @@ i386_store_return_value (struct gdbarch *gdbarch, struct type *type,
   if (type->code () == TYPE_CODE_FLT)
     {
       ULONGEST fstat;
-      gdb_byte buf[I386_MAX_REGISTER_SIZE];
+      gdb_byte buf[register_size (gdbarch, I386_ST0_REGNUM)];
 
       if (tdep->st0_regnum < 0)
 	{
@@ -3591,13 +3591,13 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
 				      int regnum,
 				      struct value *result_value)
 {
-  gdb_byte raw_buf[I386_MAX_REGISTER_SIZE];
   enum register_status status;
   gdb_byte *buf = value_contents_raw (result_value).data ();
 
   if (i386_mmx_regnum_p (gdbarch, regnum))
     {
       int fpnum = i386_mmx_regnum_to_fp_regnum (regcache, regnum);
+      gdb_byte raw_buf[register_size (gdbarch, regnum)];
 
       /* Extract (always little endian).  */
       status = regcache->raw_read (fpnum, raw_buf);
@@ -3613,6 +3613,7 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       if (i386_bnd_regnum_p (gdbarch, regnum))
 	{
 	  regnum -= tdep->bnd0_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, I387_BND0R_REGNUM (tdep))];
 
 	  /* Extract (always little endian).  Read lower 128bits.  */
 	  status = regcache->raw_read (I387_BND0R_REGNUM (tdep) + regnum,
@@ -3636,6 +3637,7 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       else if (i386_k_regnum_p (gdbarch, regnum))
 	{
 	  regnum -= tdep->k0_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, tdep->k0_regnum)];
 
 	  /* Extract (always little endian).  */
 	  status = regcache->raw_read (tdep->k0_regnum + regnum, raw_buf);
@@ -3647,6 +3649,7 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       else if (i386_zmm_regnum_p (gdbarch, regnum))
 	{
 	  regnum -= tdep->zmm0_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, tdep->zmm0_regnum)];
 
 	  if (regnum < num_lower_zmm_regs)
 	    {
@@ -3698,6 +3701,7 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       else if (i386_ymm_regnum_p (gdbarch, regnum))
 	{
 	  regnum -= tdep->ymm0_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, tdep->ymm0_regnum)];
 
 	  /* Extract (always little endian).  Read lower 128bits.  */
 	  status = regcache->raw_read (I387_XMM0_REGNUM (tdep) + regnum,
@@ -3717,6 +3721,8 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       else if (i386_ymm_avx512_regnum_p (gdbarch, regnum))
 	{
 	  regnum -= tdep->ymm16_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, tdep->ymm0_regnum)];
+
 	  /* Extract (always little endian).  Read lower 128bits.  */
 	  status = regcache->raw_read (I387_XMM16_REGNUM (tdep) + regnum,
 				       raw_buf);
@@ -3735,6 +3741,7 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, gpnum)];
 
 	  /* Extract (always little endian).  */
 	  status = regcache->raw_read (gpnum, raw_buf);
@@ -3747,6 +3754,7 @@ i386_pseudo_register_read_into_value (struct gdbarch *gdbarch,
       else if (i386_byte_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->al_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, gpnum % 4)];
 
 	  /* Extract (always little endian).  We read both lower and
 	     upper registers.  */
@@ -3784,11 +3792,10 @@ void
 i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
 			    int regnum, const gdb_byte *buf)
 {
-  gdb_byte raw_buf[I386_MAX_REGISTER_SIZE];
-
   if (i386_mmx_regnum_p (gdbarch, regnum))
     {
       int fpnum = i386_mmx_regnum_to_fp_regnum (regcache, regnum);
+      gdb_byte raw_buf[register_size (gdbarch, regnum)];
 
       /* Read ...  */
       regcache->raw_read (fpnum, raw_buf);
@@ -3813,6 +3820,8 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
 	  upper = extract_unsigned_integer (buf + size, size, byte_order);
 
 	  /* Fetching register buffer.  */
+	  gdb_byte raw_buf[register_size (gdbarch,
+					  I387_BND0R_REGNUM (tdep) + regnum)];
 	  regcache->raw_read (I387_BND0R_REGNUM (tdep) + regnum,
 			      raw_buf);
 
@@ -3874,6 +3883,7 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
       else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, gpnum)];
 
 	  /* Read ...  */
 	  regcache->raw_read (gpnum, raw_buf);
@@ -3885,6 +3895,7 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
       else if (i386_byte_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->al_regnum;
+	  gdb_byte raw_buf[register_size (gdbarch, gpnum)];
 
 	  /* Read ...  We read both lower and upper registers.  */
 	  regcache->raw_read (gpnum % 4, raw_buf);
diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
index 38ffa3f967b..56239539402 100644
--- a/gdb/i387-tdep.c
+++ b/gdb/i387-tdep.c
@@ -350,7 +350,7 @@ i387_register_to_value (struct frame_info *frame, int regnum,
 			int *optimizedp, int *unavailablep)
 {
   struct gdbarch *gdbarch = get_frame_arch (frame);
-  gdb_byte from[I386_MAX_REGISTER_SIZE];
+  gdb_byte from[register_size (gdbarch, regnum)];
 
   gdb_assert (i386_fp_regnum_p (gdbarch, regnum));
 
@@ -384,7 +384,7 @@ i387_value_to_register (struct frame_info *frame, int regnum,
 			struct type *type, const gdb_byte *from)
 {
   struct gdbarch *gdbarch = get_frame_arch (frame);
-  gdb_byte to[I386_MAX_REGISTER_SIZE];
+  gdb_byte to[register_size (gdbarch, regnum)];
 
   gdb_assert (i386_fp_regnum_p (gdbarch, regnum));
 
@@ -1419,7 +1419,6 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
   enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
   i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
   gdb_byte *p, *regs = (gdb_byte *) xsave;
-  gdb_byte raw[I386_MAX_REGISTER_SIZE];
   ULONGEST initial_xstate_bv, clear_bv, xstate_bv = 0;
   unsigned int i;
   /* See the comment in i387_supply_xsave().  */
@@ -1604,6 +1603,14 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
 
   if (regclass == all)
     {
+      /* This used to blindly allocate I386_MAX_REGISTER_SIZE of space.
+	 With AMX that became a bit much to do unconditionally.  For now
+	 this seems to be the best trade-off between saving space and
+	 the performance penalty for adding individual allocations.  */
+      const uint32_t buf_size
+	  = (tdep->xcr0 & X86_XSTATE_TILEDATA) ? I386_MAX_REGISTER_SIZE : 64;
+      gdb_byte raw[buf_size];
+
       /* Check if the tilecfg register is changed.  */
       if ((tdep->xcr0 & X86_XSTATE_TILECFG))
 	{
@@ -1791,6 +1798,7 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
   else
     {
       /* Check if REGNUM is changed.  */
+      gdb_byte raw[register_size (gdbarch, regnum)];
       regcache->raw_collect (regnum, raw);
 
       switch (regclass)
@@ -1988,10 +1996,11 @@ i387_collect_xsave (const struct regcache *regcache, int regnum,
 	  }
 	else
 	  {
-	    int regsize;
+	    int regsize = register_size (gdbarch, i);
 
+	    gdb_byte raw[regsize];
 	    regcache->raw_collect (i, raw);
-	    regsize = regcache_register_size (regcache, i);
+
 	    p = FXSAVE_ADDR (tdep, regs, i);
 	    if (memcmp (raw, p, regsize))
 	      {
diff --git a/gdbserver/i387-fp.cc b/gdbserver/i387-fp.cc
index 2d22b0419f8..0b80d287a47 100644
--- a/gdbserver/i387-fp.cc
+++ b/gdbserver/i387-fp.cc
@@ -265,7 +265,6 @@ i387_cache_to_xsave (struct regcache *regcache, void *buf)
   unsigned long val, val2;
   unsigned long long xstate_bv = 0;
   unsigned long long clear_bv = 0;
-  char raw[8192];
   char *p;
 
   /* Amd64 has 16 xmm regs; I386 has 8 xmm regs.  */
@@ -348,6 +347,13 @@ i387_cache_to_xsave (struct regcache *regcache, void *buf)
 	}
     }
 
+  /* This used to blindly allocate 64 bytes of space.
+     With AMX that became a bit much to do unconditionally.  For now
+     this seems to be the best trade-off between saving space and
+     the performance penalty for adding individual allocations.  */
+  const uint32_t buf_size = (x86_xcr0 & X86_XSTATE_TILEDATA) ? 8192 : 64;
+  char raw[buf_size];
+
   /* Check if any x87 registers are changed.  */
   if ((x86_xcr0 & X86_XSTATE_X87))
     {
-- 
2.34.1

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 4/4] gdb: Clear tilecfg.start_row for any PC modification.
  2022-05-06 12:12 [PATCH 0/4] Add AMX support Felix Willgerodt
                   ` (2 preceding siblings ...)
  2022-05-06 12:12 ` [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers Felix Willgerodt
@ 2022-05-06 12:12 ` Felix Willgerodt
  2022-06-27 18:55   ` Pedro Alves
  3 siblings, 1 reply; 27+ messages in thread
From: Felix Willgerodt @ 2022-05-06 12:12 UTC (permalink / raw)
  To: gdb-patches

AMX tile instructions are restartable, e.g. on faults.  Tilecfg.start_row
is used to restart the interrupted instructions at the right row.
On inferior calls, jumps or any other PC modification, start_row needs
to be reset.  It binds to the current instruction and not to the one we
would start executing next in these cases.
---
 gdb/amd64-linux-tdep.c                        |  24 ++++
 gdb/testsuite/gdb.arch/amd64-amx-startrow.c   | 122 ++++++++++++++++++
 gdb/testsuite/gdb.arch/amd64-amx-startrow.exp |  91 +++++++++++++
 3 files changed, 237 insertions(+)
 create mode 100644 gdb/testsuite/gdb.arch/amd64-amx-startrow.c
 create mode 100755 gdb/testsuite/gdb.arch/amd64-amx-startrow.exp

diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index cda90de54c6..65cce7f36ed 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -314,6 +314,30 @@ amd64_linux_write_pc (struct regcache *regcache, CORE_ADDR pc)
      within GDB.  In all other cases the system call will not be
      restarted.  */
   regcache_cooked_write_unsigned (regcache, AMD64_LINUX_ORIG_RAX_REGNUM, -1);
+
+  /* If we have interrupted a restart-able AMX instruction we should clear
+     start_row.  Any instructions we will now run should start at row 0.  */
+  i386_gdbarch_tdep *tdep
+      = (i386_gdbarch_tdep *) gdbarch_tdep (regcache->arch ());
+  if (tdep != nullptr && tdep->tilecfg_raw_regnum != -1)
+    {
+      gdb_byte tilecfg_buf[register_size (regcache->arch (),
+					  tdep->tilecfg_raw_regnum)];
+
+      if (regcache->raw_read (tdep->tilecfg_raw_regnum, tilecfg_buf)
+	  != REG_VALID)
+	{
+	  warning (_ ("Could not reset $tilecfg.start_row."));
+	  return;
+	}
+
+      /* start_row is the second byte.  */
+      if (tilecfg_buf[1] != 0)
+	{
+	  tilecfg_buf[1] = 0;
+	  regcache->raw_write (AMD64_AMX_TILECFG_RAW_REGNUM, tilecfg_buf);
+	}
+    }
 }
 
 /* Record all registers but IP register for process-record.  */
diff --git a/gdb/testsuite/gdb.arch/amd64-amx-startrow.c b/gdb/testsuite/gdb.arch/amd64-amx-startrow.c
new file mode 100644
index 00000000000..00650ac5683
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/amd64-amx-startrow.c
@@ -0,0 +1,122 @@
+/* Test program for AMX startrow.
+
+   Copyright 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <immintrin.h>
+#include <malloc.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <asm/prctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+
+#define XFEATURE_XTILEDATA 18
+#define ARCH_REQ_XCOMP_PERM 0x1023
+
+/* To test infcalls.  */
+int
+square (int a, int b)
+{
+  int tmp;
+  tmp = a * b; /* BP2.  */
+  return tmp;
+}
+
+int
+main (int argc, char **argv)
+{
+  /* Ask the OS to configure AMX in xsave.  */
+  if (syscall (SYS_arch_prctl, ARCH_REQ_XCOMP_PERM, XFEATURE_XTILEDATA) != 0)
+    return -1;
+
+  /* Configure tiles.  */
+  struct tileconfig_t
+  {
+    uint8_t palette_id;
+    uint8_t startRow;
+    uint8_t reserved[14];
+    uint16_t cols[16];
+    uint8_t rows[16];
+  };
+
+  const int tmm0 = 0;
+
+  struct tileconfig_t tc = { 1 };
+
+  tc.rows[tmm0] = 16;
+  tc.cols[tmm0] = 64;
+
+  _tile_loadconfig (&tc);
+
+  const uint32_t memA1[16][16]
+    = { { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 },
+	{ 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 },
+	{ 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47 },
+	{ 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 },
+	{ 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79 },
+	{ 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95 },
+	{ 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109,
+	  110, 111 },
+	{ 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124,
+	  125, 126, 127 },
+	{ 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140,
+	  141, 142, 143 },
+	{ 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156,
+	  157, 158, 159 },
+	{ 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172,
+	  173, 174, 175 },
+	{ 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188,
+	  189, 190, 191 },
+	{ 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204,
+	  205, 206, 207 },
+	{ 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220,
+	  221, 222, 223 },
+	{ 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236,
+	  237, 238, 239 },
+	{ 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252,
+	  253, 254, 255 } };
+
+  /* Load tile that is stored over a page boundary.  */
+  const long page_size = sysconf (_SC_PAGESIZE);
+  if (page_size == -1)
+    return -1;
+
+  void *p;
+  int ret = posix_memalign (&p, page_size, 2 * page_size);
+  if (ret != 0)
+    return -1;
+
+  void *p2 = p + page_size;
+
+  memmove (p2 - 512, memA1, sizeof (memA1));
+
+  /* Protect the second page to produce a fault.  */
+  if (mprotect (p2, page_size, PROT_NONE) == -1)
+    return -1;
+
+  _tile_loadd (tmm0, p2 - 512, 64); /* BP1.  */
+
+  square (2, 2); /* Jump.  */
+  free (p);
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.arch/amd64-amx-startrow.exp b/gdb/testsuite/gdb.arch/amd64-amx-startrow.exp
new file mode 100755
index 00000000000..201d4aaf767
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/amd64-amx-startrow.exp
@@ -0,0 +1,91 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file tests clearing of tilecfg.startrow in case it isn't empty.
+# If there is a fault, tileload and store instructions can be interrupted.
+# In that case startrow will point to the row on which they should be
+# continued.  In that case, inferior calls and jump commands should clear it.
+# This is tested by placing a tile over two memory pages, creating a page
+# fault.  Watchpoints that have hit will be delivered before the page fault.
+
+if { [skip_amx_tests] } {
+    unsupported "Target does not support AMX."
+    return -1
+}
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} debug] } {
+    return -1
+}
+
+proc test_startrow {test} {
+    global gdb_prompt hex decimal srcfile
+
+    if { ![runto_main] } {
+	untested "could not run to main"
+	return -1
+    }
+
+    set line1 [gdb_get_line_number "BP1"]
+    set line2 [gdb_get_line_number "BP2"]
+    set line_jump [gdb_get_line_number "Jump"]
+    gdb_breakpoint $line1
+    gdb_breakpoint $line2
+
+    gdb_continue_to_breakpoint "line1" ".*$srcfile:$line1.*"
+
+    # Set a watchpoint on the first page, which is un-protected.
+    set watch_addr 0
+    gdb_test_multiple "p/x p2 - 8" "get watch_addr" {
+	-re -wrap "= ($hex)" {
+	    set watch_addr $expect_out(1,string)
+	    pass $gdb_test_name
+	}
+    }
+
+    # If we didn't get a watch_addr, it makes no sense to continue.
+    if { $watch_addr == 0 } {
+	return -1
+    }
+
+    gdb_test "rwatch *(int*) $watch_addr" \
+	"atchpoint $decimal: \\*\\(int\\*\\) $watch_addr"
+
+    gdb_test "continue" \
+	"Continuing.*atchpoint $decimal: \\*\\(int\\*\\) $watch_addr.*"
+
+    gdb_test "p \$tilecfg.start_row" "= \[1-9\]+" "print non-zero start_row"
+
+    if { $test == "jump" } {
+	# Test jump.
+	gdb_test "jump $line_jump" "Breakpoint $decimal, .*$srcfile:$line2.*"
+	gdb_test "p \$tilecfg.start_row" "= 0"
+    } else {
+	# Test infcall.
+	gdb_test "p square (2, 2)" "Breakpoint $decimal, .*$srcfile:$line2.*"
+	gdb_test "p \$tilecfg.start_row" "= 0"
+    }
+}
+
+with_test_prefix "infcall" {
+    test_startrow ""
+}
+
+clean_restart $binfile
+
+with_test_prefix "jump" {
+    test_startrow "jump"
+}
-- 
2.34.1

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/4] gdb: define int512 and uint512 as built-in types.
  2022-05-06 12:12 ` [PATCH 1/4] gdb: define int512 and uint512 as built-in types Felix Willgerodt
@ 2022-05-06 12:19   ` Eli Zaretskii
  2022-06-27 18:17   ` Pedro Alves
  1 sibling, 0 replies; 27+ messages in thread
From: Eli Zaretskii @ 2022-05-06 12:19 UTC (permalink / raw)
  To: Felix Willgerodt; +Cc: gdb-patches, aleksandar.paunovic

> Date: Fri,  6 May 2022 14:12:23 +0200
> From: Felix Willgerodt via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
> 
> From: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
> 
> Allow using int512 and uint512 as built-in types, particularly
> for the target descriptions.
> ---
>  gdb/doc/gdb.texinfo       | 2 ++
>  gdb/gdbtypes.c            | 4 ++++
>  gdb/gdbtypes.h            | 2 ++
>  gdb/target-descriptions.c | 6 ++++++
>  gdbsupport/tdesc.cc       | 2 ++
>  gdbsupport/tdesc.h        | 2 ++
>  6 files changed, 18 insertions(+)

The documentation part is trivially OK, thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-06 12:12 ` [PATCH 2/4] gdb, gdbserver: Add AMX registers Felix Willgerodt
@ 2022-05-06 12:25   ` Eli Zaretskii
  2022-05-11  8:14     ` Willgerodt, Felix
  2022-05-06 16:17   ` John Baldwin
  2022-06-27 18:12   ` Pedro Alves
  2 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2022-05-06 12:25 UTC (permalink / raw)
  To: Felix Willgerodt; +Cc: gdb-patches

> Date: Fri,  6 May 2022 14:12:24 +0200
> From: Felix Willgerodt via Gdb-patches <gdb-patches@sourceware.org>
> 
> +@subsubsection Intel @dfn{Advanced Matrix Extensions} (AMX).
> +@cindex Advanced Matrix Extensions (AMX).

Please make @cindex entries start with a lower-case letter.
Otherwise, the index could be sorted differently in different locales.

> +Advanced Matrix Extensions (AMX) adds one 64 byte @samp{TILECFG} register and
                               ^^^
Please use @acronym{AMX} there.

> +eight 1024 byte tile registers @samp{TMM0}, @samp{TMM1}, ..., @samp{TMM7}.
                                                            ^^^
This should be @dots{}, not 3 literal period characters.  The former
will look better in print (and even the Info manual).

> +@smallexample
> +	(gdb) print/x $tilecfg_raw
> +	$1 = 0x203020000000000000000000000000000000000000000000000000000001000
> +	10000c00000000000000000000000000000001
> +	(gdb) print $tilecfg
> +	$2 = @{palette = 0x1, start_row = 0x0, tile0.colsb = 0xc,
> +	tile1.colsb = 0x10, tile2.colsb = 0x10, tile3.colsb = 0x0,
> +	tile4.colsb = 0x0, tile5.colsb = 0x0, tile6.colsb = 0x0,
> +	tile7.colsb = 0x0, tile0.rows = 0x2, tile1.rows = 0x3, tile2.rows = 0x2,
> +	tile3.rows = 0x0, tile4.rows = 0x0, tile5.rows = 0x0, tile6.rows = 0x0,
> +	tile7.rows = 0x0@}
> +	(gdb) p $tmm0.m_int8
> +	$3 = @{@{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2@},	@{1, 1, 1, 1, 2, 2, 2,
> +	2, 3, 3, 3, 3@}@}
> +@end smallexample

Please avoid such long lines.  TeX cannot break lines in @example, so
we need to do that manually.

The documentation part is OK with the above fixed.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers.
  2022-05-06 12:12 ` [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers Felix Willgerodt
@ 2022-05-06 16:08   ` John Baldwin
  2022-05-09  7:04     ` Willgerodt, Felix
  2022-06-27 18:30   ` Pedro Alves
  1 sibling, 1 reply; 27+ messages in thread
From: John Baldwin @ 2022-05-06 16:08 UTC (permalink / raw)
  To: Felix Willgerodt, gdb-patches

On 5/6/22 5:12 AM, Felix Willgerodt via Gdb-patches wrote:
> A couple of functions blindly allocate a buffer of the size of
> I386_MAX_REGISTER_SIZE.  With the addition of AMX, this size has increased
> drastically from 64 bytes to 8192.  This changes these buffer allocations
> to only use the actual amount needed, similar to how it is already done in
> amd64-tdep.c (amd64_pseudo_register_read_value).
> 
> For the i387_collect_xsave and i387_cache_to_xsave functions any feedback is
> welcome.  I opted to take the middle ground and only distinguish
> between "AMX" and "Not-AMX".  That might be unnecessary optimization,
> we could alternatively be okay with using an 8kB buffer unconditionally or
> be okay with having many smaller buffer allocations.

I think these changes make sense, but I think it might be nice to have some
sort of named constant (I386_NOAMX_REGISTER_SIZE perhaps) instead of a bare
64.

-- 
John Baldwin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-06 12:12 ` [PATCH 2/4] gdb, gdbserver: Add AMX registers Felix Willgerodt
  2022-05-06 12:25   ` Eli Zaretskii
@ 2022-05-06 16:17   ` John Baldwin
  2022-05-09  7:04     ` Willgerodt, Felix
  2022-06-27 18:12   ` Pedro Alves
  2 siblings, 1 reply; 27+ messages in thread
From: John Baldwin @ 2022-05-06 16:17 UTC (permalink / raw)
  To: Felix Willgerodt, gdb-patches

On 5/6/22 5:12 AM, Felix Willgerodt via Gdb-patches wrote:
> Advanced Matrix Extensions (AMX) adds one 64 byte TILECFG register and
> eight 1024 byte tile registers TMM0, TMM1, ..., TMM7.  The tile registers
> each represent a matrix, whose dimensions are configured via TILECFG.
> In XSAVE, all tiles are represented in the 8kB TILEDATA section.
> 
> Future AMX platforms are free to add new palettes, which are
> run-time configurable partitionings of the TILEDATA space.
> Currently only palette 0 (initialized zero state) and palette 1 exist.
> New palettes might change any of the following parameters, which are defined
> in the palette_table (which can be accessed via CPUID):
> 
> define palette_table[id]:
> 	uint16_t total_tile_bytes
> 	uint16_t bytes_per_tile
> 	uint16_t bytes_per_row
> 	uint16_t max_names
> 	uint16_t max_rows
> 
> More information about AMX can be found in the Intel(R) Architecture
> Instruction Set Extensions Programming Reference, May 2021.
> 
> The $tilecfg register is implemented as a pseudo register.  For convenience
> it is partitioned as a struct, representing the single configuration options
> as members.  It doesn't show reserved space, as structs can only contain
> existing data types.  To also be able to show the full register, $tilecfg_raw
> is implemented as a uint512 value.
> 
> The $tmm0-7 registers are also represented as pseudo registers.  This allows
> to only show the actually configured matrix and to omit filling zeros, which
> greatly increases readability on smaller matrices.  A raw $tiledata register
> is implemented as the base for the pseudo registers.
> 
> When developing this we also considered updating the target description at
> runtime to achieve the dynamic sizing.  This however would have required
> extensive changes to the writing and reading from/to XSAVE.  And it wouldn't
> work with gdbserver easily, as there currently is no infrastructure to keep
> XML target descriptions in sync after the initial transfer.
> ---
> diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
> index 6b3555aa3ea..705c7bcd602 100644
> --- a/gdb/i386-linux-tdep.h
> +++ b/gdb/i386-linux-tdep.h
> @@ -29,7 +29,7 @@
>   /* Register number for the "orig_eax" pseudo-register.  If this
>      pseudo-register contains a value >= 0 it is interpreted as the
>      system call number that the kernel is supposed to restart.  */
> -#define I386_LINUX_ORIG_EAX_REGNUM (I386_PKRU_REGNUM + 1)
> +#define I386_LINUX_ORIG_EAX_REGNUM (I386_AMX_TILEDATA_REGNUM + 1)
>   
>   /* Total number of registers for GNU/Linux.  */
>   #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
> index 8501e12e241..921b24ab60f 100644
> --- a/gdb/i386-tdep.c
> +++ b/gdb/i386-tdep.c
> @@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch)
>     return tdep->i386_mmx_type;
>   }
>   
> +/* Construct vector type for TMM registers.  */
> +
> +static struct type *
> +i386_tmm_type (struct gdbarch *gdbarch)
> +{
> +  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
> +
> +  if (!tdep->i386_tmm_type)
> +    {
> +      const struct builtin_type *bt = builtin_type (gdbarch);
> +
> +      uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW;
> +      uint8_t max_rows = tilecfg_reg::MAX_ROWS;
> +
> +      /* The type we're building is this:  */
> +#if 0
> +      union __gdb_builtin_type_matrix1024i
> +      {
> +	int8_t m_int8[max_rows][bytes_per_row];
> +	uint8_t m_uint8[max_rows][bytes_per_row];
> +	int32_t m_int32[max_rows][bytes_per_row/4];
> +	bfloat16_t m_bfloat16[max_rows][bytes_per_row/2];
> +	float m_int32[max_rows][bytes_per_row/4];
> +      };
> +#endif

I think it might be better to put this inside of the comment instead of using #if 0
as a reader might think that code under #if 0 might be intended to be used in the
actual source under some circumstance (e.g. it was old code disabled but not
removed, or it is some kind of WIP that will be enabled in the future), but this
is clearly documentation that will never be compiled as part of GDB itself.

(And I think this #if 0 pattern is in some other places in the patch as well?)

-- 
John Baldwin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-06 16:17   ` John Baldwin
@ 2022-05-09  7:04     ` Willgerodt, Felix
  2022-05-09 16:31       ` John Baldwin
  0 siblings, 1 reply; 27+ messages in thread
From: Willgerodt, Felix @ 2022-05-09  7:04 UTC (permalink / raw)
  To: John Baldwin, gdb-patches

> -----Original Message-----
> From: John Baldwin <jhb@FreeBSD.org>
> Sent: Freitag, 6. Mai 2022 18:18
> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-
> patches@sourceware.org
> Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
> 
> On 5/6/22 5:12 AM, Felix Willgerodt via Gdb-patches wrote:
> > Advanced Matrix Extensions (AMX) adds one 64 byte TILECFG register and
> > eight 1024 byte tile registers TMM0, TMM1, ..., TMM7.  The tile registers
> > each represent a matrix, whose dimensions are configured via TILECFG.
> > In XSAVE, all tiles are represented in the 8kB TILEDATA section.
> >
> > Future AMX platforms are free to add new palettes, which are
> > run-time configurable partitionings of the TILEDATA space.
> > Currently only palette 0 (initialized zero state) and palette 1 exist.
> > New palettes might change any of the following parameters, which are
> defined
> > in the palette_table (which can be accessed via CPUID):
> >
> > define palette_table[id]:
> > 	uint16_t total_tile_bytes
> > 	uint16_t bytes_per_tile
> > 	uint16_t bytes_per_row
> > 	uint16_t max_names
> > 	uint16_t max_rows
> >
> > More information about AMX can be found in the Intel(R) Architecture
> > Instruction Set Extensions Programming Reference, May 2021.
> >
> > The $tilecfg register is implemented as a pseudo register.  For convenience
> > it is partitioned as a struct, representing the single configuration options
> > as members.  It doesn't show reserved space, as structs can only contain
> > existing data types.  To also be able to show the full register, $tilecfg_raw
> > is implemented as a uint512 value.
> >
> > The $tmm0-7 registers are also represented as pseudo registers.  This
> allows
> > to only show the actually configured matrix and to omit filling zeros, which
> > greatly increases readability on smaller matrices.  A raw $tiledata register
> > is implemented as the base for the pseudo registers.
> >
> > When developing this we also considered updating the target description
> at
> > runtime to achieve the dynamic sizing.  This however would have required
> > extensive changes to the writing and reading from/to XSAVE.  And it
> wouldn't
> > work with gdbserver easily, as there currently is no infrastructure to keep
> > XML target descriptions in sync after the initial transfer.
> > ---
> > diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
> > index 6b3555aa3ea..705c7bcd602 100644
> > --- a/gdb/i386-linux-tdep.h
> > +++ b/gdb/i386-linux-tdep.h
> > @@ -29,7 +29,7 @@
> >   /* Register number for the "orig_eax" pseudo-register.  If this
> >      pseudo-register contains a value >= 0 it is interpreted as the
> >      system call number that the kernel is supposed to restart.  */
> > -#define I386_LINUX_ORIG_EAX_REGNUM (I386_PKRU_REGNUM + 1)
> > +#define I386_LINUX_ORIG_EAX_REGNUM
> (I386_AMX_TILEDATA_REGNUM + 1)
> >
> >   /* Total number of registers for GNU/Linux.  */
> >   #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
> > diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
> > index 8501e12e241..921b24ab60f 100644
> > --- a/gdb/i386-tdep.c
> > +++ b/gdb/i386-tdep.c
> > @@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch)
> >     return tdep->i386_mmx_type;
> >   }
> >
> > +/* Construct vector type for TMM registers.  */
> > +
> > +static struct type *
> > +i386_tmm_type (struct gdbarch *gdbarch)
> > +{
> > +  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep
> (gdbarch);
> > +
> > +  if (!tdep->i386_tmm_type)
> > +    {
> > +      const struct builtin_type *bt = builtin_type (gdbarch);
> > +
> > +      uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW;
> > +      uint8_t max_rows = tilecfg_reg::MAX_ROWS;
> > +
> > +      /* The type we're building is this:  */
> > +#if 0
> > +      union __gdb_builtin_type_matrix1024i
> > +      {
> > +	int8_t m_int8[max_rows][bytes_per_row];
> > +	uint8_t m_uint8[max_rows][bytes_per_row];
> > +	int32_t m_int32[max_rows][bytes_per_row/4];
> > +	bfloat16_t m_bfloat16[max_rows][bytes_per_row/2];
> > +	float m_int32[max_rows][bytes_per_row/4];
> > +      };
> > +#endif
> 
> I think it might be better to put this inside of the comment instead of using
> #if 0
> as a reader might think that code under #if 0 might be intended to be used in
> the
> actual source under some circumstance (e.g. it was old code disabled but not
> removed, or it is some kind of WIP that will be enabled in the future), but this
> is clearly documentation that will never be compiled as part of GDB itself.
> 
> (And I think this #if 0 pattern is in some other places in the patch as well?)
> 
> --
> John Baldwin

Thanks for the feedback. I understand your point. But all pseudo register type
functions in this file (e.g. i386_zmm_type, i386_zmm_type and i386_bnd_type)
use this style of "comment". I don't know the background and am a bit
reluctant to change the style in my patch series. I think that should be done
separately.

Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers.
  2022-05-06 16:08   ` John Baldwin
@ 2022-05-09  7:04     ` Willgerodt, Felix
  0 siblings, 0 replies; 27+ messages in thread
From: Willgerodt, Felix @ 2022-05-09  7:04 UTC (permalink / raw)
  To: John Baldwin, gdb-patches

> -----Original Message-----
> From: John Baldwin <jhb@FreeBSD.org>
> Sent: Freitag, 6. Mai 2022 18:09
> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-
> patches@sourceware.org
> Subject: Re: [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of
> buffer when fetching registers.
> 
> On 5/6/22 5:12 AM, Felix Willgerodt via Gdb-patches wrote:
> > A couple of functions blindly allocate a buffer of the size of
> > I386_MAX_REGISTER_SIZE.  With the addition of AMX, this size has
> increased
> > drastically from 64 bytes to 8192.  This changes these buffer allocations
> > to only use the actual amount needed, similar to how it is already done in
> > amd64-tdep.c (amd64_pseudo_register_read_value).
> >
> > For the i387_collect_xsave and i387_cache_to_xsave functions any
> feedback is
> > welcome.  I opted to take the middle ground and only distinguish
> > between "AMX" and "Not-AMX".  That might be unnecessary optimization,
> > we could alternatively be okay with using an 8kB buffer unconditionally or
> > be okay with having many smaller buffer allocations.
> 
> I think these changes make sense, but I think it might be nice to have some
> sort of named constant (I386_NOAMX_REGISTER_SIZE perhaps) instead of a
> bare
> 64.

Thanks for your feedback. I have implemented that locally, naming the define
I386_NO_AMX_REGISTER_SIZE.

Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-09  7:04     ` Willgerodt, Felix
@ 2022-05-09 16:31       ` John Baldwin
  0 siblings, 0 replies; 27+ messages in thread
From: John Baldwin @ 2022-05-09 16:31 UTC (permalink / raw)
  To: Willgerodt, Felix, gdb-patches

On 5/9/22 12:04 AM, Willgerodt, Felix wrote:
>> -----Original Message-----
>> From: John Baldwin <jhb@FreeBSD.org>
>> Sent: Freitag, 6. Mai 2022 18:18
>> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-
>> patches@sourceware.org
>> Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
>>
>> On 5/6/22 5:12 AM, Felix Willgerodt via Gdb-patches wrote:
>>> Advanced Matrix Extensions (AMX) adds one 64 byte TILECFG register and
>>> eight 1024 byte tile registers TMM0, TMM1, ..., TMM7.  The tile registers
>>> each represent a matrix, whose dimensions are configured via TILECFG.
>>> In XSAVE, all tiles are represented in the 8kB TILEDATA section.
>>>
>>> Future AMX platforms are free to add new palettes, which are
>>> run-time configurable partitionings of the TILEDATA space.
>>> Currently only palette 0 (initialized zero state) and palette 1 exist.
>>> New palettes might change any of the following parameters, which are
>> defined
>>> in the palette_table (which can be accessed via CPUID):
>>>
>>> define palette_table[id]:
>>> 	uint16_t total_tile_bytes
>>> 	uint16_t bytes_per_tile
>>> 	uint16_t bytes_per_row
>>> 	uint16_t max_names
>>> 	uint16_t max_rows
>>>
>>> More information about AMX can be found in the Intel(R) Architecture
>>> Instruction Set Extensions Programming Reference, May 2021.
>>>
>>> The $tilecfg register is implemented as a pseudo register.  For convenience
>>> it is partitioned as a struct, representing the single configuration options
>>> as members.  It doesn't show reserved space, as structs can only contain
>>> existing data types.  To also be able to show the full register, $tilecfg_raw
>>> is implemented as a uint512 value.
>>>
>>> The $tmm0-7 registers are also represented as pseudo registers.  This
>> allows
>>> to only show the actually configured matrix and to omit filling zeros, which
>>> greatly increases readability on smaller matrices.  A raw $tiledata register
>>> is implemented as the base for the pseudo registers.
>>>
>>> When developing this we also considered updating the target description
>> at
>>> runtime to achieve the dynamic sizing.  This however would have required
>>> extensive changes to the writing and reading from/to XSAVE.  And it
>> wouldn't
>>> work with gdbserver easily, as there currently is no infrastructure to keep
>>> XML target descriptions in sync after the initial transfer.
>>> ---
>>> diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
>>> index 6b3555aa3ea..705c7bcd602 100644
>>> --- a/gdb/i386-linux-tdep.h
>>> +++ b/gdb/i386-linux-tdep.h
>>> @@ -29,7 +29,7 @@
>>>    /* Register number for the "orig_eax" pseudo-register.  If this
>>>       pseudo-register contains a value >= 0 it is interpreted as the
>>>       system call number that the kernel is supposed to restart.  */
>>> -#define I386_LINUX_ORIG_EAX_REGNUM (I386_PKRU_REGNUM + 1)
>>> +#define I386_LINUX_ORIG_EAX_REGNUM
>> (I386_AMX_TILEDATA_REGNUM + 1)
>>>
>>>    /* Total number of registers for GNU/Linux.  */
>>>    #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
>>> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
>>> index 8501e12e241..921b24ab60f 100644
>>> --- a/gdb/i386-tdep.c
>>> +++ b/gdb/i386-tdep.c
>>> @@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch)
>>>      return tdep->i386_mmx_type;
>>>    }
>>>
>>> +/* Construct vector type for TMM registers.  */
>>> +
>>> +static struct type *
>>> +i386_tmm_type (struct gdbarch *gdbarch)
>>> +{
>>> +  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep
>> (gdbarch);
>>> +
>>> +  if (!tdep->i386_tmm_type)
>>> +    {
>>> +      const struct builtin_type *bt = builtin_type (gdbarch);
>>> +
>>> +      uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW;
>>> +      uint8_t max_rows = tilecfg_reg::MAX_ROWS;
>>> +
>>> +      /* The type we're building is this:  */
>>> +#if 0
>>> +      union __gdb_builtin_type_matrix1024i
>>> +      {
>>> +	int8_t m_int8[max_rows][bytes_per_row];
>>> +	uint8_t m_uint8[max_rows][bytes_per_row];
>>> +	int32_t m_int32[max_rows][bytes_per_row/4];
>>> +	bfloat16_t m_bfloat16[max_rows][bytes_per_row/2];
>>> +	float m_int32[max_rows][bytes_per_row/4];
>>> +      };
>>> +#endif
>>
>> I think it might be better to put this inside of the comment instead of using
>> #if 0
>> as a reader might think that code under #if 0 might be intended to be used in
>> the
>> actual source under some circumstance (e.g. it was old code disabled but not
>> removed, or it is some kind of WIP that will be enabled in the future), but this
>> is clearly documentation that will never be compiled as part of GDB itself.
>>
>> (And I think this #if 0 pattern is in some other places in the patch as well?)
>>
>> --
>> John Baldwin
> 
> Thanks for the feedback. I understand your point. But all pseudo register type
> functions in this file (e.g. i386_zmm_type, i386_zmm_type and i386_bnd_type)
> use this style of "comment". I don't know the background and am a bit
> reluctant to change the style in my patch series. I think that should be done
> separately.

Oh, yes, sorry, if it is consistent with existing style then best to leave it
as-is.

-- 
John Baldwin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-06 12:25   ` Eli Zaretskii
@ 2022-05-11  8:14     ` Willgerodt, Felix
  2022-05-11 11:41       ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Willgerodt, Felix @ 2022-05-11  8:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

> -----Original Message-----
> From: Eli Zaretskii <eliz@gnu.org>
> Sent: Freitag, 6. Mai 2022 14:25
> To: Willgerodt, Felix <felix.willgerodt@intel.com>
> Cc: gdb-patches@sourceware.org
> Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
> 
> > Date: Fri,  6 May 2022 14:12:24 +0200
> > From: Felix Willgerodt via Gdb-patches <gdb-patches@sourceware.org>
> >
> > +@subsubsection Intel @dfn{Advanced Matrix Extensions} (AMX).
> > +@cindex Advanced Matrix Extensions (AMX).
> 
> Please make @cindex entries start with a lower-case letter.
> Otherwise, the index could be sorted differently in different locales.
> 

I tried it, it just doesn't look right to me if I don't use capital letters.
(Writing "advanced Matrix Extensions" or "advanced matrix extensions".)

There are many index entries starting with capital letters, e.g. AArch64,
ARM or Ada. I see that Intel MPX is added as "Intel Memory Protection
Extensions (MPX)". Features of other vendors/architectures seem to
have similar formatting, like "AArch64 SVE" or "AArch64 Memory
Tagging Extension". Can I use the same formatting for AMX?
E.g. "Intel Advanced Memory Extensions (AMX)".


> > +Advanced Matrix Extensions (AMX) adds one 64 byte @samp{TILECFG}
> register and
>                                ^^^
> Please use @acronym{AMX} there.
> 

I have changed it locally.

> > +eight 1024 byte tile registers @samp{TMM0}, @samp{TMM1}, ...,
> @samp{TMM7}.
>                                                             ^^^
> This should be @dots{}, not 3 literal period characters.  The former
> will look better in print (and even the Info manual).
>

Same.
 
> > +@smallexample
> > +	(gdb) print/x $tilecfg_raw
> > +	$1 =
> 0x203020000000000000000000000000000000000000000000000000000001000
> > +	10000c00000000000000000000000000000001
> > +	(gdb) print $tilecfg
> > +	$2 = @{palette = 0x1, start_row = 0x0, tile0.colsb = 0xc,
> > +	tile1.colsb = 0x10, tile2.colsb = 0x10, tile3.colsb = 0x0,
> > +	tile4.colsb = 0x0, tile5.colsb = 0x0, tile6.colsb = 0x0,
> > +	tile7.colsb = 0x0, tile0.rows = 0x2, tile1.rows = 0x3, tile2.rows = 0x2,
> > +	tile3.rows = 0x0, tile4.rows = 0x0, tile5.rows = 0x0, tile6.rows = 0x0,
> > +	tile7.rows = 0x0@}
> > +	(gdb) p $tmm0.m_int8
> > +	$3 = @{@{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2@},	@{1, 1, 1, 1, 2, 2, 2,
> > +	2, 3, 3, 3, 3@}@}
> > +@end smallexample
> 
> Please avoid such long lines.  TeX cannot break lines in @example, so
> we need to do that manually.
> 

Same.

Thanks,
Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-11  8:14     ` Willgerodt, Felix
@ 2022-05-11 11:41       ` Eli Zaretskii
  2022-06-27 18:16         ` Pedro Alves
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2022-05-11 11:41 UTC (permalink / raw)
  To: Willgerodt, Felix; +Cc: gdb-patches

> From: "Willgerodt, Felix" <felix.willgerodt@intel.com>
> CC: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
> Date: Wed, 11 May 2022 08:14:26 +0000
> 
> > Please make @cindex entries start with a lower-case letter.
> > Otherwise, the index could be sorted differently in different locales.
> > 
> 
> I tried it, it just doesn't look right to me if I don't use capital letters.
> (Writing "advanced Matrix Extensions" or "advanced matrix extensions".)
> 
> There are many index entries starting with capital letters, e.g. AArch64,
> ARM or Ada. I see that Intel MPX is added as "Intel Memory Protection
> Extensions (MPX)". Features of other vendors/architectures seem to
> have similar formatting, like "AArch64 SVE" or "AArch64 Memory
> Tagging Extension". Can I use the same formatting for AMX?
> E.g. "Intel Advanced Memory Extensions (AMX)".

If you start with "Intel" (or another non-word), yes.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-06 12:12 ` [PATCH 2/4] gdb, gdbserver: Add AMX registers Felix Willgerodt
  2022-05-06 12:25   ` Eli Zaretskii
  2022-05-06 16:17   ` John Baldwin
@ 2022-06-27 18:12   ` Pedro Alves
  2022-07-14 10:54     ` Willgerodt, Felix
  2022-08-08  9:15     ` Willgerodt, Felix
  2 siblings, 2 replies; 27+ messages in thread
From: Pedro Alves @ 2022-06-27 18:12 UTC (permalink / raw)
  To: Felix Willgerodt, gdb-patches

Hi Felix,

This largely looks good to me, though I have a couple questions.  See below.

On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:

>  
> +/* A helper function to re-size AMX pseudo registers during reads.  Copies
> +   the contents from RAW_BUF to BUF and re-sizes the value.  */

I think this should say what does it mean when TILECFG is NULL.

> +
> +static void
> +amd64_tmm_resize_read (const tilecfg_reg *tilecfg, const gdb_byte *raw_buf,
> +		       gdb_byte *buf, value *result_value, const int tmmnum)
> +{
> +  uint16_t columns = 64;
> +  uint8_t rows = 16;
> +
> +  if (tilecfg != nullptr)
> +    {
> +      columns = tilecfg->bytes_per_row (tmmnum);
> +      rows = tilecfg->rows (tmmnum);
> +      if (columns == 0)
> +	columns = 64;
> +      if (rows == 0)
> +	rows = 16;
> +    }
> +
> +  gdb_assert (TYPE_LENGTH (value_type (result_value)) >= rows * columns);
> +
> +  /* Copy each row from raw_buf into buf.  The rows are not consecutive
> +     but they are on MAX_BYTES_PER_ROW * iRow position.  */
> +  const gdb_byte *raw_buf_offset
> +    = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE;
> +  for (uint8_t iRow = 0; iRow < rows; ++iRow)
> +    {
> +      memcpy (buf + columns * iRow,
> +	      raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow,
> +	      columns);
> +    }
> +
> +  /* Adjust the result_value.  The value is a union of matrices of different
> +     types.  See i386_tmm_type ().  This iterates over each member and
> +     adjusts the dimensions according to the type.  */
> +  for (int i = 0; i < value_type (result_value)->num_fields (); ++i)
> +    {
> +      type *rows_type = value_type (result_value)->fields ()[i].m_type;
> +      type *cols_type = rows_type->main_type->target_type;
> +
> +      /* Adjust array bit lengths.  */
> +      rows_type->length = columns * rows;
> +      cols_type->length = columns;
> +
> +      /* Adjust array dimensions.  */
> +      rows_type->bounds ()->high.set_const_val (rows - 1);
> +      int num_bytes = cols_type->main_type->target_type->length;
> +      cols_type->bounds ()->high.set_const_val (columns / num_bytes - 1);

Does any other target do in-place type rewriting like this?  That seems fishy.  What happens e.g.,
to values already in the value history that were recorded before the dimensions changed, for
instance?  Will they suddenly start re-printing differently / incorrectly with their type changed
behind their back?

Like:

 (gdb) print $reg  # some register or value mapped to a register that that ends up in the function above
 $1 = ...  # before type changes
 # something happens and the AMX type changes.
 (gdb) print $reg
 $2 = ...  # reflects type change
 (gdb) print $1
 $3 = ...  # what type does GDB use here?

Do the new tests cover something like this already?

This may likewise affect, e.g., watchpoints and displays.

I haven't traced the new code to check where do those types originally come from, but maybe it
would work to reuse/extend the vla support to make those types have dynamic length and
bounds (TYPE_DYNAMIC_LENGTH, DYN_PROP_BYTE_SIZE, etc.).

Or maybe just tweak these functions such that you create a new type instead of changing the
original type.  I don't know how frequently the array dimentions change and how open
ended the dimensions are, but caching the type keyed on row/col sizes may work well to
spare creating too many types, or actually creating them all the time.

> +    }
> +}
> +


>  static struct value *
>  amd64_pseudo_register_read_value (struct gdbarch *gdbarch,
>  				  readable_regcache *regcache,
> @@ -401,6 +511,58 @@ amd64_pseudo_register_read_value (struct gdbarch *gdbarch,
>  	mark_value_bytes_unavailable (result_value, 0,
>  				      TYPE_LENGTH (value_type (result_value)));
>      }
> +  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
> +    {
> +      /* Read tilecfg.  */
> +      gdb_byte raw_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)];
> +      register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum,
> +						   raw_buf);
> +      if (status != REG_VALID)
> +	{
> +	  mark_value_bytes_unavailable (
> +	    result_value, 0, TYPE_LENGTH (value_type (result_value)));

Formatting, "(" never ends a line.  This appears multiple times in the patch throughout.

> +	}
> +      else
> +	{
> +	  /* Copy palette and start_row.  See tilecfg_type ().  */
> +	  memcpy (buf, raw_buf, 2 * 1);
> +	  /* Copy all colsb.  */
> +	  memcpy (buf + 2, raw_buf + 16, 2 * 8);
> +	  /* Copy all rows.  */
> +	  memcpy (buf + 18, raw_buf + 48, 1 * 8);
> +	}
> +    }
> +  else if (i386_tmm_regnum_p (gdbarch, regnum))
> +    {
> +      /* Read tilecfg.  */
> +      gdb_byte tilecfg_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)];
> +      register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum,
> +						   tilecfg_buf);
> +      if (status != REG_VALID)
> +	{
> +	  mark_value_bytes_unavailable (
> +	    result_value, 0, TYPE_LENGTH (value_type (result_value)));

Ditto.

> +	}
> +      else
> +	{
> +	  tilecfg_reg tilecfg{ tilecfg_buf };

GDB prefers using ()s for ctors.  Also space before '(' / '{'.  Thus:

	  tilecfg_reg tilecfg (tilecfg_buf);

> +	  gdb_byte raw_buf[register_size (gdbarch, tdep->tiledata_regnum)];
> +	  status = regcache->raw_read (tdep->tiledata_regnum, raw_buf);
> +
> +	  if (status != REG_VALID)
> +	    {
> +	      mark_value_bytes_unavailable (
> +		result_value, 0, TYPE_LENGTH (value_type (result_value)));

Formatting.  (I'll stop pointing these out.)

> +	    }
> +	  else
> +	    {
> +	      /* Re-size value and copy data.  */
> +	      amd64_tmm_resize_read (&tilecfg, raw_buf,
> +				     buf, result_value,
> +				     regnum - tdep->tmm_regnum);
> +	    }
> +	}
> +    }
>    else
>      i386_pseudo_register_read_into_value (gdbarch, regcache, regnum,
>  					  result_value);

>  
> +/* AMX tilecfg_reg constructor.  */
> +
> +tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg) : tilecfg_reg ()
> +{

Please write:

tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg)
  : tilecfg_reg ()
{

Though that is just calling the default ctor.  I don't think you need
to that explicitly here.

> +  /* Use default values.  */
> +  if (raw_tilecfg == nullptr)
> +    return;
> +
> +  palette = raw_tilecfg[0];
> +  start_row = raw_tilecfg[1];
> +
> +  /* Read TILECFG column and row values via pointers.
> +     Columns are represented by 2 bytes and rows are represented
> +     by 1 byte.  Column pointer which is *uint8_t needs to be converted
> +     to *uint16_t pointer.  */
> +  uint16_t *vec_col_pos
> +      = reinterpret_cast<uint16_t *> (raw_tilecfg + COLUMN_MEMORY_OFFSET);
> +  uint8_t *vec_row_pos = raw_tilecfg + ROW_MEMORY_OFFSET;
> +
> +  for (int i = 0; i < MAX_NAMES; i++)
> +    columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] };

Where was columns_and_rows resized to MAX_NAMES?

> +}
> +
>  /* Convert a dbx register number REG to the appropriate register
>     number used by GDB.  */
>  
> @@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch)
>    return tdep->i386_mmx_type;
>  }
>  
> +/* Construct vector type for TMM registers.  */
> +
> +static struct type *
> +i386_tmm_type (struct gdbarch *gdbarch)
> +{
> +  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch);
> +
> +  if (!tdep->i386_tmm_type)
> +    {
> +      const struct builtin_type *bt = builtin_type (gdbarch);
> +
> +      uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW;
> +      uint8_t max_rows = tilecfg_reg::MAX_ROWS;
> +
> +      /* The type we're building is this:  */
> +#if 0
> +      union __gdb_builtin_type_matrix1024i
> +      {
> +	int8_t m_int8[max_rows][bytes_per_row];
> +	uint8_t m_uint8[max_rows][bytes_per_row];
> +	int32_t m_int32[max_rows][bytes_per_row/4];
> +	bfloat16_t m_bfloat16[max_rows][bytes_per_row/2];
> +	float m_int32[max_rows][bytes_per_row/4];
> +      };
> +#endif
> +
> +      struct type *t;
> +      t = arch_composite_type (gdbarch, "builtin_type_tile", TYPE_CODE_UNION);
> +
> +      append_composite_type_field (

Formatting.  (same below.)

> +	  t, "m_int8",
> +	  init_vector_type (init_vector_type (bt->builtin_int8, bytes_per_row),
> +			    max_rows));
> +
> +      append_composite_type_field (
> +	  t, "m_uint8",
> +	  init_vector_type (
> +	      init_vector_type (bt->builtin_uint8, bytes_per_row), max_rows));
> +
> +      append_composite_type_field (
> +	  t, "m_int32",
> +	  init_vector_type (
> +	      init_vector_type (bt->builtin_int32, bytes_per_row / 4),
> +	      max_rows));
> +
> +      append_composite_type_field (
> +	  t, "m_bf16",
> +	  init_vector_type (
> +	      init_vector_type (bt->builtin_bfloat16, bytes_per_row / 2),
> +	      max_rows));
> +
> +      append_composite_type_field (
> +	  t, "m_fp32",
> +	  init_vector_type (
> +	      init_vector_type (bt->builtin_float, bytes_per_row / 4),
> +	      max_rows));
> +
> +      t->set_is_vector (true);
> +      t->set_name ("builtin_type_tile");
> +      tdep->i386_tmm_type = t;
> +    }
> +
> +  return tdep->i386_tmm_type;
> +}


> @@ -348,7 +376,7 @@ enum record_i386_regnum
>  #define I386_NUM_REGS		(I386_GSBASE_REGNUM + 1)
>  
>  /* Size of the largest register.  */
> -#define I386_MAX_REGISTER_SIZE	64
> +#define I386_MAX_REGISTER_SIZE	8192
>  

*Eek*

>  /* Types for i386-specific registers.  */
>  extern struct type *i387_ext_type (struct gdbarch *gdbarch);
> @@ -366,6 +394,8 @@ extern int i386_k_regnum_p (struct gdbarch *gdbarch, int regnum);
>  extern int i386_zmm_regnum_p (struct gdbarch *gdbarch, int regnum);
>  extern int i386_zmmh_regnum_p (struct gdbarch *gdbarch, int regnum);
>  extern bool i386_pkru_regnum_p (struct gdbarch *gdbarch, int regnum);
> +extern bool i386_tilecfg_regnum_p (struct gdbarch *gdbarch, int regnum);
> +extern bool i386_tmm_regnum_p (struct gdbarch *gdbarch, int regnum);
>  
>  extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
>  					      int regnum);
> @@ -485,4 +515,94 @@ extern int i386_stap_is_single_operand (struct gdbarch *gdbarch,
>  extern expr::operation_up i386_stap_parse_special_token
>       (struct gdbarch *gdbarch, struct stap_parse_info *p);
>  
> +/* AMX utilities.  */
> +
> +/* TILECFG register.
> +   0       palette
> +   1       start_row
> +   2-15    reserved, must be zero
> +   16-17   tile0.colsb Tile 0 bytes per row.
> +   18-19   tile1.colsb Tile 1 bytes per row.
> +   20-21   tile2.colsb Tile 2 bytes per row.
> +   ...     (sequence continues)
> +   30-31   tile7.colsb Tile 7 bytes per row.
> +   32-47   reserved, must be zero
> +   48      tile0.rows Tile 0 rows.
> +   49      tile1.rows Tile 1 rows.
> +   50      tile2.rows Tile 2 rows.
> +   ...     (sequence continues)
> +   55      tile7.rows Tile 7 rows.
> +   56-63   reserved, must be zero.  */
> +
> +/* TILECFG class representing the AMX Tilecfg register.  */
> +
> +class tilecfg_reg
> +{
> +public:
> +  tilecfg_reg ()
> +      : columns_and_rows (
> +	  std::vector<std::pair<uint16_t, uint8_t>> (MAX_NAMES, { 0, 0 }))

Would it be possible to use a struct instead of a pair?  Pairs are kind of horrible
for readability.  Some proper field names would be much clearer than "first" and "second".
std::pair is great for general purpose templates, not so much otherwise.

Also, how about doing this columns_and_rows initialization where the
field is declared?

> +  {
> +  }
> +
> +  /* Construct it from raw tilecfg data.  */
> +  explicit tilecfg_reg (uint8_t *raw_tilecfg);

As pointed out above, the implementation of this ctor doesn't seem to resize the
columns_and_rows vector, and just accesses the elements straight away:

   for (int i = 0; i < MAX_NAMES; i++)
     columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] };
 }

Moving the initialization to the member declaration would fix it,
but maybe I'm missing something.

> +
> +private:
> +  /* This stores the colsb and rows entries.  */

I guess "colsb" is a typo for "cols"?

> +  std::vector<std::pair<uint16_t, uint8_t>> columns_and_rows;
> +};
> +

> +set line1 [gdb_get_line_number "BP1"]
> +set line2 [gdb_get_line_number "BP2"]
> +gdb_breakpoint $line1
> +gdb_breakpoint $line2
> +
> +# Registers should be displayed as zeroed before AMX enablement.
> +with_test_prefix "Before AMX is enabled" {

Lowercase "Before".

> +    gdb_test "print \$tilecfg_raw" "= 0"
> +    for {set i 0} {$i < 8} {incr i} {
> +	test_zeroed_tile "\$tmm$i"
> +    }
> +}
> +
> +

> +gdb_test "continue" \
> +    ".*\\\[Inferior $decimal \\\(process $decimal\\\) exited normally\\]"

gdb_continue_to_end

> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> index 47cb2b23676..97664191244 100644
> --- a/gdb/testsuite/lib/gdb.exp
> +++ b/gdb/testsuite/lib/gdb.exp
> @@ -3429,6 +3429,73 @@ gdb_caching_proc skip_tsx_tests {
>      return $skip_tsx_tests
>  }
>  
> +# Run a test on the target to see if it supports AMX.  Return 0 if so,
> +# 1 if it does not.  Based on 'check_vmx_hw_available' from the GCC testsuite.
> +
> +gdb_caching_proc skip_amx_tests {
> +    global srcdir subdir gdb_prompt inferior_exited_re
> +
> +    set me "skip_amx_tests"
> +    if { ![istarget "x86_64-*-*"] } {
> +	verbose "$me:  target does not support AMX, returning 1" 2

Suprious double space after "me:".

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-05-11 11:41       ` Eli Zaretskii
@ 2022-06-27 18:16         ` Pedro Alves
  2022-06-27 18:24           ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Pedro Alves @ 2022-06-27 18:16 UTC (permalink / raw)
  To: Eli Zaretskii, Willgerodt, Felix; +Cc: gdb-patches

On 2022-05-11 12:41, Eli Zaretskii via Gdb-patches wrote:
>> From: "Willgerodt, Felix" <felix.willgerodt@intel.com>
>> CC: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
>> Date: Wed, 11 May 2022 08:14:26 +0000
>>
>>> Please make @cindex entries start with a lower-case letter.
>>> Otherwise, the index could be sorted differently in different locales.
>>>
>>
>> I tried it, it just doesn't look right to me if I don't use capital letters.
>> (Writing "advanced Matrix Extensions" or "advanced matrix extensions".)
>>
>> There are many index entries starting with capital letters, e.g. AArch64,
>> ARM or Ada. I see that Intel MPX is added as "Intel Memory Protection
>> Extensions (MPX)". Features of other vendors/architectures seem to
>> have similar formatting, like "AArch64 SVE" or "AArch64 Memory
>> Tagging Extension". Can I use the same formatting for AMX?
>> E.g. "Intel Advanced Memory Extensions (AMX)".
> 
> If you start with "Intel" (or another non-word), yes.

Hi Eli,

I'm curious about this.  What is different between "Intel" and "Advanced" here,
wrt to locale, since they are both upper case?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/4] gdb: define int512 and uint512 as built-in types.
  2022-05-06 12:12 ` [PATCH 1/4] gdb: define int512 and uint512 as built-in types Felix Willgerodt
  2022-05-06 12:19   ` Eli Zaretskii
@ 2022-06-27 18:17   ` Pedro Alves
  1 sibling, 0 replies; 27+ messages in thread
From: Pedro Alves @ 2022-06-27 18:17 UTC (permalink / raw)
  To: Felix Willgerodt, gdb-patches; +Cc: Aleksandar Paunovic

On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:
> From: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
> 
> Allow using int512 and uint512 as built-in types, particularly
> for the target descriptions.

This is OK.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-06-27 18:16         ` Pedro Alves
@ 2022-06-27 18:24           ` Eli Zaretskii
  2022-06-27 19:15             ` Pedro Alves
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2022-06-27 18:24 UTC (permalink / raw)
  To: Pedro Alves; +Cc: felix.willgerodt, gdb-patches

> Date: Mon, 27 Jun 2022 19:16:33 +0100
> Cc: gdb-patches@sourceware.org
> From: Pedro Alves <pedro@palves.net>
> 
> On 2022-05-11 12:41, Eli Zaretskii via Gdb-patches wrote:
> >> From: "Willgerodt, Felix" <felix.willgerodt@intel.com>
> >> CC: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
> >> Date: Wed, 11 May 2022 08:14:26 +0000
> >>
> >>> Please make @cindex entries start with a lower-case letter.
> >>> Otherwise, the index could be sorted differently in different locales.
> >>>
> >>
> >> I tried it, it just doesn't look right to me if I don't use capital letters.
> >> (Writing "advanced Matrix Extensions" or "advanced matrix extensions".)
> >>
> >> There are many index entries starting with capital letters, e.g. AArch64,
> >> ARM or Ada. I see that Intel MPX is added as "Intel Memory Protection
> >> Extensions (MPX)". Features of other vendors/architectures seem to
> >> have similar formatting, like "AArch64 SVE" or "AArch64 Memory
> >> Tagging Extension". Can I use the same formatting for AMX?
> >> E.g. "Intel Advanced Memory Extensions (AMX)".
> > 
> > If you start with "Intel" (or another non-word), yes.
> 
> Hi Eli,
> 
> I'm curious about this.  What is different between "Intel" and "Advanced" here,
> wrt to locale, since they are both upper case?

The locale doesn't matter, but people do.  "Intel" is not a word, so
it doesn't really matter in what place in the sort order it winds up,
as long as we always spell it with the capital "I".  But if "Advanced"
is in one place and "advanced" is in a very different place, people
may have trouble finding one or the other, if they look up stuff in
alphabetical order.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers.
  2022-05-06 12:12 ` [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers Felix Willgerodt
  2022-05-06 16:08   ` John Baldwin
@ 2022-06-27 18:30   ` Pedro Alves
  1 sibling, 0 replies; 27+ messages in thread
From: Pedro Alves @ 2022-06-27 18:30 UTC (permalink / raw)
  To: Felix Willgerodt, gdb-patches

On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:
> A couple of functions blindly allocate a buffer of the size of
> I386_MAX_REGISTER_SIZE.  With the addition of AMX, this size has increased
> drastically from 64 bytes to 8192.  This changes these buffer allocations
> to only use the actual amount needed, similar to how it is already done in
> amd64-tdep.c (amd64_pseudo_register_read_value).
> 
> For the i387_collect_xsave and i387_cache_to_xsave functions any feedback is
> welcome.  I opted to take the middle ground and only distinguish
> between "AMX" and "Not-AMX".  That might be unnecessary optimization,
> we could alternatively be okay with using an 8kB buffer unconditionally or
> be okay with having many smaller buffer allocations.

Seems fine, with the bare sizes issue John pointed out, addressed.  

VLAs are actually not part of standard C++, but seems like no compiler complains so
we can keep ignoring that...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/4] gdb: Clear tilecfg.start_row for any PC modification.
  2022-05-06 12:12 ` [PATCH 4/4] gdb: Clear tilecfg.start_row for any PC modification Felix Willgerodt
@ 2022-06-27 18:55   ` Pedro Alves
  0 siblings, 0 replies; 27+ messages in thread
From: Pedro Alves @ 2022-06-27 18:55 UTC (permalink / raw)
  To: Felix Willgerodt, gdb-patches

Hi Felix,

The GDB change LGTM.  

But just to be clear, it is always OK to reset start_row, even if we didn't stop for a tile
instruction fault?  It won't ever lead to inferior corruption, for example?  I'm thinking of how
for instance, we always write to the PC after a breakpoint, to adjust it by -1, so a breakpoint on
a tile insn will always lead to resetting start_row AFAICT.

Some comments on the testcase below.

On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:
> AMX tile instructions are restartable, e.g. on faults.  Tilecfg.start_row
> is used to restart the interrupted instructions at the right row.
> On inferior calls, jumps or any other PC modification, start_row needs
> to be reset.  It binds to the current instruction and not to the one we
> would start executing next in these cases.
> ---
>  gdb/amd64-linux-tdep.c                        |  24 ++++
>  gdb/testsuite/gdb.arch/amd64-amx-startrow.c   | 122 ++++++++++++++++++
>  gdb/testsuite/gdb.arch/amd64-amx-startrow.exp |  91 +++++++++++++
>  3 files changed, 237 insertions(+)
>  create mode 100644 gdb/testsuite/gdb.arch/amd64-amx-startrow.c
>  create mode 100755 gdb/testsuite/gdb.arch/amd64-amx-startrow.exp
> 
> diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
> index cda90de54c6..65cce7f36ed 100644
> --- a/gdb/amd64-linux-tdep.c
> +++ b/gdb/amd64-linux-tdep.c
> @@ -314,6 +314,30 @@ amd64_linux_write_pc (struct regcache *regcache, CORE_ADDR pc)
>       within GDB.  In all other cases the system call will not be
>       restarted.  */
>    regcache_cooked_write_unsigned (regcache, AMD64_LINUX_ORIG_RAX_REGNUM, -1);
> +
> +  /* If we have interrupted a restart-able AMX instruction we should clear

Odd hyphenization given "restartable" is a word.  (and you used it in the commit log.)

> +     start_row.  Any instructions we will now run should start at row 0.  */
> +  i386_gdbarch_tdep *tdep
> +      = (i386_gdbarch_tdep *) gdbarch_tdep (regcache->arch ());
> +  if (tdep != nullptr && tdep->tilecfg_raw_regnum != -1)
> +    {
> +      gdb_byte tilecfg_buf[register_size (regcache->arch (),
> +					  tdep->tilecfg_raw_regnum)];
> +
> +      if (regcache->raw_read (tdep->tilecfg_raw_regnum, tilecfg_buf)
> +	  != REG_VALID)
> +	{
> +	  warning (_ ("Could not reset $tilecfg.start_row."));
> +	  return;
> +	}
> +
> +      /* start_row is the second byte.  */
> +      if (tilecfg_buf[1] != 0)
> +	{
> +	  tilecfg_buf[1] = 0;
> +	  regcache->raw_write (AMD64_AMX_TILECFG_RAW_REGNUM, tilecfg_buf);
> +	}
> +    }
>  }
>  


> +standard_testfile
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} debug] } {
> +    return -1
> +}

I'd use build_executable instead here.  See below for why.

> +
> +proc test_startrow {test} {
> +    global gdb_prompt hex decimal srcfile
> +
> +    if { ![runto_main] } {
> +	untested "could not run to main"
> +	return -1
> +    }
> +
> +    set line1 [gdb_get_line_number "BP1"]
> +    set line2 [gdb_get_line_number "BP2"]
> +    set line_jump [gdb_get_line_number "Jump"]
> +    gdb_breakpoint $line1
> +    gdb_breakpoint $line2
> +
> +    gdb_continue_to_breakpoint "line1" ".*$srcfile:$line1.*"
> +
> +    # Set a watchpoint on the first page, which is un-protected.

un-protected -> unprotected ?

> +    set watch_addr 0
> +    gdb_test_multiple "p/x p2 - 8" "get watch_addr" {
> +	-re -wrap "= ($hex)" {
> +	    set watch_addr $expect_out(1,string)
> +	    pass $gdb_test_name
> +	}
> +    }

This could be instead:

  set watch_addr [get_valueof "/x" "p2 - 8" 0 "get watch_addr"]

> +
> +    # If we didn't get a watch_addr, it makes no sense to continue.
> +    if { $watch_addr == 0 } {
> +	return -1
> +    }
> +
> +    gdb_test "rwatch *(int*) $watch_addr" \
> +	"atchpoint $decimal: \\*\\(int\\*\\) $watch_addr"

This needs an explicit test name to avoid having addresses leak to gdb.sum results.

> +
> +    gdb_test "continue" \
> +	"Continuing.*atchpoint $decimal: \\*\\(int\\*\\) $watch_addr.*"
> +
> +    gdb_test "p \$tilecfg.start_row" "= \[1-9\]+" "print non-zero start_row"
> +
> +    if { $test == "jump" } {
> +	# Test jump.
> +	gdb_test "jump $line_jump" "Breakpoint $decimal, .*$srcfile:$line2.*"

Likewise, better avoid line numbers in gdb.sum.

Please double check other tests your adding for addresses and line numbers
in gdb.sum.

> +	gdb_test "p \$tilecfg.start_row" "= 0"
> +    } else {
> +	# Test infcall.
> +	gdb_test "p square (2, 2)" "Breakpoint $decimal, .*$srcfile:$line2.*"
> +	gdb_test "p \$tilecfg.start_row" "= 0"
> +    }
> +}
> +
> +with_test_prefix "infcall" {
> +    test_startrow ""
> +}
> +
> +clean_restart $binfile
> +
> +with_test_prefix "jump" {
> +    test_startrow "jump"
> +}

I'd write this instead:

foreach_with_prefix test {"infcall" "jump"} {
    clean_restart $binfile
    test_startrow $test
}

and this is the reason for using build_executable instead of prepare_for_testing
at the top -- because this loop calls clean_restart itself.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-06-27 18:24           ` Eli Zaretskii
@ 2022-06-27 19:15             ` Pedro Alves
  2022-06-28 12:09               ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Pedro Alves @ 2022-06-27 19:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: felix.willgerodt, gdb-patches

On 2022-06-27 19:24, Eli Zaretskii wrote:
>> Date: Mon, 27 Jun 2022 19:16:33 +0100
>> Cc: gdb-patches@sourceware.org
>> From: Pedro Alves <pedro@palves.net>
>>
>> On 2022-05-11 12:41, Eli Zaretskii via Gdb-patches wrote:
>>>> From: "Willgerodt, Felix" <felix.willgerodt@intel.com>
>>>> CC: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
>>>> Date: Wed, 11 May 2022 08:14:26 +0000
>>>>
>>>>> Please make @cindex entries start with a lower-case letter.
>>>>> Otherwise, the index could be sorted differently in different locales.
>>>>>
>>>>
>>>> I tried it, it just doesn't look right to me if I don't use capital letters.
>>>> (Writing "advanced Matrix Extensions" or "advanced matrix extensions".)
>>>>
>>>> There are many index entries starting with capital letters, e.g. AArch64,
>>>> ARM or Ada. I see that Intel MPX is added as "Intel Memory Protection
>>>> Extensions (MPX)". Features of other vendors/architectures seem to
>>>> have similar formatting, like "AArch64 SVE" or "AArch64 Memory
>>>> Tagging Extension". Can I use the same formatting for AMX?
>>>> E.g. "Intel Advanced Memory Extensions (AMX)".
>>>
>>> If you start with "Intel" (or another non-word), yes.
>>
>> Hi Eli,
>>
>> I'm curious about this.  What is different between "Intel" and "Advanced" here,
>> wrt to locale, since they are both upper case?
> 
> The locale doesn't matter, but people do.  "Intel" is not a word, so
> it doesn't really matter in what place in the sort order it winds up,
> as long as we always spell it with the capital "I".  But if "Advanced"
> is in one place and "advanced" is in a very different place, people
> may have trouble finding one or the other, if they look up stuff in
> alphabetical order.
> 

Given the manual is written in US English, I wonder why we let locale influence
sorting order.  I mean, shouldn't we be forcing locale to LANC=C or some such when
generating the manual, to be sure the sections are always sorted the same way?
(At least the html manual sorts the concept index ignoring case for me, and I see
the same in the docs copy in the gdb website, so I assume that's the order we want.)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-06-27 19:15             ` Pedro Alves
@ 2022-06-28 12:09               ` Eli Zaretskii
  2022-06-28 13:35                 ` Pedro Alves
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2022-06-28 12:09 UTC (permalink / raw)
  To: Pedro Alves; +Cc: felix.willgerodt, gdb-patches

> Date: Mon, 27 Jun 2022 20:15:17 +0100
> Cc: felix.willgerodt@intel.com, gdb-patches@sourceware.org
> From: Pedro Alves <pedro@palves.net>
> 
> Given the manual is written in US English, I wonder why we let locale influence
> sorting order.  I mean, shouldn't we be forcing locale to LANC=C or some such when
> generating the manual, to be sure the sections are always sorted the same way?

Doing this means we cannot include any non-ASCII text in the manual,
ever.  Not even mention names of people whose names include non-ASCII
characters.

Also, I think doing that means the Unicode characters produced by
makeinfo from the likes of @result, @print, @error, etc. will be
replaced by their ASCII equivalents -- do we really want that?

In sum, this would be an unusual thing to do, as GNU manuals go.
There is actually in recent years an urge to produce UTF-8 encoded
manuals, not go back to plain ASCII.

> (At least the html manual sorts the concept index ignoring case for me, and I see
> the same in the docs copy in the gdb website, so I assume that's the order we want.)

The purpose of this convention to let everyone produce a manual with
the same order, regardless of the locale in which the manual is
produced.  That manuals on the site, which are produced in en_US, do
TRT doesn't surprise me at all...

The burden in practice is not too heavy, since only a handful of our
index entries use company names (or any proper names, for that matter).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-06-28 12:09               ` Eli Zaretskii
@ 2022-06-28 13:35                 ` Pedro Alves
  0 siblings, 0 replies; 27+ messages in thread
From: Pedro Alves @ 2022-06-28 13:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: felix.willgerodt, gdb-patches

On 2022-06-28 13:09, Eli Zaretskii wrote:
>> Date: Mon, 27 Jun 2022 20:15:17 +0100
>> Cc: felix.willgerodt@intel.com, gdb-patches@sourceware.org
>> From: Pedro Alves <pedro@palves.net>
>>
>> Given the manual is written in US English, I wonder why we let locale influence
>> sorting order.  I mean, shouldn't we be forcing locale to LANC=C or some such when
>> generating the manual, to be sure the sections are always sorted the same way?
> 
> Doing this means we cannot include any non-ASCII text in the manual,
> ever.  Not even mention names of people whose names include non-ASCII
> characters.
> 

Are you sure?  I hacked in some non-ASCII characters, and did:

 cd gdb/doc
 LANG=C make
 LANG=C make html
 
and the non-ASCII characters were not lost, in either the texinfo manual nor the
html manual.

I also tried LC_COLLATE=C, LC_ALL=C.

> Also, I think doing that means the Unicode characters produced by
> makeinfo from the likes of @result, @print, @error, etc. will be
> replaced by their ASCII equivalents -- do we really want that?
> 
> In sum, this would be an unusual thing to do, as GNU manuals go.
> There is actually in recent years an urge to produce UTF-8 encoded
> manuals, not go back to plain ASCII.

That was just an example, I did say "or some such".  Could be LANG=C.UTF8 instead,
or LC_COLLATE=en_US.UTF-8, or some other setting, maybe even some texinfo flag
or something.  I don't know what influences texinfo's sorting behavior exactly.

Actually, I am surprised that LANG/LC_COLLATE/LC_ALL=C doesn't make the cindex
entries sort uppercase before lowercase.  I wonder what else one needs to do
to reproduce such an order without resorting to custom collations.

> 
>> (At least the html manual sorts the concept index ignoring case for me, and I see
>> the same in the docs copy in the gdb website, so I assume that's the order we want.)
> 
> The purpose of this convention to let everyone produce a manual with
> the same order, regardless of the locale in which the manual is
> produced.  That manuals on the site, which are produced in en_US, do
> TRT doesn't surprise me at all...
> 
> The burden in practice is not too heavy, since only a handful of our
> index entries use company names (or any proper names, for that matter).
> 

In the case at and, it forced the prepending of "Intel" to the name, so
someone that wouldn't find "Advanced" because they were looking for it
near the lowercase "a..." entries won't find it either.  It just seems a
little weird to me, to not be able to use uppercase in acronyms, since they
are names.  But as you say, it's not much of a burden, assuming we can
find some prefix word, like "Intel".  Myself, I don't think I actually read the index
entries sequentially, ever -- I instead hit the search function in the browser and type
what I'm looking for, so whether "advanced" is the first word or the second, doesn't
really matter.  I guess it might matter more for printed versions of the manual.  But
then in such case, I don't know whether I'd remember to look up under "Intel" rather
than "Advanced", and would probably end up just skimming the whole cindex for AMX.

Anyhow...  Thanks for the clarifications.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-06-27 18:12   ` Pedro Alves
@ 2022-07-14 10:54     ` Willgerodt, Felix
  2022-07-15 11:51       ` Willgerodt, Felix
  2022-08-08  9:15     ` Willgerodt, Felix
  1 sibling, 1 reply; 27+ messages in thread
From: Willgerodt, Felix @ 2022-07-14 10:54 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

> -----Original Message-----
> From: Pedro Alves <pedro@palves.net>
> Sent: Montag, 27. Juni 2022 20:12
> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-
> patches@sourceware.org
> Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
> 
> Hi Felix,
> 
> This largely looks good to me, though I have a couple questions.  See below.
> 

Hi Pedro,

Thanks for your review. Sorry for taking so long to reply, see my comments
below.

> On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:
> 
> >
> > +/* A helper function to re-size AMX pseudo registers during reads.  Copies
> > +   the contents from RAW_BUF to BUF and re-sizes the value.  */
> 
> I think this should say what does it mean when TILECFG is NULL.

The next version will add a sentence.

> > +
> > +static void
> > +amd64_tmm_resize_read (const tilecfg_reg *tilecfg, const gdb_byte
> *raw_buf,
> > +		       gdb_byte *buf, value *result_value, const int tmmnum)
> > +{
> > +  uint16_t columns = 64;
> > +  uint8_t rows = 16;
> > +
> > +  if (tilecfg != nullptr)
> > +    {
> > +      columns = tilecfg->bytes_per_row (tmmnum);
> > +      rows = tilecfg->rows (tmmnum);
> > +      if (columns == 0)
> > +	columns = 64;
> > +      if (rows == 0)
> > +	rows = 16;
> > +    }
> > +
> > +  gdb_assert (TYPE_LENGTH (value_type (result_value)) >= rows *
> columns);
> > +
> > +  /* Copy each row from raw_buf into buf.  The rows are not consecutive
> > +     but they are on MAX_BYTES_PER_ROW * iRow position.  */
> > +  const gdb_byte *raw_buf_offset
> > +    = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE;
> > +  for (uint8_t iRow = 0; iRow < rows; ++iRow)
> > +    {
> > +      memcpy (buf + columns * iRow,
> > +	      raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow,
> > +	      columns);
> > +    }
> > +
> > +  /* Adjust the result_value.  The value is a union of matrices of different
> > +     types.  See i386_tmm_type ().  This iterates over each member and
> > +     adjusts the dimensions according to the type.  */
> > +  for (int i = 0; i < value_type (result_value)->num_fields (); ++i)
> > +    {
> > +      type *rows_type = value_type (result_value)->fields ()[i].m_type;
> > +      type *cols_type = rows_type->main_type->target_type;
> > +
> > +      /* Adjust array bit lengths.  */
> > +      rows_type->length = columns * rows;
> > +      cols_type->length = columns;
> > +
> > +      /* Adjust array dimensions.  */
> > +      rows_type->bounds ()->high.set_const_val (rows - 1);
> > +      int num_bytes = cols_type->main_type->target_type->length;
> > +      cols_type->bounds ()->high.set_const_val (columns / num_bytes - 1);
> 
> Does any other target do in-place type rewriting like this?

I am not aware of anyone else that has done this exactly. ARM SVE has the
easier case of having only a vector, that you can just cut off or extend at the end.


>  That seems fishy.
> What happens e.g.,
> to values already in the value history that were recorded before the
> dimensions changed, for
> instance?  Will they suddenly start re-printing differently / incorrectly with
> their type changed
> behind their back?
> 
> Like:
> 
>  (gdb) print $reg  # some register or value mapped to a register that that ends
> up in the function above
>  $1 = ...  # before type changes
>  # something happens and the AMX type changes.
>  (gdb) print $reg
>  $2 = ...  # reflects type change
>  (gdb) print $1
>  $3 = ...  # what type does GDB use here?
> 
> Do the new tests cover something like this already?

No they don't cover this. A tilecfg change flushes the tmm register though.
When I set the tilecfg manually in GDB, indeed $1 changes as well.

	(gdb) p $tmm0.m_int8
	$1 = {{5, 5, 5, 5, 6, 6, 6, 6}}
	(gdb) p $tilecfg.tile0_colsb
	$2 = 8
	(gdb) p $tilecfg.tile0_colsb = 4
	$3 = 4
	(gdb) p $tmm0.m_int8
	$5 = {{5, 5, 5, 5}}
	(gdb) p $1
	$6 = {{5, 5, 5, 5}}

Good catch, I didn't think of this. We should fix that.

> This may likewise affect, e.g., watchpoints and displays.
> 
> I haven't traced the new code to check where do those types originally come
> from, but maybe it
> would work to reuse/extend the vla support to make those types have
> dynamic length and
> bounds (TYPE_DYNAMIC_LENGTH, DYN_PROP_BYTE_SIZE, etc.).

I have looked a bit at the dynamic length for types now, but that doesn't
seem to account for dimensions, just (byte) length or rank.
Or at least I don't see how we could use it here.

> Or maybe just tweak these functions such that you create a new type
> instead of changing the
> original type.  I don't know how frequently the array dimentions change and
> how open
> ended the dimensions are, but caching the type keyed on row/col sizes may
> work well to
> spare creating too many types, or actually creating them all the time.

I tried implementing this approach a while ago (without any type caching).
Having a i386_tmm_type() accept dimensions, creating the type directly.
And returning that instead of the manual resize.
The problem was that in value.c:value_fetch_lazy_register(), gdb just
copies the contents of NEW_VAL to VAL, assuming the same
type/length/dimensions. The "old" VAL comes from
findvar.c:value_of_register_lazy(), where it is fetched using regcache.c:register_type().
Which looks at regcache_descr->register_type.
In regcache.c, I see this old comment:

  /* Lay out the register cache.

     NOTE: cagney/2002-05-22: Only register_type () is used when
     constructing the register cache.  It is assumed that the
     register's raw size, virtual size and type length are all the
     same.  */

(What even is a virtual size?)

I struggle to figure out how to best address this.
Maybe allowing for multiple entries per register in the register_type table in regcache?
Not sure how much effort that is or if there are any other implications.

Or I could call gdbarch_register_type in regcache.c:register_type() again?
Maybe only conditionally, if the register_type was marked with a dynamic property?
Indicating that it can change at runtime and only the arch can figure it out.
But would that even solve the "$1 issue"?

I am really happy about any pointers.

> > +    }
> > +}
> > +
> 
> 
> >  static struct value *
> >  amd64_pseudo_register_read_value (struct gdbarch *gdbarch,
> >  				  readable_regcache *regcache,
> > @@ -401,6 +511,58 @@ amd64_pseudo_register_read_value (struct
> gdbarch *gdbarch,
> >  	mark_value_bytes_unavailable (result_value, 0,
> >  				      TYPE_LENGTH (value_type
> (result_value)));
> >      }
> > +  else if (i386_tilecfg_regnum_p (gdbarch, regnum))
> > +    {
> > +      /* Read tilecfg.  */
> > +      gdb_byte raw_buf[register_size (gdbarch, tdep-
> >tilecfg_raw_regnum)];
> > +      register_status status = regcache->raw_read (tdep-
> >tilecfg_raw_regnum,
> > +						   raw_buf);
> > +      if (status != REG_VALID)
> > +	{
> > +	  mark_value_bytes_unavailable (
> > +	    result_value, 0, TYPE_LENGTH (value_type (result_value)));
> 
> Formatting, "(" never ends a line.  This appears multiple times in the patch
> throughout.

I have addressed all occurrences locally, will be fixed in the next revision.

> > +	}
> > +      else
> > +	{
> > +	  /* Copy palette and start_row.  See tilecfg_type ().  */
> > +	  memcpy (buf, raw_buf, 2 * 1);
> > +	  /* Copy all colsb.  */
> > +	  memcpy (buf + 2, raw_buf + 16, 2 * 8);
> > +	  /* Copy all rows.  */
> > +	  memcpy (buf + 18, raw_buf + 48, 1 * 8);
> > +	}
> > +    }
> > +  else if (i386_tmm_regnum_p (gdbarch, regnum))
> > +    {
> > +      /* Read tilecfg.  */
> > +      gdb_byte tilecfg_buf[register_size (gdbarch, tdep-
> >tilecfg_raw_regnum)];
> > +      register_status status = regcache->raw_read (tdep-
> >tilecfg_raw_regnum,
> > +						   tilecfg_buf);
> > +      if (status != REG_VALID)
> > +	{
> > +	  mark_value_bytes_unavailable (
> > +	    result_value, 0, TYPE_LENGTH (value_type (result_value)));
> 
> Ditto.

Same.

> > +	}
> > +      else
> > +	{
> > +	  tilecfg_reg tilecfg{ tilecfg_buf };
> 
> GDB prefers using ()s for ctors.  Also space before '(' / '{'.  Thus:
> 
> 	  tilecfg_reg tilecfg (tilecfg_buf);

Same.

> > +	  gdb_byte raw_buf[register_size (gdbarch, tdep-
> >tiledata_regnum)];
> > +	  status = regcache->raw_read (tdep->tiledata_regnum, raw_buf);
> > +
> > +	  if (status != REG_VALID)
> > +	    {
> > +	      mark_value_bytes_unavailable (
> > +		result_value, 0, TYPE_LENGTH (value_type (result_value)));
> 
> Formatting.  (I'll stop pointing these out.)

Same.

> > +	    }
> > +	  else
> > +	    {
> > +	      /* Re-size value and copy data.  */
> > +	      amd64_tmm_resize_read (&tilecfg, raw_buf,
> > +				     buf, result_value,
> > +				     regnum - tdep->tmm_regnum);
> > +	    }
> > +	}
> > +    }
> >    else
> >      i386_pseudo_register_read_into_value (gdbarch, regcache, regnum,
> >  					  result_value);
> 
> >
> > +/* AMX tilecfg_reg constructor.  */
> > +
> > +tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg) : tilecfg_reg ()
> > +{
> 
> Please write:
> 
> tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg)
>   : tilecfg_reg ()
> {
>
> Though that is just calling the default ctor.  I don't think you need
> to that explicitly here.

I formatted it the way you suggested. But we can't drop the tilecfg_reg ()
(at least in this revision). That would make GDB segfault.
It is needed to initialize (and size) columns_and_rows.
We can only change it to an explicit initialization.

> > +  /* Use default values.  */
> > +  if (raw_tilecfg == nullptr)
> > +    return;
> > +
> > +  palette = raw_tilecfg[0];
> > +  start_row = raw_tilecfg[1];
> > +
> > +  /* Read TILECFG column and row values via pointers.
> > +     Columns are represented by 2 bytes and rows are represented
> > +     by 1 byte.  Column pointer which is *uint8_t needs to be converted
> > +     to *uint16_t pointer.  */
> > +  uint16_t *vec_col_pos
> > +      = reinterpret_cast<uint16_t *> (raw_tilecfg +
> COLUMN_MEMORY_OFFSET);
> > +  uint8_t *vec_row_pos = raw_tilecfg + ROW_MEMORY_OFFSET;
> > +
> > +  for (int i = 0; i < MAX_NAMES; i++)
> > +    columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] };
> 
> Where was columns_and_rows resized to MAX_NAMES?

In the default ctor, that is always called for the ctor.
That is why I need the initializer line with the explicit call above.

> > +}
> > +
> >  /* Convert a dbx register number REG to the appropriate register
> >     number used by GDB.  */
> >
> > @@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch)
> >    return tdep->i386_mmx_type;
> >  }
> >
> > +/* Construct vector type for TMM registers.  */
> > +
> > +static struct type *
> > +i386_tmm_type (struct gdbarch *gdbarch)
> > +{
> > +  i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep
> (gdbarch);
> > +
> > +  if (!tdep->i386_tmm_type)
> > +    {
> > +      const struct builtin_type *bt = builtin_type (gdbarch);
> > +
> > +      uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW;
> > +      uint8_t max_rows = tilecfg_reg::MAX_ROWS;
> > +
> > +      /* The type we're building is this:  */
> > +#if 0
> > +      union __gdb_builtin_type_matrix1024i
> > +      {
> > +	int8_t m_int8[max_rows][bytes_per_row];
> > +	uint8_t m_uint8[max_rows][bytes_per_row];
> > +	int32_t m_int32[max_rows][bytes_per_row/4];
> > +	bfloat16_t m_bfloat16[max_rows][bytes_per_row/2];
> > +	float m_int32[max_rows][bytes_per_row/4];
> > +      };
> > +#endif
> > +
> > +      struct type *t;
> > +      t = arch_composite_type (gdbarch, "builtin_type_tile",
> TYPE_CODE_UNION);
> > +
> > +      append_composite_type_field (
> 
> Formatting.  (same below.)

I struggle a bit with formatting these lines nicely. I changed them
so that no line ever ends on "(".

> > +	  t, "m_int8",
> > +	  init_vector_type (init_vector_type (bt->builtin_int8,
> bytes_per_row),
> > +			    max_rows));
> > +
> > +      append_composite_type_field (
> > +	  t, "m_uint8",
> > +	  init_vector_type (
> > +	      init_vector_type (bt->builtin_uint8, bytes_per_row),
> max_rows));
> > +
> > +      append_composite_type_field (
> > +	  t, "m_int32",
> > +	  init_vector_type (
> > +	      init_vector_type (bt->builtin_int32, bytes_per_row / 4),
> > +	      max_rows));
> > +
> > +      append_composite_type_field (
> > +	  t, "m_bf16",
> > +	  init_vector_type (
> > +	      init_vector_type (bt->builtin_bfloat16, bytes_per_row / 2),
> > +	      max_rows));
> > +
> > +      append_composite_type_field (
> > +	  t, "m_fp32",
> > +	  init_vector_type (
> > +	      init_vector_type (bt->builtin_float, bytes_per_row / 4),
> > +	      max_rows));
> > +
> > +      t->set_is_vector (true);
> > +      t->set_name ("builtin_type_tile");
> > +      tdep->i386_tmm_type = t;
> > +    }
> > +
> > +  return tdep->i386_tmm_type;
> > +}
> 
> 
> > @@ -348,7 +376,7 @@ enum record_i386_regnum
> >  #define I386_NUM_REGS		(I386_GSBASE_REGNUM + 1)
> >
> >  /* Size of the largest register.  */
> > -#define I386_MAX_REGISTER_SIZE	64
> > +#define I386_MAX_REGISTER_SIZE	8192
> >
> 
> *Eek*

Sorry. That is why Patch 4 exists ;)

> >  /* Types for i386-specific registers.  */
> >  extern struct type *i387_ext_type (struct gdbarch *gdbarch);
> > @@ -366,6 +394,8 @@ extern int i386_k_regnum_p (struct gdbarch
> *gdbarch, int regnum);
> >  extern int i386_zmm_regnum_p (struct gdbarch *gdbarch, int regnum);
> >  extern int i386_zmmh_regnum_p (struct gdbarch *gdbarch, int regnum);
> >  extern bool i386_pkru_regnum_p (struct gdbarch *gdbarch, int regnum);
> > +extern bool i386_tilecfg_regnum_p (struct gdbarch *gdbarch, int
> regnum);
> > +extern bool i386_tmm_regnum_p (struct gdbarch *gdbarch, int regnum);
> >
> >  extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
> >  					      int regnum);
> > @@ -485,4 +515,94 @@ extern int i386_stap_is_single_operand (struct
> gdbarch *gdbarch,
> >  extern expr::operation_up i386_stap_parse_special_token
> >       (struct gdbarch *gdbarch, struct stap_parse_info *p);
> >
> > +/* AMX utilities.  */
> > +
> > +/* TILECFG register.
> > +   0       palette
> > +   1       start_row
> > +   2-15    reserved, must be zero
> > +   16-17   tile0.colsb Tile 0 bytes per row.
> > +   18-19   tile1.colsb Tile 1 bytes per row.
> > +   20-21   tile2.colsb Tile 2 bytes per row.
> > +   ...     (sequence continues)
> > +   30-31   tile7.colsb Tile 7 bytes per row.
> > +   32-47   reserved, must be zero
> > +   48      tile0.rows Tile 0 rows.
> > +   49      tile1.rows Tile 1 rows.
> > +   50      tile2.rows Tile 2 rows.
> > +   ...     (sequence continues)
> > +   55      tile7.rows Tile 7 rows.
> > +   56-63   reserved, must be zero.  */
> > +
> > +/* TILECFG class representing the AMX Tilecfg register.  */
> > +
> > +class tilecfg_reg
> > +{
> > +public:
> > +  tilecfg_reg ()
> > +      : columns_and_rows (
> > +	  std::vector<std::pair<uint16_t, uint8_t>> (MAX_NAMES, { 0, 0 }))
> 
> Would it be possible to use a struct instead of a pair?  Pairs are kind of
> horrible
> for readability.  Some proper field names would be much clearer than "first"
> and "second".
> std::pair is great for general purpose templates, not so much otherwise.

I do agree, yet here it is not too bad to read IMO.
The pair has the advantage of not having to write more operators and ctors.
As they are already defined for pairs. Not for a vector of structs.
We need to be able to compare them.
That said, I won't insist if you still want it to be a struct. It is straightforward.

> Also, how about doing this columns_and_rows initialization where the
> field is declared?

I don't see much wrong with the current way. I can also add an explicit
initializer instead of the default ctor call. I don't think I can add it in the
class definition where it is declared. That doesn't compile.

> > +  {
> > +  }
> > +
> > +  /* Construct it from raw tilecfg data.  */
> > +  explicit tilecfg_reg (uint8_t *raw_tilecfg);
> 
> As pointed out above, the implementation of this ctor doesn't seem to resize
> the
> columns_and_rows vector, and just accesses the elements straight away:
> 
>    for (int i = 0; i < MAX_NAMES; i++)
>      columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] };
>  }
> 
> Moving the initialization to the member declaration would fix it,
> but maybe I'm missing something.

See above. The default ctor is called before the ctor.

> > +
> > +private:
> > +  /* This stores the colsb and rows entries.  */
> 
> I guess "colsb" is a typo for "cols"?

It isn't. It comes from the Intel Documentation wording. Indicating byte
columns I suppose. I will try to unify it. Probably more confusing than helping.

> > +  std::vector<std::pair<uint16_t, uint8_t>> columns_and_rows;
> > +};
> > +
> 
> > +set line1 [gdb_get_line_number "BP1"]
> > +set line2 [gdb_get_line_number "BP2"]
> > +gdb_breakpoint $line1
> > +gdb_breakpoint $line2
> > +
> > +# Registers should be displayed as zeroed before AMX enablement.
> > +with_test_prefix "Before AMX is enabled" {
> 
> Lowercase "Before".

Done

> > +    gdb_test "print \$tilecfg_raw" "= 0"
> > +    for {set i 0} {$i < 8} {incr i} {
> > +	test_zeroed_tile "\$tmm$i"
> > +    }
> > +}
> > +
> > +
> 
> > +gdb_test "continue" \
> > +    ".*\\\[Inferior $decimal \\\(process $decimal\\\) exited normally\\]"
> 
> gdb_continue_to_end

This one would be a good and obvious suggestion. But for me this
makes the test fail with the native-gdbserver board.

(gdb) PASS: gdb.arch/amd64-amx.exp: set tilecfg: print $tilecfg
continue^M
Continuing.^M
Remote connection closed^M
(gdb) FAIL: gdb.arch/amd64-amx.exp: continue until exit

This seems like a bug with gdb_continue_to_end.
I looked at the function and couldn't quickly figure out how to best
address this.

> > diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> > index 47cb2b23676..97664191244 100644
> > --- a/gdb/testsuite/lib/gdb.exp
> > +++ b/gdb/testsuite/lib/gdb.exp
> > @@ -3429,6 +3429,73 @@ gdb_caching_proc skip_tsx_tests {
> >      return $skip_tsx_tests
> >  }
> >
> > +# Run a test on the target to see if it supports AMX.  Return 0 if so,
> > +# 1 if it does not.  Based on 'check_vmx_hw_available' from the GCC
> testsuite.
> > +
> > +gdb_caching_proc skip_amx_tests {
> > +    global srcdir subdir gdb_prompt inferior_exited_re
> > +
> > +    set me "skip_amx_tests"
> > +    if { ![istarget "x86_64-*-*"] } {
> > +	verbose "$me:  target does not support AMX, returning 1" 2
> 
> Suprious double space after "me:".

This isn't spurious, but the style of all these skip_feature_tests functions.
Not that I like the style, but I would prefer to keep it consistent.

Thanks,
Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-07-14 10:54     ` Willgerodt, Felix
@ 2022-07-15 11:51       ` Willgerodt, Felix
  0 siblings, 0 replies; 27+ messages in thread
From: Willgerodt, Felix @ 2022-07-15 11:51 UTC (permalink / raw)
  To: Willgerodt, Felix, Pedro Alves, gdb-patches

> > > +gdb_test "continue" \
> > > +    ".*\\\[Inferior $decimal \\\(process $decimal\\\) exited normally\\]"
> >
> > gdb_continue_to_end
> 
> This one would be a good and obvious suggestion. But for me this
> makes the test fail with the native-gdbserver board.
> 
> (gdb) PASS: gdb.arch/amd64-amx.exp: set tilecfg: print $tilecfg
> continue^M
> Continuing.^M
> Remote connection closed^M
> (gdb) FAIL: gdb.arch/amd64-amx.exp: continue until exit
> 
> This seems like a bug with gdb_continue_to_end.
> I looked at the function and couldn't quickly figure out how to best
> address this.

Sorry, I didn't look closely enough yesterday. This also happens with my
previous test as well. So it is obviously not a bug with
gdb_continue_to_end.

I am still debugging this, I never saw this before a recent rebase.
It seems like gdbserver is exiting too early, I am not even hitting
the BPs for the second matrix branch. 

Regards,
Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-06-27 18:12   ` Pedro Alves
  2022-07-14 10:54     ` Willgerodt, Felix
@ 2022-08-08  9:15     ` Willgerodt, Felix
  2022-08-08 17:16       ` John Baldwin
  1 sibling, 1 reply; 27+ messages in thread
From: Willgerodt, Felix @ 2022-08-08  9:15 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

> -----Original Message-----
> From: Willgerodt, Felix
> Sent: Donnerstag, 14. Juli 2022 12:55
> To: Pedro Alves <pedro@palves.net>; gdb-patches@sourceware.org
> Subject: RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
> 
> > -----Original Message-----
> > From: Pedro Alves <pedro@palves.net>
> > Sent: Montag, 27. Juni 2022 20:12
> > To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-
> > patches@sourceware.org
> > Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
> >
> > Hi Felix,
> >
> > This largely looks good to me, though I have a couple questions.  See
> below.
> >
> 
> Hi Pedro,
> 
> Thanks for your review. Sorry for taking so long to reply, see my comments
> below.
> 
> > On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:
> >
> > >
> > > +/* A helper function to re-size AMX pseudo registers during reads.
> Copies
> > > +   the contents from RAW_BUF to BUF and re-sizes the value.  */
> >
> > I think this should say what does it mean when TILECFG is NULL.
> 
> The next version will add a sentence.
> 
> > > +
> > > +static void
> > > +amd64_tmm_resize_read (const tilecfg_reg *tilecfg, const gdb_byte
> > *raw_buf,
> > > +		       gdb_byte *buf, value *result_value, const int tmmnum)
> > > +{
> > > +  uint16_t columns = 64;
> > > +  uint8_t rows = 16;
> > > +
> > > +  if (tilecfg != nullptr)
> > > +    {
> > > +      columns = tilecfg->bytes_per_row (tmmnum);
> > > +      rows = tilecfg->rows (tmmnum);
> > > +      if (columns == 0)
> > > +	columns = 64;
> > > +      if (rows == 0)
> > > +	rows = 16;
> > > +    }
> > > +
> > > +  gdb_assert (TYPE_LENGTH (value_type (result_value)) >= rows *
> > columns);
> > > +
> > > +  /* Copy each row from raw_buf into buf.  The rows are not
> consecutive
> > > +     but they are on MAX_BYTES_PER_ROW * iRow position.  */
> > > +  const gdb_byte *raw_buf_offset
> > > +    = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE;
> > > +  for (uint8_t iRow = 0; iRow < rows; ++iRow)
> > > +    {
> > > +      memcpy (buf + columns * iRow,
> > > +	      raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow,
> > > +	      columns);
> > > +    }
> > > +
> > > +  /* Adjust the result_value.  The value is a union of matrices of different
> > > +     types.  See i386_tmm_type ().  This iterates over each member and
> > > +     adjusts the dimensions according to the type.  */
> > > +  for (int i = 0; i < value_type (result_value)->num_fields (); ++i)
> > > +    {
> > > +      type *rows_type = value_type (result_value)->fields ()[i].m_type;
> > > +      type *cols_type = rows_type->main_type->target_type;
> > > +
> > > +      /* Adjust array bit lengths.  */
> > > +      rows_type->length = columns * rows;
> > > +      cols_type->length = columns;
> > > +
> > > +      /* Adjust array dimensions.  */
> > > +      rows_type->bounds ()->high.set_const_val (rows - 1);
> > > +      int num_bytes = cols_type->main_type->target_type->length;
> > > +      cols_type->bounds ()->high.set_const_val (columns / num_bytes -
> 1);
> >
> > Does any other target do in-place type rewriting like this?
> 
> I am not aware of anyone else that has done this exactly. ARM SVE has the
> easier case of having only a vector, that you can just cut off or extend at the
> end.
> 
> 
> >  That seems fishy.
> > What happens e.g.,
> > to values already in the value history that were recorded before the
> > dimensions changed, for
> > instance?  Will they suddenly start re-printing differently / incorrectly with
> > their type changed
> > behind their back?
> >
> > Like:
> >
> >  (gdb) print $reg  # some register or value mapped to a register that that
> ends
> > up in the function above
> >  $1 = ...  # before type changes
> >  # something happens and the AMX type changes.
> >  (gdb) print $reg
> >  $2 = ...  # reflects type change
> >  (gdb) print $1
> >  $3 = ...  # what type does GDB use here?
> >
> > Do the new tests cover something like this already?
> 
> No they don't cover this. A tilecfg change flushes the tmm register though.
> When I set the tilecfg manually in GDB, indeed $1 changes as well.
> 
> 	(gdb) p $tmm0.m_int8
> 	$1 = {{5, 5, 5, 5, 6, 6, 6, 6}}
> 	(gdb) p $tilecfg.tile0_colsb
> 	$2 = 8
> 	(gdb) p $tilecfg.tile0_colsb = 4
> 	$3 = 4
> 	(gdb) p $tmm0.m_int8
> 	$5 = {{5, 5, 5, 5}}
> 	(gdb) p $1
> 	$6 = {{5, 5, 5, 5}}
> 
> Good catch, I didn't think of this. We should fix that.
> 
> > This may likewise affect, e.g., watchpoints and displays.
> >
> > I haven't traced the new code to check where do those types originally
> come
> > from, but maybe it
> > would work to reuse/extend the vla support to make those types have
> > dynamic length and
> > bounds (TYPE_DYNAMIC_LENGTH, DYN_PROP_BYTE_SIZE, etc.).
> 
> I have looked a bit at the dynamic length for types now, but that doesn't
> seem to account for dimensions, just (byte) length or rank.
> Or at least I don't see how we could use it here.
> 
> > Or maybe just tweak these functions such that you create a new type
> > instead of changing the
> > original type.  I don't know how frequently the array dimentions change
> and
> > how open
> > ended the dimensions are, but caching the type keyed on row/col sizes
> may
> > work well to
> > spare creating too many types, or actually creating them all the time.
> 
> I tried implementing this approach a while ago (without any type caching).
> Having a i386_tmm_type() accept dimensions, creating the type directly.
> And returning that instead of the manual resize.
> The problem was that in value.c:value_fetch_lazy_register(), gdb just
> copies the contents of NEW_VAL to VAL, assuming the same
> type/length/dimensions. The "old" VAL comes from
> findvar.c:value_of_register_lazy(), where it is fetched using
> regcache.c:register_type().
> Which looks at regcache_descr->register_type.
> In regcache.c, I see this old comment:
> 
>   /* Lay out the register cache.
> 
>      NOTE: cagney/2002-05-22: Only register_type () is used when
>      constructing the register cache.  It is assumed that the
>      register's raw size, virtual size and type length are all the
>      same.  */
> 
> (What even is a virtual size?)
> 
> I struggle to figure out how to best address this.
> Maybe allowing for multiple entries per register in the register_type table in
> regcache?
> Not sure how much effort that is or if there are any other implications.
> 
> Or I could call gdbarch_register_type in regcache.c:register_type() again?
> Maybe only conditionally, if the register_type was marked with a dynamic
> property?
> Indicating that it can change at runtime and only the arch can figure it out.
> But would that even solve the "$1 issue"?
> 
> I am really happy about any pointers.


Hi Pedro,

Did you get a chance to look at this again? I did find a fix for the
issue you pointed out. But I am not sure if my approach is right.

Basically my fix avoids using the type caching for some pseudo regs:

--- a/gdb/regcache.c
+++ b/gdb/regcache.c
@@ -160,7 +160,14 @@ register_type (struct gdbarch *gdbarch, int regnum)
   struct regcache_descr *descr = regcache_descr (gdbarch);
 
   gdb_assert (regnum >= 0 && regnum < descr->nr_cooked_registers);
-  return descr->register_type[regnum];
+
+  /* Some architectures have variable length vector pseudo registers,
+     whose type needs to be re-evaluated at runtime.  */
+  struct type *t = descr->register_type[regnum];
+  if (gdbarch_num_regs (gdbarch) < regnum && t->is_vector ())
+    t = gdbarch_register_type (gdbarch, regnum);
+
+  return t;
 }

I tried to have it like this first:

+  if (gdbarch_num_regs (gdbarch) < regnum && TYPE_DYNAMIC_LENGTH(t))

However a dynamic property needs to be objfile owned (see
gdbtypes.c:add_dyn_prop). Which seems wrong for register types.
Then again, I am not sure if is_vector() would be considered an acceptable
condition.

Would this approach (disabling type caching for certain cases) be good enough?
With this approach I can avoid the "on-the-fly" type resizing in my current patches
and fix the $1 problem.

Thanks,
Felix


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
  2022-08-08  9:15     ` Willgerodt, Felix
@ 2022-08-08 17:16       ` John Baldwin
  0 siblings, 0 replies; 27+ messages in thread
From: John Baldwin @ 2022-08-08 17:16 UTC (permalink / raw)
  To: Willgerodt, Felix, Pedro Alves, gdb-patches

On 8/8/22 2:15 AM, Willgerodt, Felix via Gdb-patches wrote:
>> -----Original Message-----
>> From: Willgerodt, Felix
>> Sent: Donnerstag, 14. Juli 2022 12:55
>> To: Pedro Alves <pedro@palves.net>; gdb-patches@sourceware.org
>> Subject: RE: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
>>
>>> -----Original Message-----
>>> From: Pedro Alves <pedro@palves.net>
>>> Sent: Montag, 27. Juni 2022 20:12
>>> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-
>>> patches@sourceware.org
>>> Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers.
>>>
>>> Hi Felix,
>>>
>>> This largely looks good to me, though I have a couple questions.  See
>> below.
>>>
>>
>> Hi Pedro,
>>
>> Thanks for your review. Sorry for taking so long to reply, see my comments
>> below.
>>
>>> On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote:
>>>
>>>>
>>>> +/* A helper function to re-size AMX pseudo registers during reads.
>> Copies
>>>> +   the contents from RAW_BUF to BUF and re-sizes the value.  */
>>>
>>> I think this should say what does it mean when TILECFG is NULL.
>>
>> The next version will add a sentence.
>>
>>>> +
>>>> +static void
>>>> +amd64_tmm_resize_read (const tilecfg_reg *tilecfg, const gdb_byte
>>> *raw_buf,
>>>> +		       gdb_byte *buf, value *result_value, const int tmmnum)
>>>> +{
>>>> +  uint16_t columns = 64;
>>>> +  uint8_t rows = 16;
>>>> +
>>>> +  if (tilecfg != nullptr)
>>>> +    {
>>>> +      columns = tilecfg->bytes_per_row (tmmnum);
>>>> +      rows = tilecfg->rows (tmmnum);
>>>> +      if (columns == 0)
>>>> +	columns = 64;
>>>> +      if (rows == 0)
>>>> +	rows = 16;
>>>> +    }
>>>> +
>>>> +  gdb_assert (TYPE_LENGTH (value_type (result_value)) >= rows *
>>> columns);
>>>> +
>>>> +  /* Copy each row from raw_buf into buf.  The rows are not
>> consecutive
>>>> +     but they are on MAX_BYTES_PER_ROW * iRow position.  */
>>>> +  const gdb_byte *raw_buf_offset
>>>> +    = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE;
>>>> +  for (uint8_t iRow = 0; iRow < rows; ++iRow)
>>>> +    {
>>>> +      memcpy (buf + columns * iRow,
>>>> +	      raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow,
>>>> +	      columns);
>>>> +    }
>>>> +
>>>> +  /* Adjust the result_value.  The value is a union of matrices of different
>>>> +     types.  See i386_tmm_type ().  This iterates over each member and
>>>> +     adjusts the dimensions according to the type.  */
>>>> +  for (int i = 0; i < value_type (result_value)->num_fields (); ++i)
>>>> +    {
>>>> +      type *rows_type = value_type (result_value)->fields ()[i].m_type;
>>>> +      type *cols_type = rows_type->main_type->target_type;
>>>> +
>>>> +      /* Adjust array bit lengths.  */
>>>> +      rows_type->length = columns * rows;
>>>> +      cols_type->length = columns;
>>>> +
>>>> +      /* Adjust array dimensions.  */
>>>> +      rows_type->bounds ()->high.set_const_val (rows - 1);
>>>> +      int num_bytes = cols_type->main_type->target_type->length;
>>>> +      cols_type->bounds ()->high.set_const_val (columns / num_bytes -
>> 1);
>>>
>>> Does any other target do in-place type rewriting like this?
>>
>> I am not aware of anyone else that has done this exactly. ARM SVE has the
>> easier case of having only a vector, that you can just cut off or extend at the
>> end.
>>
>>
>>>   That seems fishy.
>>> What happens e.g.,
>>> to values already in the value history that were recorded before the
>>> dimensions changed, for
>>> instance?  Will they suddenly start re-printing differently / incorrectly with
>>> their type changed
>>> behind their back?
>>>
>>> Like:
>>>
>>>   (gdb) print $reg  # some register or value mapped to a register that that
>> ends
>>> up in the function above
>>>   $1 = ...  # before type changes
>>>   # something happens and the AMX type changes.
>>>   (gdb) print $reg
>>>   $2 = ...  # reflects type change
>>>   (gdb) print $1
>>>   $3 = ...  # what type does GDB use here?
>>>
>>> Do the new tests cover something like this already?
>>
>> No they don't cover this. A tilecfg change flushes the tmm register though.
>> When I set the tilecfg manually in GDB, indeed $1 changes as well.
>>
>> 	(gdb) p $tmm0.m_int8
>> 	$1 = {{5, 5, 5, 5, 6, 6, 6, 6}}
>> 	(gdb) p $tilecfg.tile0_colsb
>> 	$2 = 8
>> 	(gdb) p $tilecfg.tile0_colsb = 4
>> 	$3 = 4
>> 	(gdb) p $tmm0.m_int8
>> 	$5 = {{5, 5, 5, 5}}
>> 	(gdb) p $1
>> 	$6 = {{5, 5, 5, 5}}
>>
>> Good catch, I didn't think of this. We should fix that.
>>
>>> This may likewise affect, e.g., watchpoints and displays.
>>>
>>> I haven't traced the new code to check where do those types originally
>> come
>>> from, but maybe it
>>> would work to reuse/extend the vla support to make those types have
>>> dynamic length and
>>> bounds (TYPE_DYNAMIC_LENGTH, DYN_PROP_BYTE_SIZE, etc.).
>>
>> I have looked a bit at the dynamic length for types now, but that doesn't
>> seem to account for dimensions, just (byte) length or rank.
>> Or at least I don't see how we could use it here.
>>
>>> Or maybe just tweak these functions such that you create a new type
>>> instead of changing the
>>> original type.  I don't know how frequently the array dimentions change
>> and
>>> how open
>>> ended the dimensions are, but caching the type keyed on row/col sizes
>> may
>>> work well to
>>> spare creating too many types, or actually creating them all the time.
>>
>> I tried implementing this approach a while ago (without any type caching).
>> Having a i386_tmm_type() accept dimensions, creating the type directly.
>> And returning that instead of the manual resize.
>> The problem was that in value.c:value_fetch_lazy_register(), gdb just
>> copies the contents of NEW_VAL to VAL, assuming the same
>> type/length/dimensions. The "old" VAL comes from
>> findvar.c:value_of_register_lazy(), where it is fetched using
>> regcache.c:register_type().
>> Which looks at regcache_descr->register_type.
>> In regcache.c, I see this old comment:
>>
>>    /* Lay out the register cache.
>>
>>       NOTE: cagney/2002-05-22: Only register_type () is used when
>>       constructing the register cache.  It is assumed that the
>>       register's raw size, virtual size and type length are all the
>>       same.  */
>>
>> (What even is a virtual size?)
>>
>> I struggle to figure out how to best address this.
>> Maybe allowing for multiple entries per register in the register_type table in
>> regcache?
>> Not sure how much effort that is or if there are any other implications.
>>
>> Or I could call gdbarch_register_type in regcache.c:register_type() again?
>> Maybe only conditionally, if the register_type was marked with a dynamic
>> property?
>> Indicating that it can change at runtime and only the arch can figure it out.
>> But would that even solve the "$1 issue"?
>>
>> I am really happy about any pointers.
> 
> 
> Hi Pedro,
> 
> Did you get a chance to look at this again? I did find a fix for the
> issue you pointed out. But I am not sure if my approach is right.
> 
> Basically my fix avoids using the type caching for some pseudo regs:
> 
> --- a/gdb/regcache.c
> +++ b/gdb/regcache.c
> @@ -160,7 +160,14 @@ register_type (struct gdbarch *gdbarch, int regnum)
>     struct regcache_descr *descr = regcache_descr (gdbarch);
>   
>     gdb_assert (regnum >= 0 && regnum < descr->nr_cooked_registers);
> -  return descr->register_type[regnum];
> +
> +  /* Some architectures have variable length vector pseudo registers,
> +     whose type needs to be re-evaluated at runtime.  */
> +  struct type *t = descr->register_type[regnum];
> +  if (gdbarch_num_regs (gdbarch) < regnum && t->is_vector ())
> +    t = gdbarch_register_type (gdbarch, regnum);
> +
> +  return t;
>   }
> 
> I tried to have it like this first:
> 
> +  if (gdbarch_num_regs (gdbarch) < regnum && TYPE_DYNAMIC_LENGTH(t))
> 
> However a dynamic property needs to be objfile owned (see
> gdbtypes.c:add_dyn_prop). Which seems wrong for register types.
> Then again, I am not sure if is_vector() would be considered an acceptable
> condition.
> 
> Would this approach (disabling type caching for certain cases) be good enough?
> With this approach I can avoid the "on-the-fly" type resizing in my current patches
> and fix the $1 problem.
> 
> Thanks,
> Felix

I do think that if TYPE_DYNAMIC_LENGTH can't be used that it is probably
worth adding an explicit type flag for this case rather than overloading
is_vector.

-- 
John Baldwin

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-08-08 17:16 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-06 12:12 [PATCH 0/4] Add AMX support Felix Willgerodt
2022-05-06 12:12 ` [PATCH 1/4] gdb: define int512 and uint512 as built-in types Felix Willgerodt
2022-05-06 12:19   ` Eli Zaretskii
2022-06-27 18:17   ` Pedro Alves
2022-05-06 12:12 ` [PATCH 2/4] gdb, gdbserver: Add AMX registers Felix Willgerodt
2022-05-06 12:25   ` Eli Zaretskii
2022-05-11  8:14     ` Willgerodt, Felix
2022-05-11 11:41       ` Eli Zaretskii
2022-06-27 18:16         ` Pedro Alves
2022-06-27 18:24           ` Eli Zaretskii
2022-06-27 19:15             ` Pedro Alves
2022-06-28 12:09               ` Eli Zaretskii
2022-06-28 13:35                 ` Pedro Alves
2022-05-06 16:17   ` John Baldwin
2022-05-09  7:04     ` Willgerodt, Felix
2022-05-09 16:31       ` John Baldwin
2022-06-27 18:12   ` Pedro Alves
2022-07-14 10:54     ` Willgerodt, Felix
2022-07-15 11:51       ` Willgerodt, Felix
2022-08-08  9:15     ` Willgerodt, Felix
2022-08-08 17:16       ` John Baldwin
2022-05-06 12:12 ` [PATCH 3/4] gdb, gdbserver: Allocate only a sane amount of buffer when fetching registers Felix Willgerodt
2022-05-06 16:08   ` John Baldwin
2022-05-09  7:04     ` Willgerodt, Felix
2022-06-27 18:30   ` Pedro Alves
2022-05-06 12:12 ` [PATCH 4/4] gdb: Clear tilecfg.start_row for any PC modification Felix Willgerodt
2022-06-27 18:55   ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).