public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* PATCH: 1/6: Add AVX support
@ 2010-03-04 18:02 H.J. Lu
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
                   ` (2 more replies)
  0 siblings, 3 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 18:02 UTC (permalink / raw)
  To: GDB

AVX registers are saved and restored via the XSAVE extended state. The
extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
is used to determine which states, x87, SSE, AVX, ... are supported
in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
instruction.  The xstate_bv field at byte offset 512 in the XSAVE
extended state indicates what states the current process is in. If
the feature bit is cleared, the corresponding registers should be read as
0. If we update a register, we should set the corresponding feature
bit.

We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
fetch and store AVX registers with ptrace. Linux kernel also stores
XCR0 at the first 8 bytes of the software usable bytes, starting at
byte offset 464.

There are total 6 patches to add AVX support for Linux. The first
patch to provide AVX XML target decriptions is at

http://sourceware.org/ml/gdb-patches/2010-03/msg00092.html

They support:

1. Backward compatible. If AVX isn't supported, SSE will be used.
2. Forward compatible. If new state beyond AVX is supported in
the XSAVE extended state, only AVX state will be used.
3. XMM pseudo register. When AVX is available, $xmmX can be used
to access the lower 128bit of $ymmX.
4. Remote gdb protocol extension. GDB will send

x86:xstate=BYTES:xcr0=VALUE

in qSupported request packet to indicate that GDB supports x86 XSAVE
extended state. BYTES specifies the maximum size in bytes of x86 XSAVE
extended state GDB supports. VALUE specifies the maximum value of XCR0
GDB supports.  Gdbserver will select the best target description
supported by GDB, based on BYTES and VALUE. The older gdbserver will
always return SSE target.

To support AVX on other OSes, the following changes are needed:

1. Kernel support to get/set the XSAVE extended state.
2. Provide target to_read_description to return SSE or AVX target
description.
3. Update gdbarch_core_read_description to return SSE or AVX target
description based on contents of core dump.


H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 2/6: Add AVX support (Update document)
  2010-03-04 18:02 PATCH: 1/6: Add AVX support H.J. Lu
@ 2010-03-04 18:05 ` H.J. Lu
  2010-03-04 18:06   ` PATCH: 3/6: Add AVX support (i386 changes) H.J. Lu
                     ` (3 more replies)
  2010-03-04 19:09 ` PATCH: 1/6: Add AVX support Daniel Jacobowitz
  2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
  2 siblings, 4 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 18:05 UTC (permalink / raw)
  To: GDB

Hi,

This patch updates document for AVX support.  OK to install?

Thanks.


H.J.
---
2010-03-03  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.texinfo (General Query Packets): Document x86:xstate
	extension in gdb remote protocol.
	(i386 Features): Add org.gnu.gdb.i386.avx.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 6bb7d52..9bb79ae 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -30250,6 +30250,16 @@ extensions to the remote protocol.  @value{GDBN} does not use such
 extensions unless the stub also reports that it supports them by
 including @samp{multiprocess+} in its @samp{qSupported} reply.
 @xref{multiprocess extensions}, for details.
+
+@item x86:xstate=@var{bytes}:xcr0=@var{value}
+This feature indicates that @value{GDBN} supports x86 XSAVE extended
+state. @var{bytes} specifies the maximum size in bytes of x86 XSAVE
+extended state @value{GDBN} supports. @var{value} specifies the
+maximum value of the extended control register 0 (the
+XFEATURE_ENABLED_MASK register) @value{GDBN} supports.  The stub should
+select the best target description supported by @value{GDBN}, based on
+@var{bytes} and @var{value}.  @var{bytes} and @var{value} are encoded
+as @sc{ascii} string in hexadecimal or decimal numbers.
 @end table
 
 Stubs should ignore any unknown values for
@@ -33320,8 +33330,7 @@ targets.  It should describe the following registers:
 
 The register sets may be different, depending on the target.
 
-The @samp{org.gnu.gdb.i386.sse} feature is required.  It should
-describe registers:
+The @samp{org.gnu.gdb.i386.sse} feature should describe registers:
 
 @itemize @minus
 @item
@@ -33332,6 +33341,20 @@ describe registers:
 @samp{mxcsr}
 @end itemize
 
+The @samp{org.gnu.gdb.i386.avx} feature should describe registers:
+
+@itemize @minus
+@item
+@samp{ymm0} through @samp{ymm7} for i386
+@item
+@samp{ymm0} through @samp{ymm15} for amd64
+@item 
+@samp{mxcsr}
+@end itemize
+
+One of the @samp{org.gnu.gdb.i386.sse} or @samp{org.gnu.gdb.i386.avx}
+feature is required, not both.
+
 The @samp{org.gnu.gdb.i386.linux} feature is optional.  It should
 describe a single register, @samp{orig_eax}.
 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 3/6: Add AVX support (i386 changes)
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
@ 2010-03-04 18:06   ` H.J. Lu
  2010-03-06 22:21     ` PATCH: 3/6 [2nd try]: " H.J. Lu
  2010-03-04 18:08   ` PATCH: 4/6: Add AVX support (amd64 changes) H.J. Lu
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 18:06 UTC (permalink / raw)
  To: GDB

Hi,

Here are i386 changes to support AVX. OK to install?

Thanks.


H.J.
----
2010-03-04  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-darwin-tdep.c (i386_darwin_init_abi): Replace num_xmm_regs
	with num_vector_regs.

	* i386-linux-nat.c: Include "regset.h", "elf/common.h" and
	<sys/uio.h>.
	(xstate_size): New.
	(xstate_size_n_of_int64): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Make it global.  Add
	".reg-xstate".
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_regset_sections): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-nto-tdep.c (i386nto_register_area): Replace
	I387_XMM0_REGNUM with I387_VECTOR0_REGNUM.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_register_names): Renamed to ...
	(i386_sse_register_names): This.
	(i386_avx_register_names): New.
	(i386_xmm_names): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_vector_regnum_p): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_xmm_type): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_mxcsr_regnum_p): Replace I387_XMM0_REGNUM with
	I387_VECTOR0_REGNUM.
	(i386_dbx_reg_to_regnum): Likewise.
	(i386_pseudo_register_name): Support pseudo XMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_register_reggroup_p): Likewise.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_go32_init_abi): Replace num_xmm_regs with num_vector_regs.
	(i386_validate_tdesc_p): Check org.gnu.gdb.i386.avx feature.
	Set xcr0.
	(i386_gdbarch_init): Set xstateregset to NULL.  Replace
	num_xmm_regs with num_vector_regs.  Set num_xmm_regs.  Add
	num_xmm_regs to set_gdbarch_num_pseudo_regs.  Call
	set_gdbarch_qsupported.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, xmm0_regnum,
	num_vector_regs, xcr0, xsave_xcr0_offset and i386_xmm_type.
	(I386_MAX_REGISTER_SIZE): Changed to 32.
	(i386_xmm_regnum_p): New.

	* common/i386-xstate.h: New.
	* config/i386/nm-linux-xstate.h: Likewise.
	* config/i386/nm-linux64.h: Likewise.

	* config/i386/linux64.mh (NAT_FILE): Set to nm-linux64.h.

	* config/i386/nm-linux.h: Include "config/i386/nm-linux-xstate.h".

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..8089e10
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,53 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define bit_I386_XSTATE_X87		(1ULL << 0)
+#define bit_I386_XSTATE_SSE		(1ULL << 1)
+#define bit_I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	\
+  (bit_I386_XSTATE_X87 | bit_I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	\
+  (I386_XSTATE_SSE_MASK | bit_I386_XSTATE_AVX)
+#define I386_XSTATE_MAX_MASK	\
+  I386_XSTATE_AVX_MASK
+
+#define I386_XSTATE_SSE_MASK_STRING	"0x3"
+#define I386_XSTATE_AVX_MASK_STRING	"0x7"
+#define I386_XSTATE_MAX_MASK_STRING	"0x7"
+
+#define I386_XSTATE_SSE_SIZE		576
+#define I386_XSTATE_AVX_SIZE		832
+#define I386_XSTATE_MAX_SIZE		832
+
+#define I386_XSTATE_SSE_SIZE_STRING	"576"
+#define I386_XSTATE_AVX_SIZE_STRING	"832"
+#define I386_XSTATE_MAX_SIZE_STRING	"832"
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & bit_I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/config/i386/linux64.mh b/gdb/config/i386/linux64.mh
index 19f3be0..99a5042 100644
--- a/gdb/config/i386/linux64.mh
+++ b/gdb/config/i386/linux64.mh
@@ -2,7 +2,7 @@
 NATDEPFILES= inf-ptrace.o fork-child.o \
 	i386-nat.o amd64-nat.o amd64-linux-nat.o linux-nat.o \
 	proc-service.o linux-thread-db.o linux-fork.o
-NAT_FILE= config/nm-linux.h
+NAT_FILE= nm-linux64.h
 
 # The dynamically loaded libthread_db needs access to symbols in the
 # gdb executable.
diff --git a/gdb/config/i386/nm-linux-xstate.h b/gdb/config/i386/nm-linux-xstate.h
new file mode 100644
index 0000000..0dbf9e5
--- /dev/null
+++ b/gdb/config/i386/nm-linux-xstate.h
@@ -0,0 +1,33 @@
+/* Native XSAVE extended state support for GNU/Linux x86.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef	NM_LINUX_XSTATE_H
+#define	NM_LINUX_XSTATE_H
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+#endif	/* NM_LINUX_XSTATE_H */
diff --git a/gdb/config/i386/nm-linux.h b/gdb/config/i386/nm-linux.h
index 10db309..fab8a0d 100644
--- a/gdb/config/i386/nm-linux.h
+++ b/gdb/config/i386/nm-linux.h
@@ -23,6 +23,7 @@
 #define NM_LINUX_H
 
 #include "config/nm-linux.h"
+#include "config/i386/nm-linux-xstate.h"
 
 #ifdef HAVE_PTRACE_GETFPXREGS
 /* Include register set support for the SSE registers.  */
diff --git a/gdb/config/i386/nm-linux64.h b/gdb/config/i386/nm-linux64.h
new file mode 100644
index 0000000..75220d6
--- /dev/null
+++ b/gdb/config/i386/nm-linux64.h
@@ -0,0 +1,26 @@
+/* Native support for GNU/Linux x86-64.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef NM_LINUX64_H
+#define NM_LINUX64_H
+
+#include "config/nm-linux.h"
+#include "config/i386/nm-linux-xstate.h"
+
+#endif /* nm-linux64.h */
diff --git a/gdb/i386-darwin-tdep.c b/gdb/i386-darwin-tdep.c
index 25a5e50..5b2bc7e 100644
--- a/gdb/i386-darwin-tdep.c
+++ b/gdb/i386-darwin-tdep.c
@@ -253,7 +253,7 @@ i386_darwin_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
   /* We support the SSE registers.  */
-  tdep->num_xmm_regs = I386_NUM_XREGS - 1;
+  tdep->num_vector_regs = I386_NUM_XREGS - 1;
   set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
 
   dwarf2_frame_set_signal_frame_p (gdbarch, darwin_dwarf_signal_frame_p);
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..fa6ea20 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,16 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -355,6 +368,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +553,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -503,6 +569,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 
   if (GETFPXREGS_SUPPLIES (regno))
     {
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
 
@@ -553,6 +621,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -567,6 +637,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
 
   if (GETFPXREGS_SUPPLIES (regno))
     {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
 
@@ -858,7 +930,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static unsigned long long xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (i386_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..2eb0eae 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,13 +50,15 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
-static struct core_regset_section i386_linux_regset_sections[] =
+struct core_regset_section i386_linux_regset_sections[] =
 {
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -560,6 +565,66 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size)
+{
+  int i;
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  for (i = 0; regset_sections[i].sect_name != NULL; i++)
+    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
+      {
+	if (xstate_size)
+	  regset_sections[i].size = xstate_size;
+	else
+	  regset_sections[i].sect_name = NULL;
+	break;
+      }
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+unsigned long long
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  unsigned long long xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      gdb_assert (size >= I386_XSTATE_SSE_SIZE);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +633,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +693,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +912,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..d29f0f8 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -35,7 +35,40 @@
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern unsigned long long i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section i386_linux_regset_sections[];
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
+  enabled state mask,  which is same as the 64bit mask returned by the
+  xgetbv's XCR0). We can use this mask as well as the mask saved in the
+  xstate_hdr bytes to interpret what states the processor/OS supports and
+  what state is in, used/initialized conditions, for the particular
+  process/thread.  */
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-nto-tdep.c b/gdb/i386-nto-tdep.c
index 09c55e2..cbf24cf 100644
--- a/gdb/i386-nto-tdep.c
+++ b/gdb/i386-nto-tdep.c
@@ -158,7 +158,7 @@ i386nto_register_area (struct gdbarch *gdbarch,
 			 && regno <= I387_FOP_REGNUM (tdep));
       int st_reg = (regno >= I387_ST0_REGNUM (tdep)
 		    && regno < I387_ST0_REGNUM (tdep) + 8);
-      int xmm_reg = (regno >= I387_XMM0_REGNUM (tdep)
+      int xmm_reg = (regno >= I387_VECTOR0_REGNUM (tdep)
 		     && regno < I387_MXCSR_REGNUM (tdep));
 
       if (nto_cpuinfo_valid && nto_cpuinfo_flags | X86_CPU_FXSR)
@@ -194,7 +194,7 @@ i386nto_register_area (struct gdbarch *gdbarch,
 	      /* XMM registers.  */
 	      regsize = 16;
 	      off_adjust = 160;
-	      regno_base = I387_XMM0_REGNUM (tdep);
+	      regno_base = I387_VECTOR0_REGNUM (tdep);
 	    }
 	  else if (regno == I387_MXCSR_REGNUM (tdep))
 	    {
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 05afa56..7959c40 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -50,15 +50,17 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
-static const char *i386_register_names[] =
+static const char *i386_sse_register_names[] =
 {
   "eax",   "ecx",    "edx",   "ebx",
   "esp",   "ebp",    "esi",   "edi",
@@ -73,6 +75,21 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_avx_register_names[] =
+{
+  "eax",   "ecx",    "edx",   "ebx",
+  "esp",   "ebp",    "esi",   "edi",
+  "eip",   "eflags", "cs",    "ss",
+  "ds",    "es",     "fs",    "gs",
+  "st0",   "st1",    "st2",   "st3",
+  "st4",   "st5",    "st6",   "st7",
+  "fctrl", "fstat",  "ftag",  "fiseg",
+  "fioff", "foseg",  "fooff", "fop",
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+  "mxcsr"
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -97,6 +114,13 @@ static const char *i386_word_names[] =
   "sp", "bp", "si", "di"
 };
 
+/* Register names for XMMM pseudo-registers.  */
+
+static const char *i386_xmm_names[] =
+{
+  "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7"
+};
+
 /* MMX register?  */
 
 static int
@@ -149,18 +173,32 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
-/* SSE register?  */
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int xmm0_regnum = tdep->xmm0_regnum;
+
+  if (xmm0_regnum < 0)
+    return 0;
+
+  regnum -= xmm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_xmm_regs;
+}
+
+/* Vector register?  */
 
 static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+i386_vector_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_vector_regs = I387_NUM_VECTOR_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_vector_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_VECTOR0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_vector_regs;
 }
 
 static int
@@ -168,7 +206,7 @@ i386_mxcsr_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (I387_NUM_VECTOR_REGS (tdep) == 0)
     return 0;
 
   return (regnum == I387_MXCSR_REGNUM (tdep));
@@ -197,7 +235,7 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
     return 0;
 
   return (I387_FCTRL_REGNUM (tdep) <= regnum 
-	  && regnum < I387_XMM0_REGNUM (tdep));
+	  && regnum < I387_VECTOR0_REGNUM (tdep));
 }
 
 /* Return the name of register REGNUM.  */
@@ -208,6 +246,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_xmm_regnum_p (gdbarch, regnum))
+    return i386_xmm_names[regnum - tdep->xmm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -244,8 +284,8 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
     }
   else if (reg >= 21 && reg <= 28)
     {
-      /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      /* Vector registers.  */
+      return reg - 21 + I387_VECTOR0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2183,6 +2223,58 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo XMM registers.  We can't use
+   tdesc_find_type since XMM isn't described in target description.  */
+
+static struct type *
+i386_xmm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_xmm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec128i
+      {
+        int128_t uint128;
+        int64_t v2_int64[2];
+        int32_t v4_int32[4];
+        int16_t v8_int16[8];
+        int8_t v16_int8[16];
+        double v2_double[2];
+        float v4_float[4];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec128i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v4_float",
+				   init_vector_type (bt->builtin_float, 4));
+      append_composite_type_field (t, "v2_double",
+				   init_vector_type (bt->builtin_double, 2));
+      append_composite_type_field (t, "v16_int8",
+				   init_vector_type (bt->builtin_int8, 16));
+      append_composite_type_field (t, "v8_int16",
+				   init_vector_type (bt->builtin_int16, 8));
+      append_composite_type_field (t, "v4_int32",
+				   init_vector_type (bt->builtin_int32, 4));
+      append_composite_type_field (t, "v2_int64",
+				   init_vector_type (bt->builtin_int64, 2));
+      append_composite_type_field (t, "uint128", bt->builtin_int128);
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_xmm_type = t;
+    }
+
+  return tdep->i386_xmm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2233,6 +2325,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_xmm_regnum_p (gdbarch, regnum))
+    return i386_xmm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2284,7 +2378,16 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_xmm_regnum_p (gdbarch, regnum))
+	{
+	  int vecnum = (I387_VECTOR0_REGNUM (tdep)
+			+ regnum - tdep->xmm0_regnum);
+
+	  /* Extract (always little endian).  */
+	  regcache_raw_read (regcache, vecnum, raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2333,7 +2436,19 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_xmm_regnum_p (gdbarch, regnum))
+	{
+	  int vecnum = (I387_VECTOR0_REGNUM (tdep)
+			+ regnum - tdep->xmm0_regnum);
+
+	  /* Read ...  */
+	  regcache_raw_read (regcache, vecnum, raw_buf);
+	  /* ... Modify ... (always little endian).  */
+	  memcpy (raw_buf, buf, 16);
+	  /* ... Write.  */
+	  regcache_raw_write (regcache, vecnum, raw_buf);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2580,6 +2695,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2607,6 +2744,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2757,7 +2904,7 @@ i386_go32_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->jb_pc_offset = 36;
 
   /* DJGPP does not support the SSE registers.  */
-  tdep->num_xmm_regs = 0;
+  tdep->num_vector_regs = 0;
   set_gdbarch_num_regs (gdbarch, I386_NUM_GREGS + I387_NUM_REGS);
 
   /* Native compiler is GCC, which uses the SVR4 register numbering
@@ -2800,8 +2947,9 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int vector_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
+      word_regnum_p, dword_regnum_p, xmm_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
@@ -2821,12 +2969,20 @@ i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
-  if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+  vector_regnum_p = (i386_vector_regnum_p (gdbarch, regnum)
+		     || i386_mxcsr_regnum_p (gdbarch, regnum));
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return mmx_regnum_p || vector_regnum_p;
+
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  if (group == i386_sse_reggroup)
+    return (xmm_regnum_p
+	    || (vector_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
+
+  if (xmm_regnum_p)
+    return 0;
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
@@ -2835,8 +2991,9 @@ i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 
   if (group == general_reggroup)
     return (!fp_regnum_p
+	    && !vector_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
+	    && !xmm_regnum_p
 	    && !byte_regnum_p
 	    && !word_regnum_p
 	    && !dword_regnum_p);
@@ -5651,6 +5808,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   const struct target_desc *tdesc = tdep->tdesc;
   const struct tdesc_feature *feature_core, *feature_vector;
   int i, num_regs, valid_p;
+  unsigned long long xcr0;
 
   if (! tdesc_has_registers (tdesc))
     return 0;
@@ -5658,8 +5816,16 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Get core registers.  */
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
-  /* Get SSE registers.  */
+  /* Try SSE registers first.  */
   feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  if (feature_vector)
+    xcr0 = I386_XSTATE_SSE_MASK;
+  else
+    {
+      /* Try AVX registers.  */
+      feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+      xcr0 = I386_XSTATE_AVX_MASK;
+    }
 
   if (feature_core == NULL || feature_vector == NULL)
     return 0;
@@ -5672,11 +5838,14 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 					tdep->register_names[i]);
 
   /* Need to include %mxcsr, so add one.  */
-  num_regs += tdep->num_xmm_regs + 1;
+  num_regs += tdep->num_vector_regs + 1;
   for (; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
 					tdep->register_names[i]);
 
+  /* The XCR0 bits.  */
+  tdep->xcr0 = xcr0;
+
   return valid_p;
 }
 
@@ -5689,6 +5858,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int xmm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -5709,10 +5879,12 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
-     `num_xmm_regs' of `struct gdbarch_tdep', otherwise the registers
+     `num_vector_regs' of `struct gdbarch_tdep', otherwise the registers
      will show up in the output of "info all-registers".  Ideally we
      should try to autodetect whether they are available, such that we
      can prevent "info all-registers" from displaying registers that
@@ -5726,7 +5898,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->st0_regnum = I386_ST0_REGNUM;
 
   /* I386_NUM_XREGS includes %mxcsr, so substract one.  */
-  tdep->num_xmm_regs = I386_NUM_XREGS - 1;
+  tdep->num_vector_regs = I386_NUM_XREGS - 1;
 
   tdep->jb_pc_offset = -1;
   tdep->struct_return = pcc_struct_return;
@@ -5738,6 +5910,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -5865,7 +6039,17 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->tdesc = tdesc;
 
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
-  tdep->register_names = i386_register_names;
+
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse") != NULL)
+    {
+      tdep->register_names = i386_sse_register_names;
+      tdep->num_xmm_regs = 0;
+    }
+  else
+    {
+      tdep->register_names = i386_avx_register_names;
+      tdep->num_xmm_regs = 8;
+    }
 
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
@@ -5883,7 +6067,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_xmm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
@@ -5905,16 +6090,30 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  xmm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->num_dword_regs = tdep->num_dword_regs;
+      tdep->eax_regnum = xmm0_regnum;
+      xmm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  if (tdep->num_xmm_regs)
+    {
+      /* Support XMM pseudo-registesr if it is available,  */
+      tdep->num_xmm_regs = tdep->num_xmm_regs;
+      tdep->xmm0_regnum = xmm0_regnum;
+      mm0_regnum = tdep->xmm0_regnum + tdep->num_xmm_regs;
+    }
+  else
+    {
+      tdep->xmm0_regnum = -1;
+      mm0_regnum = xmm0_regnum;
+    }
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -5940,6 +6139,12 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_fast_tracepoint_valid_at (gdbarch,
 					i386_fast_tracepoint_valid_at);
 
+  /* Specify XSAVE extended state support with the largest extended
+     state size and XCR0.  */ 
+  set_gdbarch_qsupported (gdbarch,
+			  "x86:xstate=" I386_XSTATE_MAX_SIZE_STRING
+			  ":xcr0=" I386_XSTATE_MAX_MASK_STRING); 
+
   return gdbarch;
 }
 
@@ -5997,4 +6202,5 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 }
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..2746fb4 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -42,7 +42,7 @@ struct regcache;
    determines the register number at which the FPU data registers
    start.  The number of FPU data and control registers is the same
    for both architectures.  The number of SSE registers however,
-   differs and is determined by the num_xmm_regs member of `struct
+   differs and is determined by the num_vector_regs member of `struct
    gdbarch_tdep'.  */
 
 /* Convention for returning structures.  */
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo XMM registers.  */
+  int num_xmm_regs;
+
+  /* Register number for %xmm0.  Set this to -1 to indicate the absence
+     of pseudo XMM register support.  */
+  int xmm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -143,8 +153,16 @@ struct gdbarch_tdep
   /* Number of core registers.  */
   int num_core_regs;
 
-  /* Number of SSE registers.  */
-  int num_xmm_regs;
+  /* Number of vector registers.  */
+  int num_vector_regs;
+
+    /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+       register), excluding the x87 bit, which are supported by this gdb.
+     */
+  unsigned long long xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
 
   /* Register names.  */
   const char **register_names;
@@ -182,6 +200,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_xmm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -267,7 +286,7 @@ enum record_i386_regnum
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
 
 /* Size of the largest register.  */
-#define I386_MAX_REGISTER_SIZE	16
+#define I386_MAX_REGISTER_SIZE	32
 
 /* Types for i386-specific registers.  */
 extern struct type *i387_ext_type (struct gdbarch *gdbarch);
@@ -276,6 +295,7 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 4/6: Add AVX support (amd64 changes)
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
  2010-03-04 18:06   ` PATCH: 3/6: Add AVX support (i386 changes) H.J. Lu
@ 2010-03-04 18:08   ` H.J. Lu
  2010-03-04 18:09     ` PATCH: 5/6: Add AVX support (i387 changes) H.J. Lu
  2010-03-06 22:21     ` PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes) H.J. Lu
  2010-03-05 10:33   ` PATCH: 2/6: Add AVX support (Update document) Eli Zaretskii
  2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
  3 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 18:08 UTC (permalink / raw)
  To: GDB

Hi,

Here are the amd64 changes to support AVX.  OK to install?

Thanks.


H.J.
---
2010-03-04  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h" and
	<sys/uio.h>.
	(xstate_size): New.
	(xstate_size_n_of_int64): Likewise.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (tdesc_amd64_avx_linux): New.
	(amd64_linux_regset_sections): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_register_names): Renamed to ...
	(amd64_sse_register_names): This.
	(amd64_avx_register_names): New.
	(amd64_xmm_names): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Updated.
	(amd64_pseudo_register_name): Support pseudo XMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set num_xmm_regs, register_names and
	num_vector_regs.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_supply_xsave): New.
	(amd64_collect_xsave): Likewise.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..4a79891 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -52,6 +55,16 @@
 #include "amd64-nat.h"
 #include "i386-nat.h"
 
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
+
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
 
@@ -183,10 +196,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
+
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +255,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
 
-      return;
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
+
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +735,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static unsigned long long xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +750,53 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (amd64_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..51722bf 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,7 +28,9 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
 
 #include "gdb_string.h"
@@ -38,6 +40,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +48,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1250,12 +1262,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1314,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1337,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1539,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..78d9744 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -33,6 +33,10 @@
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section amd64_linux_regset_sections[];
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index 8c41a8a..1a6cd80 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -53,7 +54,7 @@
 
 /* Register information.  */
 
-static const char *amd64_register_names[] = 
+static const char *amd64_sse_register_names[] = 
 {
   "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp",
 
@@ -71,8 +72,26 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
+static const char *amd64_avx_register_names[] = 
+{
+  "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp",
+
+  /* %r8 is indeed register number 8.  */
+  "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
+  "rip", "eflags", "cs", "ss", "ds", "es", "fs", "gs",
+
+  /* %st0 is register number 24.  */
+  "st0", "st1", "st2", "st3", "st4", "st5", "st6", "st7",
+  "fctrl", "fstat", "ftag", "fiseg", "fioff", "foseg", "fooff", "fop",
+
+  /* %ymm0 is register number 40.  */
+  "ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11", "ymm12", "ymm13", "ymm14", "ymm15",
+  "mxcsr"
+};
+
 /* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_sse_register_names)
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -234,6 +253,14 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Register names for XMMM pseudo-registers.  */
+
+static const char *amd64_xmm_names[] =
+{
+  "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7",
+  "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15"
+};
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -242,6 +269,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_xmm_regnum_p (gdbarch, regnum))
+    return amd64_xmm_names[regnum - tdep->xmm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2148,6 +2177,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2166,6 +2217,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2226,7 +2287,17 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->tdesc = tdesc;
 
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
-  tdep->register_names = amd64_register_names;
+
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse") != NULL)
+    {
+      tdep->register_names = amd64_sse_register_names;
+      tdep->num_xmm_regs = 0;
+    }
+  else
+    {
+      tdep->register_names = amd64_avx_register_names;
+      tdep->num_xmm_regs = 16;
+    }
 
   tdep->num_byte_regs = 16;
   tdep->num_word_regs = 16;
@@ -2243,7 +2314,7 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
-  tdep->num_xmm_regs = 16;
+  tdep->num_vector_regs = 16;
 
   /* This is what all the fuss is about.  */
   set_gdbarch_long_bit (gdbarch, 64);
@@ -2321,6 +2392,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2356,6 +2428,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2379,3 +2475,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..4dccb4f 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -91,6 +91,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +103,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 5/6: Add AVX support (i387 changes)
  2010-03-04 18:08   ` PATCH: 4/6: Add AVX support (amd64 changes) H.J. Lu
@ 2010-03-04 18:09     ` H.J. Lu
  2010-03-04 18:10       ` PATCH: 6/6: Add AVX support (gdbserver changes) H.J. Lu
                         ` (2 more replies)
  2010-03-06 22:21     ` PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes) H.J. Lu
  1 sibling, 3 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 18:09 UTC (permalink / raw)
  To: GDB

Hi,

Here are i387 changes to support AVX.  OK to install?

Thanks.


H.J.
---
2010-03-03  H.J. Lu  <hongjiu.lu@intel.com>

	* i387-tdep.c: Include "i386-xstate.h".
	(i387_supply_fsave): Replace I387_XMM0_REGNUM with
	I387_VECTOR0_REGNUM.
	(i387_collect_fsave): Likewise.
	(i387_supply_fxsave): Replace I387_XMM0_REGNUM with
	I387_VECTOR0_REGNUM.  Replace num_xmm_regs with num_vector_regs.
	Check tdep->xcr0 for AVX.
	(i387_collect_fxsave): Likewise.
	(xsave_sse_offset): New.
	(XSAVE_XSTATE_BV_ADDR): Likewise.
	(XSAVE_SSE_ADDR): Likewise.
	(xsave_avxh_offset): Likewise.
	(XSAVE_AVXH_ADDR): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

	* i387-tdep.h (I387_NUM_XMM_REGS): Renamed to ...
	(I387_NUM_VECTOR_REGS): This.
	(I387_XMM0_REGNUM): Renamed to ...
	(I387_VECTOR0_REGNUM): This.
	(I387_MXCSR_REGNUM): Updated.
	(i387_supply_xsave): New.
	(i387_collect_xsave): Likewise.

diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
index 3fb5b56..1f4547d 100644
--- a/gdb/i387-tdep.c
+++ b/gdb/i387-tdep.c
@@ -34,6 +34,7 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 /* Print the floating point number specified by RAW.  */
 
@@ -398,7 +399,7 @@ i387_supply_fsave (struct regcache *regcache, int regnum, const void *fsave)
 
   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
 
-  for (i = I387_ST0_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+  for (i = I387_ST0_REGNUM (tdep); i < I387_VECTOR0_REGNUM (tdep); i++)
     if (regnum == -1 || regnum == i)
       {
 	if (fsave == NULL)
@@ -425,7 +426,7 @@ i387_supply_fsave (struct regcache *regcache, int regnum, const void *fsave)
       }
 
   /* Provide dummy values for the SSE registers.  */
-  for (i = I387_XMM0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
+  for (i = I387_VECTOR0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
     if (regnum == -1 || regnum == i)
       regcache_raw_supply (regcache, i, NULL);
   if (regnum == -1 || regnum == I387_MXCSR_REGNUM (tdep))
@@ -451,7 +452,7 @@ i387_collect_fsave (const struct regcache *regcache, int regnum, void *fsave)
 
   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
 
-  for (i = I387_ST0_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+  for (i = I387_ST0_REGNUM (tdep); i < I387_VECTOR0_REGNUM (tdep); i++)
     if (regnum == -1 || regnum == i)
       {
 	/* Most of the FPU control registers occupy only 16 bits in
@@ -541,9 +542,11 @@ i387_supply_fxsave (struct regcache *regcache, int regnum, const void *fxsave)
   struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
   const gdb_byte *regs = fxsave;
   int i;
+  gdb_byte raw[I386_MAX_REGISTER_SIZE];
+  const gdb_byte *xmm;
 
   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
-  gdb_assert (tdep->num_xmm_regs > 0);
+  gdb_assert (tdep->num_vector_regs > 0);
 
   for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
     if (regnum == -1 || regnum == i)
@@ -556,7 +559,7 @@ i387_supply_fxsave (struct regcache *regcache, int regnum, const void *fxsave)
 
 	/* Most of the FPU control registers occupy only 16 bits in
 	   the fxsave area.  Give those a special treatment.  */
-	if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_XMM0_REGNUM (tdep)
+	if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_VECTOR0_REGNUM (tdep)
 	    && i != I387_FIOFF_REGNUM (tdep) && i != I387_FOOFF_REGNUM (tdep))
 	  {
 	    gdb_byte val[4];
@@ -600,7 +603,17 @@ i387_supply_fxsave (struct regcache *regcache, int regnum, const void *fxsave)
 	    regcache_raw_supply (regcache, i, val);
 	  }
 	else
-	  regcache_raw_supply (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+	  {
+	    if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		== I386_XSTATE_AVX_MASK)
+	      {
+		memcpy (raw, FXSAVE_ADDR (tdep, regs, i), 16);
+		xmm = raw;
+	      }
+	    else
+	      xmm = FXSAVE_ADDR (tdep, regs, i);
+	    regcache_raw_supply (regcache, i, xmm);
+	  }
       }
 
   if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
@@ -624,16 +637,18 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
   struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
   gdb_byte *regs = fxsave;
   int i;
+  gdb_byte raw[I386_MAX_REGISTER_SIZE];
+  gdb_byte *xmm;
 
   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
-  gdb_assert (tdep->num_xmm_regs > 0);
+  gdb_assert (tdep->num_vector_regs > 0);
 
   for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
     if (regnum == -1 || regnum == i)
       {
 	/* Most of the FPU control registers occupy only 16 bits in
            the fxsave area.  Give those a special treatment.  */
-	if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_XMM0_REGNUM (tdep)
+	if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_VECTOR0_REGNUM (tdep)
 	    && i != I387_FIOFF_REGNUM (tdep) && i != I387_FOOFF_REGNUM (tdep))
 	  {
 	    gdb_byte buf[4];
@@ -669,7 +684,465 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
 	    memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
 	  }
 	else
+	  {
+	    if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		== I386_XSTATE_AVX_MASK)
+	      {
+		memcpy (raw, FXSAVE_ADDR (tdep, regs, i), 16);
+		xmm = raw;
+	      }
+	    else
+	      xmm = FXSAVE_ADDR (tdep, regs, i);
+	    regcache_raw_collect (regcache, i, xmm);
+	  }
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    regcache_raw_collect (regcache, I387_MXCSR_REGNUM (tdep),
+			  FXSAVE_MXCSR_ADDR (regs));
+}
+
+/* At xsave_sse_offset[REGNUM] you'll find the offset to the location in
+   the SSE register data structure used by the "xsave" instruction where
+   GDB register REGNUM is stored.  */
+
+static int xsave_sse_offset[] =
+{
+  160 + 0 * 16,		/* %xmm0 through ...  */
+  160 + 1 * 16,
+  160 + 2 * 16,
+  160 + 3 * 16,
+  160 + 4 * 16,
+  160 + 5 * 16,
+  160 + 6 * 16,
+  160 + 7 * 16,
+  160 + 8 * 16,
+  160 + 9 * 16,
+  160 + 10 * 16,
+  160 + 11 * 16,
+  160 + 12 * 16,
+  160 + 13 * 16,
+  160 + 14 * 16,
+  160 + 15 * 16,	/* ... %xmm15 (128 bits each).  */
+};
+
+/* `xstate_bv' is at byte offset 512.  */
+#define XSAVE_XSTATE_BV_ADDR(xsave) (xsave + 512)
+
+#define XSAVE_SSE_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_sse_offset[regnum - I387_VECTOR0_REGNUM (tdep)])
+
+/* At xsave_avxh_offset[REGNUM] you'll find the offset to the location in
+   the upper 128bit of AVX register data structure used by the "xsave"
+   instruction where GDB register REGNUM is stored.  */
+
+static int xsave_avxh_offset[] =
+{
+  576 + 0 * 16,		/* Upper 128bit of %ymm0 through ...  */
+  576 + 1 * 16,
+  576 + 2 * 16,
+  576 + 3 * 16,
+  576 + 4 * 16,
+  576 + 5 * 16,
+  576 + 6 * 16,
+  576 + 7 * 16,
+  576 + 8 * 16,
+  576 + 9 * 16,
+  576 + 10 * 16,
+  576 + 11 * 16,
+  576 + 12 * 16,
+  576 + 13 * 16,
+  576 + 14 * 16,
+  576 + 15 * 16,	/* Upper 128bit of ... %ymm15 (128 bits each).  */
+};
+
+#define XSAVE_AVXH_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_avxh_offset[regnum - I387_VECTOR0_REGNUM (tdep)])
+
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+void
+i387_supply_xsave (struct regcache *regcache, int regnum,
+		   const void *xsave)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  const gdb_byte *regs = xsave;
+  int i;
+  unsigned int clear_bv;
+  gdb_byte raw[I386_MAX_REGISTER_SIZE];
+  const gdb_byte *p;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_vector_regs > 0);
+
+  if (regs != NULL
+      && (regnum == -1
+	  || (regnum >= I387_VECTOR0_REGNUM(tdep)
+	      && regnum < I387_MXCSR_REGNUM (tdep))
+	  || (regnum >= I387_ST0_REGNUM (tdep)
+	      && regnum < I387_FCTRL_REGNUM (tdep))))
+    {
+      /* Get `xstat_bv'.  */
+      const gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+
+      /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+	 vector registers if its bit in xstat_bv is zero.  */
+      clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+    }
+  else
+    clear_bv = 0;
+
+  for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	if (regs == NULL)
+	  {
+	    regcache_raw_supply (regcache, i, NULL);
+	    continue;
+	  }
+
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i >= I387_FCTRL_REGNUM (tdep)
+	    && i < I387_VECTOR0_REGNUM (tdep)
+	    && i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte val[4];
+
+	    memcpy (val, FXSAVE_ADDR (tdep, regs, i), 2);
+	    val[2] = val[3] = 0;
+	    if (i == I387_FOP_REGNUM (tdep))
+	      val[1] &= ((1 << 3) - 1);
+	    else if (i== I387_FTAG_REGNUM (tdep))
+	      {
+		/* The fxsave area contains a simplified version of
+		   the tag word.  We have to look at the actual 80-bit
+		   FP data to recreate the traditional i387 tag word.  */
+
+		unsigned long ftag = 0;
+		int fpreg;
+		int top;
+
+		top = ((FXSAVE_ADDR (tdep, regs,
+				     I387_FSTAT_REGNUM (tdep)))[1] >> 3);
+		top &= 0x7;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag;
+
+		    if (val[0] & (1 << fpreg))
+		      {
+			int regnum = (fpreg + 8 - top) % 8 
+				       + I387_ST0_REGNUM (tdep);
+			tag = i387_tag (FXSAVE_ADDR (tdep, regs, regnum));
+		      }
+		    else
+		      tag = 3;		/* Empty */
+
+		    ftag |= tag << (2 * fpreg);
+		  }
+		val[0] = ftag & 0xff;
+		val[1] = (ftag >> 8) & 0xff;
+	      }
+	    regcache_raw_supply (regcache, i, val);
+	  }
+	else if (i < I387_VECTOR0_REGNUM (tdep))
+	  {
+	    if (i < I387_FCTRL_REGNUM (tdep)
+		&& (clear_bv & bit_I386_XSTATE_X87))
+	      p = NULL;
+	    else
+	      p = FXSAVE_ADDR (tdep, regs, i);
+	    regcache_raw_supply (regcache, i, p);
+	  }
+	else
+	  {
+	    if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		== I386_XSTATE_AVX_MASK)
+	      {
+		if ((clear_bv & (bit_I386_XSTATE_SSE | bit_I386_XSTATE_AVX))
+		    == (bit_I386_XSTATE_SSE | bit_I386_XSTATE_AVX))
+		  p = NULL;
+		else
+		  {
+		    p = raw;
+		    if ((clear_bv & bit_I386_XSTATE_SSE))
+		      memset (raw, 0, 16);
+		    else
+		      memcpy (raw, XSAVE_SSE_ADDR (tdep, regs, i), 16);
+		    if ((clear_bv & bit_I386_XSTATE_AVX))
+		      memset (raw + 16, 0, 16);
+		    else
+		      memcpy (raw + 16, XSAVE_AVXH_ADDR (tdep, regs, i),
+			      16);
+		  }
+	      }
+	    else
+	      {
+		if ((clear_bv & bit_I386_XSTATE_SSE))
+		  p = NULL;
+		else
+		  p = XSAVE_SSE_ADDR (tdep, regs, i);
+	      }
+	    regcache_raw_supply (regcache, i, p);
+	  }
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    {
+      if (regs == NULL)
+	regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep), NULL);
+      else
+	regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep),
+			     FXSAVE_MXCSR_ADDR (regs));
+    }
+}
+
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+void
+i387_collect_xsave (const struct regcache *regcache, int regnum,
+		    void *xsave, int gcore)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  gdb_byte *regs = xsave;
+  int i;
+  gdb_byte raw[I386_MAX_REGISTER_SIZE];
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_vector_regs > 0);
+
+  if (gcore)
+    {
+      /* Update XCR0 and `xstate_bv' with XCR0 for gcore.  */
+      if (tdep->xsave_xcr0_offset != -1)
+	memcpy (regs + tdep->xsave_xcr0_offset, &tdep->xcr0, 8);
+      memcpy (XSAVE_XSTATE_BV_ADDR (regs), &tdep->xcr0, 8);
+    }
+  else
+    {
+      enum
+	{
+	  none = 0x0,
+	  check = 0x1,
+	  x87 = 0x2 | check,
+	  vector = 0x4 | check,
+	  all = 0x8 | check
+	} regclass;
+
+      if (regnum == -1)
+	regclass = all;
+      else if (regnum >= I387_VECTOR0_REGNUM(tdep)
+	       && regnum < I387_MXCSR_REGNUM (tdep))
+	regclass = vector;
+      else if (regnum >= I387_ST0_REGNUM (tdep)
+	       && regnum < I387_FCTRL_REGNUM (tdep))
+	regclass = x87;
+      else
+	regclass = none;
+
+      if ((regclass & check))
+	{
+	  gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+	  int num_vector_regs;
+	  unsigned int xstate_bv = 0;
+	  /* The supported bits in `xstat_bv' are 1 byte. */
+	  unsigned int clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+
+	  /* Clear part in vector registers if its bit in xstat_bv is
+	     zero.  */
+	  if (clear_bv)
+	    {
+	      i = I387_VECTOR0_REGNUM (tdep);
+	      num_vector_regs = I387_NUM_VECTOR_REGS(tdep);
+	      for (; num_vector_regs; num_vector_regs--, i++)
+		{
+		  if ((clear_bv & bit_I386_XSTATE_AVX))
+		    memset (XSAVE_AVXH_ADDR (tdep, regs, i), 0, 16);
+		  if ((clear_bv & bit_I386_XSTATE_SSE))
+		    memset (XSAVE_SSE_ADDR (tdep, regs, i), 0, 16);
+		}
+
+	      if ((clear_bv & bit_I386_XSTATE_X87))
+		for (i = I387_ST0_REGNUM (tdep);
+		     i < I387_FCTRL_REGNUM (tdep); i++)
+		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 10);
+	    }
+
+	  if (regclass == all)
+	    {
+	      i = I387_VECTOR0_REGNUM (tdep);
+	      num_vector_regs = I387_NUM_VECTOR_REGS(tdep);
+
+	      if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		  == I386_XSTATE_AVX_MASK)
+		{
+		  /* Check if any AVX registers are changed.  */
+		  for (; num_vector_regs; num_vector_regs--, i++)
+		    {
+		      regcache_raw_read ((struct regcache *) regcache,
+					 i, raw);
+		      if (memcmp (raw + 16,
+				  XSAVE_AVXH_ADDR (tdep, regs, i), 16))
+			xstate_bv |= bit_I386_XSTATE_AVX;
+		      if (memcmp (raw, XSAVE_SSE_ADDR (tdep, regs, i), 16))
+			xstate_bv |= bit_I386_XSTATE_SSE;
+
+		      if (xstate_bv
+			  == (bit_I386_XSTATE_AVX | bit_I386_XSTATE_SSE))
+			break;
+		    }
+		}
+	      else
+		{
+		  /* Check if any SSE registers are changed.  */
+		  for (; num_vector_regs; num_vector_regs--, i++)
+		    {
+		      regcache_raw_read ((struct regcache *) regcache,
+					 i, raw);
+		      if (memcmp (raw, XSAVE_SSE_ADDR (tdep, regs, i), 16))
+			{
+			  xstate_bv |= bit_I386_XSTATE_SSE;
+			  break;
+			}
+		    }
+		}
+
+	      /* Check if any X87 registers are changed.  */
+	      for (i = I387_ST0_REGNUM (tdep);
+		   i < I387_FCTRL_REGNUM (tdep); i++)
+		{
+		  regcache_raw_read ((struct regcache *) regcache, i, raw);
+		  if (memcmp (raw, FXSAVE_ADDR (tdep, regs, i), 10))
+		    {
+		      xstate_bv |= bit_I386_XSTATE_X87;
+		      break;
+		    }
+		}
+	    }
+	  else
+	    {
+	      /* Check if REGNUM is changed.  */
+	      regcache_raw_read ((struct regcache *) regcache, regnum, raw);
+
+	      if (regclass == x87)
+		{
+		  /* This is an x87 register.  */
+		  if (memcmp (raw, FXSAVE_ADDR (tdep, regs, regnum), 10))
+		    xstate_bv |= bit_I386_XSTATE_X87;
+		}
+	      else
+		{
+		  /* This is an SSE/AVX register.  */
+		  if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		      == I386_XSTATE_AVX_MASK)
+		    {
+		      if (memcmp (raw + 16,
+				  XSAVE_AVXH_ADDR (tdep, regs, regnum), 16))
+			xstate_bv |= bit_I386_XSTATE_AVX;
+		    }
+
+		  if (memcmp (raw, XSAVE_SSE_ADDR (tdep, regs, regnum), 16))
+		    xstate_bv |= bit_I386_XSTATE_SSE;
+		}
+	    }
+
+	  /* Update the corresponding bits in `xstate_bv' if any SSE/AVX
+	     registers are changed.  */
+	  if (xstate_bv)
+	    {
+	      /* The supported bits in `xstat_bv' are 1 byte.  */
+	      *xstate_bv_p |= (gdb_byte) xstate_bv;
+
+	      /* Update REGNUM and return.  */
+	      if (regclass != all)
+		{
+		  if (regclass == x87)
+		    {
+		      /* x87 register.  */
+		      memcpy (FXSAVE_ADDR (tdep, regs, regnum), raw, 10);
+		    }
+		  else
+		    {
+		      /* SSE/AVX register.  */
+		      if ((xstate_bv & bit_I386_XSTATE_AVX))
+			memcpy (XSAVE_AVXH_ADDR (tdep, regs, regnum),
+				raw + 16, 16);
+		      if ((xstate_bv & bit_I386_XSTATE_SSE))
+			memcpy (XSAVE_SSE_ADDR (tdep, regs, regnum), raw, 16);
+		    }
+		  return;
+		}
+	    }
+	  else
+	    {
+	      /* Return if REGNUM isn't changed.  */
+	      if (regclass != all)
+		return;
+	    }
+	}
+    }
+
+  for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i >= I387_FCTRL_REGNUM (tdep)
+	    && i < I387_VECTOR0_REGNUM (tdep)
+	    && i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte buf[4];
+
+	    regcache_raw_collect (regcache, i, buf);
+
+	    if (i == I387_FOP_REGNUM (tdep))
+	      {
+		/* The opcode occupies only 11 bits.  Make sure we
+                   don't touch the other bits.  */
+		buf[1] &= ((1 << 3) - 1);
+		buf[1] |= ((FXSAVE_ADDR (tdep, regs, i))[1] & ~((1 << 3) - 1));
+	      }
+	    else if (i == I387_FTAG_REGNUM (tdep))
+	      {
+		/* Converting back is much easier.  */
+
+		unsigned short ftag;
+		int fpreg;
+
+		ftag = (buf[1] << 8) | buf[0];
+		buf[0] = 0;
+		buf[1] = 0;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag = (ftag >> (fpreg * 2)) & 3;
+
+		    if (tag != 3)
+		      buf[0] |= (1 << fpreg);
+		  }
+	      }
+	    memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
+	  }
+	else if (i < I387_VECTOR0_REGNUM (tdep))
 	  regcache_raw_collect (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+	else
+	  {
+	    if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		== I386_XSTATE_AVX_MASK)
+	      {
+		regcache_raw_collect (regcache, i, raw);
+		memcpy (XSAVE_SSE_ADDR (tdep, regs, i), raw, 16);
+		memcpy (XSAVE_AVXH_ADDR (tdep, regs, i),
+			raw + 16, 16);
+	      }
+	    else
+	      regcache_raw_collect (regcache, i,
+				    XSAVE_SSE_ADDR (tdep, regs, i));
+	  }
       }
 
   if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
index 645eb91..f867a1f 100644
--- a/gdb/i387-tdep.h
+++ b/gdb/i387-tdep.h
@@ -31,7 +31,7 @@ struct ui_file;
 #define I387_NUM_REGS	16
 
 #define I387_ST0_REGNUM(tdep) ((tdep)->st0_regnum)
-#define I387_NUM_XMM_REGS(tdep) ((tdep)->num_xmm_regs)
+#define I387_NUM_VECTOR_REGS(tdep) ((tdep)->num_vector_regs)
 #define I387_MM0_REGNUM(tdep) ((tdep)->mm0_regnum)
 
 #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
@@ -42,9 +42,9 @@ struct ui_file;
 #define I387_FOSEG_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 5)
 #define I387_FOOFF_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 6)
 #define I387_FOP_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 7)
-#define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
+#define I387_VECTOR0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
 #define I387_MXCSR_REGNUM(tdep) \
-  (I387_XMM0_REGNUM (tdep) + I387_NUM_XMM_REGS (tdep))
+  (I387_VECTOR0_REGNUM (tdep) + I387_NUM_VECTOR_REGS (tdep))
 
 /* Print out the i387 floating point state.  */
 
@@ -99,6 +99,11 @@ extern void i387_collect_fsave (const struct regcache *regcache, int regnum,
 extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 				const void *fxsave);
 
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+extern void i387_supply_xsave (struct regcache *regcache, int regnum,
+			       const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -107,6 +112,11 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 extern void i387_collect_fxsave (const struct regcache *regcache, int regnum,
 				 void *fxsave);
 
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+extern void i387_collect_xsave (const struct regcache *regcache,
+				int regnum, void *xsave, int gcore);
+
 /* Prepare the FPU stack in REGCACHE for a function return.  */
 
 extern void i387_return_value (struct gdbarch *gdbarch,

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 6/6: Add AVX support (gdbserver changes)
  2010-03-04 18:09     ` PATCH: 5/6: Add AVX support (i387 changes) H.J. Lu
@ 2010-03-04 18:10       ` H.J. Lu
  2010-03-06 22:23         ` PATCH: 6/6 [2nd try]: " H.J. Lu
  2010-03-05  3:20       ` PATCH: 5/6: Add AVX support (i387 changes) Hui Zhu
  2010-03-06 22:22       ` PATCH: 5/6 [2nd try]: " H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 18:10 UTC (permalink / raw)
  To: GDB

Hi,

Here are gdbserver changes to support AVX.  OK to install?

Thanks.


H.J.
---
2010-03-02  H.J. Lu  <hongjiu.lu@intel.com>

	* Makefile.in (clean): Updated.
	(i386-avx.o): New.
	(i386-avx.c): Likewise.
	(i386-avx-linux.o): Likewise.
	(i386-avx-linux.c): Likewise.
	(amd64-avx.o): Likewise.
	(amd64-avx.c): Likewise.
	(amd64-avx-linux.o): Likewise.
	(amd64-avx-linux.c): Likewise.

	* configure.srv (srv_i386_regobj): Add i386-avx.o.
	(srv_i386_linux_regobj): Add i386-avx-linux.o.
	(srv_amd64_regobj): Add amd64-avx.o.
	(srv_amd64_linux_regobj): Add amd64-avx-linux.o.
	(srv_i386_32bit_xmlfiles): Add i386/32bit-avx.xml.
	(srv_i386_64bit_xmlfiles): Add i386/64bit-avx.xml.
	(srv_i386_xmlfiles): Add i386/i386-avx.xml.
	(srv_amd64_xmlfiles): Add i386/amd64-avx.xml.
	(srv_i386_linux_xmlfiles): Add i386/i386-avx-linux.xml.
	(srv_amd64_linux_xmlfiles): Add i386/amd64-avx-linux.xml.

	* i387-fp.c: Include "i386-xstate.h".
	(i387_xsave): New.
	(i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* i387-fp.h (i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* linux-arm-low.c (target_regsets): Initialize nt_type to 0.
	* linux-crisv32-low.c (target_regsets): Likewise.
	* linux-m68k-low.c (target_regsets): Likewise.
	* linux-mips-low.c (target_regsets): Likewise.
	* linux-ppc-low.c (target_regsets): Likewise.
	* linux-s390-low.c (target_regsets): Likewise.
	* linux-sh-low.c (target_regsets): Likewise.
	* linux-sparc-low.c (target_regsets): Likewise.
	* linux-xtensa-low.c (target_regsets): Likewise.

	* linux-low.c: Include <sys/uio.h>.
	(regsets_fetch_inferior_registers): Support nt_type.
	(regsets_store_inferior_registers): Likewise.
	(linux_process_qsupported): New.
	(linux_target_ops): Add linux_process_qsupported.

	* linux-low.h (regset_info): Add nt_type.
	(linux_target_ops): Add process_qsupported.

	* linux-x86-low.c: Include "i386-xstate.h", "elf/common.h" and
	<sys/uio.h>.
	(init_registers_i386_avx_linux): New.
	(init_registers_amd64_avx_linux): Likewise.
	(PTRACE_GETREGSET): Likewise.
	(PTRACE_SETREGSET): Likewise.
	(x86_fill_xstateregset): Likewise.
	(x86_store_xstateregset): Likewise.
	(x86_linux_process_qsupported): Likewise.
	(target_regsets): Add NT_X86_XSTATE entry and Initialize nt_type.
	(the_low_target): Add x86_linux_process_qsupported.

	* server.c (use_xml): New.
	(get_features_xml): Don't use XML file if use_xml is 0.
	(handle_query): Call target_process_qsupported.

	* server.h (use_xml): New.

	* target.h (target_ops): Add process_qsupported.
	(target_process_qsupported): New.

diff --git a/gdb/gdbserver/Makefile.in b/gdb/gdbserver/Makefile.in
index 7fecced..2ec9784 100644
--- a/gdb/gdbserver/Makefile.in
+++ b/gdb/gdbserver/Makefile.in
@@ -217,6 +217,8 @@ clean:
 	rm -f powerpc-isa205-vsx64l.c
 	rm -f s390-linux32.c s390-linux64.c s390x-linux64.c
 	rm -f xml-builtin.c stamp-xml
+	rm -f i386-avx.c i386-avx-linux.c
+	rm -f amd64-avx.c amd64-avx-linux.c
 
 maintainer-clean realclean distclean: clean
 	rm -f nm.h tm.h xm.h config.status config.h stamp-h config.log
@@ -351,6 +353,12 @@ i386.c : $(srcdir)/../regformats/i386/i386.dat $(regdat_sh)
 i386-linux.o : i386-linux.c $(regdef_h)
 i386-linux.c : $(srcdir)/../regformats/i386/i386-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-linux.dat i386-linux.c
+i386-avx.o : i386-avx.c $(regdef_h)
+i386-avx.c : $(srcdir)/../regformats/i386/i386-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx.dat i386-avx.c
+i386-avx-linux.o : i386-avx-linux.c $(regdef_h)
+i386-avx-linux.c : $(srcdir)/../regformats/i386/i386-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx-linux.dat i386-avx-linux.c
 reg-ia64.o : reg-ia64.c $(regdef_h)
 reg-ia64.c : $(srcdir)/../regformats/reg-ia64.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-ia64.dat reg-ia64.c
@@ -438,6 +446,12 @@ amd64.c : $(srcdir)/../regformats/i386/amd64.dat $(regdat_sh)
 amd64-linux.o : amd64-linux.c $(regdef_h)
 amd64-linux.c : $(srcdir)/../regformats/i386/amd64-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-linux.dat amd64-linux.c
+amd64-avx.o : amd64-avx.c $(regdef_h)
+amd64-avx.c : $(srcdir)/../regformats/i386/amd64-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx.dat amd64-avx.c
+amd64-avx-linux.o : amd64-avx-linux.c $(regdef_h)
+amd64-avx-linux.c : $(srcdir)/../regformats/i386/amd64-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx-linux.dat amd64-avx-linux.c
 reg-xtensa.o : reg-xtensa.c $(regdef_h)
 reg-xtensa.c : $(srcdir)/../regformats/reg-xtensa.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-xtensa.dat reg-xtensa.c
diff --git a/gdb/gdbserver/configure.srv b/gdb/gdbserver/configure.srv
index e5818cd..a2f4323 100644
--- a/gdb/gdbserver/configure.srv
+++ b/gdb/gdbserver/configure.srv
@@ -22,17 +22,17 @@
 # Default hostio_last_error implementation
 srv_hostio_err_objs="hostio-errno.o"
 
-srv_i386_regobj=i386.o
-srv_i386_linux_regobj=i386-linux.o
-srv_amd64_regobj=amd64.o
-srv_amd64_linux_regobj=amd64-linux.o
+srv_i386_regobj="i386.o i386-avx.o"
+srv_i386_linux_regobj="i386-linux.o i386-avx-linux.o"
+srv_amd64_regobj="amd64.o x86-64-avx.o"
+srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
-srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml"
-srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml"
-srv_i386_xmlfiles="i386/i386.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_xmlfiles="i386/amd64.xml $srv_i386_64bit_xmlfiles"
-srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
+srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
+srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
+srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_xmlfiles="i386/amd64.xml i386/amd64-avx.xml $srv_i386_64bit_xmlfiles"
+srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/i386-avx-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/amd64-avx-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
 
 # Input is taken from the "${target}" variable.
 
diff --git a/gdb/gdbserver/i387-fp.c b/gdb/gdbserver/i387-fp.c
index 7ef4ba3..08fb79a 100644
--- a/gdb/gdbserver/i387-fp.c
+++ b/gdb/gdbserver/i387-fp.c
@@ -19,6 +19,7 @@
 
 #include "server.h"
 #include "i387-fp.h"
+#include "i386-xstate.h"
 
 int num_xmm_registers = 8;
 
@@ -72,6 +73,46 @@ struct i387_fxsave {
   unsigned char xmm_space[256];
 };
 
+struct i387_xsave {
+  /* All these are only sixteen bits, plus padding, except for fop (which
+     is only eleven bits), and fooff / fioff (which are 32 bits each).  */
+  unsigned short fctrl;
+  unsigned short fstat;
+  unsigned short ftag;
+  unsigned short fop;
+  unsigned int fioff;
+  unsigned short fiseg;
+  unsigned short pad1;
+  unsigned int fooff;
+  unsigned short foseg;
+  unsigned short pad12;
+
+  unsigned int mxcsr;
+  unsigned int mxcsr_mask;
+
+  /* Space for eight 80-bit FP values in 128-bit spaces.  */
+  unsigned char st_space[128];
+
+  /* Space for eight 128-bit XMM values, or 16 on x86-64.  */
+  unsigned char xmm_space[256];
+
+  unsigned char reserved1[48];
+
+  /* The extended control register 0 (the XFEATURE_ENABLED_MASK
+     register).  */
+  unsigned long long xcr0;
+
+  unsigned char reserved2[40];
+
+  /* The XSTATE_BV bit vector.  */
+  unsigned long long xstate_bv;
+
+  unsigned char reserved3[56];
+
+  /* Space for eight upper 128-bit YMM values, or 16 on x86-64.  */
+  unsigned char ymmh_space[256];
+};
+
 void
 i387_cache_to_fsave (struct regcache *regcache, void *buf)
 {
@@ -199,6 +240,141 @@ i387_cache_to_fxsave (struct regcache *regcache, void *buf)
   fp->foseg = val;
 }
 
+void
+i387_cache_to_xsave (struct regcache *regcache, void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  int i;
+  int st0_regnum = find_regno ("st0");
+  int xmm0_regnum;
+  unsigned long val, val2;
+  unsigned int clear_bv;
+  unsigned long long xstate_bv = 0;
+  char raw[32];
+
+  if ((x86_xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    xmm0_regnum = find_regno ("ymm0");
+  else
+    xmm0_regnum = find_regno ("xmm0");
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Clear part in x87 and vector registers if its bit in xstat_bv is
+     zero.  */
+  if (clear_bv)
+    {
+      if ((clear_bv & bit_I386_XSTATE_X87))
+	for (i = 0; i < 8; i++)
+	  memset (((char *) &fp->st_space[0]) + i * 16, 0, 10);
+
+      if ((clear_bv & bit_I386_XSTATE_SSE))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->xmm_space[0]) + i * 16, 0, 16);
+
+      if ((clear_bv & bit_I386_XSTATE_AVX))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->ymmh_space[0]) + i * 16, 0, 16);
+    }
+
+  /* Check if any x87 registers are changed.  */
+  for (i = 0; i < 8; i++)
+    {
+      collect_register (regcache, i + st0_regnum, raw);
+      if (memcmp (raw, ((char *) &fp->st_space[0]) + i * 16, 10))
+	{
+	  xstate_bv |= bit_I386_XSTATE_X87;
+	  break;
+	}
+    }
+
+  /* Check if any SSE/AVX registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  if (memcmp (raw, ((char *) &fp->xmm_space[0]) + i * 16, 16))
+	     xstate_bv |= bit_I386_XSTATE_SSE;
+	  if (memcmp (raw + 16, ((char *) &fp->ymmh_space[0]) + i * 16, 16))
+	     xstate_bv |= bit_I386_XSTATE_AVX;
+	  if (xstate_bv == (bit_I386_XSTATE_AVX | bit_I386_XSTATE_SSE))
+	    break;
+	}
+    }
+  else
+    {
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  if (memcmp (raw, ((char *) &fp->xmm_space[0]) + i * 16, 16))
+	    {
+	      xstate_bv |= bit_I386_XSTATE_SSE;
+	      break;
+	    }
+	}
+    }
+
+  /* Update the corresponding bits in xstate_bv if any SSE/AVX
+     registers are changed.  */
+  fp->xstate_bv |= xstate_bv;
+
+  for (i = 0; i < 8; i++)
+    collect_register (regcache, i + st0_regnum,
+		      ((char *) &fp->st_space[0]) + i * 16);
+
+  if ((x86_xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  memcpy (((char *) &fp->xmm_space[0]) + i * 16, raw, 16);
+	  memcpy (((char *) &fp->ymmh_space[0]) + i * 16, raw + 16, 16);
+	}
+    }
+  else
+    {
+      for (i = 0; i < num_xmm_registers; i++)
+	collect_register (regcache, i + xmm0_regnum,
+			  ((char *) &fp->xmm_space[0]) + i * 16);
+    }
+
+  collect_register_by_name (regcache, "fioff", &fp->fioff);
+  collect_register_by_name (regcache, "fooff", &fp->fooff);
+  collect_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* This one's 11 bits... */
+  collect_register_by_name (regcache, "fop", &val2);
+  fp->fop = (val2 & 0x7FF) | (fp->fop & 0xF800);
+
+  /* Some registers are 16-bit.  */
+  collect_register_by_name (regcache, "fctrl", &val);
+  fp->fctrl = val;
+
+  collect_register_by_name (regcache, "fstat", &val);
+  fp->fstat = val;
+
+  /* Convert to the simplifed tag form stored in fxsave data.  */
+  collect_register_by_name (regcache, "ftag", &val);
+  val &= 0xFFFF;
+  val2 = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag = (val >> (i * 2)) & 3;
+
+      if (tag != 3)
+	val2 |= (1 << i);
+    }
+  fp->ftag = val2;
+
+  collect_register_by_name (regcache, "fiseg", &val);
+  fp->fiseg = val;
+
+  collect_register_by_name (regcache, "foseg", &val);
+  fp->foseg = val;
+}
+
 static int
 i387_ftag (struct i387_fxsave *fp, int regno)
 {
@@ -296,3 +472,104 @@ i387_fxsave_to_cache (struct regcache *regcache, const void *buf)
   val = (fp->fop) & 0x7FF;
   supply_register_by_name (regcache, "fop", &val);
 }
+
+void
+i387_xsave_to_cache (struct regcache *regcache, const void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  struct i387_fxsave *fxp = (struct i387_fxsave *) buf;
+  int i, top;
+  int st0_regnum = find_regno ("st0");
+  int xmm0_regnum;
+  unsigned long val;
+  unsigned int clear_bv;
+  char raw[32];
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  for (i = 0; i < 8; i++)
+    {
+      if ((clear_bv & bit_I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = ((char *) &fp->st_space[0]) + i * 16;
+      supply_register (regcache, i + st0_regnum, p);
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      xmm0_regnum = find_regno ("ymm0");
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if ((clear_bv & (bit_I386_XSTATE_SSE | bit_I386_XSTATE_AVX))
+	       == (bit_I386_XSTATE_SSE | bit_I386_XSTATE_AVX))
+	    p = NULL;
+	  else
+	    {
+	      p = raw;
+	      if ((clear_bv & bit_I386_XSTATE_SSE))
+		memset (raw, 0, 16);
+	      else
+		memcpy (raw, ((char *) &fp->xmm_space[0]) + i * 16, 16);
+	      if ((clear_bv & bit_I386_XSTATE_AVX))
+		memset (raw + 16, 0, 16);
+	      else
+		memcpy (raw + 16, ((char *) &fp->ymmh_space[0]) + i * 16,
+			16);
+	    }
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+  else
+    {
+      xmm0_regnum = find_regno ("xmm0");
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if ((clear_bv & bit_I386_XSTATE_SSE))
+	    p = NULL;
+	  else
+	    p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+
+  supply_register_by_name (regcache, "fioff", &fp->fioff);
+  supply_register_by_name (regcache, "fooff", &fp->fooff);
+  supply_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* Some registers are 16-bit.  */
+  val = fp->fctrl & 0xFFFF;
+  supply_register_by_name (regcache, "fctrl", &val);
+
+  val = fp->fstat & 0xFFFF;
+  supply_register_by_name (regcache, "fstat", &val);
+
+  /* Generate the form of ftag data that GDB expects.  */
+  top = (fp->fstat >> 11) & 0x7;
+  val = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag;
+      if (fp->ftag & (1 << i))
+	tag = i387_ftag (fxp, (i + 8 - top) % 8);
+      else
+	tag = 3;
+      val |= tag << (2 * i);
+    }
+  supply_register_by_name (regcache, "ftag", &val);
+
+  val = fp->fiseg & 0xFFFF;
+  supply_register_by_name (regcache, "fiseg", &val);
+
+  val = fp->foseg & 0xFFFF;
+  supply_register_by_name (regcache, "foseg", &val);
+
+  val = (fp->fop) & 0x7FF;
+  supply_register_by_name (regcache, "fop", &val);
+}
+
+/* Default to SSE.  */
+unsigned long long x86_xcr0 = I386_XSTATE_SSE_MASK;
diff --git a/gdb/gdbserver/i387-fp.h b/gdb/gdbserver/i387-fp.h
index d1e0681..ed1a322 100644
--- a/gdb/gdbserver/i387-fp.h
+++ b/gdb/gdbserver/i387-fp.h
@@ -26,6 +26,11 @@ void i387_fsave_to_cache (struct regcache *regcache, const void *buf);
 void i387_cache_to_fxsave (struct regcache *regcache, void *buf);
 void i387_fxsave_to_cache (struct regcache *regcache, const void *buf);
 
+void i387_cache_to_xsave (struct regcache *regcache, void *buf);
+void i387_xsave_to_cache (struct regcache *regcache, const void *buf);
+
+extern unsigned long long x86_xcr0;
+
 extern int num_xmm_registers;
 
 #endif /* I387_FP_H */
diff --git a/gdb/gdbserver/linux-arm-low.c b/gdb/gdbserver/linux-arm-low.c
index 54668f8..32bd7bb 100644
--- a/gdb/gdbserver/linux-arm-low.c
+++ b/gdb/gdbserver/linux-arm-low.c
@@ -354,16 +354,16 @@ arm_arch_setup (void)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, 18 * 4,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 18 * 4,
     GENERAL_REGS,
     arm_fill_gregset, arm_store_gregset },
-  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 16 * 8 + 6 * 4,
+  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 0, 16 * 8 + 6 * 4,
     EXTENDED_REGS,
     arm_fill_wmmxregset, arm_store_wmmxregset },
-  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 32 * 8 + 4,
+  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 0, 32 * 8 + 4,
     EXTENDED_REGS,
     arm_fill_vfpregset, arm_store_vfpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-crisv32-low.c b/gdb/gdbserver/linux-crisv32-low.c
index 6ba48b6..d426c32 100644
--- a/gdb/gdbserver/linux-crisv32-low.c
+++ b/gdb/gdbserver/linux-crisv32-low.c
@@ -365,9 +365,9 @@ cris_store_gregset (const void *buf)
 typedef unsigned long elf_gregset_t[cris_num_regs];
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS, cris_fill_gregset, cris_store_gregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 262a1df..3802f9b 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -39,6 +39,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/vfs.h>
+#include <sys/uio.h>
 #ifndef ELFMAG0
 /* Don't include <linux/elf.h> here.  If it got included by gdb_proc_service.h
    then ELFMAG0 will have been defined.  If it didn't get included by
@@ -2280,14 +2281,15 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2296,10 +2298,21 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
 	}
 
       buf = xmalloc (regset->size);
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, data, nt_type);
 #endif
       if (res < 0)
 	{
@@ -2337,14 +2350,15 @@ regsets_store_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2357,10 +2371,21 @@ regsets_store_inferior_registers (struct regcache *regcache)
       /* First fill the buffer with the current register set contents,
 	 in case there are any items in the kernel's regset that are
 	 not in gdbserver's regcache.  */
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, &iov, data);
 #endif
 
       if (res == 0)
@@ -2370,9 +2395,9 @@ regsets_store_inferior_registers (struct regcache *regcache)
 
 	  /* Only now do we write the register set.  */
 #ifndef __sparc__
-	  res = ptrace (regset->set_request, pid, 0, buf);
+	  res = ptrace (regset->set_request, pid, nt_type, data);
 #else
-	  res = ptrace (regset->set_request, pid, buf, 0);
+	  res = ptrace (regset->set_request, pid, data, nt_type);
 #endif
 	}
 
@@ -3433,6 +3458,13 @@ linux_core_of_thread (ptid_t ptid)
   return core;
 }
 
+static void
+linux_process_qsupported (const char *query)
+{
+  if (the_low_target.process_qsupported != NULL)
+    the_low_target.process_qsupported (query);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -3476,7 +3508,8 @@ static struct target_ops linux_target_ops = {
 #else
   NULL,
 #endif
-  linux_core_of_thread
+  linux_core_of_thread,
+  linux_process_qsupported
 };
 
 static void
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index 82ad00c..57e7adb 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -35,6 +35,9 @@ enum regset_type {
 struct regset_info
 {
   int get_request, set_request;
+  /* If NT_TYPE isn't 0, it will be passed to ptrace as the 3rd
+     argument and the 4th argument should be "const struct iovec *".  */
+  int nt_type;
   int size;
   enum regset_type type;
   regset_fill_func fill_function;
@@ -111,6 +114,9 @@ struct linux_target_ops
 
   /* Hook to call prior to resuming a thread.  */
   void (*prepare_to_resume) (struct lwp_info *);
+
+  /* Hook to support target specific qSupported.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct linux_target_ops the_low_target;
diff --git a/gdb/gdbserver/linux-m68k-low.c b/gdb/gdbserver/linux-m68k-low.c
index 14e3864..6c98bb1 100644
--- a/gdb/gdbserver/linux-m68k-low.c
+++ b/gdb/gdbserver/linux-m68k-low.c
@@ -112,14 +112,14 @@ m68k_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     m68k_fill_gregset, m68k_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     m68k_fill_fpregset, m68k_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static const unsigned char m68k_breakpoint[] = { 0x4E, 0x4F };
diff --git a/gdb/gdbserver/linux-mips-low.c b/gdb/gdbserver/linux-mips-low.c
index 70f6700..1c04b2e 100644
--- a/gdb/gdbserver/linux-mips-low.c
+++ b/gdb/gdbserver/linux-mips-low.c
@@ -343,12 +343,12 @@ mips_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, 38 * 8, GENERAL_REGS,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 38 * 8, GENERAL_REGS,
     mips_fill_gregset, mips_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 33 * 8, FP_REGS,
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, 33 * 8, FP_REGS,
     mips_fill_fpregset, mips_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-ppc-low.c b/gdb/gdbserver/linux-ppc-low.c
index 10a1309..0dab604 100644
--- a/gdb/gdbserver/linux-ppc-low.c
+++ b/gdb/gdbserver/linux-ppc-low.c
@@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
      fetch them every time, but still fall back to PTRACE_PEEKUSER for the
      general registers.  Some kernels support these, but not the newer
      PPC_PTRACE_GETREGS.  */
-  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
+  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
   ppc_fill_vsxregset, ppc_store_vsxregset },
   { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
     ppc_fill_vrregset, ppc_store_vrregset },
-  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
+  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
     ppc_fill_evrregset, ppc_store_evrregset },
-  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-s390-low.c b/gdb/gdbserver/linux-s390-low.c
index 5460f57..eb865dc 100644
--- a/gdb/gdbserver/linux-s390-low.c
+++ b/gdb/gdbserver/linux-s390-low.c
@@ -181,8 +181,8 @@ static void s390_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 
diff --git a/gdb/gdbserver/linux-sh-low.c b/gdb/gdbserver/linux-sh-low.c
index 9d27e7f..87a0dd2 100644
--- a/gdb/gdbserver/linux-sh-low.c
+++ b/gdb/gdbserver/linux-sh-low.c
@@ -104,8 +104,8 @@ static void sh_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-sparc-low.c b/gdb/gdbserver/linux-sparc-low.c
index 0bb5f2f..e0bfe81 100644
--- a/gdb/gdbserver/linux-sparc-low.c
+++ b/gdb/gdbserver/linux-sparc-low.c
@@ -260,13 +260,13 @@ sparc_reinsert_addr (void)
 
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     sparc_fill_gregset, sparc_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (fpregset_t),
     FP_REGS,
     sparc_fill_fpregset, sparc_store_fpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index 496baa2..2297108 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -24,6 +24,8 @@
 #include "linux-low.h"
 #include "i387-fp.h"
 #include "i386-low.h"
+#include "i386-xstate.h"
+#include "elf/common.h"
 
 #include "gdb_proc_service.h"
 
@@ -31,10 +33,24 @@
 void init_registers_i386_linux (void);
 /* Defined in auto-generated file amd64-linux.c.  */
 void init_registers_amd64_linux (void);
+/* Defined in auto-generated file i386-avx-linux.c.  */
+void init_registers_i386_avx_linux (void);
+/* Defined in auto-generated file amd64-avx-linux.c.  */
+void init_registers_amd64_avx_linux (void);
 
 #include <sys/reg.h>
 #include <sys/procfs.h>
 #include <sys/ptrace.h>
+#include <sys/uio.h>
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
 
 #ifndef PTRACE_GET_THREAD_AREA
 #define PTRACE_GET_THREAD_AREA 25
@@ -252,6 +268,18 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 
 #endif
 
+static void
+x86_fill_xstateregset (struct regcache *regcache, void *buf)
+{
+  i387_cache_to_xsave (regcache, buf);
+}
+
+static void
+x86_store_xstateregset (struct regcache *regcache, const void *buf)
+{
+  i387_xsave_to_cache (regcache, buf);
+}
+
 /* ??? The non-biarch i386 case stores all the i387 regs twice.
    Once in i387_.*fsave.* and once in i387_.*fxsave.*.
    This is, presumably, to handle the case where PTRACE_[GS]ETFPXREGS
@@ -264,21 +292,28 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 struct regset_info target_regsets[] =
 {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     x86_fill_gregset, x86_store_gregset },
+  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
+# ifdef __x86_64__
+    FP_REGS,
+# else
+    EXTENDED_REGS,
+# endif
+    x86_fill_xstateregset, x86_store_xstateregset },
 # ifndef __x86_64__
 #  ifdef HAVE_PTRACE_GETFPXREGS
-  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, sizeof (elf_fpxregset_t),
+  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, 0, sizeof (elf_fpxregset_t),
     EXTENDED_REGS,
     x86_fill_fpxregset, x86_store_fpxregset },
 #  endif
 # endif
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     x86_fill_fpregset, x86_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static CORE_ADDR
@@ -772,6 +807,86 @@ x86_siginfo_fixup (struct siginfo *native, void *inf, int direction)
   return 0;
 }
 \f
+/* Process qSupported query:
+
+   x86:xstate=SIZE:xcr0=MASK
+
+   Update the buffer size for PTRACE_GETREGSET.  */
+
+static void
+x86_linux_process_qsupported (const char *query)
+{
+  int size = 0;
+  int pid;
+  unsigned long long xstateregs[I386_XSTATE_SSE_SIZE / sizeof (long long)];
+  struct iovec iov;
+  unsigned long long xcr0;
+
+  /* Return if gdb doesn't provide XCR0 info.   */
+  if (query == NULL)
+    {
+      use_xml = 0;
+      return;
+    }
+
+  xcr0 = 0;
+  if (strncmp (query, "x86:xstate=", 11) == 0)
+    {
+      char *p;
+
+      size = strtol (query + 11, &p, 0);
+      if (p != (query + 11))
+	{
+	  if (strncmp (p, ":xcr0=", 6) == 0)
+	    {
+	      xcr0 = strtoull (p + 6, NULL, 0);
+	      use_xml = 1;
+	    }
+	}
+    }
+
+  /* Check if XSAVE extended state is supported.  */
+  pid = pid_of (get_thread_lwp (current_inferior));
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+  /* Check if PTRACE_GETREGSET works.  */
+  if (ptrace (PTRACE_GETREGSET, pid,
+	      (unsigned int) NT_X86_XSTATE, (long) &iov) == 0)
+    {
+      struct regset_info *regset;
+
+      /* Support only those XSAVE extended states supported by both gdb
+	 and host.  Get XCR0 from XSAVE extended state at byte 464.  */
+      xcr0 &= xstateregs[464 / sizeof (long long)];
+
+      /* Use PTRACE_GETREGSET if it is available.  */
+      for (regset = target_regsets;
+	   regset->fill_function != NULL; regset++)
+	if (regset->get_request == PTRACE_GETREGSET)
+	  regset->size = I386_XSTATE_SIZE (xcr0);
+	else if (regset->type != GENERAL_REGS)
+	  regset->size = 0;
+
+      /* AVX is the highest feature we support.  */
+      if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+	{
+	  x86_xcr0 = xcr0;
+
+#ifdef __x86_64__
+	  /* I386 has 8 xmm regs.  */
+	  if (num_xmm_registers == 8)
+	    init_registers_i386_avx_linux ();
+	  else
+	    init_registers_amd64_avx_linux ();
+#else
+	  init_registers_i386_avx_linux ();
+#endif
+	}
+    }
+};
+
 /* Initialize gdbserver for the architecture of the inferior.  */
 
 static void
@@ -850,5 +964,6 @@ struct linux_target_ops the_low_target =
   x86_siginfo_fixup,
   x86_linux_new_process,
   x86_linux_new_thread,
-  x86_linux_prepare_to_resume
+  x86_linux_prepare_to_resume,
+  x86_linux_process_qsupported 
 };
diff --git a/gdb/gdbserver/linux-xtensa-low.c b/gdb/gdbserver/linux-xtensa-low.c
index c5ed351..8d0e73a 100644
--- a/gdb/gdbserver/linux-xtensa-low.c
+++ b/gdb/gdbserver/linux-xtensa-low.c
@@ -131,13 +131,13 @@ xtensa_store_xtregset (struct regcache *regcache, const void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     xtensa_fill_gregset, xtensa_store_gregset },
-  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, XTENSA_ELF_XTREG_SIZE,
+  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, 0, XTENSA_ELF_XTREG_SIZE,
     EXTENDED_REGS,
     xtensa_fill_xtregset, xtensa_store_xtregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 #if XCHAL_HAVE_BE
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index a03f877..6e46a7a 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -32,6 +32,13 @@
 #include <malloc.h>
 #endif
 
+int use_xml =
+#ifdef USE_XML
+  1;
+#else
+  0;
+#endif
+
 ptid_t cont_thread;
 ptid_t general_thread;
 ptid_t step_thread;
@@ -474,20 +481,19 @@ get_features_xml (const char *annex)
 	annex = gdbserver_xmltarget;
     }
 
-#ifdef USE_XML
-  {
-    extern const char *const xml_builtin[][2];
-    int i;
+  if (use_xml)
+    {
+      extern const char *const xml_builtin[][2];
+      int i;
 
-    /* Look for the annex.  */
-    for (i = 0; xml_builtin[i][0] != NULL; i++)
-      if (strcmp (annex, xml_builtin[i][0]) == 0)
-	break;
+      /* Look for the annex.  */
+      for (i = 0; xml_builtin[i][0] != NULL; i++)
+	if (strcmp (annex, xml_builtin[i][0]) == 0)
+	  break;
 
-    if (xml_builtin[i][0] != NULL)
-      return xml_builtin[i][1];
-  }
-#endif
+      if (xml_builtin[i][0] != NULL)
+	return xml_builtin[i][1];
+    }
 
   return NULL;
 }
@@ -1236,6 +1242,9 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
     {
       char *p = &own_buf[10];
 
+      /* Start processing qSupported packet.  */
+      target_process_qsupported (NULL);
+
       /* Process each feature being provided by GDB.  The first
 	 feature will follow a ':', and latter features will follow
 	 ';'.  */
@@ -1251,6 +1260,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		if (target_supports_multi_process ())
 		  multi_process = 1;
 	      }
+	    else if (strncmp (p, "x86:xstate=", 11) == 0)
+	      target_process_qsupported (p);
 	  }
 
       sprintf (own_buf, "PacketSize=%x;QPassSignals+", PBUFSIZ - 1);
diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h
index f46ee60..a9cd024 100644
--- a/gdb/gdbserver/server.h
+++ b/gdb/gdbserver/server.h
@@ -22,6 +22,8 @@
 
 #include "config.h"
 
+extern int use_xml;
+
 #ifdef __MINGW32CE__
 #include "wincecompat.h"
 #endif
diff --git a/gdb/gdbserver/target.h b/gdb/gdbserver/target.h
index ac68652..6109b1c 100644
--- a/gdb/gdbserver/target.h
+++ b/gdb/gdbserver/target.h
@@ -286,6 +286,9 @@ struct target_ops
 
   /* Returns the core given a thread, or -1 if not known.  */
   int (*core_of_thread) (ptid_t);
+
+  /* Target specific qSupported support.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct target_ops *the_target;
@@ -326,6 +329,10 @@ void set_target_ops (struct target_ops *);
   (the_target->supports_multi_process ? \
    (*the_target->supports_multi_process) () : 0)
 
+#define target_process_qsupported(query) \
+  if (the_target->process_qsupported) \
+    the_target->process_qsupported (query)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 18:02 PATCH: 1/6: Add AVX support H.J. Lu
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
@ 2010-03-04 19:09 ` Daniel Jacobowitz
  2010-03-04 19:29   ` H.J. Lu
  2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-04 19:09 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Thu, Mar 04, 2010 at 10:02:19AM -0800, H.J. Lu wrote:
> 4. Remote gdb protocol extension. GDB will send
> 
> x86:xstate=BYTES:xcr0=VALUE
> 
> in qSupported request packet to indicate that GDB supports x86 XSAVE
> extended state. BYTES specifies the maximum size in bytes of x86 XSAVE
> extended state GDB supports. VALUE specifies the maximum value of XCR0
> GDB supports.  Gdbserver will select the best target description
> supported by GDB, based on BYTES and VALUE. The older gdbserver will
> always return SSE target.

The whole point of target descriptions, and the thing we've been going
to so much trouble to implement for the past month, is that this
negotiation is not supposed to be needed.  Why does it matter what GDB
supports?  If there are new registers that GDB does not know about,
in the target description supplied by gdbserver, then GDB will not use
them specially for debug info or function calls.  But it will
otherwise handle them fine.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 19:09 ` PATCH: 1/6: Add AVX support Daniel Jacobowitz
@ 2010-03-04 19:29   ` H.J. Lu
  2010-03-04 19:47     ` Daniel Jacobowitz
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 19:29 UTC (permalink / raw)
  To: H.J. Lu, GDB

On Thu, Mar 4, 2010 at 11:09 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> On Thu, Mar 04, 2010 at 10:02:19AM -0800, H.J. Lu wrote:
>> 4. Remote gdb protocol extension. GDB will send
>>
>> x86:xstate=BYTES:xcr0=VALUE
>>
>> in qSupported request packet to indicate that GDB supports x86 XSAVE
>> extended state. BYTES specifies the maximum size in bytes of x86 XSAVE
>> extended state GDB supports. VALUE specifies the maximum value of XCR0
>> GDB supports.  Gdbserver will select the best target description
>> supported by GDB, based on BYTES and VALUE. The older gdbserver will
>> always return SSE target.
>
> The whole point of target descriptions, and the thing we've been going
> to so much trouble to implement for the past month, is that this
> negotiation is not supposed to be needed.  Why does it matter what GDB
> supports?  If there are new registers that GDB does not know about,
> in the target description supplied by gdbserver, then GDB will not use
> them specially for debug info or function calls.  But it will
> otherwise handle them fine.
>

AVX registers aren't new registers, on top of SSE registers. AVX
registers are the super set of SSE registers. XMM0 is the alias
of the lower 128bit of YMM0. So we have either SSE target or
AVX target, depending on the processor/OS. We may have
remote gdb stub on an AVX processor/OS.  But gdb, which
the stub is talking to, may not support AVX at all. If the stub
sends the AVX target description to gdb, gdb won't understand it
and will fail.

This may not be a serious issue today since the new stub with
x86 XML target descriptions can only talk to gdb with x86 XML
support, which very likely supports AVX XML. But between my x86
XML checkin and AVX checkin, there are some snapshots of
gdb, which only supports SSE XML. If gdbserver sends AVX
XML to them, those gdbs will fail.

In the future, we will add more states to XSAVE extended state
and we will extend gdb to support them. Gdbserver should
send the XML target, which is supported by gdb, to gdb so
that the new x86 gdb stub can talk to older gdb with x86
XML support.. The XCR0 bits in qSupported request packet
are used for this purpose to provide forward compatibility.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 19:29   ` H.J. Lu
@ 2010-03-04 19:47     ` Daniel Jacobowitz
  2010-03-04 21:27       ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-04 19:47 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Thu, Mar 04, 2010 at 11:29:38AM -0800, H.J. Lu wrote:
> AVX registers aren't new registers, on top of SSE registers. AVX
> registers are the super set of SSE registers. XMM0 is the alias
> of the lower 128bit of YMM0. So we have either SSE target or
> AVX target, depending on the processor/OS. We may have
> remote gdb stub on an AVX processor/OS.  But gdb, which
> the stub is talking to, may not support AVX at all. If the stub
> sends the AVX target description to gdb, gdb won't understand it
> and will fail.

No, it will fail to display SSE.  Core debugging should still be
possible, and the newly added registers will be visible too.  If
that's not the case, fix GDB to function with the SSE registers
missing.

The goal of the target description language is to communicate the
entire target to GDB.  If you're not putting enough in the description
for an arbitrary XML-capable GDB to function, then you need to rethink
how you've written the description.

I do not think that having GDB negotiate with the target for an
intermediate target description is a good idea.  Also, it is not hard
to upgrade GDB on the host.  It can be hard to upgrade the target, of
course.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 19:47     ` Daniel Jacobowitz
@ 2010-03-04 21:27       ` H.J. Lu
  2010-03-04 21:34         ` Nathan Froyd
  2010-03-04 21:47         ` Daniel Jacobowitz
  0 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 21:27 UTC (permalink / raw)
  To: H.J. Lu, GDB

On Thu, Mar 4, 2010 at 11:46 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> On Thu, Mar 04, 2010 at 11:29:38AM -0800, H.J. Lu wrote:
>> AVX registers aren't new registers, on top of SSE registers. AVX
>> registers are the super set of SSE registers. XMM0 is the alias
>> of the lower 128bit of YMM0. So we have either SSE target or
>> AVX target, depending on the processor/OS. We may have
>> remote gdb stub on an AVX processor/OS.  But gdb, which
>> the stub is talking to, may not support AVX at all. If the stub
>> sends the AVX target description to gdb, gdb won't understand it
>> and will fail.
>
> No, it will fail to display SSE.  Core debugging should still be
> possible, and the newly added registers will be visible too.  If
> that's not the case, fix GDB to function with the SSE registers
> missing.

Your description only works for truly NEW registers, which
AVX registers aren't.  AVX registers are actually the old SSE
registers with different names.

> The goal of the target description language is to communicate the
> entire target to GDB.  If you're not putting enough in the description
> for an arbitrary XML-capable GDB to function, then you need to rethink
> how you've written the description.
>
> I do not think that having GDB negotiate with the target for an
> intermediate target description is a good idea.  Also, it is not hard
> to upgrade GDB on the host.  It can be hard to upgrade the target, of
> course.
>

Gdb stub can only send XML target description with register names
which gdb can pass to tdesc_numbered_register. If gdb doesn't
know the new register names, it won't work.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 21:27       ` H.J. Lu
@ 2010-03-04 21:34         ` Nathan Froyd
  2010-03-04 21:41           ` H.J. Lu
  2010-03-04 21:47         ` Daniel Jacobowitz
  1 sibling, 1 reply; 115+ messages in thread
From: Nathan Froyd @ 2010-03-04 21:34 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Thu, Mar 04, 2010 at 01:27:09PM -0800, H.J. Lu wrote:
> On Thu, Mar 4, 2010 at 11:46 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> > No, it will fail to display SSE.  Core debugging should still be
> > possible, and the newly added registers will be visible too.  If
> > that's not the case, fix GDB to function with the SSE registers
> > missing.
> 
> Your description only works for truly NEW registers, which
> AVX registers aren't.  AVX registers are actually the old SSE
> registers with different names.

You can make "wide" registers like this work; the PPC backend does this
for the SPE registers, where the lower 32 bits function as the normal
PPC registers and the upper 32 bits are sent as separate "registers".
GDB then synthesizes the complete register out of both parts.

-Nathan

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 21:34         ` Nathan Froyd
@ 2010-03-04 21:41           ` H.J. Lu
  2010-03-04 21:59             ` Nathan Froyd
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-04 21:41 UTC (permalink / raw)
  To: Nathan Froyd; +Cc: GDB

On Thu, Mar 4, 2010 at 1:34 PM, Nathan Froyd <froydnj@codesourcery.com> wrote:
> On Thu, Mar 04, 2010 at 01:27:09PM -0800, H.J. Lu wrote:
>> On Thu, Mar 4, 2010 at 11:46 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
>> > No, it will fail to display SSE.  Core debugging should still be
>> > possible, and the newly added registers will be visible too.  If
>> > that's not the case, fix GDB to function with the SSE registers
>> > missing.
>>
>> Your description only works for truly NEW registers, which
>> AVX registers aren't.  AVX registers are actually the old SSE
>> registers with different names.
>
> You can make "wide" registers like this work; the PPC backend does this
> for the SPE registers, where the lower 32 bits function as the normal
> PPC registers and the upper 32 bits are sent as separate "registers".
> GDB then synthesizes the complete register out of both parts.
>

I am not familiar with SPE. How does it work with native SPE
gdb? Does it support old registers with new names?


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 21:27       ` H.J. Lu
  2010-03-04 21:34         ` Nathan Froyd
@ 2010-03-04 21:47         ` Daniel Jacobowitz
  2010-03-05  2:06           ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-04 21:47 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Thu, Mar 04, 2010 at 01:27:09PM -0800, H.J. Lu wrote:
> > No, it will fail to display SSE.  Core debugging should still be
> > possible, and the newly added registers will be visible too.  If
> > that's not the case, fix GDB to function with the SSE registers
> > missing.
> 
> Your description only works for truly NEW registers, which
> AVX registers aren't.  AVX registers are actually the old SSE
> registers with different names.

I'm trying to get you to think about compatibility in the
descriptions, instead of separately in the remote protocol.
There are always ways to solve it.  For instance, you could present
both the AVX registers and the hypothetical newer, larger registers as
separate things.  As long as the P packet is implemented, which it is,
GDB should work OK if modifying one register changes another.
I don't know if there's an example of this in the GDB sources, but I
have one in my tree; there's $sp, $sp_user, and $sp_system registers,
and $sp is the same as one of the other two depending on processor
mode.  But they're all visible.

Another solution is to define new registers which correspond to the
added bits, and have a sufficiently recent GDB synthesize the combined
registers from the AVX registers and the new bits.  This, for
instance, is how the Power E500 registers are handled
(rs6000/power-spe.xml).

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 21:41           ` H.J. Lu
@ 2010-03-04 21:59             ` Nathan Froyd
  0 siblings, 0 replies; 115+ messages in thread
From: Nathan Froyd @ 2010-03-04 21:59 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Thu, Mar 04, 2010 at 01:41:05PM -0800, H.J. Lu wrote:
> On Thu, Mar 4, 2010 at 1:34 PM, Nathan Froyd <froydnj@codesourcery.com> wrote:
> > On Thu, Mar 04, 2010 at 01:27:09PM -0800, H.J. Lu wrote:
> >> Your description only works for truly NEW registers, which
> >> AVX registers aren't.  AVX registers are actually the old SSE
> >> registers with different names.
> >
> > You can make "wide" registers like this work; the PPC backend does this
> > for the SPE registers, where the lower 32 bits function as the normal
> > PPC registers and the upper 32 bits are sent as separate "registers".
> > GDB then synthesizes the complete register out of both parts.
> 
> I am not familiar with SPE. How does it work with native SPE
> gdb? Does it support old registers with new names?

I assume it works just fine; I've never taken the chance to see how it
works natively, but it works just fine remotely.  Of course, the native
bits have to be written with an understanding of how the high parts of
the registers work.

I don't understand your second question.

-Nathan

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-04 21:47         ` Daniel Jacobowitz
@ 2010-03-05  2:06           ` H.J. Lu
  2010-03-05  7:29             ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-05  2:06 UTC (permalink / raw)
  To: H.J. Lu, GDB

On Thu, Mar 4, 2010 at 1:47 PM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> On Thu, Mar 04, 2010 at 01:27:09PM -0800, H.J. Lu wrote:
>> > No, it will fail to display SSE.  Core debugging should still be
>> > possible, and the newly added registers will be visible too.  If
>> > that's not the case, fix GDB to function with the SSE registers
>> > missing.
>>
>> Your description only works for truly NEW registers, which
>> AVX registers aren't.  AVX registers are actually the old SSE
>> registers with different names.
>
> I'm trying to get you to think about compatibility in the
> descriptions, instead of separately in the remote protocol.
> There are always ways to solve it.  For instance, you could present
> both the AVX registers and the hypothetical newer, larger registers as
> separate things.  As long as the P packet is implemented, which it is,
> GDB should work OK if modifying one register changes another.
> I don't know if there's an example of this in the GDB sources, but I
> have one in my tree; there's $sp, $sp_user, and $sp_system registers,
> and $sp is the same as one of the other two depending on processor
> mode.  But they're all visible.
>
> Another solution is to define new registers which correspond to the
> added bits, and have a sufficiently recent GDB synthesize the combined
> registers from the AVX registers and the new bits.  This, for
> instance, is how the Power E500 registers are handled
> (rs6000/power-spe.xml).
>

OK, I will try SPE approach. It will take a while.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 5/6: Add AVX support (i387 changes)
  2010-03-04 18:09     ` PATCH: 5/6: Add AVX support (i387 changes) H.J. Lu
  2010-03-04 18:10       ` PATCH: 6/6: Add AVX support (gdbserver changes) H.J. Lu
@ 2010-03-05  3:20       ` Hui Zhu
  2010-03-05  3:54         ` H.J. Lu
  2010-03-06 22:22       ` PATCH: 5/6 [2nd try]: " H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: Hui Zhu @ 2010-03-05  3:20 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

-#define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
+#define I387_VECTOR0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)

We need this name change?

Thanks,
Hui

On Fri, Mar 5, 2010 at 02:09, H.J. Lu <hongjiu.lu@intel.com> wrote:
> Hi,
>
> Here are i387 changes to support AVX.  OK to install?
>
> Thanks.
>
>
> H.J.
> ---
> 2010-03-03  H.J. Lu  <hongjiu.lu@intel.com>
>
>        * i387-tdep.c: Include "i386-xstate.h".
>        (i387_supply_fsave): Replace I387_XMM0_REGNUM with
>        I387_VECTOR0_REGNUM.
>        (i387_collect_fsave): Likewise.
>        (i387_supply_fxsave): Replace I387_XMM0_REGNUM with
>        I387_VECTOR0_REGNUM.  Replace num_xmm_regs with num_vector_regs.
>        Check tdep->xcr0 for AVX.
>        (i387_collect_fxsave): Likewise.
>        (xsave_sse_offset): New.
>        (XSAVE_XSTATE_BV_ADDR): Likewise.
>        (XSAVE_SSE_ADDR): Likewise.
>        (xsave_avxh_offset): Likewise.
>        (XSAVE_AVXH_ADDR): Likewise.
>        (i387_supply_xsave): Likewise.
>        (i387_collect_xsave): Likewise.
>
>        * i387-tdep.h (I387_NUM_XMM_REGS): Renamed to ...
>        (I387_NUM_VECTOR_REGS): This.
>        (I387_XMM0_REGNUM): Renamed to ...
>        (I387_VECTOR0_REGNUM): This.
>        (I387_MXCSR_REGNUM): Updated.
>        (i387_supply_xsave): New.
>        (i387_collect_xsave): Likewise.
>
> diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
> index 3fb5b56..1f4547d 100644
> --- a/gdb/i387-tdep.c
> +++ b/gdb/i387-tdep.c
> @@ -34,6 +34,7 @@
>
>  #include "i386-tdep.h"
>  #include "i387-tdep.h"
> +#include "i386-xstate.h"
>
>  /* Print the floating point number specified by RAW.  */
>
> @@ -398,7 +399,7 @@ i387_supply_fsave (struct regcache *regcache, int regnum, const void *fsave)
>
>   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
>
> -  for (i = I387_ST0_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
> +  for (i = I387_ST0_REGNUM (tdep); i < I387_VECTOR0_REGNUM (tdep); i++)
>     if (regnum == -1 || regnum == i)
>       {
>        if (fsave == NULL)
> @@ -425,7 +426,7 @@ i387_supply_fsave (struct regcache *regcache, int regnum, const void *fsave)
>       }
>
>   /* Provide dummy values for the SSE registers.  */
> -  for (i = I387_XMM0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
> +  for (i = I387_VECTOR0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
>     if (regnum == -1 || regnum == i)
>       regcache_raw_supply (regcache, i, NULL);
>   if (regnum == -1 || regnum == I387_MXCSR_REGNUM (tdep))
> @@ -451,7 +452,7 @@ i387_collect_fsave (const struct regcache *regcache, int regnum, void *fsave)
>
>   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
>
> -  for (i = I387_ST0_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
> +  for (i = I387_ST0_REGNUM (tdep); i < I387_VECTOR0_REGNUM (tdep); i++)
>     if (regnum == -1 || regnum == i)
>       {
>        /* Most of the FPU control registers occupy only 16 bits in
> @@ -541,9 +542,11 @@ i387_supply_fxsave (struct regcache *regcache, int regnum, const void *fxsave)
>   struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
>   const gdb_byte *regs = fxsave;
>   int i;
> +  gdb_byte raw[I386_MAX_REGISTER_SIZE];
> +  const gdb_byte *xmm;
>
>   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
> -  gdb_assert (tdep->num_xmm_regs > 0);
> +  gdb_assert (tdep->num_vector_regs > 0);
>
>   for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
>     if (regnum == -1 || regnum == i)
> @@ -556,7 +559,7 @@ i387_supply_fxsave (struct regcache *regcache, int regnum, const void *fxsave)
>
>        /* Most of the FPU control registers occupy only 16 bits in
>           the fxsave area.  Give those a special treatment.  */
> -       if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_XMM0_REGNUM (tdep)
> +       if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_VECTOR0_REGNUM (tdep)
>            && i != I387_FIOFF_REGNUM (tdep) && i != I387_FOOFF_REGNUM (tdep))
>          {
>            gdb_byte val[4];
> @@ -600,7 +603,17 @@ i387_supply_fxsave (struct regcache *regcache, int regnum, const void *fxsave)
>            regcache_raw_supply (regcache, i, val);
>          }
>        else
> -         regcache_raw_supply (regcache, i, FXSAVE_ADDR (tdep, regs, i));
> +         {
> +           if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +               == I386_XSTATE_AVX_MASK)
> +             {
> +               memcpy (raw, FXSAVE_ADDR (tdep, regs, i), 16);
> +               xmm = raw;
> +             }
> +           else
> +             xmm = FXSAVE_ADDR (tdep, regs, i);
> +           regcache_raw_supply (regcache, i, xmm);
> +         }
>       }
>
>   if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
> @@ -624,16 +637,18 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
>   struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
>   gdb_byte *regs = fxsave;
>   int i;
> +  gdb_byte raw[I386_MAX_REGISTER_SIZE];
> +  gdb_byte *xmm;
>
>   gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
> -  gdb_assert (tdep->num_xmm_regs > 0);
> +  gdb_assert (tdep->num_vector_regs > 0);
>
>   for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
>     if (regnum == -1 || regnum == i)
>       {
>        /* Most of the FPU control registers occupy only 16 bits in
>            the fxsave area.  Give those a special treatment.  */
> -       if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_XMM0_REGNUM (tdep)
> +       if (i >= I387_FCTRL_REGNUM (tdep) && i < I387_VECTOR0_REGNUM (tdep)
>            && i != I387_FIOFF_REGNUM (tdep) && i != I387_FOOFF_REGNUM (tdep))
>          {
>            gdb_byte buf[4];
> @@ -669,7 +684,465 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
>            memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
>          }
>        else
> +         {
> +           if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +               == I386_XSTATE_AVX_MASK)
> +             {
> +               memcpy (raw, FXSAVE_ADDR (tdep, regs, i), 16);
> +               xmm = raw;
> +             }
> +           else
> +             xmm = FXSAVE_ADDR (tdep, regs, i);
> +           regcache_raw_collect (regcache, i, xmm);
> +         }
> +      }
> +
> +  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
> +    regcache_raw_collect (regcache, I387_MXCSR_REGNUM (tdep),
> +                         FXSAVE_MXCSR_ADDR (regs));
> +}
> +
> +/* At xsave_sse_offset[REGNUM] you'll find the offset to the location in
> +   the SSE register data structure used by the "xsave" instruction where
> +   GDB register REGNUM is stored.  */
> +
> +static int xsave_sse_offset[] =
> +{
> +  160 + 0 * 16,                /* %xmm0 through ...  */
> +  160 + 1 * 16,
> +  160 + 2 * 16,
> +  160 + 3 * 16,
> +  160 + 4 * 16,
> +  160 + 5 * 16,
> +  160 + 6 * 16,
> +  160 + 7 * 16,
> +  160 + 8 * 16,
> +  160 + 9 * 16,
> +  160 + 10 * 16,
> +  160 + 11 * 16,
> +  160 + 12 * 16,
> +  160 + 13 * 16,
> +  160 + 14 * 16,
> +  160 + 15 * 16,       /* ... %xmm15 (128 bits each).  */
> +};
> +
> +/* `xstate_bv' is at byte offset 512.  */
> +#define XSAVE_XSTATE_BV_ADDR(xsave) (xsave + 512)
> +
> +#define XSAVE_SSE_ADDR(tdep, xsave, regnum) \
> +  (xsave + xsave_sse_offset[regnum - I387_VECTOR0_REGNUM (tdep)])
> +
> +/* At xsave_avxh_offset[REGNUM] you'll find the offset to the location in
> +   the upper 128bit of AVX register data structure used by the "xsave"
> +   instruction where GDB register REGNUM is stored.  */
> +
> +static int xsave_avxh_offset[] =
> +{
> +  576 + 0 * 16,                /* Upper 128bit of %ymm0 through ...  */
> +  576 + 1 * 16,
> +  576 + 2 * 16,
> +  576 + 3 * 16,
> +  576 + 4 * 16,
> +  576 + 5 * 16,
> +  576 + 6 * 16,
> +  576 + 7 * 16,
> +  576 + 8 * 16,
> +  576 + 9 * 16,
> +  576 + 10 * 16,
> +  576 + 11 * 16,
> +  576 + 12 * 16,
> +  576 + 13 * 16,
> +  576 + 14 * 16,
> +  576 + 15 * 16,       /* Upper 128bit of ... %ymm15 (128 bits each).  */
> +};
> +
> +#define XSAVE_AVXH_ADDR(tdep, xsave, regnum) \
> +  (xsave + xsave_avxh_offset[regnum - I387_VECTOR0_REGNUM (tdep)])
> +
> +/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
> +
> +void
> +i387_supply_xsave (struct regcache *regcache, int regnum,
> +                  const void *xsave)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
> +  const gdb_byte *regs = xsave;
> +  int i;
> +  unsigned int clear_bv;
> +  gdb_byte raw[I386_MAX_REGISTER_SIZE];
> +  const gdb_byte *p;
> +
> +  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
> +  gdb_assert (tdep->num_vector_regs > 0);
> +
> +  if (regs != NULL
> +      && (regnum == -1
> +         || (regnum >= I387_VECTOR0_REGNUM(tdep)
> +             && regnum < I387_MXCSR_REGNUM (tdep))
> +         || (regnum >= I387_ST0_REGNUM (tdep)
> +             && regnum < I387_FCTRL_REGNUM (tdep))))
> +    {
> +      /* Get `xstat_bv'.  */
> +      const gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
> +
> +      /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
> +        vector registers if its bit in xstat_bv is zero.  */
> +      clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
> +    }
> +  else
> +    clear_bv = 0;
> +
> +  for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
> +    if (regnum == -1 || regnum == i)
> +      {
> +       if (regs == NULL)
> +         {
> +           regcache_raw_supply (regcache, i, NULL);
> +           continue;
> +         }
> +
> +       /* Most of the FPU control registers occupy only 16 bits in
> +          the xsave extended state.  Give those a special treatment.  */
> +       if (i >= I387_FCTRL_REGNUM (tdep)
> +           && i < I387_VECTOR0_REGNUM (tdep)
> +           && i != I387_FIOFF_REGNUM (tdep)
> +           && i != I387_FOOFF_REGNUM (tdep))
> +         {
> +           gdb_byte val[4];
> +
> +           memcpy (val, FXSAVE_ADDR (tdep, regs, i), 2);
> +           val[2] = val[3] = 0;
> +           if (i == I387_FOP_REGNUM (tdep))
> +             val[1] &= ((1 << 3) - 1);
> +           else if (i== I387_FTAG_REGNUM (tdep))
> +             {
> +               /* The fxsave area contains a simplified version of
> +                  the tag word.  We have to look at the actual 80-bit
> +                  FP data to recreate the traditional i387 tag word.  */
> +
> +               unsigned long ftag = 0;
> +               int fpreg;
> +               int top;
> +
> +               top = ((FXSAVE_ADDR (tdep, regs,
> +                                    I387_FSTAT_REGNUM (tdep)))[1] >> 3);
> +               top &= 0x7;
> +
> +               for (fpreg = 7; fpreg >= 0; fpreg--)
> +                 {
> +                   int tag;
> +
> +                   if (val[0] & (1 << fpreg))
> +                     {
> +                       int regnum = (fpreg + 8 - top) % 8
> +                                      + I387_ST0_REGNUM (tdep);
> +                       tag = i387_tag (FXSAVE_ADDR (tdep, regs, regnum));
> +                     }
> +                   else
> +                     tag = 3;          /* Empty */
> +
> +                   ftag |= tag << (2 * fpreg);
> +                 }
> +               val[0] = ftag & 0xff;
> +               val[1] = (ftag >> 8) & 0xff;
> +             }
> +           regcache_raw_supply (regcache, i, val);
> +         }
> +       else if (i < I387_VECTOR0_REGNUM (tdep))
> +         {
> +           if (i < I387_FCTRL_REGNUM (tdep)
> +               && (clear_bv & bit_I386_XSTATE_X87))
> +             p = NULL;
> +           else
> +             p = FXSAVE_ADDR (tdep, regs, i);
> +           regcache_raw_supply (regcache, i, p);
> +         }
> +       else
> +         {
> +           if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +               == I386_XSTATE_AVX_MASK)
> +             {
> +               if ((clear_bv & (bit_I386_XSTATE_SSE | bit_I386_XSTATE_AVX))
> +                   == (bit_I386_XSTATE_SSE | bit_I386_XSTATE_AVX))
> +                 p = NULL;
> +               else
> +                 {
> +                   p = raw;
> +                   if ((clear_bv & bit_I386_XSTATE_SSE))
> +                     memset (raw, 0, 16);
> +                   else
> +                     memcpy (raw, XSAVE_SSE_ADDR (tdep, regs, i), 16);
> +                   if ((clear_bv & bit_I386_XSTATE_AVX))
> +                     memset (raw + 16, 0, 16);
> +                   else
> +                     memcpy (raw + 16, XSAVE_AVXH_ADDR (tdep, regs, i),
> +                             16);
> +                 }
> +             }
> +           else
> +             {
> +               if ((clear_bv & bit_I386_XSTATE_SSE))
> +                 p = NULL;
> +               else
> +                 p = XSAVE_SSE_ADDR (tdep, regs, i);
> +             }
> +           regcache_raw_supply (regcache, i, p);
> +         }
> +      }
> +
> +  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
> +    {
> +      if (regs == NULL)
> +       regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep), NULL);
> +      else
> +       regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep),
> +                            FXSAVE_MXCSR_ADDR (regs));
> +    }
> +}
> +
> +/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
> +
> +void
> +i387_collect_xsave (const struct regcache *regcache, int regnum,
> +                   void *xsave, int gcore)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
> +  gdb_byte *regs = xsave;
> +  int i;
> +  gdb_byte raw[I386_MAX_REGISTER_SIZE];
> +
> +  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
> +  gdb_assert (tdep->num_vector_regs > 0);
> +
> +  if (gcore)
> +    {
> +      /* Update XCR0 and `xstate_bv' with XCR0 for gcore.  */
> +      if (tdep->xsave_xcr0_offset != -1)
> +       memcpy (regs + tdep->xsave_xcr0_offset, &tdep->xcr0, 8);
> +      memcpy (XSAVE_XSTATE_BV_ADDR (regs), &tdep->xcr0, 8);
> +    }
> +  else
> +    {
> +      enum
> +       {
> +         none = 0x0,
> +         check = 0x1,
> +         x87 = 0x2 | check,
> +         vector = 0x4 | check,
> +         all = 0x8 | check
> +       } regclass;
> +
> +      if (regnum == -1)
> +       regclass = all;
> +      else if (regnum >= I387_VECTOR0_REGNUM(tdep)
> +              && regnum < I387_MXCSR_REGNUM (tdep))
> +       regclass = vector;
> +      else if (regnum >= I387_ST0_REGNUM (tdep)
> +              && regnum < I387_FCTRL_REGNUM (tdep))
> +       regclass = x87;
> +      else
> +       regclass = none;
> +
> +      if ((regclass & check))
> +       {
> +         gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
> +         int num_vector_regs;
> +         unsigned int xstate_bv = 0;
> +         /* The supported bits in `xstat_bv' are 1 byte. */
> +         unsigned int clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
> +
> +         /* Clear part in vector registers if its bit in xstat_bv is
> +            zero.  */
> +         if (clear_bv)
> +           {
> +             i = I387_VECTOR0_REGNUM (tdep);
> +             num_vector_regs = I387_NUM_VECTOR_REGS(tdep);
> +             for (; num_vector_regs; num_vector_regs--, i++)
> +               {
> +                 if ((clear_bv & bit_I386_XSTATE_AVX))
> +                   memset (XSAVE_AVXH_ADDR (tdep, regs, i), 0, 16);
> +                 if ((clear_bv & bit_I386_XSTATE_SSE))
> +                   memset (XSAVE_SSE_ADDR (tdep, regs, i), 0, 16);
> +               }
> +
> +             if ((clear_bv & bit_I386_XSTATE_X87))
> +               for (i = I387_ST0_REGNUM (tdep);
> +                    i < I387_FCTRL_REGNUM (tdep); i++)
> +                 memset (FXSAVE_ADDR (tdep, regs, i), 0, 10);
> +           }
> +
> +         if (regclass == all)
> +           {
> +             i = I387_VECTOR0_REGNUM (tdep);
> +             num_vector_regs = I387_NUM_VECTOR_REGS(tdep);
> +
> +             if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +                 == I386_XSTATE_AVX_MASK)
> +               {
> +                 /* Check if any AVX registers are changed.  */
> +                 for (; num_vector_regs; num_vector_regs--, i++)
> +                   {
> +                     regcache_raw_read ((struct regcache *) regcache,
> +                                        i, raw);
> +                     if (memcmp (raw + 16,
> +                                 XSAVE_AVXH_ADDR (tdep, regs, i), 16))
> +                       xstate_bv |= bit_I386_XSTATE_AVX;
> +                     if (memcmp (raw, XSAVE_SSE_ADDR (tdep, regs, i), 16))
> +                       xstate_bv |= bit_I386_XSTATE_SSE;
> +
> +                     if (xstate_bv
> +                         == (bit_I386_XSTATE_AVX | bit_I386_XSTATE_SSE))
> +                       break;
> +                   }
> +               }
> +             else
> +               {
> +                 /* Check if any SSE registers are changed.  */
> +                 for (; num_vector_regs; num_vector_regs--, i++)
> +                   {
> +                     regcache_raw_read ((struct regcache *) regcache,
> +                                        i, raw);
> +                     if (memcmp (raw, XSAVE_SSE_ADDR (tdep, regs, i), 16))
> +                       {
> +                         xstate_bv |= bit_I386_XSTATE_SSE;
> +                         break;
> +                       }
> +                   }
> +               }
> +
> +             /* Check if any X87 registers are changed.  */
> +             for (i = I387_ST0_REGNUM (tdep);
> +                  i < I387_FCTRL_REGNUM (tdep); i++)
> +               {
> +                 regcache_raw_read ((struct regcache *) regcache, i, raw);
> +                 if (memcmp (raw, FXSAVE_ADDR (tdep, regs, i), 10))
> +                   {
> +                     xstate_bv |= bit_I386_XSTATE_X87;
> +                     break;
> +                   }
> +               }
> +           }
> +         else
> +           {
> +             /* Check if REGNUM is changed.  */
> +             regcache_raw_read ((struct regcache *) regcache, regnum, raw);
> +
> +             if (regclass == x87)
> +               {
> +                 /* This is an x87 register.  */
> +                 if (memcmp (raw, FXSAVE_ADDR (tdep, regs, regnum), 10))
> +                   xstate_bv |= bit_I386_XSTATE_X87;
> +               }
> +             else
> +               {
> +                 /* This is an SSE/AVX register.  */
> +                 if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +                     == I386_XSTATE_AVX_MASK)
> +                   {
> +                     if (memcmp (raw + 16,
> +                                 XSAVE_AVXH_ADDR (tdep, regs, regnum), 16))
> +                       xstate_bv |= bit_I386_XSTATE_AVX;
> +                   }
> +
> +                 if (memcmp (raw, XSAVE_SSE_ADDR (tdep, regs, regnum), 16))
> +                   xstate_bv |= bit_I386_XSTATE_SSE;
> +               }
> +           }
> +
> +         /* Update the corresponding bits in `xstate_bv' if any SSE/AVX
> +            registers are changed.  */
> +         if (xstate_bv)
> +           {
> +             /* The supported bits in `xstat_bv' are 1 byte.  */
> +             *xstate_bv_p |= (gdb_byte) xstate_bv;
> +
> +             /* Update REGNUM and return.  */
> +             if (regclass != all)
> +               {
> +                 if (regclass == x87)
> +                   {
> +                     /* x87 register.  */
> +                     memcpy (FXSAVE_ADDR (tdep, regs, regnum), raw, 10);
> +                   }
> +                 else
> +                   {
> +                     /* SSE/AVX register.  */
> +                     if ((xstate_bv & bit_I386_XSTATE_AVX))
> +                       memcpy (XSAVE_AVXH_ADDR (tdep, regs, regnum),
> +                               raw + 16, 16);
> +                     if ((xstate_bv & bit_I386_XSTATE_SSE))
> +                       memcpy (XSAVE_SSE_ADDR (tdep, regs, regnum), raw, 16);
> +                   }
> +                 return;
> +               }
> +           }
> +         else
> +           {
> +             /* Return if REGNUM isn't changed.  */
> +             if (regclass != all)
> +               return;
> +           }
> +       }
> +    }
> +
> +  for (i = I387_ST0_REGNUM (tdep); i < I387_MXCSR_REGNUM (tdep); i++)
> +    if (regnum == -1 || regnum == i)
> +      {
> +       /* Most of the FPU control registers occupy only 16 bits in
> +          the xsave extended state.  Give those a special treatment.  */
> +       if (i >= I387_FCTRL_REGNUM (tdep)
> +           && i < I387_VECTOR0_REGNUM (tdep)
> +           && i != I387_FIOFF_REGNUM (tdep)
> +           && i != I387_FOOFF_REGNUM (tdep))
> +         {
> +           gdb_byte buf[4];
> +
> +           regcache_raw_collect (regcache, i, buf);
> +
> +           if (i == I387_FOP_REGNUM (tdep))
> +             {
> +               /* The opcode occupies only 11 bits.  Make sure we
> +                   don't touch the other bits.  */
> +               buf[1] &= ((1 << 3) - 1);
> +               buf[1] |= ((FXSAVE_ADDR (tdep, regs, i))[1] & ~((1 << 3) - 1));
> +             }
> +           else if (i == I387_FTAG_REGNUM (tdep))
> +             {
> +               /* Converting back is much easier.  */
> +
> +               unsigned short ftag;
> +               int fpreg;
> +
> +               ftag = (buf[1] << 8) | buf[0];
> +               buf[0] = 0;
> +               buf[1] = 0;
> +
> +               for (fpreg = 7; fpreg >= 0; fpreg--)
> +                 {
> +                   int tag = (ftag >> (fpreg * 2)) & 3;
> +
> +                   if (tag != 3)
> +                     buf[0] |= (1 << fpreg);
> +                 }
> +             }
> +           memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
> +         }
> +       else if (i < I387_VECTOR0_REGNUM (tdep))
>          regcache_raw_collect (regcache, i, FXSAVE_ADDR (tdep, regs, i));
> +       else
> +         {
> +           if ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +               == I386_XSTATE_AVX_MASK)
> +             {
> +               regcache_raw_collect (regcache, i, raw);
> +               memcpy (XSAVE_SSE_ADDR (tdep, regs, i), raw, 16);
> +               memcpy (XSAVE_AVXH_ADDR (tdep, regs, i),
> +                       raw + 16, 16);
> +             }
> +           else
> +             regcache_raw_collect (regcache, i,
> +                                   XSAVE_SSE_ADDR (tdep, regs, i));
> +         }
>       }
>
>   if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
> diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
> index 645eb91..f867a1f 100644
> --- a/gdb/i387-tdep.h
> +++ b/gdb/i387-tdep.h
> @@ -31,7 +31,7 @@ struct ui_file;
>  #define I387_NUM_REGS  16
>
>  #define I387_ST0_REGNUM(tdep) ((tdep)->st0_regnum)
> -#define I387_NUM_XMM_REGS(tdep) ((tdep)->num_xmm_regs)
> +#define I387_NUM_VECTOR_REGS(tdep) ((tdep)->num_vector_regs)
>  #define I387_MM0_REGNUM(tdep) ((tdep)->mm0_regnum)
>
>  #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
> @@ -42,9 +42,9 @@ struct ui_file;
>  #define I387_FOSEG_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 5)
>  #define I387_FOOFF_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 6)
>  #define I387_FOP_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 7)
> -#define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
> +#define I387_VECTOR0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
>  #define I387_MXCSR_REGNUM(tdep) \
> -  (I387_XMM0_REGNUM (tdep) + I387_NUM_XMM_REGS (tdep))
> +  (I387_VECTOR0_REGNUM (tdep) + I387_NUM_VECTOR_REGS (tdep))
>
>  /* Print out the i387 floating point state.  */
>
> @@ -99,6 +99,11 @@ extern void i387_collect_fsave (const struct regcache *regcache, int regnum,
>  extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
>                                const void *fxsave);
>
> +/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
> +
> +extern void i387_supply_xsave (struct regcache *regcache, int regnum,
> +                              const void *xsave);
> +
>  /* Fill register REGNUM (if it is a floating-point or SSE register) in
>    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
>    all registers.  This function doesn't touch any of the reserved
> @@ -107,6 +112,11 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
>  extern void i387_collect_fxsave (const struct regcache *regcache, int regnum,
>                                 void *fxsave);
>
> +/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
> +
> +extern void i387_collect_xsave (const struct regcache *regcache,
> +                               int regnum, void *xsave, int gcore);
> +
>  /* Prepare the FPU stack in REGCACHE for a function return.  */
>
>  extern void i387_return_value (struct gdbarch *gdbarch,
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 5/6: Add AVX support (i387 changes)
  2010-03-05  3:20       ` PATCH: 5/6: Add AVX support (i387 changes) Hui Zhu
@ 2010-03-05  3:54         ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-05  3:54 UTC (permalink / raw)
  To: Hui Zhu; +Cc: GDB

On Thu, Mar 4, 2010 at 7:19 PM, Hui Zhu <teawater@gmail.com> wrote:
> -#define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
> +#define I387_VECTOR0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
>
> We need this name change?
>

I will change it.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 1/6: Add AVX support
  2010-03-05  2:06           ` H.J. Lu
@ 2010-03-05  7:29             ` Mark Kettenis
  0 siblings, 0 replies; 115+ messages in thread
From: Mark Kettenis @ 2010-03-05  7:29 UTC (permalink / raw)
  To: hjl.tools; +Cc: hjl.tools, gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1777 bytes --]

> Date: Thu, 4 Mar 2010 18:06:11 -0800
> From: "H.J. Lu" <hjl.tools@gmail.com>
> 
> On Thu, Mar 4, 2010 at 1:47 PM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> > On Thu, Mar 04, 2010 at 01:27:09PM -0800, H.J. Lu wrote:
> >> > No, it will fail to display SSE.  Core debugging should still be
> >> > possible, and the newly added registers will be visible too.  If
> >> > that's not the case, fix GDB to function with the SSE registers
> >> > missing.
> >>
> >> Your description only works for truly NEW registers, which
> >> AVX registers aren't.  AVX registers are actually the old SSE
> >> registers with different names.
> >
> > I'm trying to get you to think about compatibility in the
> > descriptions, instead of separately in the remote protocol.
> > There are always ways to solve it.  For instance, you could present
> > both the AVX registers and the hypothetical newer, larger registers as
> > separate things.  As long as the P packet is implemented, which it is,
> > GDB should work OK if modifying one register changes another.
> > I don't know if there's an example of this in the GDB sources, but I
> > have one in my tree; there's $sp, $sp_user, and $sp_system registers,
> > and $sp is the same as one of the other two depending on processor
> > mode.  But they're all visible.
> >
> > Another solution is to define new registers which correspond to the
> > added bits, and have a sufficiently recent GDB synthesize the combined
> > registers from the AVX registers and the new bits.  This, for
> > instance, is how the Power E500 registers are handled
> > (rs6000/power-spe.xml).
> >
> 
> OK, I will try SPE approach. It will take a while.

Wait, please.  I'll reply later when I have a bit more time, but I
don't think this will be a good idea.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6: Add AVX support (Update document)
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
  2010-03-04 18:06   ` PATCH: 3/6: Add AVX support (i386 changes) H.J. Lu
  2010-03-04 18:08   ` PATCH: 4/6: Add AVX support (amd64 changes) H.J. Lu
@ 2010-03-05 10:33   ` Eli Zaretskii
  2010-03-05 14:08     ` H.J. Lu
  2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
  3 siblings, 1 reply; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-05 10:33 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Thu, 4 Mar 2010 10:04:08 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> This patch updates document for AVX support.  OK to install?

Is it still relevant?

I will review this anyway, in the hope that it will help you submit
the fixed patch down the way.

> +@item x86:xstate=@var{bytes}:xcr0=@var{value}
> +This feature indicates that @value{GDBN} supports x86 XSAVE extended

It will look prettier in print if you use @sc{xsave} instead of
XSAVE.  (The result in the Info manual is the same.)

> +state. @var{bytes} specifies the maximum size in bytes of x86 XSAVE
        ^^
Two spaces between sentences, please (here and elsewhere in your
patch).

> +extended state @value{GDBN} supports. @var{value} specifies the
> +maximum value of the extended control register 0 (the
> +XFEATURE_ENABLED_MASK register) @value{GDBN} supports.  The stub should

XFEATURE_ENABLED_MASK is a C symbol, so it should be in @code.

Thanks.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6: Add AVX support (Update document)
  2010-03-05 10:33   ` PATCH: 2/6: Add AVX support (Update document) Eli Zaretskii
@ 2010-03-05 14:08     ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-05 14:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Fri, Mar 5, 2010 at 2:33 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Thu, 4 Mar 2010 10:04:08 -0800
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> This patch updates document for AVX support.  OK to install?
>
> Is it still relevant?
>
> I will review this anyway, in the hope that it will help you submit
> the fixed patch down the way.

Thanks. I will keep it in mind.

>> +@item x86:xstate=@var{bytes}:xcr0=@var{value}
>> +This feature indicates that @value{GDBN} supports x86 XSAVE extended
>
> It will look prettier in print if you use @sc{xsave} instead of
> XSAVE.  (The result in the Info manual is the same.)
>
>> +state. @var{bytes} specifies the maximum size in bytes of x86 XSAVE
>        ^^
> Two spaces between sentences, please (here and elsewhere in your
> patch).
>
>> +extended state @value{GDBN} supports. @var{value} specifies the
>> +maximum value of the extended control register 0 (the
>> +XFEATURE_ENABLED_MASK register) @value{GDBN} supports.  The stub should
>
> XFEATURE_ENABLED_MASK is a C symbol, so it should be in @code.
>
> Thanks.
>



-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-04 18:02 PATCH: 1/6: Add AVX support H.J. Lu
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
  2010-03-04 19:09 ` PATCH: 1/6: Add AVX support Daniel Jacobowitz
@ 2010-03-06 22:16 ` H.J. Lu
  2010-03-06 22:18   ` PATCH: 1/6 [2nd try]: Add AVX support (AVX XML files) H.J. Lu
                     ` (3 more replies)
  2 siblings, 4 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:16 UTC (permalink / raw)
  To: GDB

AVX registers are saved and restored via the XSAVE extended state. The
extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
is used to determine which states, x87, SSE, AVX, ... are supported
in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
instruction.  The xstate_bv field at byte offset 512 in the XSAVE
extended state indicates what states the current process is in. If
the feature bit is cleared, the corresponding registers should be read as
0. If we update a register, we should set the corresponding feature
bit in the xstate_bv field.

We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
fetch and store AVX registers with ptrace. Linux kernel also stores
XCR0 at the first 8 bytes of the software usable bytes, starting at
byte offset 464.

There are total 6 patches to add AVX support for Linux.  They support:

1. The upper 128bit YMM registers are added for AVX support. The upper
128bit YMM registers are hidden from users. Gdb combines XMM register,
%xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
YMM register, %ymmX, as pseudo register to users.
2. Backward compatible. If AVX isn't supported, SSE will be used.
3. Forward compatible. If new state beyond AVX is supported in
the XSAVE extended state, only AVX state will be used.
4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
request packet to indicate that GDB supports x86 XML target desciption.
The gdb stub will send x86 XML target desciption if it sees "x86=xml"
in qSupported request packet.

One advantage of this approach is YMM registers are actually stored as
XMM registers and upper YMM registers in the XSAVE extended state.  It
is easy and natural to access them as %xmmX and %ymmXh internally.  We
just need to hide %ymmXh from users.

To support AVX on other OSes, the following changes are needed:

1. Kernel support to get/set the XSAVE extended state.
2. Handle 8/16 upper YMM registers.
3. Provide target to_read_description to return SSE or AVX target
description.
4. Update gdbarch_core_read_description to return SSE or AVX target
description based on contents of core dump.



H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 1/6 [2nd try]: Add AVX support (AVX XML files)
  2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
@ 2010-03-06 22:18   ` H.J. Lu
  2010-03-07 14:16   ` PATCH: 0/6 [2nd try]: Add AVX support Mark Kettenis
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:18 UTC (permalink / raw)
  To: GDB

Hi,

This patch adds AVX XML files.  OK to install?

Thanks.



H.J.
---
2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>

	* config/djgpp/fnchange.lst: Add x86 AVX XML files.

	* features/Makefile (WHICH): Add i386/i386-avx,
	i386/i386-avx-linux, i386/x86-64-avx and i386/x86-64-avx-linux.
	(i386/i386-avx-expedite): New.
	(i386/i386-avx-linux-expedite): Likewise.
	(i386/x86-64-avx-expedite):Likewise.
	(i386/x86-64-avx-linux-expedite): Likewise.
	($(outdir)/i386/i386-avx.dat): New dependency.
	($(outdir)/i386/i386-avx-linux.dat): Likewise.
	($(outdir)/i386/x86-avx-64.dat): Likewise.
	($(outdir)/i386/x86-64-avx-linux.dat): Likewise.

	* features/i386/32bit-avx.xml: New.
	* features/i386/64bit-avx.xml: Likewise.
	* features/i386/i386-avx-linux.c: Likewise.
	* features/i386/i386-avx-linux.xml: Likewise.
	* features/i386/i386-avx.c: Likewise.
	* features/i386/i386-avx.xml: Likewise.
	* features/i386/x86-64-avx-linux.c: Likewise.
	* features/i386/x86-64-avx-linux.xml: Likewise.
	* features/i386/x86-64-avx.c: Likewise.
	* features/i386/x86-64-avx.xml: Likewise.
	* regformats/i386/i386-avx-linux.dat: Likewise.
	* regformats/i386/i386-avx.dat: Likewise.
	* regformats/i386/x86-64-avx-linux.dat: Likewise.
	* regformats/i386/x86-64-avx.dat: Likewise.

diff --git a/gdb/config/djgpp/fnchange.lst b/gdb/config/djgpp/fnchange.lst
index 3982f1d..7bec57d 100644
--- a/gdb/config/djgpp/fnchange.lst
+++ b/gdb/config/djgpp/fnchange.lst
@@ -228,6 +228,14 @@
 @V@/gdb/features/rs6000/powerpc-vsx64l.xml @V@/gdb/features/rs6000/ppc-v64l.xml
 @V@/gdb/features/rs6000/powerpc-cell32l.xml @V@/gdb/features/rs6000/ppc-c32l.xml
 @V@/gdb/features/rs6000/powerpc-cell64l.xml @V@/gdb/features/rs6000/ppc-c64l.xml
+@V@/gdb/features/i386/amd64-avx-linux.c @V@/gdb/features/i386/a64-al.c
+@V@/gdb/features/i386/amd64-avx.c @V@/gdb/features/i386/a64-a.c
+@V@/gdb/features/i386/amd64-avx-linux.xml @V@/gdb/features/i386/a64-al.xml
+@V@/gdb/features/i386/amd64-avx.xml @V@/gdb/features/i386/a64-a.xml
+@V@/gdb/features/i386/i386-avx-linux.c @V@/features/i386/i32-al.c
+@V@/gdb/features/i386/i386-avx.c @V@/gdb/features/i386/i32-a.c
+@V@/gdb/features/i386/i386-avx-linux.xml @V@/gdb/features/i386/i32-al.xml
+@V@/gdb/features/i386/i386-avx.xml @V@/gdb/features/i386/i32-a.xml
 @V@/gdb/f-exp.tab.c @V@/gdb/f-exp_tab.c
 @V@/gdb/gdbserver/linux-cris-low.c @V@/gdb/gdbserver/lx-cris.c
 @V@/gdb/gdbserver/linux-crisv32-low.c @V@/gdb/gdbserver/lx-cris32.c
diff --git a/gdb/features/Makefile b/gdb/features/Makefile
index b00800c..1166582 100644
--- a/gdb/features/Makefile
+++ b/gdb/features/Makefile
@@ -33,6 +33,8 @@
 WHICH = arm-with-iwmmxt arm-with-vfpv2 arm-with-vfpv3 arm-with-neon \
 	i386/i386 i386/i386-linux \
 	i386/amd64 i386/amd64-linux \
+	i386/i386-avx i386/i386-avx-linux \
+	i386/amd64-avx i386/amd64-avx-linux \
 	mips-linux mips64-linux \
 	rs6000/powerpc-32l rs6000/powerpc-altivec32l rs6000/powerpc-e500l \
 	rs6000/powerpc-64l rs6000/powerpc-altivec64l rs6000/powerpc-vsx32l \
@@ -45,6 +47,10 @@ i386/i386-expedite = ebp,esp,eip
 i386/i386-linux-expedite = ebp,esp,eip
 i386/amd64-expedite = rbp,rsp,rip
 i386/amd64-linux-expedite = rbp,rsp,rip
+i386/i386-avx-expedite = ebp,esp,eip
+i386/i386-avx-linux-expedite = ebp,esp,eip
+i386/amd64-avx-expedite = rbp,rsp,rip
+i386/amd64-avx-linux-expedite = rbp,rsp,rip
 mips-expedite = r29,pc
 mips64-expedite = r29,pc
 powerpc-expedite = r1,pc
@@ -90,3 +96,9 @@ $(outdir)/i386/i386-linux.dat: i386/32bit-core.xml i386/32bit-sse.xml \
 $(outdir)/i386/amd64.dat: i386/64bit-core.xml i386/64bit-sse.xml
 $(outdir)/i386/amd64-linux.dat: i386/64bit-core.xml i386/64bit-sse.xml \
 			        i386/64bit-linux.xml
+$(outdir)/i386/i386-avx.dat: i386/32bit-core.xml i386/32bit-avx.xml
+$(outdir)/i386/i386-avx-linux.dat: i386/32bit-core.xml i386/32bit-avx.xml \
+			       i386/32bit-linux.xml
+$(outdir)/i386/amd64-avx.dat: i386/64bit-core.xml i386/64bit-avx.xml
+$(outdir)/i386/amd64-avx-linux.dat: i386/64bit-core.xml i386/64bit-avx.xml \
+				    i386/64bit-linux.xml
diff --git a/gdb/features/i386/32bit-avx.xml b/gdb/features/i386/32bit-avx.xml
new file mode 100644
index 0000000..8e8213e
--- /dev/null
+++ b/gdb/features/i386/32bit-avx.xml
@@ -0,0 +1,18 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2010 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<feature name="org.gnu.gdb.i386.avx">
+  <reg name="ymm0h" bitsize="128" type="uint128"/>
+  <reg name="ymm1h" bitsize="128" type="uint128"/>
+  <reg name="ymm2h" bitsize="128" type="uint128"/>
+  <reg name="ymm3h" bitsize="128" type="uint128"/>
+  <reg name="ymm4h" bitsize="128" type="uint128"/>
+  <reg name="ymm5h" bitsize="128" type="uint128"/>
+  <reg name="ymm6h" bitsize="128" type="uint128"/>
+  <reg name="ymm7h" bitsize="128" type="uint128"/>
+</feature>
diff --git a/gdb/features/i386/64bit-avx.xml b/gdb/features/i386/64bit-avx.xml
new file mode 100644
index 0000000..7827e72
--- /dev/null
+++ b/gdb/features/i386/64bit-avx.xml
@@ -0,0 +1,26 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2010 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<feature name="org.gnu.gdb.i386.avx">
+  <reg name="ymm0h" bitsize="128" type="uint128"/>
+  <reg name="ymm1h" bitsize="128" type="uint128"/>
+  <reg name="ymm2h" bitsize="128" type="uint128"/>
+  <reg name="ymm3h" bitsize="128" type="uint128"/>
+  <reg name="ymm4h" bitsize="128" type="uint128"/>
+  <reg name="ymm5h" bitsize="128" type="uint128"/>
+  <reg name="ymm6h" bitsize="128" type="uint128"/>
+  <reg name="ymm7h" bitsize="128" type="uint128"/>
+  <reg name="ymm8h" bitsize="128" type="uint128"/>
+  <reg name="ymm9h" bitsize="128" type="uint128"/>
+  <reg name="ymm10h" bitsize="128" type="uint128"/>
+  <reg name="ymm11h" bitsize="128" type="uint128"/>
+  <reg name="ymm12h" bitsize="128" type="uint128"/>
+  <reg name="ymm13h" bitsize="128" type="uint128"/>
+  <reg name="ymm14h" bitsize="128" type="uint128"/>
+  <reg name="ymm15h" bitsize="128" type="uint128"/>
+</feature>
diff --git a/gdb/features/i386/Makefile b/gdb/features/i386/Makefile
new file mode 100644
index 0000000..f94beb6
--- /dev/null
+++ b/gdb/features/i386/Makefile
@@ -0,0 +1,5 @@
+include Makefile
+
+XMLTOC = $(addsuffix .xml, $(filter i386/%, $(WHICH)))
+CFILES = $(patsubst %.xml,%.c,$(XMLTOC))
+cfiles: $(CFILES)
diff --git a/gdb/features/i386/amd64-avx-linux.c b/gdb/features/i386/amd64-avx-linux.c
new file mode 100644
index 0000000..73392d3
--- /dev/null
+++ b/gdb/features/i386/amd64-avx-linux.c
@@ -0,0 +1,171 @@
+/* THIS FILE IS GENERATED.  Original: amd64-avx-linux.xml */
+
+#include "defs.h"
+#include "osabi.h"
+#include "target-descriptions.h"
+
+struct target_desc *tdesc_amd64_avx_linux;
+static void
+initialize_tdesc_amd64_avx_linux (void)
+{
+  struct target_desc *result = allocate_target_description ();
+  struct tdesc_feature *feature;
+  struct tdesc_type *field_type, *type;
+
+  set_tdesc_architecture (result, bfd_scan_arch ("i386:x86-64"));
+
+  set_tdesc_osabi (result, osabi_from_tdesc_string ("GNU/Linux"));
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.core");
+  field_type = tdesc_create_flags (feature, "i386_eflags", 4);
+  tdesc_add_flag (field_type, 0, "CF");
+  tdesc_add_flag (field_type, 1, "");
+  tdesc_add_flag (field_type, 2, "PF");
+  tdesc_add_flag (field_type, 4, "AF");
+  tdesc_add_flag (field_type, 6, "ZF");
+  tdesc_add_flag (field_type, 7, "SF");
+  tdesc_add_flag (field_type, 8, "TF");
+  tdesc_add_flag (field_type, 9, "IF");
+  tdesc_add_flag (field_type, 10, "DF");
+  tdesc_add_flag (field_type, 11, "OF");
+  tdesc_add_flag (field_type, 14, "NT");
+  tdesc_add_flag (field_type, 16, "RF");
+  tdesc_add_flag (field_type, 17, "VM");
+  tdesc_add_flag (field_type, 18, "AC");
+  tdesc_add_flag (field_type, 19, "VIF");
+  tdesc_add_flag (field_type, 20, "VIP");
+  tdesc_add_flag (field_type, 21, "ID");
+
+  tdesc_create_reg (feature, "rax", 0, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rbx", 1, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rcx", 2, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rdx", 3, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rsi", 4, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rdi", 5, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rbp", 6, 1, NULL, 64, "data_ptr");
+  tdesc_create_reg (feature, "rsp", 7, 1, NULL, 64, "data_ptr");
+  tdesc_create_reg (feature, "r8", 8, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r9", 9, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r10", 10, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r11", 11, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r12", 12, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r13", 13, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r14", 14, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r15", 15, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rip", 16, 1, NULL, 64, "code_ptr");
+  tdesc_create_reg (feature, "eflags", 17, 1, NULL, 32, "i386_eflags");
+  tdesc_create_reg (feature, "cs", 18, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ss", 19, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ds", 20, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "es", 21, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "fs", 22, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "gs", 23, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "st0", 24, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st1", 25, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st2", 26, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st3", 27, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st4", 28, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st5", 29, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st6", 30, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st7", 31, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "fctrl", 32, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fstat", 33, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "ftag", 34, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fiseg", 35, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fioff", 36, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "foseg", 37, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fooff", 38, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fop", 39, 1, "float", 32, "int");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.sse");
+  field_type = tdesc_named_type (feature, "ieee_single");
+  tdesc_create_vector (feature, "v4f", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "ieee_double");
+  tdesc_create_vector (feature, "v2d", field_type, 2);
+
+  field_type = tdesc_named_type (feature, "int8");
+  tdesc_create_vector (feature, "v16i8", field_type, 16);
+
+  field_type = tdesc_named_type (feature, "int16");
+  tdesc_create_vector (feature, "v8i16", field_type, 8);
+
+  field_type = tdesc_named_type (feature, "int32");
+  tdesc_create_vector (feature, "v4i32", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "int64");
+  tdesc_create_vector (feature, "v2i64", field_type, 2);
+
+  type = tdesc_create_union (feature, "vec128");
+  field_type = tdesc_named_type (feature, "v4f");
+  tdesc_add_field (type, "v4_float", field_type);
+  field_type = tdesc_named_type (feature, "v2d");
+  tdesc_add_field (type, "v2_double", field_type);
+  field_type = tdesc_named_type (feature, "v16i8");
+  tdesc_add_field (type, "v16_int8", field_type);
+  field_type = tdesc_named_type (feature, "v8i16");
+  tdesc_add_field (type, "v8_int16", field_type);
+  field_type = tdesc_named_type (feature, "v4i32");
+  tdesc_add_field (type, "v4_int32", field_type);
+  field_type = tdesc_named_type (feature, "v2i64");
+  tdesc_add_field (type, "v2_int64", field_type);
+  field_type = tdesc_named_type (feature, "uint128");
+  tdesc_add_field (type, "uint128", field_type);
+
+  field_type = tdesc_create_flags (feature, "i386_mxcsr", 4);
+  tdesc_add_flag (field_type, 0, "IE");
+  tdesc_add_flag (field_type, 1, "DE");
+  tdesc_add_flag (field_type, 2, "ZE");
+  tdesc_add_flag (field_type, 3, "OE");
+  tdesc_add_flag (field_type, 4, "UE");
+  tdesc_add_flag (field_type, 5, "PE");
+  tdesc_add_flag (field_type, 6, "DAZ");
+  tdesc_add_flag (field_type, 7, "IM");
+  tdesc_add_flag (field_type, 8, "DM");
+  tdesc_add_flag (field_type, 9, "ZM");
+  tdesc_add_flag (field_type, 10, "OM");
+  tdesc_add_flag (field_type, 11, "UM");
+  tdesc_add_flag (field_type, 12, "PM");
+  tdesc_add_flag (field_type, 15, "FZ");
+
+  tdesc_create_reg (feature, "xmm0", 40, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm1", 41, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm2", 42, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm3", 43, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm4", 44, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm5", 45, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm6", 46, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm7", 47, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm8", 48, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm9", 49, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm10", 50, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm11", 51, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm12", 52, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm13", 53, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm14", 54, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm15", 55, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "mxcsr", 56, 1, "vector", 32, "i386_mxcsr");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.linux");
+  tdesc_create_reg (feature, "orig_rax", 57, 1, NULL, 64, "int");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.avx");
+  tdesc_create_reg (feature, "ymm0h", 58, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm1h", 59, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm2h", 60, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm3h", 61, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm4h", 62, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm5h", 63, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm6h", 64, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm7h", 65, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm8h", 66, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm9h", 67, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm10h", 68, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm11h", 69, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm12h", 70, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm13h", 71, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm14h", 72, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm15h", 73, 1, NULL, 128, "uint128");
+
+  tdesc_amd64_avx_linux = result;
+}
diff --git a/gdb/features/i386/amd64-avx-linux.xml b/gdb/features/i386/amd64-avx-linux.xml
new file mode 100644
index 0000000..9812ded
--- /dev/null
+++ b/gdb/features/i386/amd64-avx-linux.xml
@@ -0,0 +1,18 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2010 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!-- AMD64 with AVX - Includes Linux-only special "register".  -->
+
+<!DOCTYPE target SYSTEM "gdb-target.dtd">
+<target>
+  <architecture>i386:x86-64</architecture>
+  <osabi>GNU/Linux</osabi>
+  <xi:include href="64bit-core.xml"/>
+  <xi:include href="64bit-sse.xml"/>
+  <xi:include href="64bit-linux.xml"/>
+  <xi:include href="64bit-avx.xml"/>
+</target>
diff --git a/gdb/features/i386/amd64-avx.c b/gdb/features/i386/amd64-avx.c
new file mode 100644
index 0000000..05c60ff
--- /dev/null
+++ b/gdb/features/i386/amd64-avx.c
@@ -0,0 +1,166 @@
+/* THIS FILE IS GENERATED.  Original: amd64-avx.xml */
+
+#include "defs.h"
+#include "osabi.h"
+#include "target-descriptions.h"
+
+struct target_desc *tdesc_amd64_avx;
+static void
+initialize_tdesc_amd64_avx (void)
+{
+  struct target_desc *result = allocate_target_description ();
+  struct tdesc_feature *feature;
+  struct tdesc_type *field_type, *type;
+
+  set_tdesc_architecture (result, bfd_scan_arch ("i386:x86-64"));
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.core");
+  field_type = tdesc_create_flags (feature, "i386_eflags", 4);
+  tdesc_add_flag (field_type, 0, "CF");
+  tdesc_add_flag (field_type, 1, "");
+  tdesc_add_flag (field_type, 2, "PF");
+  tdesc_add_flag (field_type, 4, "AF");
+  tdesc_add_flag (field_type, 6, "ZF");
+  tdesc_add_flag (field_type, 7, "SF");
+  tdesc_add_flag (field_type, 8, "TF");
+  tdesc_add_flag (field_type, 9, "IF");
+  tdesc_add_flag (field_type, 10, "DF");
+  tdesc_add_flag (field_type, 11, "OF");
+  tdesc_add_flag (field_type, 14, "NT");
+  tdesc_add_flag (field_type, 16, "RF");
+  tdesc_add_flag (field_type, 17, "VM");
+  tdesc_add_flag (field_type, 18, "AC");
+  tdesc_add_flag (field_type, 19, "VIF");
+  tdesc_add_flag (field_type, 20, "VIP");
+  tdesc_add_flag (field_type, 21, "ID");
+
+  tdesc_create_reg (feature, "rax", 0, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rbx", 1, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rcx", 2, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rdx", 3, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rsi", 4, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rdi", 5, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rbp", 6, 1, NULL, 64, "data_ptr");
+  tdesc_create_reg (feature, "rsp", 7, 1, NULL, 64, "data_ptr");
+  tdesc_create_reg (feature, "r8", 8, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r9", 9, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r10", 10, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r11", 11, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r12", 12, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r13", 13, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r14", 14, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "r15", 15, 1, NULL, 64, "int64");
+  tdesc_create_reg (feature, "rip", 16, 1, NULL, 64, "code_ptr");
+  tdesc_create_reg (feature, "eflags", 17, 1, NULL, 32, "i386_eflags");
+  tdesc_create_reg (feature, "cs", 18, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ss", 19, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ds", 20, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "es", 21, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "fs", 22, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "gs", 23, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "st0", 24, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st1", 25, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st2", 26, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st3", 27, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st4", 28, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st5", 29, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st6", 30, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st7", 31, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "fctrl", 32, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fstat", 33, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "ftag", 34, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fiseg", 35, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fioff", 36, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "foseg", 37, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fooff", 38, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fop", 39, 1, "float", 32, "int");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.sse");
+  field_type = tdesc_named_type (feature, "ieee_single");
+  tdesc_create_vector (feature, "v4f", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "ieee_double");
+  tdesc_create_vector (feature, "v2d", field_type, 2);
+
+  field_type = tdesc_named_type (feature, "int8");
+  tdesc_create_vector (feature, "v16i8", field_type, 16);
+
+  field_type = tdesc_named_type (feature, "int16");
+  tdesc_create_vector (feature, "v8i16", field_type, 8);
+
+  field_type = tdesc_named_type (feature, "int32");
+  tdesc_create_vector (feature, "v4i32", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "int64");
+  tdesc_create_vector (feature, "v2i64", field_type, 2);
+
+  type = tdesc_create_union (feature, "vec128");
+  field_type = tdesc_named_type (feature, "v4f");
+  tdesc_add_field (type, "v4_float", field_type);
+  field_type = tdesc_named_type (feature, "v2d");
+  tdesc_add_field (type, "v2_double", field_type);
+  field_type = tdesc_named_type (feature, "v16i8");
+  tdesc_add_field (type, "v16_int8", field_type);
+  field_type = tdesc_named_type (feature, "v8i16");
+  tdesc_add_field (type, "v8_int16", field_type);
+  field_type = tdesc_named_type (feature, "v4i32");
+  tdesc_add_field (type, "v4_int32", field_type);
+  field_type = tdesc_named_type (feature, "v2i64");
+  tdesc_add_field (type, "v2_int64", field_type);
+  field_type = tdesc_named_type (feature, "uint128");
+  tdesc_add_field (type, "uint128", field_type);
+
+  field_type = tdesc_create_flags (feature, "i386_mxcsr", 4);
+  tdesc_add_flag (field_type, 0, "IE");
+  tdesc_add_flag (field_type, 1, "DE");
+  tdesc_add_flag (field_type, 2, "ZE");
+  tdesc_add_flag (field_type, 3, "OE");
+  tdesc_add_flag (field_type, 4, "UE");
+  tdesc_add_flag (field_type, 5, "PE");
+  tdesc_add_flag (field_type, 6, "DAZ");
+  tdesc_add_flag (field_type, 7, "IM");
+  tdesc_add_flag (field_type, 8, "DM");
+  tdesc_add_flag (field_type, 9, "ZM");
+  tdesc_add_flag (field_type, 10, "OM");
+  tdesc_add_flag (field_type, 11, "UM");
+  tdesc_add_flag (field_type, 12, "PM");
+  tdesc_add_flag (field_type, 15, "FZ");
+
+  tdesc_create_reg (feature, "xmm0", 40, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm1", 41, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm2", 42, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm3", 43, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm4", 44, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm5", 45, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm6", 46, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm7", 47, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm8", 48, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm9", 49, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm10", 50, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm11", 51, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm12", 52, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm13", 53, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm14", 54, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm15", 55, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "mxcsr", 56, 1, "vector", 32, "i386_mxcsr");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.avx");
+  tdesc_create_reg (feature, "ymm0h", 57, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm1h", 58, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm2h", 59, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm3h", 60, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm4h", 61, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm5h", 62, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm6h", 63, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm7h", 64, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm8h", 65, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm9h", 66, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm10h", 67, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm11h", 68, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm12h", 69, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm13h", 70, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm14h", 71, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm15h", 72, 1, NULL, 128, "uint128");
+
+  tdesc_amd64_avx = result;
+}
diff --git a/gdb/features/i386/amd64-avx.xml b/gdb/features/i386/amd64-avx.xml
new file mode 100644
index 0000000..c62088f
--- /dev/null
+++ b/gdb/features/i386/amd64-avx.xml
@@ -0,0 +1,16 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2010 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!-- AMD64 with AVX -->
+
+<!DOCTYPE target SYSTEM "gdb-target.dtd">
+<target>
+  <architecture>i386:x86-64</architecture>
+  <xi:include href="64bit-core.xml"/>
+  <xi:include href="64bit-sse.xml"/>
+  <xi:include href="64bit-avx.xml"/>
+</target>
diff --git a/gdb/features/i386/i386-avx-linux.c b/gdb/features/i386/i386-avx-linux.c
new file mode 100644
index 0000000..1aa939b
--- /dev/null
+++ b/gdb/features/i386/i386-avx-linux.c
@@ -0,0 +1,147 @@
+/* THIS FILE IS GENERATED.  Original: i386-avx-linux.xml */
+
+#include "defs.h"
+#include "osabi.h"
+#include "target-descriptions.h"
+
+struct target_desc *tdesc_i386_avx_linux;
+static void
+initialize_tdesc_i386_avx_linux (void)
+{
+  struct target_desc *result = allocate_target_description ();
+  struct tdesc_feature *feature;
+  struct tdesc_type *field_type, *type;
+
+  set_tdesc_architecture (result, bfd_scan_arch ("i386"));
+
+  set_tdesc_osabi (result, osabi_from_tdesc_string ("GNU/Linux"));
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.core");
+  field_type = tdesc_create_flags (feature, "i386_eflags", 4);
+  tdesc_add_flag (field_type, 0, "CF");
+  tdesc_add_flag (field_type, 1, "");
+  tdesc_add_flag (field_type, 2, "PF");
+  tdesc_add_flag (field_type, 4, "AF");
+  tdesc_add_flag (field_type, 6, "ZF");
+  tdesc_add_flag (field_type, 7, "SF");
+  tdesc_add_flag (field_type, 8, "TF");
+  tdesc_add_flag (field_type, 9, "IF");
+  tdesc_add_flag (field_type, 10, "DF");
+  tdesc_add_flag (field_type, 11, "OF");
+  tdesc_add_flag (field_type, 14, "NT");
+  tdesc_add_flag (field_type, 16, "RF");
+  tdesc_add_flag (field_type, 17, "VM");
+  tdesc_add_flag (field_type, 18, "AC");
+  tdesc_add_flag (field_type, 19, "VIF");
+  tdesc_add_flag (field_type, 20, "VIP");
+  tdesc_add_flag (field_type, 21, "ID");
+
+  tdesc_create_reg (feature, "eax", 0, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ecx", 1, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "edx", 2, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ebx", 3, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "esp", 4, 1, NULL, 32, "data_ptr");
+  tdesc_create_reg (feature, "ebp", 5, 1, NULL, 32, "data_ptr");
+  tdesc_create_reg (feature, "esi", 6, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "edi", 7, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "eip", 8, 1, NULL, 32, "code_ptr");
+  tdesc_create_reg (feature, "eflags", 9, 1, NULL, 32, "i386_eflags");
+  tdesc_create_reg (feature, "cs", 10, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ss", 11, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ds", 12, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "es", 13, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "fs", 14, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "gs", 15, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "st0", 16, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st1", 17, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st2", 18, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st3", 19, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st4", 20, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st5", 21, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st6", 22, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st7", 23, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "fctrl", 24, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fstat", 25, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "ftag", 26, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fiseg", 27, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fioff", 28, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "foseg", 29, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fooff", 30, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fop", 31, 1, "float", 32, "int");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.sse");
+  field_type = tdesc_named_type (feature, "ieee_single");
+  tdesc_create_vector (feature, "v4f", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "ieee_double");
+  tdesc_create_vector (feature, "v2d", field_type, 2);
+
+  field_type = tdesc_named_type (feature, "int8");
+  tdesc_create_vector (feature, "v16i8", field_type, 16);
+
+  field_type = tdesc_named_type (feature, "int16");
+  tdesc_create_vector (feature, "v8i16", field_type, 8);
+
+  field_type = tdesc_named_type (feature, "int32");
+  tdesc_create_vector (feature, "v4i32", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "int64");
+  tdesc_create_vector (feature, "v2i64", field_type, 2);
+
+  type = tdesc_create_union (feature, "vec128");
+  field_type = tdesc_named_type (feature, "v4f");
+  tdesc_add_field (type, "v4_float", field_type);
+  field_type = tdesc_named_type (feature, "v2d");
+  tdesc_add_field (type, "v2_double", field_type);
+  field_type = tdesc_named_type (feature, "v16i8");
+  tdesc_add_field (type, "v16_int8", field_type);
+  field_type = tdesc_named_type (feature, "v8i16");
+  tdesc_add_field (type, "v8_int16", field_type);
+  field_type = tdesc_named_type (feature, "v4i32");
+  tdesc_add_field (type, "v4_int32", field_type);
+  field_type = tdesc_named_type (feature, "v2i64");
+  tdesc_add_field (type, "v2_int64", field_type);
+  field_type = tdesc_named_type (feature, "uint128");
+  tdesc_add_field (type, "uint128", field_type);
+
+  field_type = tdesc_create_flags (feature, "i386_mxcsr", 4);
+  tdesc_add_flag (field_type, 0, "IE");
+  tdesc_add_flag (field_type, 1, "DE");
+  tdesc_add_flag (field_type, 2, "ZE");
+  tdesc_add_flag (field_type, 3, "OE");
+  tdesc_add_flag (field_type, 4, "UE");
+  tdesc_add_flag (field_type, 5, "PE");
+  tdesc_add_flag (field_type, 6, "DAZ");
+  tdesc_add_flag (field_type, 7, "IM");
+  tdesc_add_flag (field_type, 8, "DM");
+  tdesc_add_flag (field_type, 9, "ZM");
+  tdesc_add_flag (field_type, 10, "OM");
+  tdesc_add_flag (field_type, 11, "UM");
+  tdesc_add_flag (field_type, 12, "PM");
+  tdesc_add_flag (field_type, 15, "FZ");
+
+  tdesc_create_reg (feature, "xmm0", 32, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm1", 33, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm2", 34, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm3", 35, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm4", 36, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm5", 37, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm6", 38, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm7", 39, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "mxcsr", 40, 1, "vector", 32, "i386_mxcsr");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.linux");
+  tdesc_create_reg (feature, "orig_eax", 41, 1, NULL, 32, "int");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.avx");
+  tdesc_create_reg (feature, "ymm0h", 42, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm1h", 43, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm2h", 44, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm3h", 45, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm4h", 46, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm5h", 47, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm6h", 48, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm7h", 49, 1, NULL, 128, "uint128");
+
+  tdesc_i386_avx_linux = result;
+}
diff --git a/gdb/features/i386/i386-avx-linux.xml b/gdb/features/i386/i386-avx-linux.xml
new file mode 100644
index 0000000..7cbb730
--- /dev/null
+++ b/gdb/features/i386/i386-avx-linux.xml
@@ -0,0 +1,18 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2010 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!-- I386 with AVX- Includes Linux-only special "register".  -->
+
+<!DOCTYPE target SYSTEM "gdb-target.dtd">
+<target>
+  <architecture>i386</architecture>
+  <osabi>GNU/Linux</osabi>
+  <xi:include href="32bit-core.xml"/>
+  <xi:include href="32bit-sse.xml"/>
+  <xi:include href="32bit-linux.xml"/>
+  <xi:include href="32bit-avx.xml"/>
+</target>
diff --git a/gdb/features/i386/i386-avx.c b/gdb/features/i386/i386-avx.c
new file mode 100644
index 0000000..1e74ed5
--- /dev/null
+++ b/gdb/features/i386/i386-avx.c
@@ -0,0 +1,142 @@
+/* THIS FILE IS GENERATED.  Original: i386-avx.xml */
+
+#include "defs.h"
+#include "osabi.h"
+#include "target-descriptions.h"
+
+struct target_desc *tdesc_i386_avx;
+static void
+initialize_tdesc_i386_avx (void)
+{
+  struct target_desc *result = allocate_target_description ();
+  struct tdesc_feature *feature;
+  struct tdesc_type *field_type, *type;
+
+  set_tdesc_architecture (result, bfd_scan_arch ("i386"));
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.core");
+  field_type = tdesc_create_flags (feature, "i386_eflags", 4);
+  tdesc_add_flag (field_type, 0, "CF");
+  tdesc_add_flag (field_type, 1, "");
+  tdesc_add_flag (field_type, 2, "PF");
+  tdesc_add_flag (field_type, 4, "AF");
+  tdesc_add_flag (field_type, 6, "ZF");
+  tdesc_add_flag (field_type, 7, "SF");
+  tdesc_add_flag (field_type, 8, "TF");
+  tdesc_add_flag (field_type, 9, "IF");
+  tdesc_add_flag (field_type, 10, "DF");
+  tdesc_add_flag (field_type, 11, "OF");
+  tdesc_add_flag (field_type, 14, "NT");
+  tdesc_add_flag (field_type, 16, "RF");
+  tdesc_add_flag (field_type, 17, "VM");
+  tdesc_add_flag (field_type, 18, "AC");
+  tdesc_add_flag (field_type, 19, "VIF");
+  tdesc_add_flag (field_type, 20, "VIP");
+  tdesc_add_flag (field_type, 21, "ID");
+
+  tdesc_create_reg (feature, "eax", 0, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ecx", 1, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "edx", 2, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ebx", 3, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "esp", 4, 1, NULL, 32, "data_ptr");
+  tdesc_create_reg (feature, "ebp", 5, 1, NULL, 32, "data_ptr");
+  tdesc_create_reg (feature, "esi", 6, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "edi", 7, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "eip", 8, 1, NULL, 32, "code_ptr");
+  tdesc_create_reg (feature, "eflags", 9, 1, NULL, 32, "i386_eflags");
+  tdesc_create_reg (feature, "cs", 10, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ss", 11, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "ds", 12, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "es", 13, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "fs", 14, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "gs", 15, 1, NULL, 32, "int32");
+  tdesc_create_reg (feature, "st0", 16, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st1", 17, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st2", 18, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st3", 19, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st4", 20, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st5", 21, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st6", 22, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "st7", 23, 1, NULL, 80, "i387_ext");
+  tdesc_create_reg (feature, "fctrl", 24, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fstat", 25, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "ftag", 26, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fiseg", 27, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fioff", 28, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "foseg", 29, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fooff", 30, 1, "float", 32, "int");
+  tdesc_create_reg (feature, "fop", 31, 1, "float", 32, "int");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.sse");
+  field_type = tdesc_named_type (feature, "ieee_single");
+  tdesc_create_vector (feature, "v4f", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "ieee_double");
+  tdesc_create_vector (feature, "v2d", field_type, 2);
+
+  field_type = tdesc_named_type (feature, "int8");
+  tdesc_create_vector (feature, "v16i8", field_type, 16);
+
+  field_type = tdesc_named_type (feature, "int16");
+  tdesc_create_vector (feature, "v8i16", field_type, 8);
+
+  field_type = tdesc_named_type (feature, "int32");
+  tdesc_create_vector (feature, "v4i32", field_type, 4);
+
+  field_type = tdesc_named_type (feature, "int64");
+  tdesc_create_vector (feature, "v2i64", field_type, 2);
+
+  type = tdesc_create_union (feature, "vec128");
+  field_type = tdesc_named_type (feature, "v4f");
+  tdesc_add_field (type, "v4_float", field_type);
+  field_type = tdesc_named_type (feature, "v2d");
+  tdesc_add_field (type, "v2_double", field_type);
+  field_type = tdesc_named_type (feature, "v16i8");
+  tdesc_add_field (type, "v16_int8", field_type);
+  field_type = tdesc_named_type (feature, "v8i16");
+  tdesc_add_field (type, "v8_int16", field_type);
+  field_type = tdesc_named_type (feature, "v4i32");
+  tdesc_add_field (type, "v4_int32", field_type);
+  field_type = tdesc_named_type (feature, "v2i64");
+  tdesc_add_field (type, "v2_int64", field_type);
+  field_type = tdesc_named_type (feature, "uint128");
+  tdesc_add_field (type, "uint128", field_type);
+
+  field_type = tdesc_create_flags (feature, "i386_mxcsr", 4);
+  tdesc_add_flag (field_type, 0, "IE");
+  tdesc_add_flag (field_type, 1, "DE");
+  tdesc_add_flag (field_type, 2, "ZE");
+  tdesc_add_flag (field_type, 3, "OE");
+  tdesc_add_flag (field_type, 4, "UE");
+  tdesc_add_flag (field_type, 5, "PE");
+  tdesc_add_flag (field_type, 6, "DAZ");
+  tdesc_add_flag (field_type, 7, "IM");
+  tdesc_add_flag (field_type, 8, "DM");
+  tdesc_add_flag (field_type, 9, "ZM");
+  tdesc_add_flag (field_type, 10, "OM");
+  tdesc_add_flag (field_type, 11, "UM");
+  tdesc_add_flag (field_type, 12, "PM");
+  tdesc_add_flag (field_type, 15, "FZ");
+
+  tdesc_create_reg (feature, "xmm0", 32, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm1", 33, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm2", 34, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm3", 35, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm4", 36, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm5", 37, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm6", 38, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "xmm7", 39, 1, NULL, 128, "vec128");
+  tdesc_create_reg (feature, "mxcsr", 40, 1, "vector", 32, "i386_mxcsr");
+
+  feature = tdesc_create_feature (result, "org.gnu.gdb.i386.avx");
+  tdesc_create_reg (feature, "ymm0h", 41, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm1h", 42, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm2h", 43, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm3h", 44, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm4h", 45, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm5h", 46, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm6h", 47, 1, NULL, 128, "uint128");
+  tdesc_create_reg (feature, "ymm7h", 48, 1, NULL, 128, "uint128");
+
+  tdesc_i386_avx = result;
+}
diff --git a/gdb/features/i386/i386-avx.xml b/gdb/features/i386/i386-avx.xml
new file mode 100644
index 0000000..b8f59c0
--- /dev/null
+++ b/gdb/features/i386/i386-avx.xml
@@ -0,0 +1,16 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2010 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!-- I386 with AVX -->
+
+<!DOCTYPE target SYSTEM "gdb-target.dtd">
+<target>
+  <architecture>i386</architecture>
+  <xi:include href="32bit-core.xml"/>
+  <xi:include href="32bit-sse.xml"/>
+  <xi:include href="32bit-avx.xml"/>
+</target>
diff --git a/gdb/regformats/i386/amd64-avx-linux.dat b/gdb/regformats/i386/amd64-avx-linux.dat
new file mode 100644
index 0000000..4491313
--- /dev/null
+++ b/gdb/regformats/i386/amd64-avx-linux.dat
@@ -0,0 +1,78 @@
+# DO NOT EDIT: generated from i386/amd64-avx-linux.xml
+name:amd64_avx_linux
+xmltarget:amd64-avx-linux.xml
+expedite:rbp,rsp,rip
+64:rax
+64:rbx
+64:rcx
+64:rdx
+64:rsi
+64:rdi
+64:rbp
+64:rsp
+64:r8
+64:r9
+64:r10
+64:r11
+64:r12
+64:r13
+64:r14
+64:r15
+64:rip
+32:eflags
+32:cs
+32:ss
+32:ds
+32:es
+32:fs
+32:gs
+80:st0
+80:st1
+80:st2
+80:st3
+80:st4
+80:st5
+80:st6
+80:st7
+32:fctrl
+32:fstat
+32:ftag
+32:fiseg
+32:fioff
+32:foseg
+32:fooff
+32:fop
+128:xmm0
+128:xmm1
+128:xmm2
+128:xmm3
+128:xmm4
+128:xmm5
+128:xmm6
+128:xmm7
+128:xmm8
+128:xmm9
+128:xmm10
+128:xmm11
+128:xmm12
+128:xmm13
+128:xmm14
+128:xmm15
+32:mxcsr
+64:orig_rax
+128:ymm0h
+128:ymm1h
+128:ymm2h
+128:ymm3h
+128:ymm4h
+128:ymm5h
+128:ymm6h
+128:ymm7h
+128:ymm8h
+128:ymm9h
+128:ymm10h
+128:ymm11h
+128:ymm12h
+128:ymm13h
+128:ymm14h
+128:ymm15h
diff --git a/gdb/regformats/i386/amd64-avx.dat b/gdb/regformats/i386/amd64-avx.dat
new file mode 100644
index 0000000..62cfdd7
--- /dev/null
+++ b/gdb/regformats/i386/amd64-avx.dat
@@ -0,0 +1,77 @@
+# DO NOT EDIT: generated from i386/amd64-avx.xml
+name:amd64_avx
+xmltarget:amd64-avx.xml
+expedite:rbp,rsp,rip
+64:rax
+64:rbx
+64:rcx
+64:rdx
+64:rsi
+64:rdi
+64:rbp
+64:rsp
+64:r8
+64:r9
+64:r10
+64:r11
+64:r12
+64:r13
+64:r14
+64:r15
+64:rip
+32:eflags
+32:cs
+32:ss
+32:ds
+32:es
+32:fs
+32:gs
+80:st0
+80:st1
+80:st2
+80:st3
+80:st4
+80:st5
+80:st6
+80:st7
+32:fctrl
+32:fstat
+32:ftag
+32:fiseg
+32:fioff
+32:foseg
+32:fooff
+32:fop
+128:xmm0
+128:xmm1
+128:xmm2
+128:xmm3
+128:xmm4
+128:xmm5
+128:xmm6
+128:xmm7
+128:xmm8
+128:xmm9
+128:xmm10
+128:xmm11
+128:xmm12
+128:xmm13
+128:xmm14
+128:xmm15
+32:mxcsr
+128:ymm0h
+128:ymm1h
+128:ymm2h
+128:ymm3h
+128:ymm4h
+128:ymm5h
+128:ymm6h
+128:ymm7h
+128:ymm8h
+128:ymm9h
+128:ymm10h
+128:ymm11h
+128:ymm12h
+128:ymm13h
+128:ymm14h
+128:ymm15h
diff --git a/gdb/regformats/i386/i386-avx-linux.dat b/gdb/regformats/i386/i386-avx-linux.dat
new file mode 100644
index 0000000..e9eb951
--- /dev/null
+++ b/gdb/regformats/i386/i386-avx-linux.dat
@@ -0,0 +1,54 @@
+# DO NOT EDIT: generated from i386/i386-avx-linux.xml
+name:i386_avx_linux
+xmltarget:i386-avx-linux.xml
+expedite:ebp,esp,eip
+32:eax
+32:ecx
+32:edx
+32:ebx
+32:esp
+32:ebp
+32:esi
+32:edi
+32:eip
+32:eflags
+32:cs
+32:ss
+32:ds
+32:es
+32:fs
+32:gs
+80:st0
+80:st1
+80:st2
+80:st3
+80:st4
+80:st5
+80:st6
+80:st7
+32:fctrl
+32:fstat
+32:ftag
+32:fiseg
+32:fioff
+32:foseg
+32:fooff
+32:fop
+128:xmm0
+128:xmm1
+128:xmm2
+128:xmm3
+128:xmm4
+128:xmm5
+128:xmm6
+128:xmm7
+32:mxcsr
+32:orig_eax
+128:ymm0h
+128:ymm1h
+128:ymm2h
+128:ymm3h
+128:ymm4h
+128:ymm5h
+128:ymm6h
+128:ymm7h
diff --git a/gdb/regformats/i386/i386-avx.dat b/gdb/regformats/i386/i386-avx.dat
new file mode 100644
index 0000000..4a19a2b
--- /dev/null
+++ b/gdb/regformats/i386/i386-avx.dat
@@ -0,0 +1,53 @@
+# DO NOT EDIT: generated from i386/i386-avx.xml
+name:i386_avx
+xmltarget:i386-avx.xml
+expedite:ebp,esp,eip
+32:eax
+32:ecx
+32:edx
+32:ebx
+32:esp
+32:ebp
+32:esi
+32:edi
+32:eip
+32:eflags
+32:cs
+32:ss
+32:ds
+32:es
+32:fs
+32:gs
+80:st0
+80:st1
+80:st2
+80:st3
+80:st4
+80:st5
+80:st6
+80:st7
+32:fctrl
+32:fstat
+32:ftag
+32:fiseg
+32:fioff
+32:foseg
+32:fooff
+32:fop
+128:xmm0
+128:xmm1
+128:xmm2
+128:xmm3
+128:xmm4
+128:xmm5
+128:xmm6
+128:xmm7
+32:mxcsr
+128:ymm0h
+128:ymm1h
+128:ymm2h
+128:ymm3h
+128:ymm4h
+128:ymm5h
+128:ymm6h
+128:ymm7h

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
                     ` (2 preceding siblings ...)
  2010-03-05 10:33   ` PATCH: 2/6: Add AVX support (Update document) Eli Zaretskii
@ 2010-03-06 22:19   ` H.J. Lu
  2010-03-12 11:11     ` Eli Zaretskii
                       ` (3 more replies)
  3 siblings, 4 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:19 UTC (permalink / raw)
  To: GDB

Hi,

This patch updates document for AVX support.  OK to install?
 
Thanks.


H.J.
---
2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.texinfo (General Query Packets): Document x86=xml.
	(i386 Features): Add org.gnu.gdb.i386.avx.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 41b11b6..64bc707 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -30268,6 +30268,11 @@ extensions to the remote protocol.  @value{GDBN} does not use such
 extensions unless the stub also reports that it supports them by
 including @samp{multiprocess+} in its @samp{qSupported} reply.
 @xref{multiprocess extensions}, for details.
+
+@item x86=xml
+This feature indicates that @value{GDBN} supports x86 XML target
+description.  If the stub sees @samp{x86=xml}, it can send @value{GDBN}
+the x86 XML target description.
 @end table
 
 Stubs should ignore any unknown values for
@@ -33350,6 +33355,17 @@ describe registers:
 @samp{mxcsr}
 @end itemize
 
+The @samp{org.gnu.gdb.i386.avx} feature is optional. It should
+describe the upper 128bit of @sc{ymm} registers:
+
+@itemize @minus
+@item
+@samp{ymm0h} through @samp{ymm7h} for i386
+@item
+@samp{ymm0h} through @samp{ymm15h} for amd64
+@item 
+@end itemize
+
 The @samp{org.gnu.gdb.i386.linux} feature is optional.  It should
 describe a single register, @samp{orig_eax}.
 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-04 18:06   ` PATCH: 3/6: Add AVX support (i386 changes) H.J. Lu
@ 2010-03-06 22:21     ` H.J. Lu
  2010-03-07 21:32       ` H.J. Lu
                         ` (2 more replies)
  0 siblings, 3 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:21 UTC (permalink / raw)
  To: GDB

Hi,

Here are i386 changes to support AVX. OK to install?
 
Thanks.


H.J.
---
2010-03-04  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h" and
	<sys/uio.h>.
	(xstate_size): New.
	(xstate_size_n_of_int64): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Make it global.  Add
	".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_regset_sections): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.  Call set_gdbarch_qsupported.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.
	* config/i386/nm-linux-xstate.h: Likewise.
	* config/i386/nm-linux64.h: Likewise.

	* config/i386/linux64.mh (NAT_FILE): Set to nm-linux64.h.

	* config/i386/nm-linux.h: Include "config/i386/nm-linux-xstate.h".

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..3548103
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,45 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define bit_I386_XSTATE_X87		(1ULL << 0)
+#define bit_I386_XSTATE_SSE		(1ULL << 1)
+#define bit_I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	\
+  (bit_I386_XSTATE_X87 | bit_I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	\
+  (I386_XSTATE_SSE_MASK | bit_I386_XSTATE_AVX)
+#define I386_XSTATE_MAX_MASK	\
+  I386_XSTATE_AVX_MASK
+
+#define I386_XSTATE_SSE_SIZE		576
+#define I386_XSTATE_AVX_SIZE		832
+#define I386_XSTATE_MAX_SIZE		832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & bit_I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/config/i386/linux64.mh b/gdb/config/i386/linux64.mh
index 19f3be0..99a5042 100644
--- a/gdb/config/i386/linux64.mh
+++ b/gdb/config/i386/linux64.mh
@@ -2,7 +2,7 @@
 NATDEPFILES= inf-ptrace.o fork-child.o \
 	i386-nat.o amd64-nat.o amd64-linux-nat.o linux-nat.o \
 	proc-service.o linux-thread-db.o linux-fork.o
-NAT_FILE= config/nm-linux.h
+NAT_FILE= nm-linux64.h
 
 # The dynamically loaded libthread_db needs access to symbols in the
 # gdb executable.
diff --git a/gdb/config/i386/nm-linux-xstate.h b/gdb/config/i386/nm-linux-xstate.h
new file mode 100644
index 0000000..0dbf9e5
--- /dev/null
+++ b/gdb/config/i386/nm-linux-xstate.h
@@ -0,0 +1,33 @@
+/* Native XSAVE extended state support for GNU/Linux x86.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef	NM_LINUX_XSTATE_H
+#define	NM_LINUX_XSTATE_H
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+#endif	/* NM_LINUX_XSTATE_H */
diff --git a/gdb/config/i386/nm-linux.h b/gdb/config/i386/nm-linux.h
index 10db309..fab8a0d 100644
--- a/gdb/config/i386/nm-linux.h
+++ b/gdb/config/i386/nm-linux.h
@@ -23,6 +23,7 @@
 #define NM_LINUX_H
 
 #include "config/nm-linux.h"
+#include "config/i386/nm-linux-xstate.h"
 
 #ifdef HAVE_PTRACE_GETFPXREGS
 /* Include register set support for the SSE registers.  */
diff --git a/gdb/config/i386/nm-linux64.h b/gdb/config/i386/nm-linux64.h
new file mode 100644
index 0000000..75220d6
--- /dev/null
+++ b/gdb/config/i386/nm-linux64.h
@@ -0,0 +1,26 @@
+/* Native support for GNU/Linux x86-64.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef NM_LINUX64_H
+#define NM_LINUX64_H
+
+#include "config/nm-linux.h"
+#include "config/i386/nm-linux-xstate.h"
+
+#endif /* nm-linux64.h */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..344c814 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,16 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +111,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +125,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +373,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +558,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +572,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +630,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +644,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +943,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static unsigned long long xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (i386_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..66ecf84 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,13 +50,15 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
-static struct core_regset_section i386_linux_regset_sections[] =
+struct core_regset_section i386_linux_regset_sections[] =
 {
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size)
+{
+  int i;
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  for (i = 0; regset_sections[i].sect_name != NULL; i++)
+    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
+      {
+	if (xstate_size)
+	  regset_sections[i].size = xstate_size;
+	else
+	  regset_sections[i].sect_name = NULL;
+	break;
+      }
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+unsigned long long
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  unsigned long long xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      gdb_assert (size >= I386_XSTATE_SSE_SIZE);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +634,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +694,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +913,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..8881fea 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,45 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern unsigned long long i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section i386_linux_regset_sections[];
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
+  enabled state mask,  which is same as the 64bit mask returned by the
+  xgetbv's XCR0). We can use this mask as well as the mask saved in the
+  xstate_hdr bytes to interpret what states the processor/OS supports and
+  what state is in, used/initialized conditions, for the particular
+  process/thread.  */
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 05afa56..8ced34a 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -50,11 +50,13 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -73,6 +75,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -149,18 +163,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
 static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -200,6 +243,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -208,6 +264,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -2183,6 +2241,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo XMM registers.  We can't use
+   tdesc_find_type since XMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2233,6 +2344,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2284,7 +2397,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 16byte. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 16byte.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2333,7 +2461,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 16byte.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 16byte.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2580,6 +2721,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2607,6 +2770,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2800,46 +2973,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & bit_I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5649,7 +5836,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -5659,13 +5847,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may be set by ABI-specific.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -5674,7 +5886,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -5689,6 +5901,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -5709,6 +5922,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -5738,6 +5953,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -5854,9 +6071,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* The default ABI includes general-purpose registers, floating-point
+     registers, the SSE registers and the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -5867,10 +6088,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -5878,24 +6104,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -5905,16 +6132,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -5940,6 +6177,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_fast_tracepoint_valid_at (gdbarch,
 					i386_fast_tracepoint_valid_at);
 
+  /* Tell remote stub that we support XML target description.  */
+  set_gdbarch_qsupported (gdbarch, "x86=xml");
+
   return gdbarch;
 }
 
@@ -5997,4 +6237,5 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 }
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..659e909 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this gdb.
+   */
+  unsigned long long xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,8 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes)
  2010-03-04 18:08   ` PATCH: 4/6: Add AVX support (amd64 changes) H.J. Lu
  2010-03-04 18:09     ` PATCH: 5/6: Add AVX support (i387 changes) H.J. Lu
@ 2010-03-06 22:21     ` H.J. Lu
  2010-03-07 21:33       ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:21 UTC (permalink / raw)
  To: GDB

Hi,

Here are the amd64 changes to support AVX.  OK to install?

Thanks.


H.J.
----
2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h" and
	<sys/uio.h>.
	(xstate_size): New.
	(xstate_size_n_of_int64): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_regset_sections): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..4af1112 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -52,6 +55,16 @@
 #include "amd64-nat.h"
 #include "i386-nat.h"
 
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
+
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
 
@@ -73,6 +86,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +114,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +199,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +258,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
+
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
 
-      return;
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +738,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static unsigned long long xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +753,53 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (amd64_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..51722bf 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,7 +28,9 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
 
 #include "gdb_string.h"
@@ -38,6 +40,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +48,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1250,12 +1262,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1314,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1337,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1539,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..734f117 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,17 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section amd64_linux_regset_sections[];
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index 8c41a8a..0c93125 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -234,6 +248,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -242,6 +269,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2148,6 +2177,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2166,6 +2217,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2228,6 +2289,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 16;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2241,6 +2309,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2321,6 +2391,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2356,6 +2427,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2379,3 +2474,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 5/6 [2nd try]: Add AVX support (i387 changes)
  2010-03-04 18:09     ` PATCH: 5/6: Add AVX support (i387 changes) H.J. Lu
  2010-03-04 18:10       ` PATCH: 6/6: Add AVX support (gdbserver changes) H.J. Lu
  2010-03-05  3:20       ` PATCH: 5/6: Add AVX support (i387 changes) Hui Zhu
@ 2010-03-06 22:22       ` H.J. Lu
  2010-03-12 17:24         ` H.J. Lu
  2010-03-27 15:08         ` PATCH: 5/6 [2nd " Mark Kettenis
  2 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:22 UTC (permalink / raw)
  To: GDB

Hi,

Here are i387 changes to support AVX.  OK to install?
 
Thanks.


H.J.
---
2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>

	* i387-tdep.c: Include "i386-xstate.h".
	(XSAVE_XSTATE_BV_ADDR): New.
	(xsave_avxh_offset): Likewise.
	(XSAVE_AVXH_ADDR): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

	* i387-tdep.h (I387_NUM_YMM_REGS): New.
	(I387_YMM0H_REGNUM): Likewise.
	(I387_YMMENDH_REGNUM): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
index 3fb5b56..197af7f 100644
--- a/gdb/i387-tdep.c
+++ b/gdb/i387-tdep.c
@@ -34,6 +34,7 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 /* Print the floating point number specified by RAW.  */
 
@@ -677,6 +678,518 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
 			  FXSAVE_MXCSR_ADDR (regs));
 }
 
+/* `xstate_bv' is at byte offset 512.  */
+#define XSAVE_XSTATE_BV_ADDR(xsave) (xsave + 512)
+
+/* At xsave_avxh_offset[REGNUM] you'll find the offset to the location in
+   the upper 128bit of AVX register data structure used by the "xsave"
+   instruction where GDB register REGNUM is stored.  */
+
+static int xsave_avxh_offset[] =
+{
+  576 + 0 * 16,		/* Upper 128bit of %ymm0 through ...  */
+  576 + 1 * 16,
+  576 + 2 * 16,
+  576 + 3 * 16,
+  576 + 4 * 16,
+  576 + 5 * 16,
+  576 + 6 * 16,
+  576 + 7 * 16,
+  576 + 8 * 16,
+  576 + 9 * 16,
+  576 + 10 * 16,
+  576 + 11 * 16,
+  576 + 12 * 16,
+  576 + 13 * 16,
+  576 + 14 * 16,
+  576 + 15 * 16		/* Upper 128bit of ... %ymm15 (128 bits each).  */
+};
+
+#define XSAVE_AVXH_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_avxh_offset[regnum - I387_YMM0H_REGNUM (tdep)])
+
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+void
+i387_supply_xsave (struct regcache *regcache, int regnum,
+		   const void *xsave)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  const gdb_byte *regs = xsave;
+  int i;
+  unsigned int clear_bv;
+  const gdb_byte *p;
+  enum
+    {
+      none = 0x0,
+      x87 = 0x1,
+      sse = 0x2,
+      avxh = 0x4,
+      all = x87 | sse | avxh
+    } regclass;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_xmm_regs > 0);
+
+  if (regnum == -1)
+    regclass = all;
+  else if (regnum >= I387_YMM0H_REGNUM (tdep)
+	   && regnum < I387_YMMENDH_REGNUM (tdep))
+    regclass = avxh;
+  else if (regnum >= I387_XMM0_REGNUM(tdep)
+	   && regnum < I387_MXCSR_REGNUM (tdep))
+    regclass = sse;
+  else if (regnum >= I387_ST0_REGNUM (tdep)
+	   && regnum < I387_FCTRL_REGNUM (tdep))
+    regclass = x87;
+  else
+    regclass = none;
+
+  if (regs != NULL && regclass != none)
+    {
+      /* Get `xstat_bv'.  */
+      const gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+
+      /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+	 vector registers if its bit in xstat_bv is zero.  */
+      clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+    }
+  else
+    clear_bv = I386_XSTATE_MAX_MASK;
+
+  switch (regclass)
+    {
+    case none:
+      break;
+
+    case avxh:
+      if ((clear_bv & bit_I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case sse:
+      if ((clear_bv & bit_I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = FXSAVE_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case x87:
+      if ((clear_bv & bit_I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = FXSAVE_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case all:
+      /* Hanle the upper YMM registers.  */
+      if ((tdep->xcr0 & bit_I386_XSTATE_AVX))
+	{
+	  if ((clear_bv & bit_I386_XSTATE_AVX))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_YMM0H_REGNUM (tdep);
+	       i < I387_YMMENDH_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = XSAVE_AVXH_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+
+      /* Handle the XMM registers.  */
+      if ((tdep->xcr0 & bit_I386_XSTATE_SSE))
+	{
+	  if ((clear_bv & bit_I386_XSTATE_SSE))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_XMM0_REGNUM (tdep);
+	       i < I387_MXCSR_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = FXSAVE_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+
+      /* Handle the x87 registers.  */
+      if ((tdep->xcr0 & bit_I386_XSTATE_X87))
+	{
+	  if ((clear_bv & bit_I386_XSTATE_X87))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_ST0_REGNUM (tdep);
+	       i < I387_FCTRL_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = FXSAVE_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+      break;
+    }
+
+  /* Only handle x87 control registers.  */
+  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	if (regs == NULL)
+	  {
+	    regcache_raw_supply (regcache, i, NULL);
+	    continue;
+	  }
+
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte val[4];
+
+	    memcpy (val, FXSAVE_ADDR (tdep, regs, i), 2);
+	    val[2] = val[3] = 0;
+	    if (i == I387_FOP_REGNUM (tdep))
+	      val[1] &= ((1 << 3) - 1);
+	    else if (i== I387_FTAG_REGNUM (tdep))
+	      {
+		/* The fxsave area contains a simplified version of
+		   the tag word.  We have to look at the actual 80-bit
+		   FP data to recreate the traditional i387 tag word.  */
+
+		unsigned long ftag = 0;
+		int fpreg;
+		int top;
+
+		top = ((FXSAVE_ADDR (tdep, regs,
+				     I387_FSTAT_REGNUM (tdep)))[1] >> 3);
+		top &= 0x7;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag;
+
+		    if (val[0] & (1 << fpreg))
+		      {
+			int regnum = (fpreg + 8 - top) % 8 
+				       + I387_ST0_REGNUM (tdep);
+			tag = i387_tag (FXSAVE_ADDR (tdep, regs, regnum));
+		      }
+		    else
+		      tag = 3;		/* Empty */
+
+		    ftag |= tag << (2 * fpreg);
+		  }
+		val[0] = ftag & 0xff;
+		val[1] = (ftag >> 8) & 0xff;
+	      }
+	    regcache_raw_supply (regcache, i, val);
+	  }
+	else 
+	  regcache_raw_supply (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    {
+      p = regs == NULL ? NULL : FXSAVE_MXCSR_ADDR (regs);
+      regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep), p);
+    }
+}
+
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+void
+i387_collect_xsave (const struct regcache *regcache, int regnum,
+		    void *xsave, int gcore)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  gdb_byte *regs = xsave;
+  int i;
+  enum
+    {
+      none = 0x0,
+      check = 0x1,
+      x87 = 0x2 | check,
+      sse = 0x4 | check,
+      avxh = 0x8 | check,
+      all = x87 | sse | avxh
+    } regclass;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_xmm_regs > 0);
+
+  if (regnum == -1)
+    regclass = all;
+  else if (regnum >= I387_YMM0H_REGNUM (tdep)
+	   && regnum < I387_YMMENDH_REGNUM (tdep))
+    regclass = avxh;
+  else if (regnum >= I387_XMM0_REGNUM(tdep)
+	   && regnum < I387_MXCSR_REGNUM (tdep))
+    regclass = sse;
+  else if (regnum >= I387_ST0_REGNUM (tdep)
+	   && regnum < I387_FCTRL_REGNUM (tdep))
+    regclass = x87;
+  else
+    regclass = none;
+
+  if (gcore)
+    {
+      /* Update XCR0 and `xstate_bv' with XCR0 for gcore.  */
+      if (tdep->xsave_xcr0_offset != -1)
+	memcpy (regs + tdep->xsave_xcr0_offset, &tdep->xcr0, 8);
+      memcpy (XSAVE_XSTATE_BV_ADDR (regs), &tdep->xcr0, 8);
+
+      switch (regclass)
+	{
+	default:
+	  abort ();
+
+	case all:
+	  /* Handle the upper YMM registers.  */
+	  if ((tdep->xcr0 & bit_I386_XSTATE_AVX))
+	    for (i = I387_YMM0H_REGNUM (tdep);
+		 i < I387_YMMENDH_REGNUM (tdep); i++)
+	      regcache_raw_collect (regcache, i,
+				    XSAVE_AVXH_ADDR (tdep, regs, i));
+
+	  /* Handle the XMM registers.  */
+	  if ((tdep->xcr0 & bit_I386_XSTATE_SSE))
+	    for (i = I387_XMM0_REGNUM (tdep);
+		 i < I387_MXCSR_REGNUM (tdep); i++)
+	      regcache_raw_collect (regcache, i,
+				    FXSAVE_ADDR (tdep, regs, i));
+
+	  /* Handle the x87 registers.  */
+	  if ((tdep->xcr0 & bit_I386_XSTATE_X87))
+	    for (i = I387_ST0_REGNUM (tdep);
+		 i < I387_FCTRL_REGNUM (tdep); i++)
+	      regcache_raw_collect (regcache, i,
+				    FXSAVE_ADDR (tdep, regs, i));
+	  break;
+
+	case x87:
+	  regcache_raw_collect (regcache, regnum,
+				FXSAVE_ADDR (tdep, regs, regnum));
+	  return;
+
+	case sse:
+	  regcache_raw_collect (regcache, regnum,
+				FXSAVE_ADDR (tdep, regs, regnum));
+	  return;
+
+	case avxh:
+	  regcache_raw_collect (regcache, regnum,
+				XSAVE_AVXH_ADDR (tdep, regs, regnum));
+	  return;
+	}
+    }
+  else
+    {
+      if ((regclass & check))
+	{
+	  gdb_byte raw[I386_MAX_REGISTER_SIZE];
+	  gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+	  unsigned int xstate_bv = 0;
+	  /* The supported bits in `xstat_bv' are 1 byte. */
+	  unsigned int clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+	  gdb_byte *p;
+
+	  /* Clear register set if its bit in xstat_bv is zero.  */
+	  if (clear_bv)
+	    {
+	      if ((clear_bv & bit_I386_XSTATE_AVX))
+		for (i = I387_YMM0H_REGNUM (tdep);
+		     i < I387_YMMENDH_REGNUM (tdep); i++)
+		  memset (XSAVE_AVXH_ADDR (tdep, regs, i), 0, 16);
+
+	      if ((clear_bv & bit_I386_XSTATE_SSE))
+		for (i = I387_XMM0_REGNUM (tdep);
+		     i < I387_MXCSR_REGNUM (tdep); i++)
+		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 16);
+
+	      if ((clear_bv & bit_I386_XSTATE_X87))
+		for (i = I387_ST0_REGNUM (tdep);
+		     i < I387_FCTRL_REGNUM (tdep); i++)
+		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 10);
+	    }
+
+	  if (regclass == all)
+	    {
+	      /* Check if any upper YMM registers are changed.  */
+	      if ((tdep->xcr0 & bit_I386_XSTATE_AVX))
+		for (i = I387_YMM0H_REGNUM (tdep);
+		     i < I387_YMMENDH_REGNUM (tdep); i++)
+		  {
+		    regcache_raw_collect (regcache, i, raw);
+		    p = XSAVE_AVXH_ADDR (tdep, regs, i);
+		    if (memcmp (raw, p, 16))
+		      {
+			xstate_bv |= bit_I386_XSTATE_AVX;
+			memcpy (p, raw, 16);
+		      }
+		  }
+
+	      /* Check if any SSE registers are changed.  */
+	      if ((tdep->xcr0 & bit_I386_XSTATE_SSE))
+		for (i = I387_XMM0_REGNUM (tdep);
+		     i < I387_MXCSR_REGNUM (tdep); i++)
+		  {
+		    regcache_raw_collect (regcache, i, raw);
+		    p = FXSAVE_ADDR (tdep, regs, i);
+		    if (memcmp (raw, p, 16))
+		      {
+			xstate_bv |= bit_I386_XSTATE_SSE;
+			memcpy (p, raw, 16);
+		      }
+		  }
+
+	      /* Check if any X87 registers are changed.  */
+	      if ((tdep->xcr0 & bit_I386_XSTATE_X87))
+		for (i = I387_ST0_REGNUM (tdep);
+		     i < I387_FCTRL_REGNUM (tdep); i++)
+		  {
+		    regcache_raw_collect (regcache, i, raw);
+		    p = FXSAVE_ADDR (tdep, regs, i);
+		    if (memcmp (raw, p, 10))
+		      {
+			xstate_bv |= bit_I386_XSTATE_X87;
+			memcpy (p, raw, 10);
+		      }
+		  }
+	    }
+	  else
+	    {
+	      /* Check if REGNUM is changed.  */
+	      regcache_raw_collect (regcache, regnum, raw);
+
+	      switch (regclass)
+		{
+		default:
+		  abort ();
+
+		case avxh:
+		  /* This is an upper YMM register.  */
+		  p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 16))
+		    {
+		      xstate_bv |= bit_I386_XSTATE_AVX;
+		      memcpy (p, raw, 16);
+		    }
+		  break;
+
+		case sse:
+		  /* This is an SSE register.  */
+		  p = FXSAVE_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 16))
+		    {
+		      xstate_bv |= bit_I386_XSTATE_SSE;
+		      memcpy (p, raw, 16);
+		    }
+		  break;
+
+		case x87:
+		  /* This is an x87 register.  */
+		  p = FXSAVE_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 10))
+		    {
+		      xstate_bv |= bit_I386_XSTATE_X87;
+		      memcpy (p, raw, 10);
+		    }
+		  break;
+		}
+	    }
+
+	  /* Update the corresponding bits in `xstate_bv' if any SSE/AVX
+	     registers are changed.  */
+	  if (xstate_bv)
+	    {
+	      /* The supported bits in `xstat_bv' are 1 byte.  */
+	      *xstate_bv_p |= (gdb_byte) xstate_bv;
+
+	      switch (regclass)
+		{
+		default:
+		  abort ();
+
+		case all:
+		  break;
+
+		case x87:
+		case sse:
+		case avxh:
+		  /* Register REGNUM has been updated.  Return.  */
+		  return;
+		}
+	    }
+	  else
+	    {
+	      /* Return if REGNUM isn't changed.  */
+	      if (regclass != all)
+		return;
+	    }
+	}
+    }
+
+  /* Only handle x87 control registers.  */
+  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte buf[4];
+
+	    regcache_raw_collect (regcache, i, buf);
+
+	    if (i == I387_FOP_REGNUM (tdep))
+	      {
+		/* The opcode occupies only 11 bits.  Make sure we
+                   don't touch the other bits.  */
+		buf[1] &= ((1 << 3) - 1);
+		buf[1] |= ((FXSAVE_ADDR (tdep, regs, i))[1] & ~((1 << 3) - 1));
+	      }
+	    else if (i == I387_FTAG_REGNUM (tdep))
+	      {
+		/* Converting back is much easier.  */
+
+		unsigned short ftag;
+		int fpreg;
+
+		ftag = (buf[1] << 8) | buf[0];
+		buf[0] = 0;
+		buf[1] = 0;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag = (ftag >> (fpreg * 2)) & 3;
+
+		    if (tag != 3)
+		      buf[0] |= (1 << fpreg);
+		  }
+	      }
+	    memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
+	  }
+	else
+	  regcache_raw_collect (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    regcache_raw_collect (regcache, I387_MXCSR_REGNUM (tdep),
+			  FXSAVE_MXCSR_ADDR (regs));
+}
+
 /* Recreate the FTW (tag word) valid bits from the 80-bit FP data in
    *RAW.  */
 
diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
index 645eb91..976fa11 100644
--- a/gdb/i387-tdep.h
+++ b/gdb/i387-tdep.h
@@ -33,6 +33,8 @@ struct ui_file;
 #define I387_ST0_REGNUM(tdep) ((tdep)->st0_regnum)
 #define I387_NUM_XMM_REGS(tdep) ((tdep)->num_xmm_regs)
 #define I387_MM0_REGNUM(tdep) ((tdep)->mm0_regnum)
+#define I387_NUM_YMM_REGS(tdep) ((tdep)->num_ymm_regs)
+#define I387_YMM0H_REGNUM(tdep) ((tdep)->ymm0h_regnum)
 
 #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
 #define I387_FSTAT_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 1)
@@ -45,6 +47,8 @@ struct ui_file;
 #define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
 #define I387_MXCSR_REGNUM(tdep) \
   (I387_XMM0_REGNUM (tdep) + I387_NUM_XMM_REGS (tdep))
+#define I387_YMMENDH_REGNUM(tdep) \
+  (I387_YMM0H_REGNUM (tdep) + I387_NUM_YMM_REGS (tdep))
 
 /* Print out the i387 floating point state.  */
 
@@ -99,6 +103,11 @@ extern void i387_collect_fsave (const struct regcache *regcache, int regnum,
 extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 				const void *fxsave);
 
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+extern void i387_supply_xsave (struct regcache *regcache, int regnum,
+			       const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -107,6 +116,11 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 extern void i387_collect_fxsave (const struct regcache *regcache, int regnum,
 				 void *fxsave);
 
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+extern void i387_collect_xsave (const struct regcache *regcache,
+				int regnum, void *xsave, int gcore);
+
 /* Prepare the FPU stack in REGCACHE for a function return.  */
 
 extern void i387_return_value (struct gdbarch *gdbarch,

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-04 18:10       ` PATCH: 6/6: Add AVX support (gdbserver changes) H.J. Lu
@ 2010-03-06 22:23         ` H.J. Lu
  2010-03-12 17:25           ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-06 22:23 UTC (permalink / raw)
  To: GDB

Hi,

Here are gdbserver changes to support AVX.  OK to install?

Thanks.


H.J.
----
2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>

	* Makefile.in (clean): Updated.
	(i386-avx.o): New.
	(i386-avx.c): Likewise.
	(i386-avx-linux.o): Likewise.
	(i386-avx-linux.c): Likewise.
	(amd64-avx.o): Likewise.
	(amd64-avx.c): Likewise.
	(amd64-avx-linux.o): Likewise.
	(amd64-avx-linux.c): Likewise.

	* configure.srv (srv_i386_regobj): Add i386-avx.o.
	(srv_i386_linux_regobj): Add i386-avx-linux.o.
	(srv_amd64_regobj): Add amd64-avx.o.
	(srv_amd64_linux_regobj): Add amd64-avx-linux.o.
	(srv_i386_32bit_xmlfiles): Add i386/32bit-avx.xml.
	(srv_i386_64bit_xmlfiles): Add i386/64bit-avx.xml.
	(srv_i386_xmlfiles): Add i386/i386-avx.xml.
	(srv_amd64_xmlfiles): Add i386/amd64-avx.xml.
	(srv_i386_linux_xmlfiles): Add i386/i386-avx-linux.xml.
	(srv_amd64_linux_xmlfiles): Add i386/amd64-avx-linux.xml.

	* i387-fp.c: Include "i386-xstate.h".
	(i387_xsave): New.
	(i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* i387-fp.h (i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* linux-arm-low.c (target_regsets): Initialize nt_type to 0.
	* linux-crisv32-low.c (target_regsets): Likewise.
	* linux-m68k-low.c (target_regsets): Likewise.
	* linux-mips-low.c (target_regsets): Likewise.
	* linux-ppc-low.c (target_regsets): Likewise.
	* linux-s390-low.c (target_regsets): Likewise.
	* linux-sh-low.c (target_regsets): Likewise.
	* linux-sparc-low.c (target_regsets): Likewise.
	* linux-xtensa-low.c (target_regsets): Likewise.

	* linux-low.c: Include <sys/uio.h>.
	(regsets_fetch_inferior_registers): Support nt_type.
	(regsets_store_inferior_registers): Likewise.
	(linux_process_qsupported): New.
	(linux_target_ops): Add linux_process_qsupported.

	* linux-low.h (regset_info): Add nt_type.
	(linux_target_ops): Add process_qsupported.

	* linux-x86-low.c: Include "i386-xstate.h", "elf/common.h" and
	<sys/uio.h>.
	(init_registers_i386_avx_linux): New.
	(init_registers_amd64_avx_linux): Likewise.
	(PTRACE_GETREGSET): Likewise.
	(PTRACE_SETREGSET): Likewise.
	(x86_fill_xstateregset): Likewise.
	(x86_store_xstateregset): Likewise.
	(x86_linux_process_qsupported): Likewise.
	(target_regsets): Add NT_X86_XSTATE entry and Initialize nt_type.
	(the_low_target): Add x86_linux_process_qsupported.

	* server.c (use_xml): New.
	(get_features_xml): Don't use XML file if use_xml is 0.
	(handle_query): Call target_process_qsupported.

	* server.h (use_xml): New.

	* target.h (target_ops): Add process_qsupported.
	(target_process_qsupported): New.

diff --git a/gdb/gdbserver/Makefile.in b/gdb/gdbserver/Makefile.in
index 7fecced..2ec9784 100644
--- a/gdb/gdbserver/Makefile.in
+++ b/gdb/gdbserver/Makefile.in
@@ -217,6 +217,8 @@ clean:
 	rm -f powerpc-isa205-vsx64l.c
 	rm -f s390-linux32.c s390-linux64.c s390x-linux64.c
 	rm -f xml-builtin.c stamp-xml
+	rm -f i386-avx.c i386-avx-linux.c
+	rm -f amd64-avx.c amd64-avx-linux.c
 
 maintainer-clean realclean distclean: clean
 	rm -f nm.h tm.h xm.h config.status config.h stamp-h config.log
@@ -351,6 +353,12 @@ i386.c : $(srcdir)/../regformats/i386/i386.dat $(regdat_sh)
 i386-linux.o : i386-linux.c $(regdef_h)
 i386-linux.c : $(srcdir)/../regformats/i386/i386-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-linux.dat i386-linux.c
+i386-avx.o : i386-avx.c $(regdef_h)
+i386-avx.c : $(srcdir)/../regformats/i386/i386-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx.dat i386-avx.c
+i386-avx-linux.o : i386-avx-linux.c $(regdef_h)
+i386-avx-linux.c : $(srcdir)/../regformats/i386/i386-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx-linux.dat i386-avx-linux.c
 reg-ia64.o : reg-ia64.c $(regdef_h)
 reg-ia64.c : $(srcdir)/../regformats/reg-ia64.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-ia64.dat reg-ia64.c
@@ -438,6 +446,12 @@ amd64.c : $(srcdir)/../regformats/i386/amd64.dat $(regdat_sh)
 amd64-linux.o : amd64-linux.c $(regdef_h)
 amd64-linux.c : $(srcdir)/../regformats/i386/amd64-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-linux.dat amd64-linux.c
+amd64-avx.o : amd64-avx.c $(regdef_h)
+amd64-avx.c : $(srcdir)/../regformats/i386/amd64-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx.dat amd64-avx.c
+amd64-avx-linux.o : amd64-avx-linux.c $(regdef_h)
+amd64-avx-linux.c : $(srcdir)/../regformats/i386/amd64-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx-linux.dat amd64-avx-linux.c
 reg-xtensa.o : reg-xtensa.c $(regdef_h)
 reg-xtensa.c : $(srcdir)/../regformats/reg-xtensa.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-xtensa.dat reg-xtensa.c
diff --git a/gdb/gdbserver/configure.srv b/gdb/gdbserver/configure.srv
index e5818cd..a2f4323 100644
--- a/gdb/gdbserver/configure.srv
+++ b/gdb/gdbserver/configure.srv
@@ -22,17 +22,17 @@
 # Default hostio_last_error implementation
 srv_hostio_err_objs="hostio-errno.o"
 
-srv_i386_regobj=i386.o
-srv_i386_linux_regobj=i386-linux.o
-srv_amd64_regobj=amd64.o
-srv_amd64_linux_regobj=amd64-linux.o
+srv_i386_regobj="i386.o i386-avx.o"
+srv_i386_linux_regobj="i386-linux.o i386-avx-linux.o"
+srv_amd64_regobj="amd64.o x86-64-avx.o"
+srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
-srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml"
-srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml"
-srv_i386_xmlfiles="i386/i386.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_xmlfiles="i386/amd64.xml $srv_i386_64bit_xmlfiles"
-srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
+srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
+srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
+srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_xmlfiles="i386/amd64.xml i386/amd64-avx.xml $srv_i386_64bit_xmlfiles"
+srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/i386-avx-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/amd64-avx-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
 
 # Input is taken from the "${target}" variable.
 
diff --git a/gdb/gdbserver/i387-fp.c b/gdb/gdbserver/i387-fp.c
index 7ef4ba3..3e60882 100644
--- a/gdb/gdbserver/i387-fp.c
+++ b/gdb/gdbserver/i387-fp.c
@@ -19,6 +19,7 @@
 
 #include "server.h"
 #include "i387-fp.h"
+#include "i386-xstate.h"
 
 int num_xmm_registers = 8;
 
@@ -72,6 +73,46 @@ struct i387_fxsave {
   unsigned char xmm_space[256];
 };
 
+struct i387_xsave {
+  /* All these are only sixteen bits, plus padding, except for fop (which
+     is only eleven bits), and fooff / fioff (which are 32 bits each).  */
+  unsigned short fctrl;
+  unsigned short fstat;
+  unsigned short ftag;
+  unsigned short fop;
+  unsigned int fioff;
+  unsigned short fiseg;
+  unsigned short pad1;
+  unsigned int fooff;
+  unsigned short foseg;
+  unsigned short pad12;
+
+  unsigned int mxcsr;
+  unsigned int mxcsr_mask;
+
+  /* Space for eight 80-bit FP values in 128-bit spaces.  */
+  unsigned char st_space[128];
+
+  /* Space for eight 128-bit XMM values, or 16 on x86-64.  */
+  unsigned char xmm_space[256];
+
+  unsigned char reserved1[48];
+
+  /* The extended control register 0 (the XFEATURE_ENABLED_MASK
+     register).  */
+  unsigned long long xcr0;
+
+  unsigned char reserved2[40];
+
+  /* The XSTATE_BV bit vector.  */
+  unsigned long long xstate_bv;
+
+  unsigned char reserved3[56];
+
+  /* Space for eight upper 128-bit YMM values, or 16 on x86-64.  */
+  unsigned char ymmh_space[256];
+};
+
 void
 i387_cache_to_fsave (struct regcache *regcache, void *buf)
 {
@@ -199,6 +240,128 @@ i387_cache_to_fxsave (struct regcache *regcache, void *buf)
   fp->foseg = val;
 }
 
+void
+i387_cache_to_xsave (struct regcache *regcache, void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  int i;
+  unsigned long val, val2;
+  unsigned int clear_bv;
+  unsigned long long xstate_bv = 0;
+  char raw[16];
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Clear part in x87 and vector registers if its bit in xstat_bv is
+     zero.  */
+  if (clear_bv)
+    {
+      if ((clear_bv & bit_I386_XSTATE_X87))
+	for (i = 0; i < 8; i++)
+	  memset (((char *) &fp->st_space[0]) + i * 16, 0, 10);
+
+      if ((clear_bv & bit_I386_XSTATE_SSE))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->xmm_space[0]) + i * 16, 0, 16);
+
+      if ((clear_bv & bit_I386_XSTATE_AVX))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->ymmh_space[0]) + i * 16, 0, 16);
+    }
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & bit_I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      for (i = 0; i < 8; i++)
+	{
+	  collect_register (regcache, i + st0_regnum, raw);
+	  p = ((char *) &fp->st_space[0]) + i * 16;
+	  if (memcmp (raw, p, 10))
+	    {
+	      xstate_bv |= bit_I386_XSTATE_X87;
+	      memcpy (p, raw, 10);
+	    }
+	}
+    }
+
+  /* Check if any SSE registers are changed.  */
+  if ((x86_xcr0 & bit_I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= bit_I386_XSTATE_SSE;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Check if any AVX registers are changed.  */
+  if ((x86_xcr0 & bit_I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + ymm0h_regnum, raw);
+	  p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= bit_I386_XSTATE_AVX;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Update the corresponding bits in xstate_bv if any SSE/AVX
+     registers are changed.  */
+  fp->xstate_bv |= xstate_bv;
+
+  collect_register_by_name (regcache, "fioff", &fp->fioff);
+  collect_register_by_name (regcache, "fooff", &fp->fooff);
+  collect_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* This one's 11 bits... */
+  collect_register_by_name (regcache, "fop", &val2);
+  fp->fop = (val2 & 0x7FF) | (fp->fop & 0xF800);
+
+  /* Some registers are 16-bit.  */
+  collect_register_by_name (regcache, "fctrl", &val);
+  fp->fctrl = val;
+
+  collect_register_by_name (regcache, "fstat", &val);
+  fp->fstat = val;
+
+  /* Convert to the simplifed tag form stored in fxsave data.  */
+  collect_register_by_name (regcache, "ftag", &val);
+  val &= 0xFFFF;
+  val2 = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag = (val >> (i * 2)) & 3;
+
+      if (tag != 3)
+	val2 |= (1 << i);
+    }
+  fp->ftag = val2;
+
+  collect_register_by_name (regcache, "fiseg", &val);
+  fp->fiseg = val;
+
+  collect_register_by_name (regcache, "foseg", &val);
+  fp->foseg = val;
+}
+
 static int
 i387_ftag (struct i387_fxsave *fp, int regno)
 {
@@ -296,3 +459,107 @@ i387_fxsave_to_cache (struct regcache *regcache, const void *buf)
   val = (fp->fop) & 0x7FF;
   supply_register_by_name (regcache, "fop", &val);
 }
+
+void
+i387_xsave_to_cache (struct regcache *regcache, const void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  struct i387_fxsave *fxp = (struct i387_fxsave *) buf;
+  int i, top;
+  unsigned long val;
+  unsigned int clear_bv;
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & bit_I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      if ((clear_bv & bit_I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < 8; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->st_space[0]) + i * 16;
+	  supply_register (regcache, i + st0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & bit_I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      if ((clear_bv & bit_I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & bit_I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      if ((clear_bv & bit_I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  supply_register (regcache, i + ymm0h_regnum, p);
+	}
+    }
+
+  supply_register_by_name (regcache, "fioff", &fp->fioff);
+  supply_register_by_name (regcache, "fooff", &fp->fooff);
+  supply_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* Some registers are 16-bit.  */
+  val = fp->fctrl & 0xFFFF;
+  supply_register_by_name (regcache, "fctrl", &val);
+
+  val = fp->fstat & 0xFFFF;
+  supply_register_by_name (regcache, "fstat", &val);
+
+  /* Generate the form of ftag data that GDB expects.  */
+  top = (fp->fstat >> 11) & 0x7;
+  val = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag;
+      if (fp->ftag & (1 << i))
+	tag = i387_ftag (fxp, (i + 8 - top) % 8);
+      else
+	tag = 3;
+      val |= tag << (2 * i);
+    }
+  supply_register_by_name (regcache, "ftag", &val);
+
+  val = fp->fiseg & 0xFFFF;
+  supply_register_by_name (regcache, "fiseg", &val);
+
+  val = fp->foseg & 0xFFFF;
+  supply_register_by_name (regcache, "foseg", &val);
+
+  val = (fp->fop) & 0x7FF;
+  supply_register_by_name (regcache, "fop", &val);
+}
+
+/* Default to SSE.  */
+unsigned long long x86_xcr0 = I386_XSTATE_SSE_MASK;
diff --git a/gdb/gdbserver/i387-fp.h b/gdb/gdbserver/i387-fp.h
index d1e0681..ed1a322 100644
--- a/gdb/gdbserver/i387-fp.h
+++ b/gdb/gdbserver/i387-fp.h
@@ -26,6 +26,11 @@ void i387_fsave_to_cache (struct regcache *regcache, const void *buf);
 void i387_cache_to_fxsave (struct regcache *regcache, void *buf);
 void i387_fxsave_to_cache (struct regcache *regcache, const void *buf);
 
+void i387_cache_to_xsave (struct regcache *regcache, void *buf);
+void i387_xsave_to_cache (struct regcache *regcache, const void *buf);
+
+extern unsigned long long x86_xcr0;
+
 extern int num_xmm_registers;
 
 #endif /* I387_FP_H */
diff --git a/gdb/gdbserver/linux-arm-low.c b/gdb/gdbserver/linux-arm-low.c
index 54668f8..32bd7bb 100644
--- a/gdb/gdbserver/linux-arm-low.c
+++ b/gdb/gdbserver/linux-arm-low.c
@@ -354,16 +354,16 @@ arm_arch_setup (void)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, 18 * 4,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 18 * 4,
     GENERAL_REGS,
     arm_fill_gregset, arm_store_gregset },
-  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 16 * 8 + 6 * 4,
+  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 0, 16 * 8 + 6 * 4,
     EXTENDED_REGS,
     arm_fill_wmmxregset, arm_store_wmmxregset },
-  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 32 * 8 + 4,
+  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 0, 32 * 8 + 4,
     EXTENDED_REGS,
     arm_fill_vfpregset, arm_store_vfpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-crisv32-low.c b/gdb/gdbserver/linux-crisv32-low.c
index 6ba48b6..d426c32 100644
--- a/gdb/gdbserver/linux-crisv32-low.c
+++ b/gdb/gdbserver/linux-crisv32-low.c
@@ -365,9 +365,9 @@ cris_store_gregset (const void *buf)
 typedef unsigned long elf_gregset_t[cris_num_regs];
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS, cris_fill_gregset, cris_store_gregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 6499ca7..4edb152 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -39,6 +39,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/vfs.h>
+#include <sys/uio.h>
 #ifndef ELFMAG0
 /* Don't include <linux/elf.h> here.  If it got included by gdb_proc_service.h
    then ELFMAG0 will have been defined.  If it didn't get included by
@@ -2281,14 +2282,15 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2297,10 +2299,21 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
 	}
 
       buf = xmalloc (regset->size);
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, data, nt_type);
 #endif
       if (res < 0)
 	{
@@ -2338,14 +2351,15 @@ regsets_store_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2358,10 +2372,21 @@ regsets_store_inferior_registers (struct regcache *regcache)
       /* First fill the buffer with the current register set contents,
 	 in case there are any items in the kernel's regset that are
 	 not in gdbserver's regcache.  */
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, &iov, data);
 #endif
 
       if (res == 0)
@@ -2371,9 +2396,9 @@ regsets_store_inferior_registers (struct regcache *regcache)
 
 	  /* Only now do we write the register set.  */
 #ifndef __sparc__
-	  res = ptrace (regset->set_request, pid, 0, buf);
+	  res = ptrace (regset->set_request, pid, nt_type, data);
 #else
-	  res = ptrace (regset->set_request, pid, buf, 0);
+	  res = ptrace (regset->set_request, pid, data, nt_type);
 #endif
 	}
 
@@ -3434,6 +3459,13 @@ linux_core_of_thread (ptid_t ptid)
   return core;
 }
 
+static void
+linux_process_qsupported (const char *query)
+{
+  if (the_low_target.process_qsupported != NULL)
+    the_low_target.process_qsupported (query);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -3477,7 +3509,8 @@ static struct target_ops linux_target_ops = {
 #else
   NULL,
 #endif
-  linux_core_of_thread
+  linux_core_of_thread,
+  linux_process_qsupported
 };
 
 static void
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index 82ad00c..57e7adb 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -35,6 +35,9 @@ enum regset_type {
 struct regset_info
 {
   int get_request, set_request;
+  /* If NT_TYPE isn't 0, it will be passed to ptrace as the 3rd
+     argument and the 4th argument should be "const struct iovec *".  */
+  int nt_type;
   int size;
   enum regset_type type;
   regset_fill_func fill_function;
@@ -111,6 +114,9 @@ struct linux_target_ops
 
   /* Hook to call prior to resuming a thread.  */
   void (*prepare_to_resume) (struct lwp_info *);
+
+  /* Hook to support target specific qSupported.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct linux_target_ops the_low_target;
diff --git a/gdb/gdbserver/linux-m68k-low.c b/gdb/gdbserver/linux-m68k-low.c
index 14e3864..6c98bb1 100644
--- a/gdb/gdbserver/linux-m68k-low.c
+++ b/gdb/gdbserver/linux-m68k-low.c
@@ -112,14 +112,14 @@ m68k_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     m68k_fill_gregset, m68k_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     m68k_fill_fpregset, m68k_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static const unsigned char m68k_breakpoint[] = { 0x4E, 0x4F };
diff --git a/gdb/gdbserver/linux-mips-low.c b/gdb/gdbserver/linux-mips-low.c
index 70f6700..1c04b2e 100644
--- a/gdb/gdbserver/linux-mips-low.c
+++ b/gdb/gdbserver/linux-mips-low.c
@@ -343,12 +343,12 @@ mips_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, 38 * 8, GENERAL_REGS,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 38 * 8, GENERAL_REGS,
     mips_fill_gregset, mips_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 33 * 8, FP_REGS,
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, 33 * 8, FP_REGS,
     mips_fill_fpregset, mips_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-ppc-low.c b/gdb/gdbserver/linux-ppc-low.c
index 10a1309..0dab604 100644
--- a/gdb/gdbserver/linux-ppc-low.c
+++ b/gdb/gdbserver/linux-ppc-low.c
@@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
      fetch them every time, but still fall back to PTRACE_PEEKUSER for the
      general registers.  Some kernels support these, but not the newer
      PPC_PTRACE_GETREGS.  */
-  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
+  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
   ppc_fill_vsxregset, ppc_store_vsxregset },
   { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
     ppc_fill_vrregset, ppc_store_vrregset },
-  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
+  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
     ppc_fill_evrregset, ppc_store_evrregset },
-  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-s390-low.c b/gdb/gdbserver/linux-s390-low.c
index 5460f57..eb865dc 100644
--- a/gdb/gdbserver/linux-s390-low.c
+++ b/gdb/gdbserver/linux-s390-low.c
@@ -181,8 +181,8 @@ static void s390_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 
diff --git a/gdb/gdbserver/linux-sh-low.c b/gdb/gdbserver/linux-sh-low.c
index 9d27e7f..87a0dd2 100644
--- a/gdb/gdbserver/linux-sh-low.c
+++ b/gdb/gdbserver/linux-sh-low.c
@@ -104,8 +104,8 @@ static void sh_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-sparc-low.c b/gdb/gdbserver/linux-sparc-low.c
index 0bb5f2f..e0bfe81 100644
--- a/gdb/gdbserver/linux-sparc-low.c
+++ b/gdb/gdbserver/linux-sparc-low.c
@@ -260,13 +260,13 @@ sparc_reinsert_addr (void)
 
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     sparc_fill_gregset, sparc_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (fpregset_t),
     FP_REGS,
     sparc_fill_fpregset, sparc_store_fpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index 496baa2..b9981ec 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -24,6 +24,8 @@
 #include "linux-low.h"
 #include "i387-fp.h"
 #include "i386-low.h"
+#include "i386-xstate.h"
+#include "elf/common.h"
 
 #include "gdb_proc_service.h"
 
@@ -31,10 +33,24 @@
 void init_registers_i386_linux (void);
 /* Defined in auto-generated file amd64-linux.c.  */
 void init_registers_amd64_linux (void);
+/* Defined in auto-generated file i386-avx-linux.c.  */
+void init_registers_i386_avx_linux (void);
+/* Defined in auto-generated file amd64-avx-linux.c.  */
+void init_registers_amd64_avx_linux (void);
 
 #include <sys/reg.h>
 #include <sys/procfs.h>
 #include <sys/ptrace.h>
+#include <sys/uio.h>
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
 
 #ifndef PTRACE_GET_THREAD_AREA
 #define PTRACE_GET_THREAD_AREA 25
@@ -252,6 +268,18 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 
 #endif
 
+static void
+x86_fill_xstateregset (struct regcache *regcache, void *buf)
+{
+  i387_cache_to_xsave (regcache, buf);
+}
+
+static void
+x86_store_xstateregset (struct regcache *regcache, const void *buf)
+{
+  i387_xsave_to_cache (regcache, buf);
+}
+
 /* ??? The non-biarch i386 case stores all the i387 regs twice.
    Once in i387_.*fsave.* and once in i387_.*fxsave.*.
    This is, presumably, to handle the case where PTRACE_[GS]ETFPXREGS
@@ -264,21 +292,28 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 struct regset_info target_regsets[] =
 {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     x86_fill_gregset, x86_store_gregset },
+  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
+# ifdef __x86_64__
+    FP_REGS,
+# else
+    EXTENDED_REGS,
+# endif
+    x86_fill_xstateregset, x86_store_xstateregset },
 # ifndef __x86_64__
 #  ifdef HAVE_PTRACE_GETFPXREGS
-  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, sizeof (elf_fpxregset_t),
+  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, 0, sizeof (elf_fpxregset_t),
     EXTENDED_REGS,
     x86_fill_fpxregset, x86_store_fpxregset },
 #  endif
 # endif
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     x86_fill_fpregset, x86_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static CORE_ADDR
@@ -772,6 +807,65 @@ x86_siginfo_fixup (struct siginfo *native, void *inf, int direction)
   return 0;
 }
 \f
+/* Process qSupported query, "x86=xml".  Update the buffer size for
+   PTRACE_GETREGSET.  */
+
+static void
+x86_linux_process_qsupported (const char *query)
+{
+  int pid;
+  unsigned long long xstateregs[I386_XSTATE_SSE_SIZE / sizeof (long long)];
+  struct iovec iov;
+
+  /* Return if gdb doesn't support XML.   */
+  if (query == NULL || strcmp (query, "x86=xml") != 0)
+    {
+      use_xml = 0;
+      return;
+    }
+
+  /* Check if XSAVE extended state is supported.  */
+  pid = pid_of (get_thread_lwp (current_inferior));
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+  /* Check if PTRACE_GETREGSET works.  */
+  if (ptrace (PTRACE_GETREGSET, pid,
+	      (unsigned int) NT_X86_XSTATE, (long) &iov) == 0)
+    {
+      struct regset_info *regset;
+      unsigned long long xcr0;
+
+      /* Get XCR0 from XSAVE extended state at byte 464.  */
+      xcr0 = xstateregs[464 / sizeof (long long)];
+
+      /* Use PTRACE_GETREGSET if it is available.  */
+      for (regset = target_regsets;
+	   regset->fill_function != NULL; regset++)
+	if (regset->get_request == PTRACE_GETREGSET)
+	  regset->size = I386_XSTATE_SIZE (xcr0);
+	else if (regset->type != GENERAL_REGS)
+	  regset->size = 0;
+
+      /* AVX is the highest feature we support.  */
+      if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+	{
+	  x86_xcr0 = xcr0;
+
+#ifdef __x86_64__
+	  /* I386 has 8 xmm regs.  */
+	  if (num_xmm_registers == 8)
+	    init_registers_i386_avx_linux ();
+	  else
+	    init_registers_amd64_avx_linux ();
+#else
+	  init_registers_i386_avx_linux ();
+#endif
+	}
+    }
+};
+
 /* Initialize gdbserver for the architecture of the inferior.  */
 
 static void
@@ -850,5 +944,6 @@ struct linux_target_ops the_low_target =
   x86_siginfo_fixup,
   x86_linux_new_process,
   x86_linux_new_thread,
-  x86_linux_prepare_to_resume
+  x86_linux_prepare_to_resume,
+  x86_linux_process_qsupported 
 };
diff --git a/gdb/gdbserver/linux-xtensa-low.c b/gdb/gdbserver/linux-xtensa-low.c
index c5ed351..8d0e73a 100644
--- a/gdb/gdbserver/linux-xtensa-low.c
+++ b/gdb/gdbserver/linux-xtensa-low.c
@@ -131,13 +131,13 @@ xtensa_store_xtregset (struct regcache *regcache, const void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     xtensa_fill_gregset, xtensa_store_gregset },
-  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, XTENSA_ELF_XTREG_SIZE,
+  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, 0, XTENSA_ELF_XTREG_SIZE,
     EXTENDED_REGS,
     xtensa_fill_xtregset, xtensa_store_xtregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 #if XCHAL_HAVE_BE
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index a03f877..6e46a7a 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -32,6 +32,13 @@
 #include <malloc.h>
 #endif
 
+int use_xml =
+#ifdef USE_XML
+  1;
+#else
+  0;
+#endif
+
 ptid_t cont_thread;
 ptid_t general_thread;
 ptid_t step_thread;
@@ -474,20 +481,19 @@ get_features_xml (const char *annex)
 	annex = gdbserver_xmltarget;
     }
 
-#ifdef USE_XML
-  {
-    extern const char *const xml_builtin[][2];
-    int i;
+  if (use_xml)
+    {
+      extern const char *const xml_builtin[][2];
+      int i;
 
-    /* Look for the annex.  */
-    for (i = 0; xml_builtin[i][0] != NULL; i++)
-      if (strcmp (annex, xml_builtin[i][0]) == 0)
-	break;
+      /* Look for the annex.  */
+      for (i = 0; xml_builtin[i][0] != NULL; i++)
+	if (strcmp (annex, xml_builtin[i][0]) == 0)
+	  break;
 
-    if (xml_builtin[i][0] != NULL)
-      return xml_builtin[i][1];
-  }
-#endif
+      if (xml_builtin[i][0] != NULL)
+	return xml_builtin[i][1];
+    }
 
   return NULL;
 }
@@ -1236,6 +1242,9 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
     {
       char *p = &own_buf[10];
 
+      /* Start processing qSupported packet.  */
+      target_process_qsupported (NULL);
+
       /* Process each feature being provided by GDB.  The first
 	 feature will follow a ':', and latter features will follow
 	 ';'.  */
@@ -1251,6 +1260,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		if (target_supports_multi_process ())
 		  multi_process = 1;
 	      }
+	    else if (strncmp (p, "x86:xstate=", 11) == 0)
+	      target_process_qsupported (p);
 	  }
 
       sprintf (own_buf, "PacketSize=%x;QPassSignals+", PBUFSIZ - 1);
diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h
index f46ee60..a9cd024 100644
--- a/gdb/gdbserver/server.h
+++ b/gdb/gdbserver/server.h
@@ -22,6 +22,8 @@
 
 #include "config.h"
 
+extern int use_xml;
+
 #ifdef __MINGW32CE__
 #include "wincecompat.h"
 #endif
diff --git a/gdb/gdbserver/target.h b/gdb/gdbserver/target.h
index ac68652..6109b1c 100644
--- a/gdb/gdbserver/target.h
+++ b/gdb/gdbserver/target.h
@@ -286,6 +286,9 @@ struct target_ops
 
   /* Returns the core given a thread, or -1 if not known.  */
   int (*core_of_thread) (ptid_t);
+
+  /* Target specific qSupported support.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct target_ops *the_target;
@@ -326,6 +329,10 @@ void set_target_ops (struct target_ops *);
   (the_target->supports_multi_process ? \
    (*the_target->supports_multi_process) () : 0)
 
+#define target_process_qsupported(query) \
+  if (the_target->process_qsupported) \
+    the_target->process_qsupported (query)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
  2010-03-06 22:18   ` PATCH: 1/6 [2nd try]: Add AVX support (AVX XML files) H.J. Lu
@ 2010-03-07 14:16   ` Mark Kettenis
  2010-03-07 14:37     ` H.J. Lu
  2010-03-27 16:16   ` Daniel Jacobowitz
  2010-03-29  0:16   ` PATCH: 0/6 [3nd " H.J. Lu
  3 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-07 14:16 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

> Date: Sat, 6 Mar 2010 14:16:34 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> AVX registers are saved and restored via the XSAVE extended state. The
> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
> is used to determine which states, x87, SSE, AVX, ... are supported
> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
> extended state indicates what states the current process is in. If
> the feature bit is cleared, the corresponding registers should be read as
> 0. If we update a register, we should set the corresponding feature
> bit in the xstate_bv field.
> 
> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
> fetch and store AVX registers with ptrace. Linux kernel also stores
> XCR0 at the first 8 bytes of the software usable bytes, starting at
> byte offset 464.
> 
> There are total 6 patches to add AVX support for Linux.  They support:
> 
> 1. The upper 128bit YMM registers are added for AVX support. The upper
> 128bit YMM registers are hidden from users. Gdb combines XMM register,
> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
> YMM register, %ymmX, as pseudo register to users.
> 2. Backward compatible. If AVX isn't supported, SSE will be used.
> 3. Forward compatible. If new state beyond AVX is supported in
> the XSAVE extended state, only AVX state will be used.
> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
> request packet to indicate that GDB supports x86 XML target desciption.
> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
> in qSupported request packet.
> 
> One advantage of this approach is YMM registers are actually stored as
> XMM registers and upper YMM registers in the XSAVE extended state.  It
> is easy and natural to access them as %xmmX and %ymmXh internally.  We
> just need to hide %ymmXh from users.
> 
> To support AVX on other OSes, the following changes are needed:
> 
> 1. Kernel support to get/set the XSAVE extended state.
> 2. Handle 8/16 upper YMM registers.
> 3. Provide target to_read_description to return SSE or AVX target
> description.
> 4. Update gdbarch_core_read_description to return SSE or AVX target
> description based on contents of core dump.

Wait; there is something important missing here.  How are the new %ymm
registers referred to in debug info?  The AMD64 SysV psABI defines the
DWARF register Number Mapping, but the 0.99.4 draft copy I have
doesn't define any mappings for the %ymm registers.  What mapping does
GCC use? 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 14:16   ` PATCH: 0/6 [2nd try]: Add AVX support Mark Kettenis
@ 2010-03-07 14:37     ` H.J. Lu
  2010-03-07 16:31       ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 14:37 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sat, 6 Mar 2010 14:16:34 -0800
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> AVX registers are saved and restored via the XSAVE extended state. The
>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
>> is used to determine which states, x87, SSE, AVX, ... are supported
>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
>> extended state indicates what states the current process is in. If
>> the feature bit is cleared, the corresponding registers should be read as
>> 0. If we update a register, we should set the corresponding feature
>> bit in the xstate_bv field.
>>
>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
>> fetch and store AVX registers with ptrace. Linux kernel also stores
>> XCR0 at the first 8 bytes of the software usable bytes, starting at
>> byte offset 464.
>>
>> There are total 6 patches to add AVX support for Linux.  They support:
>>
>> 1. The upper 128bit YMM registers are added for AVX support. The upper
>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
>> YMM register, %ymmX, as pseudo register to users.
>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
>> 3. Forward compatible. If new state beyond AVX is supported in
>> the XSAVE extended state, only AVX state will be used.
>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
>> request packet to indicate that GDB supports x86 XML target desciption.
>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
>> in qSupported request packet.
>>
>> One advantage of this approach is YMM registers are actually stored as
>> XMM registers and upper YMM registers in the XSAVE extended state.  It
>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
>> just need to hide %ymmXh from users.
>>
>> To support AVX on other OSes, the following changes are needed:
>>
>> 1. Kernel support to get/set the XSAVE extended state.
>> 2. Handle 8/16 upper YMM registers.
>> 3. Provide target to_read_description to return SSE or AVX target
>> description.
>> 4. Update gdbarch_core_read_description to return SSE or AVX target
>> description based on contents of core dump.
>
> Wait; there is something important missing here.  How are the new %ymm
> registers referred to in debug info?  The AMD64 SysV psABI defines the
> DWARF register Number Mapping, but the 0.99.4 draft copy I have
> doesn't define any mappings for the %ymm registers.  What mapping does
> GCC use?
>

In gcc, XMM and YMM registers have the same register number. They map
to be the same DWARF register with different sizes.  Since XMM and YMM
registers are caller-saved, they don't appear in unwind info. So, the same
DWARF register with different sizes for XMM/YMM registers isn't a problem.



-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 14:37     ` H.J. Lu
@ 2010-03-07 16:31       ` H.J. Lu
  2010-03-07 16:40         ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 16:31 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>>
>>> AVX registers are saved and restored via the XSAVE extended state. The
>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
>>> is used to determine which states, x87, SSE, AVX, ... are supported
>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
>>> extended state indicates what states the current process is in. If
>>> the feature bit is cleared, the corresponding registers should be read as
>>> 0. If we update a register, we should set the corresponding feature
>>> bit in the xstate_bv field.
>>>
>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
>>> fetch and store AVX registers with ptrace. Linux kernel also stores
>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
>>> byte offset 464.
>>>
>>> There are total 6 patches to add AVX support for Linux.  They support:
>>>
>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
>>> YMM register, %ymmX, as pseudo register to users.
>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
>>> 3. Forward compatible. If new state beyond AVX is supported in
>>> the XSAVE extended state, only AVX state will be used.
>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
>>> request packet to indicate that GDB supports x86 XML target desciption.
>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
>>> in qSupported request packet.
>>>
>>> One advantage of this approach is YMM registers are actually stored as
>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
>>> just need to hide %ymmXh from users.
>>>
>>> To support AVX on other OSes, the following changes are needed:
>>>
>>> 1. Kernel support to get/set the XSAVE extended state.
>>> 2. Handle 8/16 upper YMM registers.
>>> 3. Provide target to_read_description to return SSE or AVX target
>>> description.
>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
>>> description based on contents of core dump.
>>
>> Wait; there is something important missing here.  How are the new %ymm
>> registers referred to in debug info?  The AMD64 SysV psABI defines the
>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
>> doesn't define any mappings for the %ymm registers.  What mapping does
>> GCC use?
>>
>
> In gcc, XMM and YMM registers have the same register number. They map
> to be the same DWARF register with different sizes.  Since XMM and YMM
> registers are caller-saved, they don't appear in unwind info. So, the same
> DWARF register with different sizes for XMM/YMM registers isn't a problem.
>
>

Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
register to YMM. How do other arches solve this?


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 16:31       ` H.J. Lu
@ 2010-03-07 16:40         ` H.J. Lu
  2010-03-07 17:04           ` H.J. Lu
                             ` (2 more replies)
  0 siblings, 3 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 16:40 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 7, 2010 at 8:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
>>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>>>
>>>> AVX registers are saved and restored via the XSAVE extended state. The
>>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
>>>> is used to determine which states, x87, SSE, AVX, ... are supported
>>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
>>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
>>>> extended state indicates what states the current process is in. If
>>>> the feature bit is cleared, the corresponding registers should be read as
>>>> 0. If we update a register, we should set the corresponding feature
>>>> bit in the xstate_bv field.
>>>>
>>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
>>>> fetch and store AVX registers with ptrace. Linux kernel also stores
>>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
>>>> byte offset 464.
>>>>
>>>> There are total 6 patches to add AVX support for Linux.  They support:
>>>>
>>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
>>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
>>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
>>>> YMM register, %ymmX, as pseudo register to users.
>>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
>>>> 3. Forward compatible. If new state beyond AVX is supported in
>>>> the XSAVE extended state, only AVX state will be used.
>>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
>>>> request packet to indicate that GDB supports x86 XML target desciption.
>>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
>>>> in qSupported request packet.
>>>>
>>>> One advantage of this approach is YMM registers are actually stored as
>>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
>>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
>>>> just need to hide %ymmXh from users.
>>>>
>>>> To support AVX on other OSes, the following changes are needed:
>>>>
>>>> 1. Kernel support to get/set the XSAVE extended state.
>>>> 2. Handle 8/16 upper YMM registers.
>>>> 3. Provide target to_read_description to return SSE or AVX target
>>>> description.
>>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
>>>> description based on contents of core dump.
>>>
>>> Wait; there is something important missing here.  How are the new %ymm
>>> registers referred to in debug info?  The AMD64 SysV psABI defines the
>>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
>>> doesn't define any mappings for the %ymm registers.  What mapping does
>>> GCC use?
>>>
>>
>> In gcc, XMM and YMM registers have the same register number. They map
>> to be the same DWARF register with different sizes.  Since XMM and YMM
>> registers are caller-saved, they don't appear in unwind info. So, the same
>> DWARF register with different sizes for XMM/YMM registers isn't a problem.
>>
>>
>
> Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
> register to YMM. How do other arches solve this?
>

My first approach works here since XMM and YMM register have the same
register numbers.  We can solve it with 2 alternatives:

1. Give a different DWARF register number for YMM register,
which is an incompatible ABI change.
2. Implement YMM registers as a super set of XMM registers, which
is my first approach.

Thanks Mark for pointing out this issue.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 16:40         ` H.J. Lu
@ 2010-03-07 17:04           ` H.J. Lu
  2010-03-07 17:39             ` H.J. Lu
  2010-03-07 19:10           ` Nathan Froyd
  2010-03-07 20:29           ` Mark Kettenis
  2 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 17:04 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 7, 2010 at 8:40 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Mar 7, 2010 at 8:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
>>>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>>>>
>>>>> AVX registers are saved and restored via the XSAVE extended state. The
>>>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
>>>>> is used to determine which states, x87, SSE, AVX, ... are supported
>>>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
>>>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
>>>>> extended state indicates what states the current process is in. If
>>>>> the feature bit is cleared, the corresponding registers should be read as
>>>>> 0. If we update a register, we should set the corresponding feature
>>>>> bit in the xstate_bv field.
>>>>>
>>>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
>>>>> fetch and store AVX registers with ptrace. Linux kernel also stores
>>>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
>>>>> byte offset 464.
>>>>>
>>>>> There are total 6 patches to add AVX support for Linux.  They support:
>>>>>
>>>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
>>>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
>>>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
>>>>> YMM register, %ymmX, as pseudo register to users.
>>>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
>>>>> 3. Forward compatible. If new state beyond AVX is supported in
>>>>> the XSAVE extended state, only AVX state will be used.
>>>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
>>>>> request packet to indicate that GDB supports x86 XML target desciption.
>>>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
>>>>> in qSupported request packet.
>>>>>
>>>>> One advantage of this approach is YMM registers are actually stored as
>>>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
>>>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
>>>>> just need to hide %ymmXh from users.
>>>>>
>>>>> To support AVX on other OSes, the following changes are needed:
>>>>>
>>>>> 1. Kernel support to get/set the XSAVE extended state.
>>>>> 2. Handle 8/16 upper YMM registers.
>>>>> 3. Provide target to_read_description to return SSE or AVX target
>>>>> description.
>>>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
>>>>> description based on contents of core dump.
>>>>
>>>> Wait; there is something important missing here.  How are the new %ymm
>>>> registers referred to in debug info?  The AMD64 SysV psABI defines the
>>>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
>>>> doesn't define any mappings for the %ymm registers.  What mapping does
>>>> GCC use?
>>>>
>>>
>>> In gcc, XMM and YMM registers have the same register number. They map
>>> to be the same DWARF register with different sizes.  Since XMM and YMM
>>> registers are caller-saved, they don't appear in unwind info. So, the same
>>> DWARF register with different sizes for XMM/YMM registers isn't a problem.
>>>
>>>
>>
>> Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
>> register to YMM. How do other arches solve this?
>>
>
> My first approach works here since XMM and YMM register have the same
> register numbers.  We can solve it with 2 alternatives:
>
> 1. Give a different DWARF register number for YMM register,
> which is an incompatible ABI change.
> 2. Implement YMM registers as a super set of XMM registers, which
> is my first approach.
>
> Thanks Mark for pointing out this issue.

Or I can provide i386_value_from_register.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 17:04           ` H.J. Lu
@ 2010-03-07 17:39             ` H.J. Lu
  2010-03-07 20:00               ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 17:39 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 7, 2010 at 9:04 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Mar 7, 2010 at 8:40 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Sun, Mar 7, 2010 at 8:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
>>>>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>>>>>
>>>>>> AVX registers are saved and restored via the XSAVE extended state. The
>>>>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
>>>>>> is used to determine which states, x87, SSE, AVX, ... are supported
>>>>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
>>>>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
>>>>>> extended state indicates what states the current process is in. If
>>>>>> the feature bit is cleared, the corresponding registers should be read as
>>>>>> 0. If we update a register, we should set the corresponding feature
>>>>>> bit in the xstate_bv field.
>>>>>>
>>>>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
>>>>>> fetch and store AVX registers with ptrace. Linux kernel also stores
>>>>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
>>>>>> byte offset 464.
>>>>>>
>>>>>> There are total 6 patches to add AVX support for Linux.  They support:
>>>>>>
>>>>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
>>>>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
>>>>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
>>>>>> YMM register, %ymmX, as pseudo register to users.
>>>>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
>>>>>> 3. Forward compatible. If new state beyond AVX is supported in
>>>>>> the XSAVE extended state, only AVX state will be used.
>>>>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
>>>>>> request packet to indicate that GDB supports x86 XML target desciption.
>>>>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
>>>>>> in qSupported request packet.
>>>>>>
>>>>>> One advantage of this approach is YMM registers are actually stored as
>>>>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
>>>>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
>>>>>> just need to hide %ymmXh from users.
>>>>>>
>>>>>> To support AVX on other OSes, the following changes are needed:
>>>>>>
>>>>>> 1. Kernel support to get/set the XSAVE extended state.
>>>>>> 2. Handle 8/16 upper YMM registers.
>>>>>> 3. Provide target to_read_description to return SSE or AVX target
>>>>>> description.
>>>>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
>>>>>> description based on contents of core dump.
>>>>>
>>>>> Wait; there is something important missing here.  How are the new %ymm
>>>>> registers referred to in debug info?  The AMD64 SysV psABI defines the
>>>>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
>>>>> doesn't define any mappings for the %ymm registers.  What mapping does
>>>>> GCC use?
>>>>>
>>>>
>>>> In gcc, XMM and YMM registers have the same register number. They map
>>>> to be the same DWARF register with different sizes.  Since XMM and YMM
>>>> registers are caller-saved, they don't appear in unwind info. So, the same
>>>> DWARF register with different sizes for XMM/YMM registers isn't a problem.
>>>>
>>>>
>>>
>>> Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
>>> register to YMM. How do other arches solve this?
>>>
>>
>> My first approach works here since XMM and YMM register have the same
>> register numbers.  We can solve it with 2 alternatives:
>>
>> 1. Give a different DWARF register number for YMM register,
>> which is an incompatible ABI change.
>> 2. Implement YMM registers as a super set of XMM registers, which
>> is my first approach.
>>
>> Thanks Mark for pointing out this issue.
>
> Or I can provide i386_value_from_register.
>

It doesn't work on x86 since i386_value_from_register will change
regum. Can we change

typedef struct value * (gdbarch_value_from_register_ftype) (struct
type *type, int regnum, struct frame_info *frame);

to

typedef struct value * (gdbarch_value_from_register_ftype) (struct
type *type, int *regnum, struct frame_info *frame);

to support updating regnum?

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 16:40         ` H.J. Lu
  2010-03-07 17:04           ` H.J. Lu
@ 2010-03-07 19:10           ` Nathan Froyd
  2010-03-07 19:49             ` Mark Kettenis
  2010-03-07 20:29           ` Mark Kettenis
  2 siblings, 1 reply; 115+ messages in thread
From: Nathan Froyd @ 2010-03-07 19:10 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mark Kettenis, gdb-patches

On Sun, Mar 07, 2010 at 08:40:10AM -0800, H.J. Lu wrote:
> My first approach works here since XMM and YMM register have the same
> register numbers.  We can solve it with 2 alternatives:
> 
> 1. Give a different DWARF register number for YMM register,
> which is an incompatible ABI change.
> 2. Implement YMM registers as a super set of XMM registers, which
> is my first approach.

The third alternative--again, what's adopted for the PPC SPE 64-bit
registers--is to give %ymmNh their own DWARF register numbers.  I
suppose it's also ABI-incompatible, but it seems like it fits with your
approach much better than either of the above alternatives.

-Nathan

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 19:10           ` Nathan Froyd
@ 2010-03-07 19:49             ` Mark Kettenis
  2010-03-07 21:07               ` Nathan Froyd
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-07 19:49 UTC (permalink / raw)
  To: froydnj; +Cc: hjl.tools, mark.kettenis, gdb-patches

> Date: Sun, 7 Mar 2010 11:09:59 -0800
> From: Nathan Froyd <froydnj@codesourcery.com>
> 
> On Sun, Mar 07, 2010 at 08:40:10AM -0800, H.J. Lu wrote:
> > My first approach works here since XMM and YMM register have the same
> > register numbers.  We can solve it with 2 alternatives:
> > 
> > 1. Give a different DWARF register number for YMM register,
> > which is an incompatible ABI change.
> > 2. Implement YMM registers as a super set of XMM registers, which
> > is my first approach.
> 
> The third alternative--again, what's adopted for the PPC SPE 64-bit
> registers--is to give %ymmNh their own DWARF register numbers.  I
> suppose it's also ABI-incompatible, but it seems like it fits with your
> approach much better than either of the above alternatives.

I don't think that would be a good idea.  It means you can't refer to
something stored in a %ymmN register with a single register number.
The compiler will have to use a more complicated expression (which may
not be possible for older debug info formats) for these.  As a result,
things like "info address" become rather useless.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 17:39             ` H.J. Lu
@ 2010-03-07 20:00               ` Mark Kettenis
  0 siblings, 0 replies; 115+ messages in thread
From: Mark Kettenis @ 2010-03-07 20:00 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4939 bytes --]

> Date: Sun, 7 Mar 2010 09:39:44 -0800
> From: "H.J. Lu" <hjl.tools@gmail.com>
> 
> On Sun, Mar 7, 2010 at 9:04 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Sun, Mar 7, 2010 at 8:40 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> >> On Sun, Mar 7, 2010 at 8:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> >>> On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> >>>> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >>>>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
> >>>>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
> >>>>>>
> >>>>>> AVX registers are saved and restored via the XSAVE extended state. The
> >>>>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
> >>>>>> is used to determine which states, x87, SSE, AVX, ... are supported
> >>>>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
> >>>>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
> >>>>>> extended state indicates what states the current process is in. If
> >>>>>> the feature bit is cleared, the corresponding registers should be read as
> >>>>>> 0. If we update a register, we should set the corresponding feature
> >>>>>> bit in the xstate_bv field.
> >>>>>>
> >>>>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
> >>>>>> fetch and store AVX registers with ptrace. Linux kernel also stores
> >>>>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
> >>>>>> byte offset 464.
> >>>>>>
> >>>>>> There are total 6 patches to add AVX support for Linux.  They support:
> >>>>>>
> >>>>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
> >>>>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
> >>>>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
> >>>>>> YMM register, %ymmX, as pseudo register to users.
> >>>>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
> >>>>>> 3. Forward compatible. If new state beyond AVX is supported in
> >>>>>> the XSAVE extended state, only AVX state will be used.
> >>>>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
> >>>>>> request packet to indicate that GDB supports x86 XML target desciption.
> >>>>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
> >>>>>> in qSupported request packet.
> >>>>>>
> >>>>>> One advantage of this approach is YMM registers are actually stored as
> >>>>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
> >>>>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
> >>>>>> just need to hide %ymmXh from users.
> >>>>>>
> >>>>>> To support AVX on other OSes, the following changes are needed:
> >>>>>>
> >>>>>> 1. Kernel support to get/set the XSAVE extended state.
> >>>>>> 2. Handle 8/16 upper YMM registers.
> >>>>>> 3. Provide target to_read_description to return SSE or AVX target
> >>>>>> description.
> >>>>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
> >>>>>> description based on contents of core dump.
> >>>>>
> >>>>> Wait; there is something important missing here.  How are the new %ymm
> >>>>> registers referred to in debug info?  The AMD64 SysV psABI defines the
> >>>>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
> >>>>> doesn't define any mappings for the %ymm registers.  What mapping does
> >>>>> GCC use?
> >>>>>
> >>>>
> >>>> In gcc, XMM and YMM registers have the same register number. They map
> >>>> to be the same DWARF register with different sizes.  Since XMM and YMM
> >>>> registers are caller-saved, they don't appear in unwind info. So, the same
> >>>> DWARF register with different sizes for XMM/YMM registers isn't a problem.
> >>>>
> >>>>
> >>>
> >>> Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
> >>> register to YMM. How do other arches solve this?
> >>>
> >>
> >> My first approach works here since XMM and YMM register have the same
> >> register numbers.  We can solve it with 2 alternatives:
> >>
> >> 1. Give a different DWARF register number for YMM register,
> >> which is an incompatible ABI change.
> >> 2. Implement YMM registers as a super set of XMM registers, which
> >> is my first approach.
> >>
> >> Thanks Mark for pointing out this issue.
> >
> > Or I can provide i386_value_from_register.

Probably a bad idea.  That function really is only intended to convert
values that have a completely different bit-representation in the
register.

> It doesn't work on x86 since i386_value_from_register will change
> regum. Can we change
> 
> typedef struct value * (gdbarch_value_from_register_ftype) (struct
> type *type, int regnum, struct frame_info *frame);
> 
> to
> 
> typedef struct value * (gdbarch_value_from_register_ftype) (struct
> type *type, int *regnum, struct frame_info *frame);
> 
> to support updating regnum?

I'd rather not.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 16:40         ` H.J. Lu
  2010-03-07 17:04           ` H.J. Lu
  2010-03-07 19:10           ` Nathan Froyd
@ 2010-03-07 20:29           ` Mark Kettenis
  2010-03-07 21:04             ` H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-07 20:29 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4954 bytes --]

> Date: Sun, 7 Mar 2010 08:40:10 -0800
> From: "H.J. Lu" <hjl.tools@gmail.com>
> 
> On Sun, Mar 7, 2010 at 8:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> >> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
> >>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
> >>>>
> >>>> AVX registers are saved and restored via the XSAVE extended state. The
> >>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
> >>>> is used to determine which states, x87, SSE, AVX, ... are supported
> >>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
> >>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
> >>>> extended state indicates what states the current process is in. If
> >>>> the feature bit is cleared, the corresponding registers should be read as
> >>>> 0. If we update a register, we should set the corresponding feature
> >>>> bit in the xstate_bv field.
> >>>>
> >>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
> >>>> fetch and store AVX registers with ptrace. Linux kernel also stores
> >>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
> >>>> byte offset 464.
> >>>>
> >>>> There are total 6 patches to add AVX support for Linux.  They support:
> >>>>
> >>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
> >>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
> >>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
> >>>> YMM register, %ymmX, as pseudo register to users.
> >>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
> >>>> 3. Forward compatible. If new state beyond AVX is supported in
> >>>> the XSAVE extended state, only AVX state will be used.
> >>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
> >>>> request packet to indicate that GDB supports x86 XML target desciption.
> >>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
> >>>> in qSupported request packet.
> >>>>
> >>>> One advantage of this approach is YMM registers are actually stored as
> >>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
> >>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
> >>>> just need to hide %ymmXh from users.
> >>>>
> >>>> To support AVX on other OSes, the following changes are needed:
> >>>>
> >>>> 1. Kernel support to get/set the XSAVE extended state.
> >>>> 2. Handle 8/16 upper YMM registers.
> >>>> 3. Provide target to_read_description to return SSE or AVX target
> >>>> description.
> >>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
> >>>> description based on contents of core dump.
> >>>
> >>> Wait; there is something important missing here.  How are the new %ymm
> >>> registers referred to in debug info?  The AMD64 SysV psABI defines the
> >>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
> >>> doesn't define any mappings for the %ymm registers.  What mapping does
> >>> GCC use?
> >>>
> >>
> >> In gcc, XMM and YMM registers have the same register number. They map
> >> to be the same DWARF register with different sizes.  Since XMM and YMM
> >> registers are caller-saved, they don't appear in unwind info. So, the same
> >> DWARF register with different sizes for XMM/YMM registers isn't a problem.
> >>
> >>
> >
> > Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
> > register to YMM. How do other arches solve this?

A possible solution here is to simply always map %xmmN onto %ymmN if
the target supports AVX.  This'll make "info address" say that a
128-bit vector variable lives in %ymmN instead of %xmmN, but that
wouldn't really be a lie, would it?  The only problem with this
approach is that it will break cases where the debug info refers to a
variable living in consecutive %xmm (128-bit) registers using (only)
the register number of the first %xmm register.  This shouldn't happen
with DWARF2, but might happen with older debug formats like stabs.
Not necessarily a serious probem; at least nothing I care about still
uses stabs.

Should be a simple matter of returning the %ymm pseudo register number
if tdep->num_ymm_regs > 0.

> My first approach works here since XMM and YMM register have the same
> register numbers.  We can solve it with 2 alternatives:

Works or does not work?

> 1. Give a different DWARF register number for YMM register,
> which is an incompatible ABI change.

That really is the only viable option if you want to dismbiguate the
%ymm registers from the %xmm registers.

> 2. Implement YMM registers as a super set of XMM registers, which
> is my first approach.

I don't think this really solves anything.  You'll still have issues
with values stored in consecutive registers.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 20:29           ` Mark Kettenis
@ 2010-03-07 21:04             ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 21:04 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 7, 2010 at 12:28 PM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sun, 7 Mar 2010 08:40:10 -0800
>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>
>> On Sun, Mar 7, 2010 at 8:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Sun, Mar 7, 2010 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> On Sun, Mar 7, 2010 at 6:16 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> >>>> Date: Sat, 6 Mar 2010 14:16:34 -0800
>> >>>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>> >>>>
>> >>>> AVX registers are saved and restored via the XSAVE extended state. The
>> >>>> extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
>> >>>> is used to determine which states, x87, SSE, AVX, ... are supported
>> >>>> in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
>> >>>> instruction.  The xstate_bv field at byte offset 512 in the XSAVE
>> >>>> extended state indicates what states the current process is in. If
>> >>>> the feature bit is cleared, the corresponding registers should be read as
>> >>>> 0. If we update a register, we should set the corresponding feature
>> >>>> bit in the xstate_bv field.
>> >>>>
>> >>>> We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
>> >>>> fetch and store AVX registers with ptrace. Linux kernel also stores
>> >>>> XCR0 at the first 8 bytes of the software usable bytes, starting at
>> >>>> byte offset 464.
>> >>>>
>> >>>> There are total 6 patches to add AVX support for Linux.  They support:
>> >>>>
>> >>>> 1. The upper 128bit YMM registers are added for AVX support. The upper
>> >>>> 128bit YMM registers are hidden from users. Gdb combines XMM register,
>> >>>> %xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
>> >>>> YMM register, %ymmX, as pseudo register to users.
>> >>>> 2. Backward compatible. If AVX isn't supported, SSE will be used.
>> >>>> 3. Forward compatible. If new state beyond AVX is supported in
>> >>>> the XSAVE extended state, only AVX state will be used.
>> >>>> 4. Remote gdb protocol extension. GDB will send "x86=xml" in qSupported
>> >>>> request packet to indicate that GDB supports x86 XML target desciption.
>> >>>> The gdb stub will send x86 XML target desciption if it sees "x86=xml"
>> >>>> in qSupported request packet.
>> >>>>
>> >>>> One advantage of this approach is YMM registers are actually stored as
>> >>>> XMM registers and upper YMM registers in the XSAVE extended state.  It
>> >>>> is easy and natural to access them as %xmmX and %ymmXh internally.  We
>> >>>> just need to hide %ymmXh from users.
>> >>>>
>> >>>> To support AVX on other OSes, the following changes are needed:
>> >>>>
>> >>>> 1. Kernel support to get/set the XSAVE extended state.
>> >>>> 2. Handle 8/16 upper YMM registers.
>> >>>> 3. Provide target to_read_description to return SSE or AVX target
>> >>>> description.
>> >>>> 4. Update gdbarch_core_read_description to return SSE or AVX target
>> >>>> description based on contents of core dump.
>> >>>
>> >>> Wait; there is something important missing here.  How are the new %ymm
>> >>> registers referred to in debug info?  The AMD64 SysV psABI defines the
>> >>> DWARF register Number Mapping, but the 0.99.4 draft copy I have
>> >>> doesn't define any mappings for the %ymm registers.  What mapping does
>> >>> GCC use?
>> >>>
>> >>
>> >> In gcc, XMM and YMM registers have the same register number. They map
>> >> to be the same DWARF register with different sizes.  Since XMM and YMM
>> >> registers are caller-saved, they don't appear in unwind info. So, the same
>> >> DWARF register with different sizes for XMM/YMM registers isn't a problem.
>> >>
>> >>
>> >
>> > Yes, there is a problem. amd64_dwarf_reg_to_regnum needs to map 256bit
>> > register to YMM. How do other arches solve this?
>
> A possible solution here is to simply always map %xmmN onto %ymmN if
> the target supports AVX.  This'll make "info address" say that a
> 128-bit vector variable lives in %ymmN instead of %xmmN, but that
> wouldn't really be a lie, would it?  The only problem with this
> approach is that it will break cases where the debug info refers to a
> variable living in consecutive %xmm (128-bit) registers using (only)
> the register number of the first %xmm register.  This shouldn't happen
> with DWARF2, but might happen with older debug formats like stabs.
> Not necessarily a serious probem; at least nothing I care about still
> uses stabs.
>
> Should be a simple matter of returning the %ymm pseudo register number
> if tdep->num_ymm_regs > 0.

Yes, it works.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 19:49             ` Mark Kettenis
@ 2010-03-07 21:07               ` Nathan Froyd
  2010-03-07 21:17                 ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Nathan Froyd @ 2010-03-07 21:07 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: hjl.tools, gdb-patches

On Sun, Mar 07, 2010 at 08:46:34PM +0100, Mark Kettenis wrote:
> > From: Nathan Froyd <froydnj@codesourcery.com>
> > The third alternative--again, what's adopted for the PPC SPE 64-bit
> > registers--is to give %ymmNh their own DWARF register numbers.  I
> > suppose it's also ABI-incompatible, but it seems like it fits with your
> > approach much better than either of the above alternatives.
> 
> I don't think that would be a good idea.  It means you can't refer to
> something stored in a %ymmN register with a single register number.

Sure you can.  GDB knows to merge %ymmNh with %xmmN if it needs to to
make %ymmN, which will have only one register number.

> The compiler will have to use a more complicated expression (which may
> not be possible for older debug info formats) for these.  As a result,
> things like "info address" become rather useless.

I suppose the importance of that depends on how important supporting
older debug info formats is.  I haven't looked into whether the SPE bits
are well supported under non-DWARF formats.

-Nathan

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-07 21:07               ` Nathan Froyd
@ 2010-03-07 21:17                 ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 21:17 UTC (permalink / raw)
  To: Nathan Froyd; +Cc: Mark Kettenis, gdb-patches

On Sun, Mar 7, 2010 at 1:07 PM, Nathan Froyd <froydnj@codesourcery.com> wrote:
> On Sun, Mar 07, 2010 at 08:46:34PM +0100, Mark Kettenis wrote:
>> > From: Nathan Froyd <froydnj@codesourcery.com>
>> > The third alternative--again, what's adopted for the PPC SPE 64-bit
>> > registers--is to give %ymmNh their own DWARF register numbers.  I
>> > suppose it's also ABI-incompatible, but it seems like it fits with your
>> > approach much better than either of the above alternatives.
>>
>> I don't think that would be a good idea.  It means you can't refer to
>> something stored in a %ymmN register with a single register number.
>
> Sure you can.  GDB knows to merge %ymmNh with %xmmN if it needs to to
> make %ymmN, which will have only one register number.
>

We aren't going to give %ymmNh their own DWARF register numbers
in gcc 4.4. Mark's idea of using %ymmN register number for %xmmN
if AVX is available seems to work fine.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-06 22:21     ` PATCH: 3/6 [2nd try]: " H.J. Lu
@ 2010-03-07 21:32       ` H.J. Lu
  2010-03-11 22:37         ` Mark Kettenis
  2010-03-12 16:49       ` H.J. Lu
  2010-03-27 15:48       ` PATCH: 3/6 [2nd " Mark Kettenis
  2 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 21:32 UTC (permalink / raw)
  To: GDB

On Sat, Mar 06, 2010 at 02:20:37PM -0800, H.J. Lu wrote:
> Hi,
> 
> Here are i386 changes to support AVX. OK to install?
>  

Here is the updated patch to change i386_dbx_reg_to_regnum to return
%ymmN register number for %xmmN if AVX is available.  Any comments?

Thanks.


H.J.
---
2010-03-07  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h" and
	<sys/uio.h>.
	(xstate_size): New.
	(xstate_size_n_of_int64): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Make it global.  Add
	".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_regset_sections): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_dbx_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.  Call set_gdbarch_qsupported.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_xmm_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.
	* config/i386/nm-linux-xstate.h: Likewise.
	* config/i386/nm-linux64.h: Likewise.

	* config/i386/linux64.mh (NAT_FILE): Set to nm-linux64.h.

	* config/i386/nm-linux.h: Include "config/i386/nm-linux-xstate.h".

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..3548103
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,45 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define bit_I386_XSTATE_X87		(1ULL << 0)
+#define bit_I386_XSTATE_SSE		(1ULL << 1)
+#define bit_I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	\
+  (bit_I386_XSTATE_X87 | bit_I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	\
+  (I386_XSTATE_SSE_MASK | bit_I386_XSTATE_AVX)
+#define I386_XSTATE_MAX_MASK	\
+  I386_XSTATE_AVX_MASK
+
+#define I386_XSTATE_SSE_SIZE		576
+#define I386_XSTATE_AVX_SIZE		832
+#define I386_XSTATE_MAX_SIZE		832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & bit_I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/config/i386/linux64.mh b/gdb/config/i386/linux64.mh
index 19f3be0..99a5042 100644
--- a/gdb/config/i386/linux64.mh
+++ b/gdb/config/i386/linux64.mh
@@ -2,7 +2,7 @@
 NATDEPFILES= inf-ptrace.o fork-child.o \
 	i386-nat.o amd64-nat.o amd64-linux-nat.o linux-nat.o \
 	proc-service.o linux-thread-db.o linux-fork.o
-NAT_FILE= config/nm-linux.h
+NAT_FILE= nm-linux64.h
 
 # The dynamically loaded libthread_db needs access to symbols in the
 # gdb executable.
diff --git a/gdb/config/i386/nm-linux-xstate.h b/gdb/config/i386/nm-linux-xstate.h
new file mode 100644
index 0000000..0dbf9e5
--- /dev/null
+++ b/gdb/config/i386/nm-linux-xstate.h
@@ -0,0 +1,33 @@
+/* Native XSAVE extended state support for GNU/Linux x86.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef	NM_LINUX_XSTATE_H
+#define	NM_LINUX_XSTATE_H
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+#endif	/* NM_LINUX_XSTATE_H */
diff --git a/gdb/config/i386/nm-linux.h b/gdb/config/i386/nm-linux.h
index 10db309..fab8a0d 100644
--- a/gdb/config/i386/nm-linux.h
+++ b/gdb/config/i386/nm-linux.h
@@ -23,6 +23,7 @@
 #define NM_LINUX_H
 
 #include "config/nm-linux.h"
+#include "config/i386/nm-linux-xstate.h"
 
 #ifdef HAVE_PTRACE_GETFPXREGS
 /* Include register set support for the SSE registers.  */
diff --git a/gdb/config/i386/nm-linux64.h b/gdb/config/i386/nm-linux64.h
new file mode 100644
index 0000000..75220d6
--- /dev/null
+++ b/gdb/config/i386/nm-linux64.h
@@ -0,0 +1,26 @@
+/* Native support for GNU/Linux x86-64.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef NM_LINUX64_H
+#define NM_LINUX64_H
+
+#include "config/nm-linux.h"
+#include "config/i386/nm-linux-xstate.h"
+
+#endif /* nm-linux64.h */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..344c814 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,16 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +111,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +125,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +373,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +558,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +572,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +630,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +644,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +943,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static unsigned long long xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (i386_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..66ecf84 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,13 +50,15 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
-static struct core_regset_section i386_linux_regset_sections[] =
+struct core_regset_section i386_linux_regset_sections[] =
 {
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size)
+{
+  int i;
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  for (i = 0; regset_sections[i].sect_name != NULL; i++)
+    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
+      {
+	if (xstate_size)
+	  regset_sections[i].size = xstate_size;
+	else
+	  regset_sections[i].sect_name = NULL;
+	break;
+      }
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+unsigned long long
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  unsigned long long xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      gdb_assert (size >= I386_XSTATE_SSE_SIZE);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +634,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +694,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +913,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..8881fea 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,45 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern unsigned long long i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section i386_linux_regset_sections[];
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
+  enabled state mask,  which is same as the 64bit mask returned by the
+  xgetbv's XCR0). We can use this mask as well as the mask saved in the
+  xstate_hdr bytes to interpret what states the processor/OS supports and
+  what state is in, used/initialized conditions, for the particular
+  process/thread.  */
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 05afa56..02b4157 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -50,11 +50,13 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -73,6 +75,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -149,18 +163,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
-static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -200,6 +243,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -208,6 +264,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -245,7 +303,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
   else if (reg >= 21 && reg <= 28)
     {
       /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      int ymm0_regnum = tdep->ymm0_regnum;
+
+      if (ymm0_regnum >= 0
+	  && i386_xmm_regnum_p (gdbarch, reg))
+	return reg - 21 + ymm0_regnum;
+      else
+	return reg - 21 + I387_XMM0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2183,6 +2247,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo XMM registers.  We can't use
+   tdesc_find_type since XMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2233,6 +2350,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2284,7 +2403,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 16byte. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 16byte.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2333,7 +2467,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 16byte.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 16byte.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2580,6 +2727,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2607,6 +2776,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2800,46 +2979,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & bit_I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5649,7 +5842,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -5659,13 +5853,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may be set by ABI-specific.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -5674,7 +5892,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -5689,6 +5907,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -5709,6 +5928,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -5738,6 +5959,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -5854,9 +6077,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* The default ABI includes general-purpose registers, floating-point
+     registers, the SSE registers and the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -5867,10 +6094,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -5878,24 +6110,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -5905,16 +6138,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -5940,6 +6183,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_fast_tracepoint_valid_at (gdbarch,
 					i386_fast_tracepoint_valid_at);
 
+  /* Tell remote stub that we support XML target description.  */
+  set_gdbarch_qsupported (gdbarch, "x86=xml");
+
   return gdbarch;
 }
 
@@ -5997,4 +6243,5 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 }
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..1ce9d8c 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this gdb.
+   */
+  unsigned long long xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes)
  2010-03-06 22:21     ` PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes) H.J. Lu
@ 2010-03-07 21:33       ` H.J. Lu
  2010-03-12 17:01         ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-07 21:33 UTC (permalink / raw)
  To: GDB

On Sat, Mar 06, 2010 at 02:21:22PM -0800, H.J. Lu wrote:
> Hi,
> 
> Here are the amd64 changes to support AVX.  OK to install?
> 

Here is the updated patch to change amd64_dwarf_reg_to_regnum to return
%ymmN register number for %xmmN if AVX is available.  Any comments?

Thanks.


H.J.
----
2010-03-07  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h" and
	<sys/uio.h>.
	(xstate_size): New.
	(xstate_size_n_of_int64): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_regset_sections): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_dwarf_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..4af1112 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -52,6 +55,16 @@
 #include "amd64-nat.h"
 #include "i386-nat.h"
 
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
+
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
 
@@ -73,6 +86,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +114,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +199,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +258,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
+
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
 
-      return;
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +738,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static unsigned long long xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +753,53 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (amd64_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..51722bf 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,7 +28,9 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
 
 #include "gdb_string.h"
@@ -38,6 +40,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +48,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1250,12 +1262,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1314,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1337,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1539,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..734f117 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,17 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section amd64_linux_regset_sections[];
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index 8c41a8a..2f6e725 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -163,6 +177,8 @@ static const int amd64_dwarf_regmap_len =
 static int
 amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 {
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
   int regnum = -1;
 
   if (reg >= 0 && reg < amd64_dwarf_regmap_len)
@@ -170,6 +186,9 @@ amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 
   if (regnum == -1)
     warning (_("Unmapped DWARF Register #%d encountered."), reg);
+  else if (ymm0_regnum >= 0
+	   && i386_xmm_regnum_p (gdbarch, regnum))
+    regnum += ymm0_regnum - I387_XMM0_REGNUM (tdep);
 
   return regnum;
 }
@@ -234,6 +253,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -242,6 +274,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2148,6 +2182,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2166,6 +2222,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2228,6 +2294,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 16;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2241,6 +2314,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2321,6 +2396,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2356,6 +2432,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2379,3 +2479,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-07 21:32       ` H.J. Lu
@ 2010-03-11 22:37         ` Mark Kettenis
  2010-03-12  0:00           ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-11 22:37 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

> Date: Sun, 7 Mar 2010 13:31:53 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> On Sat, Mar 06, 2010 at 02:20:37PM -0800, H.J. Lu wrote:
> > Hi,
> > 
> > Here are i386 changes to support AVX. OK to install?
> >  
> 
> Here is the updated patch to change i386_dbx_reg_to_regnum to return
> %ymmN register number for %xmmN if AVX is available.  Any comments?

Can't find the time to review the complete diff.  So here's a partial
review.

The generic -tdep.c bits look reasonable, although I have a few nits.  I'm
less happy with the Linux -tdep.c and -nat.c bits though.

> 2010-03-07  H.J. Lu  <hongjiu.lu@intel.com>
> 
> 	* i386-linux-nat.c: Include "regset.h", "elf/common.h" and
> 	<sys/uio.h>.
> 	(xstate_size): New.
> 	(xstate_size_n_of_int64): Likewise.
> 	(fetch_xstateregs): Likewise.
> 	(store_xstateregs): Likewise.
> 	(GETXSTATEREGS_SUPPLIES): Likewise.
> 	(regmap): Include 8 upper YMM registers.
> 	(i386_linux_fetch_inferior_registers): Support XSAVE extended
> 	state.
> 	(i386_linux_store_inferior_registers): Likewise.
> 	(i386_linux_read_description): Check and enable AVX target
> 	descriptions.
> 
> 	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
> 	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
> 	(i386_linux_regset_sections): Make it global.  Add
> 	".reg-xstate".
> 	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
> 	(i386_linux_update_xstateregset): New.
> 	(i386_linux_core_read_xcr0): Likewise.
> 	(i386_linux_core_read_description): Check and enable AVX target
> 	description.
> 	(i386_linux_init_abi): Set xsave_xcr0_offset.
> 	(_initialize_i386_linux_tdep): Call
> 	initialize_tdesc_i386_avx_linux.
> 
> 	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
> 	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
> 	(i386_linux_core_read_xcr0): New.
> 	(tdesc_i386_avx_linux): Likewise.
> 	(i386_linux_regset_sections): Likewise.
> 	(i386_linux_update_xstateregset): Likewise.
> 	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.
> 
> 	* i386-tdep.c: Include "i386-xstate.h" and
> 	"features/i386/i386-avx.c".
> 	(i386_ymm_names): New.
> 	(i386_ymmh_names): Likewise.
> 	(i386_ymmh_regnum_p): Likewise.
> 	(i386_ymm_regnum_p): Likewise.
> 	(i386_xmm_regnum_p): Likewise.
> 	(i386_register_name): Likewise.
> 	(i386_ymm_type): Likewise.
> 	(i386_supply_xstateregset): Likewise.
> 	(i386_collect_xstateregset): Likewise.
> 	(i386_sse_regnum_p): Removed.
> 	(i386_pseudo_register_name): Support pseudo YMM registers.
> 	(i386_pseudo_register_type): Likewise.
> 	(i386_pseudo_register_read): Likewise.
> 	(i386_pseudo_register_write): Likewise.
> 	(i386_dbx_reg_to_regnum): Return %ymmN register number for
> 	%xmmN if AVX is available.
> 	(i386_regset_from_core_section): Support .reg-xstate section.
> 	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
> 	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
> 	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
> 	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
> 	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
> 	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
> 	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
> 	Set ymm0_regnum.  Call set_gdbarch_qsupported.
> 	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.
> 
> 	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
> 	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
> 	i386_ymm_type.
> 	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
> 	(I386_AVX_NUM_REGS): New.
> 	(i386_xmm_regnum_p): Likewise.
> 	(i386_ymm_regnum_p): Likewise.
> 	(i386_ymmh_regnum_p): Likewise.
> 
> 	* common/i386-xstate.h: New.
> 	* config/i386/nm-linux-xstate.h: Likewise.
> 	* config/i386/nm-linux64.h: Likewise.
> 
> 	* config/i386/linux64.mh (NAT_FILE): Set to nm-linux64.h.
> 
> 	* config/i386/nm-linux.h: Include "config/i386/nm-linux-xstate.h".
> 
> diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
> new file mode 100644
> index 0000000..3548103
> --- /dev/null
> +++ b/gdb/common/i386-xstate.h
> @@ -0,0 +1,45 @@
> +/* Common code for i386 XSAVE extended state.
> +
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef I386_XSTATE_H
> +#define I386_XSTATE_H 1
> +
> +/* The extended state feature bits.  */
> +#define bit_I386_XSTATE_X87		(1ULL << 0)
> +#define bit_I386_XSTATE_SSE		(1ULL << 1)
> +#define bit_I386_XSTATE_AVX		(1ULL << 2)

#define's should be uppercase; please drop the bit_-prefix

> +
> +/* Supported mask and size of the extended state.  */
> +#define I386_XSTATE_SSE_MASK	\
> +  (bit_I386_XSTATE_X87 | bit_I386_XSTATE_SSE)
> +#define I386_XSTATE_AVX_MASK	\
> +  (I386_XSTATE_SSE_MASK | bit_I386_XSTATE_AVX)
> +#define I386_XSTATE_MAX_MASK	\
> +  I386_XSTATE_AVX_MASK
> +
> +#define I386_XSTATE_SSE_SIZE		576
> +#define I386_XSTATE_AVX_SIZE		832
> +#define I386_XSTATE_MAX_SIZE		832
> +
> +/* Get I386 XSAVE extended state size.  */
> +#define I386_XSTATE_SIZE(XCR0)	\
> +  (((XCR0) & bit_I386_XSTATE_AVX) != 0 \
> +   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
> +
> +#endif /* I386_XSTATE_H */


Please don't introduce new nm-xxx.h files.  We've been trying to get
rid of them for years and they shouldn't be necessary.

> diff --git a/gdb/config/i386/linux64.mh b/gdb/config/i386/linux64.mh
> index 19f3be0..99a5042 100644
> --- a/gdb/config/i386/linux64.mh
> +++ b/gdb/config/i386/linux64.mh
> @@ -2,7 +2,7 @@
>  NATDEPFILES= inf-ptrace.o fork-child.o \
>  	i386-nat.o amd64-nat.o amd64-linux-nat.o linux-nat.o \
>  	proc-service.o linux-thread-db.o linux-fork.o
> -NAT_FILE= config/nm-linux.h
> +NAT_FILE= nm-linux64.h
>  
>  # The dynamically loaded libthread_db needs access to symbols in the
>  # gdb executable.
> diff --git a/gdb/config/i386/nm-linux-xstate.h b/gdb/config/i386/nm-linux-xstate.h
> new file mode 100644
> index 0000000..0dbf9e5
> --- /dev/null
> +++ b/gdb/config/i386/nm-linux-xstate.h
> @@ -0,0 +1,33 @@
> +/* Native XSAVE extended state support for GNU/Linux x86.
> +
> +   Copyright 2010 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef	NM_LINUX_XSTATE_H
> +#define	NM_LINUX_XSTATE_H
> +
> +#include "i386-xstate.h"
> +
> +#ifndef PTRACE_GETREGSET
> +#define PTRACE_GETREGSET	0x4204
> +#endif
> +
> +#ifndef PTRACE_SETREGSET
> +#define PTRACE_SETREGSET	0x4205
> +#endif
> +
> +#endif	/* NM_LINUX_XSTATE_H */

Do we really have to hardcode constants like this in GDB?  They should
be available in through kernel/libc headers.  Are Drepper and Torvalds
still fighting over that issue?

> diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
> index 31b9086..344c814 100644
> --- a/gdb/i386-linux-nat.c
> +++ b/gdb/i386-linux-nat.c
> @@ -23,11 +23,14 @@
>  #include "inferior.h"
>  #include "gdbcore.h"
>  #include "regcache.h"
> +#include "regset.h"
>  #include "target.h"
>  #include "linux-nat.h"
>  
>  #include "gdb_assert.h"
>  #include "gdb_string.h"
> +#include "elf/common.h"
> +#include <sys/uio.h>
>  #include <sys/ptrace.h>
>  #include <sys/user.h>
>  #include <sys/procfs.h>
> @@ -69,6 +72,16 @@
>  
>  /* Defines ps_err_e, struct ps_prochandle.  */
>  #include "gdb_proc_service.h"
> +
> +/* The extended state size in bytes.  */
> +static unsigned int xstate_size;
> +
> +/* The extended state size in unit of int64.  We use array of int64 for
> +   better alignment.  */
> +static unsigned int xstate_size_n_of_int64;

Does alignment really matter?  I'd rather do without this additional
complication.

> +/* Does the current host support PTRACE_GETREGSET?  */
> +static int have_ptrace_getregset = -1;
>  \f
>  
>  /* The register sets used in GNU/Linux ELF core-dumps are identical to
> @@ -98,6 +111,8 @@ static int regmap[] =
>    -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
>    -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
>    -1,				/* mxcsr */
> +  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
> +  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
>    ORIG_EAX
>  };
>  
> @@ -110,6 +125,9 @@ static int regmap[] =
>  #define GETFPXREGS_SUPPLIES(regno) \
>    (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
>  
> +#define GETXSTATEREGS_SUPPLIES(regno) \
> +  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
> +
>  /* Does the current host support the GETREGS request?  */
>  int have_ptrace_getregs =
>  #ifdef HAVE_PTRACE_GETREGS
> @@ -355,6 +373,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
>  
>  /* Transfering floating-point and SSE registers to and from GDB.  */
>  
> +/* Fetch all registers covered by the PTRACE_GETREGSET request from
> +   process/thread TID and store their values in GDB's register array.
> +   Return non-zero if successful, zero otherwise.  */
> +
> +static int
> +fetch_xstateregs (struct regcache *regcache, int tid)
> +{
> +  unsigned long long xstateregs[xstate_size_n_of_int64];
> +  struct iovec iov;
> +
> +  if (!have_ptrace_getregset)
> +    return 0;
> +
> +  iov.iov_base = xstateregs;
> +  iov.iov_len = xstate_size;
> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +	      (int) &iov) < 0)

This can't be right!

> +    perror_with_name (_("Couldn't read extended state status"));
> +
> +  i387_supply_xsave (regcache, -1, xstateregs);
> +  return 1;
> +}
> +
> +/* Store all valid registers in GDB's register array covered by the
> +   PTRACE_SETREGSET request into the process/thread specified by TID.
> +   Return non-zero if successful, zero otherwise.  */
> +
> +static int
> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
> +{
> +  unsigned long long xstateregs[xstate_size_n_of_int64];

I think it is better to use I386_XSTATE_MAX_SIZE here.

> +  struct iovec iov;
> +
> +  if (!have_ptrace_getregset)
> +    return 0;
> +  
> +  iov.iov_base = xstateregs;
> +  iov.iov_len = xstate_size;
> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +	      (int) &iov) < 0)
> +    perror_with_name (_("Couldn't read extended state status"));

This can't be right either!

> +  i387_collect_xsave (regcache, regno, xstateregs, 0);
> +
> +  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +	      (int) &iov) < 0)
> +    perror_with_name (_("Couldn't write extended state status"));
> +
> +  return 1;
> +}
> +
>  #ifdef HAVE_PTRACE_GETFPXREGS
>  
>  /* Fill GDB's register array with the floating-point and SSE register
> @@ -489,6 +558,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
>  	  return;
>  	}
>  
> +      if (fetch_xstateregs (regcache, tid))
> +	return;
>        if (fetch_fpxregs (regcache, tid))
>  	return;
>        fetch_fpregs (regcache, tid);
> @@ -501,6 +572,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
>        return;
>      }
>  
> +  if (GETXSTATEREGS_SUPPLIES (regno))
> +    {
> +      if (fetch_xstateregs (regcache, tid))
> +	return;
> +    }
> +
>    if (GETFPXREGS_SUPPLIES (regno))
>      {
>        if (fetch_fpxregs (regcache, tid))
> @@ -553,6 +630,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
>    if (regno == -1)
>      {
>        store_regs (regcache, tid, regno);
> +      if (store_xstateregs (regcache, tid, regno))
> +	return;
>        if (store_fpxregs (regcache, tid, regno))
>  	return;
>        store_fpregs (regcache, tid, regno);
> @@ -565,6 +644,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
>        return;
>      }
>  
> +  if (GETXSTATEREGS_SUPPLIES (regno))
> +    {
> +      if (store_xstateregs (regcache, tid, regno))
> +	return;
> +    }
> +
>    if (GETFPXREGS_SUPPLIES (regno))
>      {
>        if (store_fpxregs (regcache, tid, regno))
> @@ -858,7 +943,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
>  static const struct target_desc *
>  i386_linux_read_description (struct target_ops *ops)
>  {
> -  return tdesc_i386_linux;
> +  static unsigned long long xcr0;

Is it really ok, to cache this?  Will the Linux kernel always return
the same value for every process?

> +  if (have_ptrace_getregset == -1)
> +    {
> +      int tid;
> +      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
> +				     / sizeof (long long))];
> +      struct iovec iov;
> +
> +      /* GNU/Linux LWP ID's are process ID's.  */
> +      tid = TIDGET (inferior_ptid);
> +      if (tid == 0)
> +	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
> +
> +      iov.iov_base = xstateregs;
> +      iov.iov_len = I386_XSTATE_SSE_SIZE;
> +
> +      /* Check if PTRACE_GETREGSET works.  */
> +      if (ptrace (PTRACE_GETREGSET, tid,
> +		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
> +	have_ptrace_getregset = 0;
> +      else
> +	{
> +	  have_ptrace_getregset = 1;
> +
> +	  /* Get XCR0 from XSAVE extended state.  */
> +	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
> +			     / sizeof (long long))];
> +
> +	  xstate_size = I386_XSTATE_SIZE (xcr0);
> +	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
> +	}
> +
> +      i386_linux_update_xstateregset (i386_linux_regset_sections,
> +				      xstate_size);
> +    }
> +
> +  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
> +  if (have_ptrace_getregset
> +      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
> +    return tdesc_i386_avx_linux;
> +  else
> +    return tdesc_i386_linux;
>  }
>  
>  void

Time for me to zzz now; hopefully I'll be able to review the remainder
on Saturday.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-11 22:37         ` Mark Kettenis
@ 2010-03-12  0:00           ` H.J. Lu
  2010-03-27 14:55             ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-12  0:00 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Thu, Mar 11, 2010 at 2:37 PM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sun, 7 Mar 2010 13:31:53 -0800
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> On Sat, Mar 06, 2010 at 02:20:37PM -0800, H.J. Lu wrote:
>> > Hi,
>> >
>> > Here are i386 changes to support AVX. OK to install?
>> >
>>
>> Here is the updated patch to change i386_dbx_reg_to_regnum to return
>> %ymmN register number for %xmmN if AVX is available.  Any comments?
>
> Can't find the time to review the complete diff.  So here's a partial
> review.
>
> The generic -tdep.c bits look reasonable, although I have a few nits.  I'm
> less happy with the Linux -tdep.c and -nat.c bits though.
>
...

>> +
>> +/* The extended state feature bits.  */
>> +#define bit_I386_XSTATE_X87          (1ULL << 0)
>> +#define bit_I386_XSTATE_SSE          (1ULL << 1)
>> +#define bit_I386_XSTATE_AVX          (1ULL << 2)
>
> #define's should be uppercase; please drop the bit_-prefix

I will make the change.

>> +
>> +/* Supported mask and size of the extended state.  */
>> +#define I386_XSTATE_SSE_MASK \
>> +  (bit_I386_XSTATE_X87 | bit_I386_XSTATE_SSE)
>> +#define I386_XSTATE_AVX_MASK \
>> +  (I386_XSTATE_SSE_MASK | bit_I386_XSTATE_AVX)
>> +#define I386_XSTATE_MAX_MASK \
>> +  I386_XSTATE_AVX_MASK
>> +
>> +#define I386_XSTATE_SSE_SIZE         576
>> +#define I386_XSTATE_AVX_SIZE         832
>> +#define I386_XSTATE_MAX_SIZE         832
>> +
>> +/* Get I386 XSAVE extended state size.  */
>> +#define I386_XSTATE_SIZE(XCR0)       \
>> +  (((XCR0) & bit_I386_XSTATE_AVX) != 0 \
>> +   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
>> +
>> +#endif /* I386_XSTATE_H */
>
>
> Please don't introduce new nm-xxx.h files.  We've been trying to get
> rid of them for years and they shouldn't be necessary.

I will remove it.


>> +
>> +#include "i386-xstate.h"
>> +
>> +#ifndef PTRACE_GETREGSET
>> +#define PTRACE_GETREGSET     0x4204
>> +#endif
>> +
>> +#ifndef PTRACE_SETREGSET
>> +#define PTRACE_SETREGSET     0x4205
>> +#endif
>> +
>> +#endif       /* NM_LINUX_XSTATE_H */
>
> Do we really have to hardcode constants like this in GDB?  They should
> be available in through kernel/libc headers.  Are Drepper and Torvalds
> still fighting over that issue?

They are in Linux kernel 2.6.34-rc1. Do we enable gdb support only
with the new kernel/glibc headers? I compiled gdb on RHEL4 and it
works fine.  There are:

#ifndef PTRACE_GET_THREAD_AREA
#define PTRACE_GET_THREAD_AREA 25
 ...
#ifndef PTRACE_ARCH_PRCTL
#define PTRACE_ARCH_PRCTL      30

in amd64-linux-nat.c.


>> +
>> +/* The extended state size in unit of int64.  We use array of int64 for
>> +   better alignment.  */
>> +static unsigned int xstate_size_n_of_int64;
>
> Does alignment really matter?  I'd rather do without this additional
> complication.

"xcr0" is a 64bit value.  It is nice to use array of uint64 to access it.

>> +static int
>> +fetch_xstateregs (struct regcache *regcache, int tid)
>> +{
>> +  unsigned long long xstateregs[xstate_size_n_of_int64];
>> +  struct iovec iov;
>> +
>> +  if (!have_ptrace_getregset)
>> +    return 0;
>> +
>> +  iov.iov_base = xstateregs;
>> +  iov.iov_len = xstate_size;
>> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
>> +           (int) &iov) < 0)
>
> This can't be right!

Why? That is the kernel interface in 2.6.34-rc1.

>> +    perror_with_name (_("Couldn't read extended state status"));
>> +
>> +  i387_supply_xsave (regcache, -1, xstateregs);
>> +  return 1;
>> +}
>> +
>> +/* Store all valid registers in GDB's register array covered by the
>> +   PTRACE_SETREGSET request into the process/thread specified by TID.
>> +   Return non-zero if successful, zero otherwise.  */
>> +
>> +static int
>> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
>> +{
>> +  unsigned long long xstateregs[xstate_size_n_of_int64];
>
> I think it is better to use I386_XSTATE_MAX_SIZE here.

That is how the kernel interface works.  Whatever value I386_XSTATE_MAX_SIZE is
today won't be the same tomorrow. We will increase it in the coming
years. But the same
gdb binary will work fine since kernel will only copy number of bytes
specified in
iov.iov_len, which is all gdb cares/needs.

>> +  struct iovec iov;
>> +
>> +  if (!have_ptrace_getregset)
>> +    return 0;
>> +
>> +  iov.iov_base = xstateregs;
>> +  iov.iov_len = xstate_size;
>> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
>> +           (int) &iov) < 0)
>> +    perror_with_name (_("Couldn't read extended state status"));
>
> This can't be right either!
>

>>        if (store_fpxregs (regcache, tid, regno))
>> @@ -858,7 +943,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
>>  static const struct target_desc *
>>  i386_linux_read_description (struct target_ops *ops)
>>  {
>> -  return tdesc_i386_linux;
>> +  static unsigned long long xcr0;
>
> Is it really ok, to cache this?  Will the Linux kernel always return
> the same value for every process?

xcr0 is a processor value and will be the same for all processes.

>
> Time for me to zzz now; hopefully I'll be able to review the remainder
> on Saturday.
>

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
@ 2010-03-12 11:11     ` Eli Zaretskii
  2010-03-12 14:17       ` H.J. Lu
  2010-03-12 15:27     ` Eli Zaretskii
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-12 11:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Sat, 6 Mar 2010 14:19:46 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> This patch updates document for AVX support.  OK to install?

I'm sorry, I'm not sure if you are waiting for me to review this,
given the discussions that followed.  I understood that you are going
to rework this patch and resubmit, is that right?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-12 11:11     ` Eli Zaretskii
@ 2010-03-12 14:17       ` H.J. Lu
  2010-03-12 15:28         ` Eli Zaretskii
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-12 14:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Fri, Mar 12, 2010 at 3:10 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Sat, 6 Mar 2010 14:19:46 -0800
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> This patch updates document for AVX support.  OK to install?
>
> I'm sorry, I'm not sure if you are waiting for me to review this,
> given the discussions that followed.  I understood that you are going
> to rework this patch and resubmit, is that right?
>

I will update my patches. But so far I don't need to update
document changes.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
  2010-03-12 11:11     ` Eli Zaretskii
@ 2010-03-12 15:27     ` Eli Zaretskii
  2010-03-12 16:46     ` H.J. Lu
  2010-03-29  0:18     ` PATCH: 2/6 [3rd " H.J. Lu
  3 siblings, 0 replies; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-12 15:27 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Sat, 6 Mar 2010 14:19:46 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> This patch updates document for AVX support.

Thanks.

> +This feature indicates that @value{GDBN} supports x86 XML target
> +description.

"supports the x86 XML target description."

> +The @samp{org.gnu.gdb.i386.avx} feature is optional. It should
                                                      ^^
Two spaces, please.

> +describe the upper 128bit of @sc{ymm} registers:

"the upper 128 bits"

Okay with those changes.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-12 14:17       ` H.J. Lu
@ 2010-03-12 15:28         ` Eli Zaretskii
  0 siblings, 0 replies; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-12 15:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Fri, 12 Mar 2010 06:17:39 -0800
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Cc: gdb-patches@sourceware.org
> 
> On Fri, Mar 12, 2010 at 3:10 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> >> Date: Sat, 6 Mar 2010 14:19:46 -0800
> >> From: "H.J. Lu" <hongjiu.lu@intel.com>
> >>
> >> This patch updates document for AVX support.  OK to install?
> >
> > I'm sorry, I'm not sure if you are waiting for me to review this,
> > given the discussions that followed.  I understood that you are going
> > to rework this patch and resubmit, is that right?
> >
> 
> I will update my patches. But so far I don't need to update
> document changes.

Okay, reviewed and commented.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
  2010-03-12 11:11     ` Eli Zaretskii
  2010-03-12 15:27     ` Eli Zaretskii
@ 2010-03-12 16:46     ` H.J. Lu
  2010-03-12 18:15       ` Eli Zaretskii
  2010-03-29  0:18     ` PATCH: 2/6 [3rd " H.J. Lu
  3 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-12 16:46 UTC (permalink / raw)
  To: GDB

On Sat, Mar 06, 2010 at 02:19:46PM -0800, H.J. Lu wrote:
> Hi,
> 
> This patch updates document for AVX support.  OK to install?
>  
> Thanks.
> 
> 
> H.J.
> ---
> 2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>
> 
> 	* gdb.texinfo (General Query Packets): Document x86=xml.
> 	(i386 Features): Add org.gnu.gdb.i386.avx.
> 

Here is the updated patch,


H.J.
---
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.texinfo (General Query Packets): Document x86=xml.
	(i386 Features): Add org.gnu.gdb.i386.avx.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index a1f3a78..a8f4b2e 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -30274,6 +30274,11 @@ extensions to the remote protocol.  @value{GDBN} does not use such
 extensions unless the stub also reports that it supports them by
 including @samp{multiprocess+} in its @samp{qSupported} reply.
 @xref{multiprocess extensions}, for details.
+
+@item x86=xml
+This feature indicates that @value{GDBN} supports supports the x86 XML
+target description.  If the stub sees @samp{x86=xml}, it can send
+@value{GDBN} the x86 XML target description.
 @end table
 
 Stubs should ignore any unknown values for
@@ -33356,6 +33361,17 @@ describe registers:
 @samp{mxcsr}
 @end itemize
 
+The @samp{org.gnu.gdb.i386.avx} feature is optional.  It should
+describe the upper 128 bits of @sc{ymm} registers:
+
+@itemize @minus
+@item
+@samp{ymm0h} through @samp{ymm7h} for i386
+@item
+@samp{ymm0h} through @samp{ymm15h} for amd64
+@item 
+@end itemize
+
 The @samp{org.gnu.gdb.i386.linux} feature is optional.  It should
 describe a single register, @samp{orig_eax}.
 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-06 22:21     ` PATCH: 3/6 [2nd try]: " H.J. Lu
  2010-03-07 21:32       ` H.J. Lu
@ 2010-03-12 16:49       ` H.J. Lu
  2010-03-13  1:38         ` H.J. Lu
  2010-03-29  1:11         ` PATCH: 3/6 [3rd " H.J. Lu
  2010-03-27 15:48       ` PATCH: 3/6 [2nd " Mark Kettenis
  2 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-12 16:49 UTC (permalink / raw)
  To: GDB

On Sat, Mar 06, 2010 at 02:20:37PM -0800, H.J. Lu wrote:
> Hi,
> 
> Here are i386 changes to support AVX. OK to install?
>  
> Thanks.
> 

Here is the updated patch. Any comments/suggestions?

Thanks.


H.J.
--
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(xstate_size): Likewise.
	(xstate_size_n_of_int64): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Make it global.  Add
	".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_regset_sections): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_dbx_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.  Call set_gdbarch_qsupported.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_xmm_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..f047d35
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,40 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define I386_XSTATE_X87		(1ULL << 0)
+#define I386_XSTATE_SSE		(1ULL << 1)
+#define I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	(I386_XSTATE_X87 | I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	(I386_XSTATE_SSE_MASK | I386_XSTATE_AVX)
+
+#define I386_XSTATE_SSE_SIZE	576
+#define I386_XSTATE_AVX_SIZE	832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..e70716a 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,26 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +121,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +135,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +383,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  unsigned long long xstateregs[xstate_size_n_of_int64];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +568,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +582,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +640,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +654,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +953,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static unsigned long long xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (i386_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..66ecf84 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,13 +50,15 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
-static struct core_regset_section i386_linux_regset_sections[] =
+struct core_regset_section i386_linux_regset_sections[] =
 {
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size)
+{
+  int i;
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  for (i = 0; regset_sections[i].sect_name != NULL; i++)
+    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
+      {
+	if (xstate_size)
+	  regset_sections[i].size = xstate_size;
+	else
+	  regset_sections[i].sect_name = NULL;
+	break;
+      }
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+unsigned long long
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  unsigned long long xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      gdb_assert (size >= I386_XSTATE_SSE_SIZE);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +634,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +694,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +913,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..8881fea 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,45 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern unsigned long long i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section i386_linux_regset_sections[];
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
+  enabled state mask,  which is same as the 64bit mask returned by the
+  xgetbv's XCR0). We can use this mask as well as the mask saved in the
+  xstate_hdr bytes to interpret what states the processor/OS supports and
+  what state is in, used/initialized conditions, for the particular
+  process/thread.  */
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 83275ac..8a5f06a 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -50,11 +50,13 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -73,6 +75,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -149,18 +163,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
-static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -200,6 +243,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -208,6 +264,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -245,7 +303,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
   else if (reg >= 21 && reg <= 28)
     {
       /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      int ymm0_regnum = tdep->ymm0_regnum;
+
+      if (ymm0_regnum >= 0
+	  && i386_xmm_regnum_p (gdbarch, reg))
+	return reg - 21 + ymm0_regnum;
+      else
+	return reg - 21 + I387_XMM0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2183,6 +2247,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo XMM registers.  We can't use
+   tdesc_find_type since XMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2233,6 +2350,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2284,7 +2403,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 16byte. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 16byte.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2333,7 +2467,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 16byte.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 16byte.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2580,6 +2727,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2607,6 +2776,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2800,46 +2979,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5652,7 +5845,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -5662,13 +5856,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may be set by ABI-specific.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -5677,7 +5895,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -5692,6 +5910,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -5712,6 +5931,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -5741,6 +5962,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -5857,9 +6080,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* The default ABI includes general-purpose registers, floating-point
+     registers, the SSE registers and the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -5870,10 +6097,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -5881,24 +6113,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -5908,16 +6141,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -5943,6 +6186,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_fast_tracepoint_valid_at (gdbarch,
 					i386_fast_tracepoint_valid_at);
 
+  /* Tell remote stub that we support XML target description.  */
+  set_gdbarch_qsupported (gdbarch, "x86=xml");
+
   return gdbarch;
 }
 
@@ -6000,4 +6246,5 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 }
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..1ce9d8c 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this gdb.
+   */
+  unsigned long long xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes)
  2010-03-07 21:33       ` H.J. Lu
@ 2010-03-12 17:01         ` H.J. Lu
  2010-03-13  1:38           ` H.J. Lu
  2010-03-29  1:07           ` PATCH: 4/6 [3rd " H.J. Lu
  0 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-12 17:01 UTC (permalink / raw)
  To: GDB

On Sun, Mar 07, 2010 at 01:33:04PM -0800, H.J. Lu wrote:
> On Sat, Mar 06, 2010 at 02:21:22PM -0800, H.J. Lu wrote:
> > Hi,
> > 
> > Here are the amd64 changes to support AVX.  OK to install?
> > 
> 

Hi,

Here is the updated patch. Any comments/suggestions?

Thanks.


H.J.
---
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(xstate_size): Likewise.
	(xstate_size_n_of_int64): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_regset_sections): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_dwarf_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..344aeff 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -51,6 +54,25 @@
 #include "i386-linux-tdep.h"
 #include "amd64-nat.h"
 #include "i386-nat.h"
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* The extended state size in unit of int64.  We use array of int64 for
+   better alignment.  */
+static unsigned int xstate_size_n_of_int64;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
@@ -73,6 +95,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +123,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +208,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
+
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +267,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  unsigned long long xstateregs[xstate_size_n_of_int64];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
+
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
 
-      return;
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +747,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static unsigned long long xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +762,53 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	  xstate_size_n_of_int64 = xstate_size / sizeof (long long);
+	}
+
+      i386_linux_update_xstateregset (amd64_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..3473926 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,8 +28,11 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
+#include "i386-xstate.h"
 
 #include "gdb_string.h"
 
@@ -38,6 +41,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +49,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1250,12 +1263,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1315,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1338,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1540,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..734f117 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,17 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section amd64_linux_regset_sections[];
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index e5cfa71..aa4acfb 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -163,6 +177,8 @@ static const int amd64_dwarf_regmap_len =
 static int
 amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 {
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
   int regnum = -1;
 
   if (reg >= 0 && reg < amd64_dwarf_regmap_len)
@@ -170,6 +186,9 @@ amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 
   if (regnum == -1)
     warning (_("Unmapped DWARF Register #%d encountered."), reg);
+  else if (ymm0_regnum >= 0
+	   && i386_xmm_regnum_p (gdbarch, regnum))
+    regnum += ymm0_regnum - I387_XMM0_REGNUM (tdep);
 
   return regnum;
 }
@@ -234,6 +253,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -242,6 +274,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2148,6 +2182,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2166,6 +2222,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2228,6 +2294,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 16;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2241,6 +2314,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2321,6 +2396,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2356,6 +2432,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2379,3 +2479,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 5/6 [2nd try]: Add AVX support (i387 changes)
  2010-03-06 22:22       ` PATCH: 5/6 [2nd try]: " H.J. Lu
@ 2010-03-12 17:24         ` H.J. Lu
  2010-04-07 16:57           ` PATCH: 5/6 [3rd " H.J. Lu
  2010-03-27 15:08         ` PATCH: 5/6 [2nd " Mark Kettenis
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-12 17:24 UTC (permalink / raw)
  To: GDB

On Sat, Mar 06, 2010 at 02:22:12PM -0800, H.J. Lu wrote:
> Hi,
> 
> Here are i387 changes to support AVX.  OK to install?
>  
> Thanks.
> 

Here is the updated patch.  Any comments/suggestions?

Thanks.


H.J.
---
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* i387-tdep.c: Include "i386-xstate.h".
	(XSAVE_XSTATE_BV_ADDR): New.
	(xsave_avxh_offset): Likewise.
	(XSAVE_AVXH_ADDR): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

	* i387-tdep.h (I387_NUM_YMM_REGS): New.
	(I387_YMM0H_REGNUM): Likewise.
	(I387_YMMENDH_REGNUM): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
index 3fb5b56..66e2167 100644
--- a/gdb/i387-tdep.c
+++ b/gdb/i387-tdep.c
@@ -34,6 +34,7 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 /* Print the floating point number specified by RAW.  */
 
@@ -677,6 +678,518 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
 			  FXSAVE_MXCSR_ADDR (regs));
 }
 
+/* `xstate_bv' is at byte offset 512.  */
+#define XSAVE_XSTATE_BV_ADDR(xsave) (xsave + 512)
+
+/* At xsave_avxh_offset[REGNUM] you'll find the offset to the location in
+   the upper 128bit of AVX register data structure used by the "xsave"
+   instruction where GDB register REGNUM is stored.  */
+
+static int xsave_avxh_offset[] =
+{
+  576 + 0 * 16,		/* Upper 128bit of %ymm0 through ...  */
+  576 + 1 * 16,
+  576 + 2 * 16,
+  576 + 3 * 16,
+  576 + 4 * 16,
+  576 + 5 * 16,
+  576 + 6 * 16,
+  576 + 7 * 16,
+  576 + 8 * 16,
+  576 + 9 * 16,
+  576 + 10 * 16,
+  576 + 11 * 16,
+  576 + 12 * 16,
+  576 + 13 * 16,
+  576 + 14 * 16,
+  576 + 15 * 16		/* Upper 128bit of ... %ymm15 (128 bits each).  */
+};
+
+#define XSAVE_AVXH_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_avxh_offset[regnum - I387_YMM0H_REGNUM (tdep)])
+
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+void
+i387_supply_xsave (struct regcache *regcache, int regnum,
+		   const void *xsave)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  const gdb_byte *regs = xsave;
+  int i;
+  unsigned int clear_bv;
+  const gdb_byte *p;
+  enum
+    {
+      none = 0x0,
+      x87 = 0x1,
+      sse = 0x2,
+      avxh = 0x4,
+      all = x87 | sse | avxh
+    } regclass;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_xmm_regs > 0);
+
+  if (regnum == -1)
+    regclass = all;
+  else if (regnum >= I387_YMM0H_REGNUM (tdep)
+	   && regnum < I387_YMMENDH_REGNUM (tdep))
+    regclass = avxh;
+  else if (regnum >= I387_XMM0_REGNUM(tdep)
+	   && regnum < I387_MXCSR_REGNUM (tdep))
+    regclass = sse;
+  else if (regnum >= I387_ST0_REGNUM (tdep)
+	   && regnum < I387_FCTRL_REGNUM (tdep))
+    regclass = x87;
+  else
+    regclass = none;
+
+  if (regs != NULL && regclass != none)
+    {
+      /* Get `xstat_bv'.  */
+      const gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+
+      /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+	 vector registers if its bit in xstat_bv is zero.  */
+      clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+    }
+  else
+    clear_bv = I386_XSTATE_AVX_MASK;
+
+  switch (regclass)
+    {
+    case none:
+      break;
+
+    case avxh:
+      if ((clear_bv & I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case sse:
+      if ((clear_bv & I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = FXSAVE_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case x87:
+      if ((clear_bv & I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = FXSAVE_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case all:
+      /* Hanle the upper YMM registers.  */
+      if ((tdep->xcr0 & I386_XSTATE_AVX))
+	{
+	  if ((clear_bv & I386_XSTATE_AVX))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_YMM0H_REGNUM (tdep);
+	       i < I387_YMMENDH_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = XSAVE_AVXH_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+
+      /* Handle the XMM registers.  */
+      if ((tdep->xcr0 & I386_XSTATE_SSE))
+	{
+	  if ((clear_bv & I386_XSTATE_SSE))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_XMM0_REGNUM (tdep);
+	       i < I387_MXCSR_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = FXSAVE_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+
+      /* Handle the x87 registers.  */
+      if ((tdep->xcr0 & I386_XSTATE_X87))
+	{
+	  if ((clear_bv & I386_XSTATE_X87))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_ST0_REGNUM (tdep);
+	       i < I387_FCTRL_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = FXSAVE_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+      break;
+    }
+
+  /* Only handle x87 control registers.  */
+  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	if (regs == NULL)
+	  {
+	    regcache_raw_supply (regcache, i, NULL);
+	    continue;
+	  }
+
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte val[4];
+
+	    memcpy (val, FXSAVE_ADDR (tdep, regs, i), 2);
+	    val[2] = val[3] = 0;
+	    if (i == I387_FOP_REGNUM (tdep))
+	      val[1] &= ((1 << 3) - 1);
+	    else if (i== I387_FTAG_REGNUM (tdep))
+	      {
+		/* The fxsave area contains a simplified version of
+		   the tag word.  We have to look at the actual 80-bit
+		   FP data to recreate the traditional i387 tag word.  */
+
+		unsigned long ftag = 0;
+		int fpreg;
+		int top;
+
+		top = ((FXSAVE_ADDR (tdep, regs,
+				     I387_FSTAT_REGNUM (tdep)))[1] >> 3);
+		top &= 0x7;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag;
+
+		    if (val[0] & (1 << fpreg))
+		      {
+			int regnum = (fpreg + 8 - top) % 8 
+				       + I387_ST0_REGNUM (tdep);
+			tag = i387_tag (FXSAVE_ADDR (tdep, regs, regnum));
+		      }
+		    else
+		      tag = 3;		/* Empty */
+
+		    ftag |= tag << (2 * fpreg);
+		  }
+		val[0] = ftag & 0xff;
+		val[1] = (ftag >> 8) & 0xff;
+	      }
+	    regcache_raw_supply (regcache, i, val);
+	  }
+	else 
+	  regcache_raw_supply (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    {
+      p = regs == NULL ? NULL : FXSAVE_MXCSR_ADDR (regs);
+      regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep), p);
+    }
+}
+
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+void
+i387_collect_xsave (const struct regcache *regcache, int regnum,
+		    void *xsave, int gcore)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  gdb_byte *regs = xsave;
+  int i;
+  enum
+    {
+      none = 0x0,
+      check = 0x1,
+      x87 = 0x2 | check,
+      sse = 0x4 | check,
+      avxh = 0x8 | check,
+      all = x87 | sse | avxh
+    } regclass;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_xmm_regs > 0);
+
+  if (regnum == -1)
+    regclass = all;
+  else if (regnum >= I387_YMM0H_REGNUM (tdep)
+	   && regnum < I387_YMMENDH_REGNUM (tdep))
+    regclass = avxh;
+  else if (regnum >= I387_XMM0_REGNUM(tdep)
+	   && regnum < I387_MXCSR_REGNUM (tdep))
+    regclass = sse;
+  else if (regnum >= I387_ST0_REGNUM (tdep)
+	   && regnum < I387_FCTRL_REGNUM (tdep))
+    regclass = x87;
+  else
+    regclass = none;
+
+  if (gcore)
+    {
+      /* Update XCR0 and `xstate_bv' with XCR0 for gcore.  */
+      if (tdep->xsave_xcr0_offset != -1)
+	memcpy (regs + tdep->xsave_xcr0_offset, &tdep->xcr0, 8);
+      memcpy (XSAVE_XSTATE_BV_ADDR (regs), &tdep->xcr0, 8);
+
+      switch (regclass)
+	{
+	default:
+	  abort ();
+
+	case all:
+	  /* Handle the upper YMM registers.  */
+	  if ((tdep->xcr0 & I386_XSTATE_AVX))
+	    for (i = I387_YMM0H_REGNUM (tdep);
+		 i < I387_YMMENDH_REGNUM (tdep); i++)
+	      regcache_raw_collect (regcache, i,
+				    XSAVE_AVXH_ADDR (tdep, regs, i));
+
+	  /* Handle the XMM registers.  */
+	  if ((tdep->xcr0 & I386_XSTATE_SSE))
+	    for (i = I387_XMM0_REGNUM (tdep);
+		 i < I387_MXCSR_REGNUM (tdep); i++)
+	      regcache_raw_collect (regcache, i,
+				    FXSAVE_ADDR (tdep, regs, i));
+
+	  /* Handle the x87 registers.  */
+	  if ((tdep->xcr0 & I386_XSTATE_X87))
+	    for (i = I387_ST0_REGNUM (tdep);
+		 i < I387_FCTRL_REGNUM (tdep); i++)
+	      regcache_raw_collect (regcache, i,
+				    FXSAVE_ADDR (tdep, regs, i));
+	  break;
+
+	case x87:
+	  regcache_raw_collect (regcache, regnum,
+				FXSAVE_ADDR (tdep, regs, regnum));
+	  return;
+
+	case sse:
+	  regcache_raw_collect (regcache, regnum,
+				FXSAVE_ADDR (tdep, regs, regnum));
+	  return;
+
+	case avxh:
+	  regcache_raw_collect (regcache, regnum,
+				XSAVE_AVXH_ADDR (tdep, regs, regnum));
+	  return;
+	}
+    }
+  else
+    {
+      if ((regclass & check))
+	{
+	  gdb_byte raw[I386_MAX_REGISTER_SIZE];
+	  gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+	  unsigned int xstate_bv = 0;
+	  /* The supported bits in `xstat_bv' are 1 byte. */
+	  unsigned int clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+	  gdb_byte *p;
+
+	  /* Clear register set if its bit in xstat_bv is zero.  */
+	  if (clear_bv)
+	    {
+	      if ((clear_bv & I386_XSTATE_AVX))
+		for (i = I387_YMM0H_REGNUM (tdep);
+		     i < I387_YMMENDH_REGNUM (tdep); i++)
+		  memset (XSAVE_AVXH_ADDR (tdep, regs, i), 0, 16);
+
+	      if ((clear_bv & I386_XSTATE_SSE))
+		for (i = I387_XMM0_REGNUM (tdep);
+		     i < I387_MXCSR_REGNUM (tdep); i++)
+		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 16);
+
+	      if ((clear_bv & I386_XSTATE_X87))
+		for (i = I387_ST0_REGNUM (tdep);
+		     i < I387_FCTRL_REGNUM (tdep); i++)
+		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 10);
+	    }
+
+	  if (regclass == all)
+	    {
+	      /* Check if any upper YMM registers are changed.  */
+	      if ((tdep->xcr0 & I386_XSTATE_AVX))
+		for (i = I387_YMM0H_REGNUM (tdep);
+		     i < I387_YMMENDH_REGNUM (tdep); i++)
+		  {
+		    regcache_raw_collect (regcache, i, raw);
+		    p = XSAVE_AVXH_ADDR (tdep, regs, i);
+		    if (memcmp (raw, p, 16))
+		      {
+			xstate_bv |= I386_XSTATE_AVX;
+			memcpy (p, raw, 16);
+		      }
+		  }
+
+	      /* Check if any SSE registers are changed.  */
+	      if ((tdep->xcr0 & I386_XSTATE_SSE))
+		for (i = I387_XMM0_REGNUM (tdep);
+		     i < I387_MXCSR_REGNUM (tdep); i++)
+		  {
+		    regcache_raw_collect (regcache, i, raw);
+		    p = FXSAVE_ADDR (tdep, regs, i);
+		    if (memcmp (raw, p, 16))
+		      {
+			xstate_bv |= I386_XSTATE_SSE;
+			memcpy (p, raw, 16);
+		      }
+		  }
+
+	      /* Check if any X87 registers are changed.  */
+	      if ((tdep->xcr0 & I386_XSTATE_X87))
+		for (i = I387_ST0_REGNUM (tdep);
+		     i < I387_FCTRL_REGNUM (tdep); i++)
+		  {
+		    regcache_raw_collect (regcache, i, raw);
+		    p = FXSAVE_ADDR (tdep, regs, i);
+		    if (memcmp (raw, p, 10))
+		      {
+			xstate_bv |= I386_XSTATE_X87;
+			memcpy (p, raw, 10);
+		      }
+		  }
+	    }
+	  else
+	    {
+	      /* Check if REGNUM is changed.  */
+	      regcache_raw_collect (regcache, regnum, raw);
+
+	      switch (regclass)
+		{
+		default:
+		  abort ();
+
+		case avxh:
+		  /* This is an upper YMM register.  */
+		  p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 16))
+		    {
+		      xstate_bv |= I386_XSTATE_AVX;
+		      memcpy (p, raw, 16);
+		    }
+		  break;
+
+		case sse:
+		  /* This is an SSE register.  */
+		  p = FXSAVE_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 16))
+		    {
+		      xstate_bv |= I386_XSTATE_SSE;
+		      memcpy (p, raw, 16);
+		    }
+		  break;
+
+		case x87:
+		  /* This is an x87 register.  */
+		  p = FXSAVE_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 10))
+		    {
+		      xstate_bv |= I386_XSTATE_X87;
+		      memcpy (p, raw, 10);
+		    }
+		  break;
+		}
+	    }
+
+	  /* Update the corresponding bits in `xstate_bv' if any SSE/AVX
+	     registers are changed.  */
+	  if (xstate_bv)
+	    {
+	      /* The supported bits in `xstat_bv' are 1 byte.  */
+	      *xstate_bv_p |= (gdb_byte) xstate_bv;
+
+	      switch (regclass)
+		{
+		default:
+		  abort ();
+
+		case all:
+		  break;
+
+		case x87:
+		case sse:
+		case avxh:
+		  /* Register REGNUM has been updated.  Return.  */
+		  return;
+		}
+	    }
+	  else
+	    {
+	      /* Return if REGNUM isn't changed.  */
+	      if (regclass != all)
+		return;
+	    }
+	}
+    }
+
+  /* Only handle x87 control registers.  */
+  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte buf[4];
+
+	    regcache_raw_collect (regcache, i, buf);
+
+	    if (i == I387_FOP_REGNUM (tdep))
+	      {
+		/* The opcode occupies only 11 bits.  Make sure we
+                   don't touch the other bits.  */
+		buf[1] &= ((1 << 3) - 1);
+		buf[1] |= ((FXSAVE_ADDR (tdep, regs, i))[1] & ~((1 << 3) - 1));
+	      }
+	    else if (i == I387_FTAG_REGNUM (tdep))
+	      {
+		/* Converting back is much easier.  */
+
+		unsigned short ftag;
+		int fpreg;
+
+		ftag = (buf[1] << 8) | buf[0];
+		buf[0] = 0;
+		buf[1] = 0;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag = (ftag >> (fpreg * 2)) & 3;
+
+		    if (tag != 3)
+		      buf[0] |= (1 << fpreg);
+		  }
+	      }
+	    memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
+	  }
+	else
+	  regcache_raw_collect (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    regcache_raw_collect (regcache, I387_MXCSR_REGNUM (tdep),
+			  FXSAVE_MXCSR_ADDR (regs));
+}
+
 /* Recreate the FTW (tag word) valid bits from the 80-bit FP data in
    *RAW.  */
 
diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
index 645eb91..976fa11 100644
--- a/gdb/i387-tdep.h
+++ b/gdb/i387-tdep.h
@@ -33,6 +33,8 @@ struct ui_file;
 #define I387_ST0_REGNUM(tdep) ((tdep)->st0_regnum)
 #define I387_NUM_XMM_REGS(tdep) ((tdep)->num_xmm_regs)
 #define I387_MM0_REGNUM(tdep) ((tdep)->mm0_regnum)
+#define I387_NUM_YMM_REGS(tdep) ((tdep)->num_ymm_regs)
+#define I387_YMM0H_REGNUM(tdep) ((tdep)->ymm0h_regnum)
 
 #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
 #define I387_FSTAT_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 1)
@@ -45,6 +47,8 @@ struct ui_file;
 #define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
 #define I387_MXCSR_REGNUM(tdep) \
   (I387_XMM0_REGNUM (tdep) + I387_NUM_XMM_REGS (tdep))
+#define I387_YMMENDH_REGNUM(tdep) \
+  (I387_YMM0H_REGNUM (tdep) + I387_NUM_YMM_REGS (tdep))
 
 /* Print out the i387 floating point state.  */
 
@@ -99,6 +103,11 @@ extern void i387_collect_fsave (const struct regcache *regcache, int regnum,
 extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 				const void *fxsave);
 
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+extern void i387_supply_xsave (struct regcache *regcache, int regnum,
+			       const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -107,6 +116,11 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 extern void i387_collect_fxsave (const struct regcache *regcache, int regnum,
 				 void *fxsave);
 
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+extern void i387_collect_xsave (const struct regcache *regcache,
+				int regnum, void *xsave, int gcore);
+
 /* Prepare the FPU stack in REGCACHE for a function return.  */
 
 extern void i387_return_value (struct gdbarch *gdbarch,

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-06 22:23         ` PATCH: 6/6 [2nd try]: " H.J. Lu
@ 2010-03-12 17:25           ` H.J. Lu
  2010-03-27 16:07             ` Daniel Jacobowitz
  2010-03-29  1:09             ` PATCH: 6/6 [3rd " H.J. Lu
  0 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-12 17:25 UTC (permalink / raw)
  To: GDB

On Sat, Mar 06, 2010 at 02:22:50PM -0800, H.J. Lu wrote:
> Hi,
> 
> Here are gdbserver changes to support AVX.  OK to install?
> 
> Thanks.
> 
> 

Here is the updated patch.  Any comments/suggestions?


H.J.
---
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* Makefile.in (clean): Updated.
	(i386-avx.o): New.
	(i386-avx.c): Likewise.
	(i386-avx-linux.o): Likewise.
	(i386-avx-linux.c): Likewise.
	(amd64-avx.o): Likewise.
	(amd64-avx.c): Likewise.
	(amd64-avx-linux.o): Likewise.
	(amd64-avx-linux.c): Likewise.

	* configure.srv (srv_i386_regobj): Add i386-avx.o.
	(srv_i386_linux_regobj): Add i386-avx-linux.o.
	(srv_amd64_regobj): Add amd64-avx.o.
	(srv_amd64_linux_regobj): Add amd64-avx-linux.o.
	(srv_i386_32bit_xmlfiles): Add i386/32bit-avx.xml.
	(srv_i386_64bit_xmlfiles): Add i386/64bit-avx.xml.
	(srv_i386_xmlfiles): Add i386/i386-avx.xml.
	(srv_amd64_xmlfiles): Add i386/amd64-avx.xml.
	(srv_i386_linux_xmlfiles): Add i386/i386-avx-linux.xml.
	(srv_amd64_linux_xmlfiles): Add i386/amd64-avx-linux.xml.

	* i387-fp.c: Include "i386-xstate.h".
	(i387_xsave): New.
	(i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* i387-fp.h (i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* linux-arm-low.c (target_regsets): Initialize nt_type to 0.
	* linux-crisv32-low.c (target_regsets): Likewise.
	* linux-m68k-low.c (target_regsets): Likewise.
	* linux-mips-low.c (target_regsets): Likewise.
	* linux-ppc-low.c (target_regsets): Likewise.
	* linux-s390-low.c (target_regsets): Likewise.
	* linux-sh-low.c (target_regsets): Likewise.
	* linux-sparc-low.c (target_regsets): Likewise.
	* linux-xtensa-low.c (target_regsets): Likewise.

	* linux-low.c: Include <sys/uio.h>.
	(regsets_fetch_inferior_registers): Support nt_type.
	(regsets_store_inferior_registers): Likewise.
	(linux_process_qsupported): New.
	(linux_target_ops): Add linux_process_qsupported.

	* linux-low.h (regset_info): Add nt_type.
	(linux_target_ops): Add process_qsupported.

	* linux-x86-low.c: Include "i386-xstate.h", "elf/common.h" and
	<sys/uio.h>.
	(init_registers_i386_avx_linux): New.
	(init_registers_amd64_avx_linux): Likewise.
	(PTRACE_GETREGSET): Likewise.
	(PTRACE_SETREGSET): Likewise.
	(x86_fill_xstateregset): Likewise.
	(x86_store_xstateregset): Likewise.
	(x86_linux_process_qsupported): Likewise.
	(target_regsets): Add NT_X86_XSTATE entry and Initialize nt_type.
	(the_low_target): Add x86_linux_process_qsupported.

	* server.c (use_xml): New.
	(get_features_xml): Don't use XML file if use_xml is 0.
	(handle_query): Call target_process_qsupported.

	* server.h (use_xml): New.

	* target.h (target_ops): Add process_qsupported.
	(target_process_qsupported): New.

diff --git a/gdb/gdbserver/Makefile.in b/gdb/gdbserver/Makefile.in
index 7fecced..2ec9784 100644
--- a/gdb/gdbserver/Makefile.in
+++ b/gdb/gdbserver/Makefile.in
@@ -217,6 +217,8 @@ clean:
 	rm -f powerpc-isa205-vsx64l.c
 	rm -f s390-linux32.c s390-linux64.c s390x-linux64.c
 	rm -f xml-builtin.c stamp-xml
+	rm -f i386-avx.c i386-avx-linux.c
+	rm -f amd64-avx.c amd64-avx-linux.c
 
 maintainer-clean realclean distclean: clean
 	rm -f nm.h tm.h xm.h config.status config.h stamp-h config.log
@@ -351,6 +353,12 @@ i386.c : $(srcdir)/../regformats/i386/i386.dat $(regdat_sh)
 i386-linux.o : i386-linux.c $(regdef_h)
 i386-linux.c : $(srcdir)/../regformats/i386/i386-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-linux.dat i386-linux.c
+i386-avx.o : i386-avx.c $(regdef_h)
+i386-avx.c : $(srcdir)/../regformats/i386/i386-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx.dat i386-avx.c
+i386-avx-linux.o : i386-avx-linux.c $(regdef_h)
+i386-avx-linux.c : $(srcdir)/../regformats/i386/i386-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx-linux.dat i386-avx-linux.c
 reg-ia64.o : reg-ia64.c $(regdef_h)
 reg-ia64.c : $(srcdir)/../regformats/reg-ia64.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-ia64.dat reg-ia64.c
@@ -438,6 +446,12 @@ amd64.c : $(srcdir)/../regformats/i386/amd64.dat $(regdat_sh)
 amd64-linux.o : amd64-linux.c $(regdef_h)
 amd64-linux.c : $(srcdir)/../regformats/i386/amd64-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-linux.dat amd64-linux.c
+amd64-avx.o : amd64-avx.c $(regdef_h)
+amd64-avx.c : $(srcdir)/../regformats/i386/amd64-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx.dat amd64-avx.c
+amd64-avx-linux.o : amd64-avx-linux.c $(regdef_h)
+amd64-avx-linux.c : $(srcdir)/../regformats/i386/amd64-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx-linux.dat amd64-avx-linux.c
 reg-xtensa.o : reg-xtensa.c $(regdef_h)
 reg-xtensa.c : $(srcdir)/../regformats/reg-xtensa.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-xtensa.dat reg-xtensa.c
diff --git a/gdb/gdbserver/configure.srv b/gdb/gdbserver/configure.srv
index e5818cd..a2f4323 100644
--- a/gdb/gdbserver/configure.srv
+++ b/gdb/gdbserver/configure.srv
@@ -22,17 +22,17 @@
 # Default hostio_last_error implementation
 srv_hostio_err_objs="hostio-errno.o"
 
-srv_i386_regobj=i386.o
-srv_i386_linux_regobj=i386-linux.o
-srv_amd64_regobj=amd64.o
-srv_amd64_linux_regobj=amd64-linux.o
+srv_i386_regobj="i386.o i386-avx.o"
+srv_i386_linux_regobj="i386-linux.o i386-avx-linux.o"
+srv_amd64_regobj="amd64.o x86-64-avx.o"
+srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
-srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml"
-srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml"
-srv_i386_xmlfiles="i386/i386.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_xmlfiles="i386/amd64.xml $srv_i386_64bit_xmlfiles"
-srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
+srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
+srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
+srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_xmlfiles="i386/amd64.xml i386/amd64-avx.xml $srv_i386_64bit_xmlfiles"
+srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/i386-avx-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/amd64-avx-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
 
 # Input is taken from the "${target}" variable.
 
diff --git a/gdb/gdbserver/i387-fp.c b/gdb/gdbserver/i387-fp.c
index 7ef4ba3..5461022 100644
--- a/gdb/gdbserver/i387-fp.c
+++ b/gdb/gdbserver/i387-fp.c
@@ -19,6 +19,7 @@
 
 #include "server.h"
 #include "i387-fp.h"
+#include "i386-xstate.h"
 
 int num_xmm_registers = 8;
 
@@ -72,6 +73,46 @@ struct i387_fxsave {
   unsigned char xmm_space[256];
 };
 
+struct i387_xsave {
+  /* All these are only sixteen bits, plus padding, except for fop (which
+     is only eleven bits), and fooff / fioff (which are 32 bits each).  */
+  unsigned short fctrl;
+  unsigned short fstat;
+  unsigned short ftag;
+  unsigned short fop;
+  unsigned int fioff;
+  unsigned short fiseg;
+  unsigned short pad1;
+  unsigned int fooff;
+  unsigned short foseg;
+  unsigned short pad12;
+
+  unsigned int mxcsr;
+  unsigned int mxcsr_mask;
+
+  /* Space for eight 80-bit FP values in 128-bit spaces.  */
+  unsigned char st_space[128];
+
+  /* Space for eight 128-bit XMM values, or 16 on x86-64.  */
+  unsigned char xmm_space[256];
+
+  unsigned char reserved1[48];
+
+  /* The extended control register 0 (the XFEATURE_ENABLED_MASK
+     register).  */
+  unsigned long long xcr0;
+
+  unsigned char reserved2[40];
+
+  /* The XSTATE_BV bit vector.  */
+  unsigned long long xstate_bv;
+
+  unsigned char reserved3[56];
+
+  /* Space for eight upper 128-bit YMM values, or 16 on x86-64.  */
+  unsigned char ymmh_space[256];
+};
+
 void
 i387_cache_to_fsave (struct regcache *regcache, void *buf)
 {
@@ -199,6 +240,128 @@ i387_cache_to_fxsave (struct regcache *regcache, void *buf)
   fp->foseg = val;
 }
 
+void
+i387_cache_to_xsave (struct regcache *regcache, void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  int i;
+  unsigned long val, val2;
+  unsigned int clear_bv;
+  unsigned long long xstate_bv = 0;
+  char raw[16];
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Clear part in x87 and vector registers if its bit in xstat_bv is
+     zero.  */
+  if (clear_bv)
+    {
+      if ((clear_bv & I386_XSTATE_X87))
+	for (i = 0; i < 8; i++)
+	  memset (((char *) &fp->st_space[0]) + i * 16, 0, 10);
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->xmm_space[0]) + i * 16, 0, 16);
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->ymmh_space[0]) + i * 16, 0, 16);
+    }
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      for (i = 0; i < 8; i++)
+	{
+	  collect_register (regcache, i + st0_regnum, raw);
+	  p = ((char *) &fp->st_space[0]) + i * 16;
+	  if (memcmp (raw, p, 10))
+	    {
+	      xstate_bv |= I386_XSTATE_X87;
+	      memcpy (p, raw, 10);
+	    }
+	}
+    }
+
+  /* Check if any SSE registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_SSE;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Check if any AVX registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + ymm0h_regnum, raw);
+	  p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_AVX;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Update the corresponding bits in xstate_bv if any SSE/AVX
+     registers are changed.  */
+  fp->xstate_bv |= xstate_bv;
+
+  collect_register_by_name (regcache, "fioff", &fp->fioff);
+  collect_register_by_name (regcache, "fooff", &fp->fooff);
+  collect_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* This one's 11 bits... */
+  collect_register_by_name (regcache, "fop", &val2);
+  fp->fop = (val2 & 0x7FF) | (fp->fop & 0xF800);
+
+  /* Some registers are 16-bit.  */
+  collect_register_by_name (regcache, "fctrl", &val);
+  fp->fctrl = val;
+
+  collect_register_by_name (regcache, "fstat", &val);
+  fp->fstat = val;
+
+  /* Convert to the simplifed tag form stored in fxsave data.  */
+  collect_register_by_name (regcache, "ftag", &val);
+  val &= 0xFFFF;
+  val2 = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag = (val >> (i * 2)) & 3;
+
+      if (tag != 3)
+	val2 |= (1 << i);
+    }
+  fp->ftag = val2;
+
+  collect_register_by_name (regcache, "fiseg", &val);
+  fp->fiseg = val;
+
+  collect_register_by_name (regcache, "foseg", &val);
+  fp->foseg = val;
+}
+
 static int
 i387_ftag (struct i387_fxsave *fp, int regno)
 {
@@ -296,3 +459,107 @@ i387_fxsave_to_cache (struct regcache *regcache, const void *buf)
   val = (fp->fop) & 0x7FF;
   supply_register_by_name (regcache, "fop", &val);
 }
+
+void
+i387_xsave_to_cache (struct regcache *regcache, const void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  struct i387_fxsave *fxp = (struct i387_fxsave *) buf;
+  int i, top;
+  unsigned long val;
+  unsigned int clear_bv;
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      if ((clear_bv & I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < 8; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->st_space[0]) + i * 16;
+	  supply_register (regcache, i + st0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  supply_register (regcache, i + ymm0h_regnum, p);
+	}
+    }
+
+  supply_register_by_name (regcache, "fioff", &fp->fioff);
+  supply_register_by_name (regcache, "fooff", &fp->fooff);
+  supply_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* Some registers are 16-bit.  */
+  val = fp->fctrl & 0xFFFF;
+  supply_register_by_name (regcache, "fctrl", &val);
+
+  val = fp->fstat & 0xFFFF;
+  supply_register_by_name (regcache, "fstat", &val);
+
+  /* Generate the form of ftag data that GDB expects.  */
+  top = (fp->fstat >> 11) & 0x7;
+  val = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag;
+      if (fp->ftag & (1 << i))
+	tag = i387_ftag (fxp, (i + 8 - top) % 8);
+      else
+	tag = 3;
+      val |= tag << (2 * i);
+    }
+  supply_register_by_name (regcache, "ftag", &val);
+
+  val = fp->fiseg & 0xFFFF;
+  supply_register_by_name (regcache, "fiseg", &val);
+
+  val = fp->foseg & 0xFFFF;
+  supply_register_by_name (regcache, "foseg", &val);
+
+  val = (fp->fop) & 0x7FF;
+  supply_register_by_name (regcache, "fop", &val);
+}
+
+/* Default to SSE.  */
+unsigned long long x86_xcr0 = I386_XSTATE_SSE_MASK;
diff --git a/gdb/gdbserver/i387-fp.h b/gdb/gdbserver/i387-fp.h
index d1e0681..ed1a322 100644
--- a/gdb/gdbserver/i387-fp.h
+++ b/gdb/gdbserver/i387-fp.h
@@ -26,6 +26,11 @@ void i387_fsave_to_cache (struct regcache *regcache, const void *buf);
 void i387_cache_to_fxsave (struct regcache *regcache, void *buf);
 void i387_fxsave_to_cache (struct regcache *regcache, const void *buf);
 
+void i387_cache_to_xsave (struct regcache *regcache, void *buf);
+void i387_xsave_to_cache (struct regcache *regcache, const void *buf);
+
+extern unsigned long long x86_xcr0;
+
 extern int num_xmm_registers;
 
 #endif /* I387_FP_H */
diff --git a/gdb/gdbserver/linux-arm-low.c b/gdb/gdbserver/linux-arm-low.c
index 54668f8..32bd7bb 100644
--- a/gdb/gdbserver/linux-arm-low.c
+++ b/gdb/gdbserver/linux-arm-low.c
@@ -354,16 +354,16 @@ arm_arch_setup (void)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, 18 * 4,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 18 * 4,
     GENERAL_REGS,
     arm_fill_gregset, arm_store_gregset },
-  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 16 * 8 + 6 * 4,
+  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 0, 16 * 8 + 6 * 4,
     EXTENDED_REGS,
     arm_fill_wmmxregset, arm_store_wmmxregset },
-  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 32 * 8 + 4,
+  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 0, 32 * 8 + 4,
     EXTENDED_REGS,
     arm_fill_vfpregset, arm_store_vfpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-crisv32-low.c b/gdb/gdbserver/linux-crisv32-low.c
index 6ba48b6..d426c32 100644
--- a/gdb/gdbserver/linux-crisv32-low.c
+++ b/gdb/gdbserver/linux-crisv32-low.c
@@ -365,9 +365,9 @@ cris_store_gregset (const void *buf)
 typedef unsigned long elf_gregset_t[cris_num_regs];
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS, cris_fill_gregset, cris_store_gregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 6499ca7..4edb152 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -39,6 +39,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/vfs.h>
+#include <sys/uio.h>
 #ifndef ELFMAG0
 /* Don't include <linux/elf.h> here.  If it got included by gdb_proc_service.h
    then ELFMAG0 will have been defined.  If it didn't get included by
@@ -2281,14 +2282,15 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2297,10 +2299,21 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
 	}
 
       buf = xmalloc (regset->size);
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, data, nt_type);
 #endif
       if (res < 0)
 	{
@@ -2338,14 +2351,15 @@ regsets_store_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2358,10 +2372,21 @@ regsets_store_inferior_registers (struct regcache *regcache)
       /* First fill the buffer with the current register set contents,
 	 in case there are any items in the kernel's regset that are
 	 not in gdbserver's regcache.  */
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, &iov, data);
 #endif
 
       if (res == 0)
@@ -2371,9 +2396,9 @@ regsets_store_inferior_registers (struct regcache *regcache)
 
 	  /* Only now do we write the register set.  */
 #ifndef __sparc__
-	  res = ptrace (regset->set_request, pid, 0, buf);
+	  res = ptrace (regset->set_request, pid, nt_type, data);
 #else
-	  res = ptrace (regset->set_request, pid, buf, 0);
+	  res = ptrace (regset->set_request, pid, data, nt_type);
 #endif
 	}
 
@@ -3434,6 +3459,13 @@ linux_core_of_thread (ptid_t ptid)
   return core;
 }
 
+static void
+linux_process_qsupported (const char *query)
+{
+  if (the_low_target.process_qsupported != NULL)
+    the_low_target.process_qsupported (query);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -3477,7 +3509,8 @@ static struct target_ops linux_target_ops = {
 #else
   NULL,
 #endif
-  linux_core_of_thread
+  linux_core_of_thread,
+  linux_process_qsupported
 };
 
 static void
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index 82ad00c..57e7adb 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -35,6 +35,9 @@ enum regset_type {
 struct regset_info
 {
   int get_request, set_request;
+  /* If NT_TYPE isn't 0, it will be passed to ptrace as the 3rd
+     argument and the 4th argument should be "const struct iovec *".  */
+  int nt_type;
   int size;
   enum regset_type type;
   regset_fill_func fill_function;
@@ -111,6 +114,9 @@ struct linux_target_ops
 
   /* Hook to call prior to resuming a thread.  */
   void (*prepare_to_resume) (struct lwp_info *);
+
+  /* Hook to support target specific qSupported.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct linux_target_ops the_low_target;
diff --git a/gdb/gdbserver/linux-m68k-low.c b/gdb/gdbserver/linux-m68k-low.c
index 14e3864..6c98bb1 100644
--- a/gdb/gdbserver/linux-m68k-low.c
+++ b/gdb/gdbserver/linux-m68k-low.c
@@ -112,14 +112,14 @@ m68k_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     m68k_fill_gregset, m68k_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     m68k_fill_fpregset, m68k_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static const unsigned char m68k_breakpoint[] = { 0x4E, 0x4F };
diff --git a/gdb/gdbserver/linux-mips-low.c b/gdb/gdbserver/linux-mips-low.c
index 70f6700..1c04b2e 100644
--- a/gdb/gdbserver/linux-mips-low.c
+++ b/gdb/gdbserver/linux-mips-low.c
@@ -343,12 +343,12 @@ mips_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, 38 * 8, GENERAL_REGS,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 38 * 8, GENERAL_REGS,
     mips_fill_gregset, mips_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 33 * 8, FP_REGS,
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, 33 * 8, FP_REGS,
     mips_fill_fpregset, mips_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-ppc-low.c b/gdb/gdbserver/linux-ppc-low.c
index 10a1309..0dab604 100644
--- a/gdb/gdbserver/linux-ppc-low.c
+++ b/gdb/gdbserver/linux-ppc-low.c
@@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
      fetch them every time, but still fall back to PTRACE_PEEKUSER for the
      general registers.  Some kernels support these, but not the newer
      PPC_PTRACE_GETREGS.  */
-  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
+  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
   ppc_fill_vsxregset, ppc_store_vsxregset },
   { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
     ppc_fill_vrregset, ppc_store_vrregset },
-  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
+  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
     ppc_fill_evrregset, ppc_store_evrregset },
-  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-s390-low.c b/gdb/gdbserver/linux-s390-low.c
index 5460f57..eb865dc 100644
--- a/gdb/gdbserver/linux-s390-low.c
+++ b/gdb/gdbserver/linux-s390-low.c
@@ -181,8 +181,8 @@ static void s390_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 
diff --git a/gdb/gdbserver/linux-sh-low.c b/gdb/gdbserver/linux-sh-low.c
index 9d27e7f..87a0dd2 100644
--- a/gdb/gdbserver/linux-sh-low.c
+++ b/gdb/gdbserver/linux-sh-low.c
@@ -104,8 +104,8 @@ static void sh_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-sparc-low.c b/gdb/gdbserver/linux-sparc-low.c
index 0bb5f2f..e0bfe81 100644
--- a/gdb/gdbserver/linux-sparc-low.c
+++ b/gdb/gdbserver/linux-sparc-low.c
@@ -260,13 +260,13 @@ sparc_reinsert_addr (void)
 
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     sparc_fill_gregset, sparc_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (fpregset_t),
     FP_REGS,
     sparc_fill_fpregset, sparc_store_fpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index 496baa2..b9981ec 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -24,6 +24,8 @@
 #include "linux-low.h"
 #include "i387-fp.h"
 #include "i386-low.h"
+#include "i386-xstate.h"
+#include "elf/common.h"
 
 #include "gdb_proc_service.h"
 
@@ -31,10 +33,24 @@
 void init_registers_i386_linux (void);
 /* Defined in auto-generated file amd64-linux.c.  */
 void init_registers_amd64_linux (void);
+/* Defined in auto-generated file i386-avx-linux.c.  */
+void init_registers_i386_avx_linux (void);
+/* Defined in auto-generated file amd64-avx-linux.c.  */
+void init_registers_amd64_avx_linux (void);
 
 #include <sys/reg.h>
 #include <sys/procfs.h>
 #include <sys/ptrace.h>
+#include <sys/uio.h>
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
 
 #ifndef PTRACE_GET_THREAD_AREA
 #define PTRACE_GET_THREAD_AREA 25
@@ -252,6 +268,18 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 
 #endif
 
+static void
+x86_fill_xstateregset (struct regcache *regcache, void *buf)
+{
+  i387_cache_to_xsave (regcache, buf);
+}
+
+static void
+x86_store_xstateregset (struct regcache *regcache, const void *buf)
+{
+  i387_xsave_to_cache (regcache, buf);
+}
+
 /* ??? The non-biarch i386 case stores all the i387 regs twice.
    Once in i387_.*fsave.* and once in i387_.*fxsave.*.
    This is, presumably, to handle the case where PTRACE_[GS]ETFPXREGS
@@ -264,21 +292,28 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 struct regset_info target_regsets[] =
 {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     x86_fill_gregset, x86_store_gregset },
+  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
+# ifdef __x86_64__
+    FP_REGS,
+# else
+    EXTENDED_REGS,
+# endif
+    x86_fill_xstateregset, x86_store_xstateregset },
 # ifndef __x86_64__
 #  ifdef HAVE_PTRACE_GETFPXREGS
-  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, sizeof (elf_fpxregset_t),
+  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, 0, sizeof (elf_fpxregset_t),
     EXTENDED_REGS,
     x86_fill_fpxregset, x86_store_fpxregset },
 #  endif
 # endif
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     x86_fill_fpregset, x86_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static CORE_ADDR
@@ -772,6 +807,65 @@ x86_siginfo_fixup (struct siginfo *native, void *inf, int direction)
   return 0;
 }
 \f
+/* Process qSupported query, "x86=xml".  Update the buffer size for
+   PTRACE_GETREGSET.  */
+
+static void
+x86_linux_process_qsupported (const char *query)
+{
+  int pid;
+  unsigned long long xstateregs[I386_XSTATE_SSE_SIZE / sizeof (long long)];
+  struct iovec iov;
+
+  /* Return if gdb doesn't support XML.   */
+  if (query == NULL || strcmp (query, "x86=xml") != 0)
+    {
+      use_xml = 0;
+      return;
+    }
+
+  /* Check if XSAVE extended state is supported.  */
+  pid = pid_of (get_thread_lwp (current_inferior));
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+  /* Check if PTRACE_GETREGSET works.  */
+  if (ptrace (PTRACE_GETREGSET, pid,
+	      (unsigned int) NT_X86_XSTATE, (long) &iov) == 0)
+    {
+      struct regset_info *regset;
+      unsigned long long xcr0;
+
+      /* Get XCR0 from XSAVE extended state at byte 464.  */
+      xcr0 = xstateregs[464 / sizeof (long long)];
+
+      /* Use PTRACE_GETREGSET if it is available.  */
+      for (regset = target_regsets;
+	   regset->fill_function != NULL; regset++)
+	if (regset->get_request == PTRACE_GETREGSET)
+	  regset->size = I386_XSTATE_SIZE (xcr0);
+	else if (regset->type != GENERAL_REGS)
+	  regset->size = 0;
+
+      /* AVX is the highest feature we support.  */
+      if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+	{
+	  x86_xcr0 = xcr0;
+
+#ifdef __x86_64__
+	  /* I386 has 8 xmm regs.  */
+	  if (num_xmm_registers == 8)
+	    init_registers_i386_avx_linux ();
+	  else
+	    init_registers_amd64_avx_linux ();
+#else
+	  init_registers_i386_avx_linux ();
+#endif
+	}
+    }
+};
+
 /* Initialize gdbserver for the architecture of the inferior.  */
 
 static void
@@ -850,5 +944,6 @@ struct linux_target_ops the_low_target =
   x86_siginfo_fixup,
   x86_linux_new_process,
   x86_linux_new_thread,
-  x86_linux_prepare_to_resume
+  x86_linux_prepare_to_resume,
+  x86_linux_process_qsupported 
 };
diff --git a/gdb/gdbserver/linux-xtensa-low.c b/gdb/gdbserver/linux-xtensa-low.c
index c5ed351..8d0e73a 100644
--- a/gdb/gdbserver/linux-xtensa-low.c
+++ b/gdb/gdbserver/linux-xtensa-low.c
@@ -131,13 +131,13 @@ xtensa_store_xtregset (struct regcache *regcache, const void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     xtensa_fill_gregset, xtensa_store_gregset },
-  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, XTENSA_ELF_XTREG_SIZE,
+  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, 0, XTENSA_ELF_XTREG_SIZE,
     EXTENDED_REGS,
     xtensa_fill_xtregset, xtensa_store_xtregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 #if XCHAL_HAVE_BE
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index a03f877..6e46a7a 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -32,6 +32,13 @@
 #include <malloc.h>
 #endif
 
+int use_xml =
+#ifdef USE_XML
+  1;
+#else
+  0;
+#endif
+
 ptid_t cont_thread;
 ptid_t general_thread;
 ptid_t step_thread;
@@ -474,20 +481,19 @@ get_features_xml (const char *annex)
 	annex = gdbserver_xmltarget;
     }
 
-#ifdef USE_XML
-  {
-    extern const char *const xml_builtin[][2];
-    int i;
+  if (use_xml)
+    {
+      extern const char *const xml_builtin[][2];
+      int i;
 
-    /* Look for the annex.  */
-    for (i = 0; xml_builtin[i][0] != NULL; i++)
-      if (strcmp (annex, xml_builtin[i][0]) == 0)
-	break;
+      /* Look for the annex.  */
+      for (i = 0; xml_builtin[i][0] != NULL; i++)
+	if (strcmp (annex, xml_builtin[i][0]) == 0)
+	  break;
 
-    if (xml_builtin[i][0] != NULL)
-      return xml_builtin[i][1];
-  }
-#endif
+      if (xml_builtin[i][0] != NULL)
+	return xml_builtin[i][1];
+    }
 
   return NULL;
 }
@@ -1236,6 +1242,9 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
     {
       char *p = &own_buf[10];
 
+      /* Start processing qSupported packet.  */
+      target_process_qsupported (NULL);
+
       /* Process each feature being provided by GDB.  The first
 	 feature will follow a ':', and latter features will follow
 	 ';'.  */
@@ -1251,6 +1260,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		if (target_supports_multi_process ())
 		  multi_process = 1;
 	      }
+	    else if (strncmp (p, "x86:xstate=", 11) == 0)
+	      target_process_qsupported (p);
 	  }
 
       sprintf (own_buf, "PacketSize=%x;QPassSignals+", PBUFSIZ - 1);
diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h
index f46ee60..a9cd024 100644
--- a/gdb/gdbserver/server.h
+++ b/gdb/gdbserver/server.h
@@ -22,6 +22,8 @@
 
 #include "config.h"
 
+extern int use_xml;
+
 #ifdef __MINGW32CE__
 #include "wincecompat.h"
 #endif
diff --git a/gdb/gdbserver/target.h b/gdb/gdbserver/target.h
index ac68652..6109b1c 100644
--- a/gdb/gdbserver/target.h
+++ b/gdb/gdbserver/target.h
@@ -286,6 +286,9 @@ struct target_ops
 
   /* Returns the core given a thread, or -1 if not known.  */
   int (*core_of_thread) (ptid_t);
+
+  /* Target specific qSupported support.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct target_ops *the_target;
@@ -326,6 +329,10 @@ void set_target_ops (struct target_ops *);
   (the_target->supports_multi_process ? \
    (*the_target->supports_multi_process) () : 0)
 
+#define target_process_qsupported(query) \
+  if (the_target->process_qsupported) \
+    the_target->process_qsupported (query)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [2nd try]: Add AVX support (Update document)
  2010-03-12 16:46     ` H.J. Lu
@ 2010-03-12 18:15       ` Eli Zaretskii
  0 siblings, 0 replies; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-12 18:15 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Fri, 12 Mar 2010 08:46:27 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> On Sat, Mar 06, 2010 at 02:19:46PM -0800, H.J. Lu wrote:
> > Hi,
> > 
> > This patch updates document for AVX support.  OK to install?
> >  
> > Thanks.
> > 
> > 
> > H.J.
> > ---
> > 2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>
> > 
> > 	* gdb.texinfo (General Query Packets): Document x86=xml.
> > 	(i386 Features): Add org.gnu.gdb.i386.avx.
> > 
> 
> Here is the updated patch,

Thanks, it's okay with me.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-12 16:49       ` H.J. Lu
@ 2010-03-13  1:38         ` H.J. Lu
  2010-03-29  1:11         ` PATCH: 3/6 [3rd " H.J. Lu
  1 sibling, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-13  1:38 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Fri, Mar 12, 2010 at 08:49:30AM -0800, H.J. Lu wrote:
> On Sat, Mar 06, 2010 at 02:20:37PM -0800, H.J. Lu wrote:
> > Hi,
> > 
> > Here are i386 changes to support AVX. OK to install?
> >  
> > Thanks.
> > 
> 
> Here is the updated patch. Any comments/suggestions?
> 

Here is the updated patch which removes xstate_size_n_of_int64.
Any comments/suggestions?

Thanks.


H.J.
---
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(xstate_size): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Make it global.  Add
	".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_regset_sections): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_dbx_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.  Call set_gdbarch_qsupported.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_xmm_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..f047d35
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,40 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define I386_XSTATE_X87		(1ULL << 0)
+#define I386_XSTATE_SSE		(1ULL << 1)
+#define I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	(I386_XSTATE_X87 | I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	(I386_XSTATE_SSE_MASK | I386_XSTATE_AVX)
+
+#define I386_XSTATE_SSE_SIZE	576
+#define I386_XSTATE_AVX_SIZE	832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..a724956 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,22 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +117,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +131,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +379,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  char xstateregs[xstate_size];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  char xstateregs[xstate_size];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = xstate_size;
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +564,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +578,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +636,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +650,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +949,48 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static unsigned long long xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	}
+
+      i386_linux_update_xstateregset (i386_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..66ecf84 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,13 +50,15 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
-static struct core_regset_section i386_linux_regset_sections[] =
+struct core_regset_section i386_linux_regset_sections[] =
 {
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size)
+{
+  int i;
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  for (i = 0; regset_sections[i].sect_name != NULL; i++)
+    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
+      {
+	if (xstate_size)
+	  regset_sections[i].size = xstate_size;
+	else
+	  regset_sections[i].sect_name = NULL;
+	break;
+      }
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+unsigned long long
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  unsigned long long xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      gdb_assert (size >= I386_XSTATE_SSE_SIZE);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +634,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +694,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +913,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..8881fea 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,45 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern unsigned long long i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section i386_linux_regset_sections[];
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset
+  (struct core_regset_section *regset_sections, unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
+  enabled state mask,  which is same as the 64bit mask returned by the
+  xgetbv's XCR0). We can use this mask as well as the mask saved in the
+  xstate_hdr bytes to interpret what states the processor/OS supports and
+  what state is in, used/initialized conditions, for the particular
+  process/thread.  */
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 83275ac..8a5f06a 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -50,11 +50,13 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -73,6 +75,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -149,18 +163,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
-static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -200,6 +243,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -208,6 +264,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -245,7 +303,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
   else if (reg >= 21 && reg <= 28)
     {
       /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      int ymm0_regnum = tdep->ymm0_regnum;
+
+      if (ymm0_regnum >= 0
+	  && i386_xmm_regnum_p (gdbarch, reg))
+	return reg - 21 + ymm0_regnum;
+      else
+	return reg - 21 + I387_XMM0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2183,6 +2247,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo XMM registers.  We can't use
+   tdesc_find_type since XMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2233,6 +2350,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2284,7 +2403,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 16byte. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 16byte.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2333,7 +2467,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 16byte.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 16byte.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2580,6 +2727,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2607,6 +2776,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2800,46 +2979,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5652,7 +5845,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -5662,13 +5856,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may be set by ABI-specific.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -5677,7 +5895,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -5692,6 +5910,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -5712,6 +5931,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -5741,6 +5962,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -5857,9 +6080,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* The default ABI includes general-purpose registers, floating-point
+     registers, the SSE registers and the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -5870,10 +6097,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -5881,24 +6113,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -5908,16 +6141,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -5943,6 +6186,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_fast_tracepoint_valid_at (gdbarch,
 					i386_fast_tracepoint_valid_at);
 
+  /* Tell remote stub that we support XML target description.  */
+  set_gdbarch_qsupported (gdbarch, "x86=xml");
+
   return gdbarch;
 }
 
@@ -6000,4 +6246,5 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 }
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..1ce9d8c 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this gdb.
+   */
+  unsigned long long xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes)
  2010-03-12 17:01         ` H.J. Lu
@ 2010-03-13  1:38           ` H.J. Lu
  2010-03-29  1:07           ` PATCH: 4/6 [3rd " H.J. Lu
  1 sibling, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-13  1:38 UTC (permalink / raw)
  To: GDB

On Fri, Mar 12, 2010 at 09:00:51AM -0800, H.J. Lu wrote:
> On Sun, Mar 07, 2010 at 01:33:04PM -0800, H.J. Lu wrote:
> > On Sat, Mar 06, 2010 at 02:21:22PM -0800, H.J. Lu wrote:
> > > Hi,
> > > 
> > > Here are the amd64 changes to support AVX.  OK to install?
> > > 
> > 
> 
> Hi,
> 
> Here is the updated patch. Any comments/suggestions?
> 
> Thanks.
> 

Here is the updated patch which removes xstate_size_n_of_int64.
Any comments/suggestions?

Thanks.


H.J.
---
2010-03-12  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(xstate_size): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_regset_sections): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_dwarf_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..481014d 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -51,6 +54,21 @@
 #include "i386-linux-tdep.h"
 #include "amd64-nat.h"
 #include "i386-nat.h"
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* The extended state size in bytes.  */
+static unsigned int xstate_size;
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
@@ -73,6 +91,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +119,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +204,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[xstate_size];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +263,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[xstate_size];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = xstate_size;
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      return;
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
+
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +743,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static unsigned long long xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +758,52 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      unsigned long long xstateregs[(I386_XSTATE_SSE_SIZE
+				     / sizeof (long long))];
+      struct iovec iov;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = I386_XSTATE_SSE_SIZE;
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	}
+
+      i386_linux_update_xstateregset (amd64_linux_regset_sections,
+				      xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..3473926 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,8 +28,11 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
+#include "i386-xstate.h"
 
 #include "gdb_string.h"
 
@@ -38,6 +41,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +49,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1250,12 +1263,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  unsigned long long xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1315,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1338,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1540,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..734f117 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,17 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Supported register note sections.  */
+extern struct core_regset_section amd64_linux_regset_sections[];
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index e5cfa71..aa4acfb 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -163,6 +177,8 @@ static const int amd64_dwarf_regmap_len =
 static int
 amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 {
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
   int regnum = -1;
 
   if (reg >= 0 && reg < amd64_dwarf_regmap_len)
@@ -170,6 +186,9 @@ amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 
   if (regnum == -1)
     warning (_("Unmapped DWARF Register #%d encountered."), reg);
+  else if (ymm0_regnum >= 0
+	   && i386_xmm_regnum_p (gdbarch, regnum))
+    regnum += ymm0_regnum - I387_XMM0_REGNUM (tdep);
 
   return regnum;
 }
@@ -234,6 +253,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -242,6 +274,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2148,6 +2182,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2166,6 +2222,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2228,6 +2294,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 16;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2241,6 +2314,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2321,6 +2396,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2356,6 +2432,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2379,3 +2479,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-12  0:00           ` H.J. Lu
@ 2010-03-27 14:55             ` Mark Kettenis
  2010-03-27 15:30               ` Daniel Jacobowitz
  2010-03-27 15:33               ` H.J. Lu
  0 siblings, 2 replies; 115+ messages in thread
From: Mark Kettenis @ 2010-03-27 14:55 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4271 bytes --]

> Date: Thu, 11 Mar 2010 16:00:05 -0800
> From: "H.J. Lu" <hjl.tools@gmail.com>
>
> >> +
> >> +#include "i386-xstate.h"
> >> +
> >> +#ifndef PTRACE_GETREGSET
> >> +#define PTRACE_GETREGSET     0x4204
> >> +#endif
> >> +
> >> +#ifndef PTRACE_SETREGSET
> >> +#define PTRACE_SETREGSET     0x4205
> >> +#endif
> >> +
> >> +#endif       /* NM_LINUX_XSTATE_H */
> >
> > Do we really have to hardcode constants like this in GDB?  They should
> > be available in through kernel/libc headers.  Are Drepper and Torvalds
> > still fighting over that issue?
> 
> They are in Linux kernel 2.6.34-rc1. Do we enable gdb support only
> with the new kernel/glibc headers? I compiled gdb on RHEL4 and it
> works fine.  There are:
> 
> #ifndef PTRACE_GET_THREAD_AREA
> #define PTRACE_GET_THREAD_AREA 25
>  ...
> #ifndef PTRACE_ARCH_PRCTL
> #define PTRACE_ARCH_PRCTL      30
> 
> in amd64-linux-nat.c.

Yes, we have done that in the past, but I think we should stop adding
#defines like that.  

> >> +
> >> +/* The extended state size in unit of int64.  We use array of int64 for
> >> +   better alignment.  */
> >> +static unsigned int xstate_size_n_of_int64;
> >
> > Does alignment really matter?  I'd rather do without this additional
> > complication.
> 
> "xcr0" is a 64bit value.  It is nice to use array of uint64 to access it.

But there are also 32-bit, 128-bit and 256-bit fields in the xstate.
Therefore I think that typing it as an array of 64-bit values is
misleading.

> >> +static int
> >> +fetch_xstateregs (struct regcache *regcache, int tid)
> >> +{
> >> +  unsigned long long xstateregs[xstate_size_n_of_int64];
> >> +  struct iovec iov;
> >> +
> >> +  if (!have_ptrace_getregset)
> >> +    return 0;
> >> +
> >> +  iov.iov_base = xstateregs;
> >> +  iov.iov_len = xstate_size;
> >> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> >> +           (int) &iov) < 0)
> >
> > This can't be right!
> 
> Why? That is the kernel interface in 2.6.34-rc1.

Well, at least your usage of casts here and further on in the code is
inconsistent.  But casting a pointer to an int acts as a red flag to
me.  Given that the userland prototype for ptrace(2) is:

extern long int ptrace (enum __ptrace_request __request, ...) __THROW;

I believe those casts shouldn't be necessary.

> >> +    perror_with_name (_("Couldn't read extended state status"));
> >> +
> >> +  i387_supply_xsave (regcache, -1, xstateregs);
> >> +  return 1;
> >> +}
> >> +
> >> +/* Store all valid registers in GDB's register array covered by the
> >> +   PTRACE_SETREGSET request into the process/thread specified by TID.
> >> +   Return non-zero if successful, zero otherwise.  */
> >> +
> >> +static int
> >> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
> >> +{
> >> +  unsigned long long xstateregs[xstate_size_n_of_int64];
> >
> > I think it is better to use I386_XSTATE_MAX_SIZE here.
> 
> That is how the kernel interface works.  Whatever value
> I386_XSTATE_MAX_SIZE is today won't be the same tomorrow. We will
> increase it in the coming years. But the same gdb binary will work
> fine since kernel will only copy number of bytes specified in
> iov.iov_len, which is all gdb cares/needs.

Yes, you'll need to raise I386_XSTATE_MAX_SIZE whenever the kernel
gains support for different/larger xstates.  But I don't see a problem
with that, since you'll have to make changes to GDB to support those
variants anyway.  That reminds me:

> >> +  struct iovec iov;
> >> +
> >> +  if (!have_ptrace_getregset)
> >> +    return 0;
> >> +
> >> +  iov.iov_base = xstateregs;
> >> +  iov.iov_len = xstate_size;

You probably should set iov.iov_len to sizeof(xstateregs) here.

> >>        if (store_fpxregs (regcache, tid, regno))
> >> @@ -858,7 +943,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
> >>  static const struct target_desc *
> >>  i386_linux_read_description (struct target_ops *ops)
> >>  {
> >> -  return tdesc_i386_linux;
> >> +  static unsigned long long xcr0;
> >
> > Is it really ok, to cache this?  Will the Linux kernel always return
> > the same value for every process?
> 
> xcr0 is a processor value and will be the same for all processes.

ok; but could you change this to uint64_t?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 5/6 [2nd try]: Add AVX support (i387 changes)
  2010-03-06 22:22       ` PATCH: 5/6 [2nd try]: " H.J. Lu
  2010-03-12 17:24         ` H.J. Lu
@ 2010-03-27 15:08         ` Mark Kettenis
  2010-03-27 15:15           ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-27 15:08 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

> Date: Sat, 6 Mar 2010 14:22:12 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> Hi,
> 
> Here are i387 changes to support AVX.  OK to install?

I can't help thinking that the i387_supply_xsave/i387_collect_xsave
functions can be written in a simpler way, but I guess for now this is
acceptable.  Hope you don't mind if a I rewrite that logic at some
point though.

> H.J.
> ---
> 2010-03-06  H.J. Lu  <hongjiu.lu@intel.com>
> 
> 	* i387-tdep.c: Include "i386-xstate.h".
> 	(XSAVE_XSTATE_BV_ADDR): New.
> 	(xsave_avxh_offset): Likewise.
> 	(XSAVE_AVXH_ADDR): Likewise.
> 	(i387_supply_xsave): Likewise.
> 	(i387_collect_xsave): Likewise.
> 
> 	* i387-tdep.h (I387_NUM_YMM_REGS): New.
> 	(I387_YMM0H_REGNUM): Likewise.
> 	(I387_YMMENDH_REGNUM): Likewise.
> 	(i387_supply_xsave): Likewise.
> 	(i387_collect_xsave): Likewise.
> 
> diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
> index 3fb5b56..197af7f 100644
> --- a/gdb/i387-tdep.c
> +++ b/gdb/i387-tdep.c
> @@ -34,6 +34,7 @@
>  
>  #include "i386-tdep.h"
>  #include "i387-tdep.h"
> +#include "i386-xstate.h"
>  
>  /* Print the floating point number specified by RAW.  */
>  
> @@ -677,6 +678,518 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
>  			  FXSAVE_MXCSR_ADDR (regs));
>  }
>  
> +/* `xstate_bv' is at byte offset 512.  */
> +#define XSAVE_XSTATE_BV_ADDR(xsave) (xsave + 512)
> +
> +/* At xsave_avxh_offset[REGNUM] you'll find the offset to the location in
> +   the upper 128bit of AVX register data structure used by the "xsave"
> +   instruction where GDB register REGNUM is stored.  */
> +
> +static int xsave_avxh_offset[] =
> +{
> +  576 + 0 * 16,		/* Upper 128bit of %ymm0 through ...  */
> +  576 + 1 * 16,
> +  576 + 2 * 16,
> +  576 + 3 * 16,
> +  576 + 4 * 16,
> +  576 + 5 * 16,
> +  576 + 6 * 16,
> +  576 + 7 * 16,
> +  576 + 8 * 16,
> +  576 + 9 * 16,
> +  576 + 10 * 16,
> +  576 + 11 * 16,
> +  576 + 12 * 16,
> +  576 + 13 * 16,
> +  576 + 14 * 16,
> +  576 + 15 * 16		/* Upper 128bit of ... %ymm15 (128 bits each).  */
> +};
> +
> +#define XSAVE_AVXH_ADDR(tdep, xsave, regnum) \
> +  (xsave + xsave_avxh_offset[regnum - I387_YMM0H_REGNUM (tdep)])
> +
> +/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
> +
> +void
> +i387_supply_xsave (struct regcache *regcache, int regnum,
> +		   const void *xsave)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
> +  const gdb_byte *regs = xsave;
> +  int i;
> +  unsigned int clear_bv;
> +  const gdb_byte *p;
> +  enum
> +    {
> +      none = 0x0,
> +      x87 = 0x1,
> +      sse = 0x2,
> +      avxh = 0x4,
> +      all = x87 | sse | avxh
> +    } regclass;
> +
> +  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
> +  gdb_assert (tdep->num_xmm_regs > 0);
> +
> +  if (regnum == -1)
> +    regclass = all;
> +  else if (regnum >= I387_YMM0H_REGNUM (tdep)
> +	   && regnum < I387_YMMENDH_REGNUM (tdep))
> +    regclass = avxh;
> +  else if (regnum >= I387_XMM0_REGNUM(tdep)
> +	   && regnum < I387_MXCSR_REGNUM (tdep))
> +    regclass = sse;
> +  else if (regnum >= I387_ST0_REGNUM (tdep)
> +	   && regnum < I387_FCTRL_REGNUM (tdep))
> +    regclass = x87;
> +  else
> +    regclass = none;
> +
> +  if (regs != NULL && regclass != none)
> +    {
> +      /* Get `xstat_bv'.  */
> +      const gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
> +
> +      /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
> +	 vector registers if its bit in xstat_bv is zero.  */
> +      clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
> +    }
> +  else
> +    clear_bv = I386_XSTATE_MAX_MASK;
> +
> +  switch (regclass)
> +    {
> +    case none:
> +      break;
> +
> +    case avxh:
> +      if ((clear_bv & bit_I386_XSTATE_AVX))
> +	p = NULL;
> +      else
> +	p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
> +      regcache_raw_supply (regcache, regnum, p);
> +      return;
> +
> +    case sse:
> +      if ((clear_bv & bit_I386_XSTATE_SSE))
> +	p = NULL;
> +      else
> +	p = FXSAVE_ADDR (tdep, regs, regnum);
> +      regcache_raw_supply (regcache, regnum, p);
> +      return;
> +
> +    case x87:
> +      if ((clear_bv & bit_I386_XSTATE_X87))
> +	p = NULL;
> +      else
> +	p = FXSAVE_ADDR (tdep, regs, regnum);
> +      regcache_raw_supply (regcache, regnum, p);
> +      return;
> +
> +    case all:
> +      /* Hanle the upper YMM registers.  */
> +      if ((tdep->xcr0 & bit_I386_XSTATE_AVX))
> +	{
> +	  if ((clear_bv & bit_I386_XSTATE_AVX))
> +	    p = NULL;
> +	  else
> +	    p = regs;
> +
> +	  for (i = I387_YMM0H_REGNUM (tdep);
> +	       i < I387_YMMENDH_REGNUM (tdep); i++)
> +	    {
> +	      if (p != NULL)
> +		p = XSAVE_AVXH_ADDR (tdep, regs, i);
> +	      regcache_raw_supply (regcache, i, p);
> +	    }
> +	}
> +
> +      /* Handle the XMM registers.  */
> +      if ((tdep->xcr0 & bit_I386_XSTATE_SSE))
> +	{
> +	  if ((clear_bv & bit_I386_XSTATE_SSE))
> +	    p = NULL;
> +	  else
> +	    p = regs;
> +
> +	  for (i = I387_XMM0_REGNUM (tdep);
> +	       i < I387_MXCSR_REGNUM (tdep); i++)
> +	    {
> +	      if (p != NULL)
> +		p = FXSAVE_ADDR (tdep, regs, i);
> +	      regcache_raw_supply (regcache, i, p);
> +	    }
> +	}
> +
> +      /* Handle the x87 registers.  */
> +      if ((tdep->xcr0 & bit_I386_XSTATE_X87))
> +	{
> +	  if ((clear_bv & bit_I386_XSTATE_X87))
> +	    p = NULL;
> +	  else
> +	    p = regs;
> +
> +	  for (i = I387_ST0_REGNUM (tdep);
> +	       i < I387_FCTRL_REGNUM (tdep); i++)
> +	    {
> +	      if (p != NULL)
> +		p = FXSAVE_ADDR (tdep, regs, i);
> +	      regcache_raw_supply (regcache, i, p);
> +	    }
> +	}
> +      break;
> +    }
> +
> +  /* Only handle x87 control registers.  */
> +  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
> +    if (regnum == -1 || regnum == i)
> +      {
> +	if (regs == NULL)
> +	  {
> +	    regcache_raw_supply (regcache, i, NULL);
> +	    continue;
> +	  }
> +
> +	/* Most of the FPU control registers occupy only 16 bits in
> +	   the xsave extended state.  Give those a special treatment.  */
> +	if (i != I387_FIOFF_REGNUM (tdep)
> +	    && i != I387_FOOFF_REGNUM (tdep))
> +	  {
> +	    gdb_byte val[4];
> +
> +	    memcpy (val, FXSAVE_ADDR (tdep, regs, i), 2);
> +	    val[2] = val[3] = 0;
> +	    if (i == I387_FOP_REGNUM (tdep))
> +	      val[1] &= ((1 << 3) - 1);
> +	    else if (i== I387_FTAG_REGNUM (tdep))
> +	      {
> +		/* The fxsave area contains a simplified version of
> +		   the tag word.  We have to look at the actual 80-bit
> +		   FP data to recreate the traditional i387 tag word.  */
> +
> +		unsigned long ftag = 0;
> +		int fpreg;
> +		int top;
> +
> +		top = ((FXSAVE_ADDR (tdep, regs,
> +				     I387_FSTAT_REGNUM (tdep)))[1] >> 3);
> +		top &= 0x7;
> +
> +		for (fpreg = 7; fpreg >= 0; fpreg--)
> +		  {
> +		    int tag;
> +
> +		    if (val[0] & (1 << fpreg))
> +		      {
> +			int regnum = (fpreg + 8 - top) % 8 
> +				       + I387_ST0_REGNUM (tdep);
> +			tag = i387_tag (FXSAVE_ADDR (tdep, regs, regnum));
> +		      }
> +		    else
> +		      tag = 3;		/* Empty */
> +
> +		    ftag |= tag << (2 * fpreg);
> +		  }
> +		val[0] = ftag & 0xff;
> +		val[1] = (ftag >> 8) & 0xff;
> +	      }
> +	    regcache_raw_supply (regcache, i, val);
> +	  }
> +	else 
> +	  regcache_raw_supply (regcache, i, FXSAVE_ADDR (tdep, regs, i));
> +      }
> +
> +  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
> +    {
> +      p = regs == NULL ? NULL : FXSAVE_MXCSR_ADDR (regs);
> +      regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep), p);
> +    }
> +}
> +
> +/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
> +
> +void
> +i387_collect_xsave (const struct regcache *regcache, int regnum,
> +		    void *xsave, int gcore)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
> +  gdb_byte *regs = xsave;
> +  int i;
> +  enum
> +    {
> +      none = 0x0,
> +      check = 0x1,
> +      x87 = 0x2 | check,
> +      sse = 0x4 | check,
> +      avxh = 0x8 | check,
> +      all = x87 | sse | avxh
> +    } regclass;
> +
> +  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
> +  gdb_assert (tdep->num_xmm_regs > 0);
> +
> +  if (regnum == -1)
> +    regclass = all;
> +  else if (regnum >= I387_YMM0H_REGNUM (tdep)
> +	   && regnum < I387_YMMENDH_REGNUM (tdep))
> +    regclass = avxh;
> +  else if (regnum >= I387_XMM0_REGNUM(tdep)
> +	   && regnum < I387_MXCSR_REGNUM (tdep))
> +    regclass = sse;
> +  else if (regnum >= I387_ST0_REGNUM (tdep)
> +	   && regnum < I387_FCTRL_REGNUM (tdep))
> +    regclass = x87;
> +  else
> +    regclass = none;
> +
> +  if (gcore)
> +    {
> +      /* Update XCR0 and `xstate_bv' with XCR0 for gcore.  */
> +      if (tdep->xsave_xcr0_offset != -1)
> +	memcpy (regs + tdep->xsave_xcr0_offset, &tdep->xcr0, 8);
> +      memcpy (XSAVE_XSTATE_BV_ADDR (regs), &tdep->xcr0, 8);
> +
> +      switch (regclass)
> +	{
> +	default:
> +	  abort ();
> +
> +	case all:
> +	  /* Handle the upper YMM registers.  */
> +	  if ((tdep->xcr0 & bit_I386_XSTATE_AVX))
> +	    for (i = I387_YMM0H_REGNUM (tdep);
> +		 i < I387_YMMENDH_REGNUM (tdep); i++)
> +	      regcache_raw_collect (regcache, i,
> +				    XSAVE_AVXH_ADDR (tdep, regs, i));
> +
> +	  /* Handle the XMM registers.  */
> +	  if ((tdep->xcr0 & bit_I386_XSTATE_SSE))
> +	    for (i = I387_XMM0_REGNUM (tdep);
> +		 i < I387_MXCSR_REGNUM (tdep); i++)
> +	      regcache_raw_collect (regcache, i,
> +				    FXSAVE_ADDR (tdep, regs, i));
> +
> +	  /* Handle the x87 registers.  */
> +	  if ((tdep->xcr0 & bit_I386_XSTATE_X87))
> +	    for (i = I387_ST0_REGNUM (tdep);
> +		 i < I387_FCTRL_REGNUM (tdep); i++)
> +	      regcache_raw_collect (regcache, i,
> +				    FXSAVE_ADDR (tdep, regs, i));
> +	  break;
> +
> +	case x87:
> +	  regcache_raw_collect (regcache, regnum,
> +				FXSAVE_ADDR (tdep, regs, regnum));
> +	  return;
> +
> +	case sse:
> +	  regcache_raw_collect (regcache, regnum,
> +				FXSAVE_ADDR (tdep, regs, regnum));
> +	  return;
> +
> +	case avxh:
> +	  regcache_raw_collect (regcache, regnum,
> +				XSAVE_AVXH_ADDR (tdep, regs, regnum));
> +	  return;
> +	}
> +    }
> +  else
> +    {
> +      if ((regclass & check))
> +	{
> +	  gdb_byte raw[I386_MAX_REGISTER_SIZE];
> +	  gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
> +	  unsigned int xstate_bv = 0;
> +	  /* The supported bits in `xstat_bv' are 1 byte. */
> +	  unsigned int clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
> +	  gdb_byte *p;
> +
> +	  /* Clear register set if its bit in xstat_bv is zero.  */
> +	  if (clear_bv)
> +	    {
> +	      if ((clear_bv & bit_I386_XSTATE_AVX))
> +		for (i = I387_YMM0H_REGNUM (tdep);
> +		     i < I387_YMMENDH_REGNUM (tdep); i++)
> +		  memset (XSAVE_AVXH_ADDR (tdep, regs, i), 0, 16);
> +
> +	      if ((clear_bv & bit_I386_XSTATE_SSE))
> +		for (i = I387_XMM0_REGNUM (tdep);
> +		     i < I387_MXCSR_REGNUM (tdep); i++)
> +		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 16);
> +
> +	      if ((clear_bv & bit_I386_XSTATE_X87))
> +		for (i = I387_ST0_REGNUM (tdep);
> +		     i < I387_FCTRL_REGNUM (tdep); i++)
> +		  memset (FXSAVE_ADDR (tdep, regs, i), 0, 10);
> +	    }
> +
> +	  if (regclass == all)
> +	    {
> +	      /* Check if any upper YMM registers are changed.  */
> +	      if ((tdep->xcr0 & bit_I386_XSTATE_AVX))
> +		for (i = I387_YMM0H_REGNUM (tdep);
> +		     i < I387_YMMENDH_REGNUM (tdep); i++)
> +		  {
> +		    regcache_raw_collect (regcache, i, raw);
> +		    p = XSAVE_AVXH_ADDR (tdep, regs, i);
> +		    if (memcmp (raw, p, 16))
> +		      {
> +			xstate_bv |= bit_I386_XSTATE_AVX;
> +			memcpy (p, raw, 16);
> +		      }
> +		  }
> +
> +	      /* Check if any SSE registers are changed.  */
> +	      if ((tdep->xcr0 & bit_I386_XSTATE_SSE))
> +		for (i = I387_XMM0_REGNUM (tdep);
> +		     i < I387_MXCSR_REGNUM (tdep); i++)
> +		  {
> +		    regcache_raw_collect (regcache, i, raw);
> +		    p = FXSAVE_ADDR (tdep, regs, i);
> +		    if (memcmp (raw, p, 16))
> +		      {
> +			xstate_bv |= bit_I386_XSTATE_SSE;
> +			memcpy (p, raw, 16);
> +		      }
> +		  }
> +
> +	      /* Check if any X87 registers are changed.  */
> +	      if ((tdep->xcr0 & bit_I386_XSTATE_X87))
> +		for (i = I387_ST0_REGNUM (tdep);
> +		     i < I387_FCTRL_REGNUM (tdep); i++)
> +		  {
> +		    regcache_raw_collect (regcache, i, raw);
> +		    p = FXSAVE_ADDR (tdep, regs, i);
> +		    if (memcmp (raw, p, 10))
> +		      {
> +			xstate_bv |= bit_I386_XSTATE_X87;
> +			memcpy (p, raw, 10);
> +		      }
> +		  }
> +	    }
> +	  else
> +	    {
> +	      /* Check if REGNUM is changed.  */
> +	      regcache_raw_collect (regcache, regnum, raw);
> +
> +	      switch (regclass)
> +		{
> +		default:
> +		  abort ();
> +
> +		case avxh:
> +		  /* This is an upper YMM register.  */
> +		  p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
> +		  if (memcmp (raw, p, 16))
> +		    {
> +		      xstate_bv |= bit_I386_XSTATE_AVX;
> +		      memcpy (p, raw, 16);
> +		    }
> +		  break;
> +
> +		case sse:
> +		  /* This is an SSE register.  */
> +		  p = FXSAVE_ADDR (tdep, regs, regnum);
> +		  if (memcmp (raw, p, 16))
> +		    {
> +		      xstate_bv |= bit_I386_XSTATE_SSE;
> +		      memcpy (p, raw, 16);
> +		    }
> +		  break;
> +
> +		case x87:
> +		  /* This is an x87 register.  */
> +		  p = FXSAVE_ADDR (tdep, regs, regnum);
> +		  if (memcmp (raw, p, 10))
> +		    {
> +		      xstate_bv |= bit_I386_XSTATE_X87;
> +		      memcpy (p, raw, 10);
> +		    }
> +		  break;
> +		}
> +	    }
> +
> +	  /* Update the corresponding bits in `xstate_bv' if any SSE/AVX
> +	     registers are changed.  */
> +	  if (xstate_bv)
> +	    {
> +	      /* The supported bits in `xstat_bv' are 1 byte.  */
> +	      *xstate_bv_p |= (gdb_byte) xstate_bv;
> +
> +	      switch (regclass)
> +		{
> +		default:
> +		  abort ();
> +
> +		case all:
> +		  break;
> +
> +		case x87:
> +		case sse:
> +		case avxh:
> +		  /* Register REGNUM has been updated.  Return.  */
> +		  return;
> +		}
> +	    }
> +	  else
> +	    {
> +	      /* Return if REGNUM isn't changed.  */
> +	      if (regclass != all)
> +		return;
> +	    }
> +	}
> +    }
> +
> +  /* Only handle x87 control registers.  */
> +  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
> +    if (regnum == -1 || regnum == i)
> +      {
> +	/* Most of the FPU control registers occupy only 16 bits in
> +	   the xsave extended state.  Give those a special treatment.  */
> +	if (i != I387_FIOFF_REGNUM (tdep)
> +	    && i != I387_FOOFF_REGNUM (tdep))
> +	  {
> +	    gdb_byte buf[4];
> +
> +	    regcache_raw_collect (regcache, i, buf);
> +
> +	    if (i == I387_FOP_REGNUM (tdep))
> +	      {
> +		/* The opcode occupies only 11 bits.  Make sure we
> +                   don't touch the other bits.  */
> +		buf[1] &= ((1 << 3) - 1);
> +		buf[1] |= ((FXSAVE_ADDR (tdep, regs, i))[1] & ~((1 << 3) - 1));
> +	      }
> +	    else if (i == I387_FTAG_REGNUM (tdep))
> +	      {
> +		/* Converting back is much easier.  */
> +
> +		unsigned short ftag;
> +		int fpreg;
> +
> +		ftag = (buf[1] << 8) | buf[0];
> +		buf[0] = 0;
> +		buf[1] = 0;
> +
> +		for (fpreg = 7; fpreg >= 0; fpreg--)
> +		  {
> +		    int tag = (ftag >> (fpreg * 2)) & 3;
> +
> +		    if (tag != 3)
> +		      buf[0] |= (1 << fpreg);
> +		  }
> +	      }
> +	    memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
> +	  }
> +	else
> +	  regcache_raw_collect (regcache, i, FXSAVE_ADDR (tdep, regs, i));
> +      }
> +
> +  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
> +    regcache_raw_collect (regcache, I387_MXCSR_REGNUM (tdep),
> +			  FXSAVE_MXCSR_ADDR (regs));
> +}
> +
>  /* Recreate the FTW (tag word) valid bits from the 80-bit FP data in
>     *RAW.  */
>  
> diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
> index 645eb91..976fa11 100644
> --- a/gdb/i387-tdep.h
> +++ b/gdb/i387-tdep.h
> @@ -33,6 +33,8 @@ struct ui_file;
>  #define I387_ST0_REGNUM(tdep) ((tdep)->st0_regnum)
>  #define I387_NUM_XMM_REGS(tdep) ((tdep)->num_xmm_regs)
>  #define I387_MM0_REGNUM(tdep) ((tdep)->mm0_regnum)
> +#define I387_NUM_YMM_REGS(tdep) ((tdep)->num_ymm_regs)
> +#define I387_YMM0H_REGNUM(tdep) ((tdep)->ymm0h_regnum)
>  
>  #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
>  #define I387_FSTAT_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 1)
> @@ -45,6 +47,8 @@ struct ui_file;
>  #define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
>  #define I387_MXCSR_REGNUM(tdep) \
>    (I387_XMM0_REGNUM (tdep) + I387_NUM_XMM_REGS (tdep))
> +#define I387_YMMENDH_REGNUM(tdep) \
> +  (I387_YMM0H_REGNUM (tdep) + I387_NUM_YMM_REGS (tdep))
>  
>  /* Print out the i387 floating point state.  */
>  
> @@ -99,6 +103,11 @@ extern void i387_collect_fsave (const struct regcache *regcache, int regnum,
>  extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
>  				const void *fxsave);
>  
> +/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
> +
> +extern void i387_supply_xsave (struct regcache *regcache, int regnum,
> +			       const void *xsave);
> +
>  /* Fill register REGNUM (if it is a floating-point or SSE register) in
>     *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
>     all registers.  This function doesn't touch any of the reserved
> @@ -107,6 +116,11 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
>  extern void i387_collect_fxsave (const struct regcache *regcache, int regnum,
>  				 void *fxsave);
>  
> +/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
> +
> +extern void i387_collect_xsave (const struct regcache *regcache,
> +				int regnum, void *xsave, int gcore);
> +
>  /* Prepare the FPU stack in REGCACHE for a function return.  */
>  
>  extern void i387_return_value (struct gdbarch *gdbarch,
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 5/6 [2nd try]: Add AVX support (i387 changes)
  2010-03-27 15:08         ` PATCH: 5/6 [2nd " Mark Kettenis
@ 2010-03-27 15:15           ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-27 15:15 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sat, Mar 27, 2010 at 8:08 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sat, 6 Mar 2010 14:22:12 -0800
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> Hi,
>>
>> Here are i387 changes to support AVX.  OK to install?
>
> I can't help thinking that the i387_supply_xsave/i387_collect_xsave
> functions can be written in a simpler way, but I guess for now this is
> acceptable.  Hope you don't mind if a I rewrite that logic at some
> point though.
>

That is fine with me as long as the new code is extensible and follows
processor architecture specification.

BTW, the version in gdbserver/i387-fp.c is much simpler.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-27 14:55             ` Mark Kettenis
@ 2010-03-27 15:30               ` Daniel Jacobowitz
  2010-03-27 16:05                 ` Mark Kettenis
  2010-03-27 15:33               ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-27 15:30 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: hjl.tools, gdb-patches

On Sat, Mar 27, 2010 at 03:54:56PM +0100, Mark Kettenis wrote:
> Yes, we have done that in the past, but I think we should stop adding
> #defines like that.  

I disagree.  The values are available in the kernel and glibc headers,
sure.  But it takes years before a new define in linux/ptrace.h is
widely available, and it is not uncommon for new kernels to enter use
faster than that.  This way, if someone builds a new GDB with AVX
support, and installs a new kernel with AVX support, they don't come
to us and ask why AVX support isn't in their GDB.

[I agree with everything else in Mark's review.]

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-27 14:55             ` Mark Kettenis
  2010-03-27 15:30               ` Daniel Jacobowitz
@ 2010-03-27 15:33               ` H.J. Lu
  2010-03-27 16:09                 ` Mark Kettenis
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-27 15:33 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sat, Mar 27, 2010 at 7:54 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Thu, 11 Mar 2010 16:00:05 -0800
>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>
>> >> +
>> >> +#include "i386-xstate.h"
>> >> +
>> >> +#ifndef PTRACE_GETREGSET
>> >> +#define PTRACE_GETREGSET     0x4204
>> >> +#endif
>> >> +
>> >> +#ifndef PTRACE_SETREGSET
>> >> +#define PTRACE_SETREGSET     0x4205
>> >> +#endif
>> >> +
>> >> +#endif       /* NM_LINUX_XSTATE_H */
>> >
>> > Do we really have to hardcode constants like this in GDB?  They should
>> > be available in through kernel/libc headers.  Are Drepper and Torvalds
>> > still fighting over that issue?
>>
>> They are in Linux kernel 2.6.34-rc1. Do we enable gdb support only
>> with the new kernel/glibc headers? I compiled gdb on RHEL4 and it
>> works fine.  There are:
>>
>> #ifndef PTRACE_GET_THREAD_AREA
>> #define PTRACE_GET_THREAD_AREA 25
>>  ...
>> #ifndef PTRACE_ARCH_PRCTL
>> #define PTRACE_ARCH_PRCTL      30
>>
>> in amd64-linux-nat.c.
>
> Yes, we have done that in the past, but I think we should stop adding
> #defines like that.

AVX gdb support only needs PTRACE_GETREGSET/PTRACE_SETREGSET,
which are fixed constants. I don't think we should require new kernel/glibc
header files for AVX support. I can change it to

#ifdef PTRACE_GETREGSET
#if PTRACE_GETREGSET != 0x4204
# error PTRACE_GETREGSET != 0x4204
#endif
#else
#define PTRACE_GETREGSET        0x4204
#endif


>> >> +
>> >> +/* The extended state size in unit of int64.  We use array of int64 for
>> >> +   better alignment.  */
>> >> +static unsigned int xstate_size_n_of_int64;
>> >
>> > Does alignment really matter?  I'd rather do without this additional
>> > complication.
>>
>> "xcr0" is a 64bit value.  It is nice to use array of uint64 to access it.
>
> But there are also 32-bit, 128-bit and 256-bit fields in the xstate.
> Therefore I think that typing it as an array of 64-bit values is
> misleading.

I will change it.

>> >> +static int
>> >> +fetch_xstateregs (struct regcache *regcache, int tid)
>> >> +{
>> >> +  unsigned long long xstateregs[xstate_size_n_of_int64];
>> >> +  struct iovec iov;
>> >> +
>> >> +  if (!have_ptrace_getregset)
>> >> +    return 0;
>> >> +
>> >> +  iov.iov_base = xstateregs;
>> >> +  iov.iov_len = xstate_size;
>> >> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
>> >> +           (int) &iov) < 0)
>> >
>> > This can't be right!
>>
>> Why? That is the kernel interface in 2.6.34-rc1.
>
> Well, at least your usage of casts here and further on in the code is
> inconsistent.  But casting a pointer to an int acts as a red flag to
> me.  Given that the userland prototype for ptrace(2) is:
>
> extern long int ptrace (enum __ptrace_request __request, ...) __THROW;
>
> I believe those casts shouldn't be necessary.

I will change it.

>> >> +    perror_with_name (_("Couldn't read extended state status"));
>> >> +
>> >> +  i387_supply_xsave (regcache, -1, xstateregs);
>> >> +  return 1;
>> >> +}
>> >> +
>> >> +/* Store all valid registers in GDB's register array covered by the
>> >> +   PTRACE_SETREGSET request into the process/thread specified by TID.
>> >> +   Return non-zero if successful, zero otherwise.  */
>> >> +
>> >> +static int
>> >> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
>> >> +{
>> >> +  unsigned long long xstateregs[xstate_size_n_of_int64];
>> >
>> > I think it is better to use I386_XSTATE_MAX_SIZE here.
>>
>> That is how the kernel interface works.  Whatever value
>> I386_XSTATE_MAX_SIZE is today won't be the same tomorrow. We will
>> increase it in the coming years. But the same gdb binary will work
>> fine since kernel will only copy number of bytes specified in
>> iov.iov_len, which is all gdb cares/needs.
>
> Yes, you'll need to raise I386_XSTATE_MAX_SIZE whenever the kernel
> gains support for different/larger xstates.  But I don't see a problem
> with that, since you'll have to make changes to GDB to support those
> variants anyway.  That reminds me:

I will remove I386_XSTATE_MAX_SIZE since it isn't needed by kernel.

>> >> +  struct iovec iov;
>> >> +
>> >> +  if (!have_ptrace_getregset)
>> >> +    return 0;
>> >> +
>> >> +  iov.iov_base = xstateregs;
>> >> +  iov.iov_len = xstate_size;
>
> You probably should set iov.iov_len to sizeof(xstateregs) here.

I will make the change.

>> >>        if (store_fpxregs (regcache, tid, regno))
>> >> @@ -858,7 +943,49 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
>> >>  static const struct target_desc *
>> >>  i386_linux_read_description (struct target_ops *ops)
>> >>  {
>> >> -  return tdesc_i386_linux;
>> >> +  static unsigned long long xcr0;
>> >
>> > Is it really ok, to cache this?  Will the Linux kernel always return
>> > the same value for every process?
>>
>> xcr0 is a processor value and will be the same for all processes.
>
> ok; but could you change this to uint64_t?
>

I will make the change.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-06 22:21     ` PATCH: 3/6 [2nd try]: " H.J. Lu
  2010-03-07 21:32       ` H.J. Lu
  2010-03-12 16:49       ` H.J. Lu
@ 2010-03-27 15:48       ` Mark Kettenis
  2010-03-28  1:37         ` H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-27 15:48 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

> Date: Sat, 6 Mar 2010 14:20:37 -0800
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> Hi,
> 
> Here are i386 changes to support AVX. OK to install?

OK, here's a review of the remainder of this part of the diff.  I'll
wait with reviewing the amd64 bits until we've got the i386 part
right, since a lot of what I'll say about i386 will also apply to
amd64.  OK?

> diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
> index b23c109..66ecf84 100644
> --- a/gdb/i386-linux-tdep.c
> +++ b/gdb/i386-linux-tdep.c
> @@ -23,6 +23,7 @@
>  #include "frame.h"
>  #include "value.h"
>  #include "regcache.h"
> +#include "regset.h"
>  #include "inferior.h"
>  #include "osabi.h"
>  #include "reggroups.h"
> @@ -36,9 +37,11 @@
>  #include "solib-svr4.h"
>  #include "symtab.h"
>  #include "arch-utils.h"
> -#include "regset.h"
>  #include "xml-syscall.h"
>  
> +#include "i387-tdep.h"
> +#include "i386-xstate.h"
> +
>  /* The syscall's XML filename for i386.  */
>  #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
>  
> @@ -47,13 +50,15 @@
>  #include <stdint.h>
>  
>  #include "features/i386/i386-linux.c"
> +#include "features/i386/i386-avx-linux.c"
>  
>  /* Supported register note sections.  */
> -static struct core_regset_section i386_linux_regset_sections[] =
> +struct core_regset_section i386_linux_regset_sections[] =

Why do you make this non-static?

>  {
>    { ".reg", 144, "general-purpose" },
>    { ".reg2", 108, "floating-point" },
>    { ".reg-xfp", 512, "extended floating-point" },
> +  { ".reg-xstate", 0, "XSAVE extended state" },
>    { NULL, 0 }
>  };
> @@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
>    0 * 4				/* %gs */
>  };
>  
> +/* Update XSAVE extended state register note section.  */
> +
> +void
> +i386_linux_update_xstateregset
> +  (struct core_regset_section *regset_sections, unsigned int xstate_size)
> +{
> +  int i;
> +
> +  /* Update the XSAVE extended state register note section for "gcore".
> +     Disable it if its size is 0.  */
> +  for (i = 0; regset_sections[i].sect_name != NULL; i++)
> +    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
> +      {
> +	if (xstate_size)
> +	  regset_sections[i].size = xstate_size;
> +	else
> +	  regset_sections[i].sect_name = NULL;
> +	break;
> +      }
> +}

What will happen if you have a single GDB connected to two different
remote targets, one with AVX support and one without?

> +/* Get XSAVE extended state xcr0 from core dump.  */
> +
> +unsigned long long
> +i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
> +			   struct target_ops *target, bfd *abfd)

If you follow my advice about using uint64_t for xr0, the return value
will have to be adjusted.

> +{
> +  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
> +  unsigned long long xcr0;
> +
> +  if (xstate)
> +    {
> +      size_t size = bfd_section_size (abfd, xstate);
> +
> +      gdb_assert (size >= I386_XSTATE_SSE_SIZE);

Isn't a gdb_assert() here a bit harsh?  What happens if you simply return 0?

> +      /* Check extended state size.  */
> +      if (size < I386_XSTATE_AVX_SIZE)
> +	xcr0 = I386_XSTATE_SSE_MASK;
> +      else
> +	{
> +	  char contents[8];
> +
> +	  if (! bfd_get_section_contents (abfd, xstate, contents,
> +					  (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
> +					  8))

Is that cast really necessary?

>  /* Get Linux/x86 target description from core dump.  */
>  
>  static const struct target_desc *
> @@ -568,12 +634,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
>  				  bfd *abfd)
>  {
>    asection *section = bfd_get_section_by_name (abfd, ".reg2");
> +  unsigned long long xcr0;
>  
>    if (section == NULL)
>      return NULL;
>  
>    /* Linux/i386.  */
> -  return tdesc_i386_linux;
> +  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
> +  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
> +    return tdesc_i386_avx_linux;
> +  else
> +    return tdesc_i386_linux;
>  }
>  
>  static void
> @@ -623,6 +694,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
>    tdep->sc_reg_offset = i386_linux_sc_reg_offset;
>    tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
>  
> +  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
> +
>    set_gdbarch_process_record (gdbarch, i386_process_record);
>    set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
>  
> @@ -840,4 +913,5 @@ _initialize_i386_linux_tdep (void)
>  
>    /* Initialize the Linux target description  */
>    initialize_tdesc_i386_linux ();
> +  initialize_tdesc_i386_avx_linux ();
>  }
> diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
> index 11f7295..8881fea 100644
> --- a/gdb/i386-linux-tdep.h
> +++ b/gdb/i386-linux-tdep.h
> @@ -30,12 +30,45 @@
>  /* Register number for the "orig_eax" pseudo-register.  If this
>     pseudo-register contains a value >= 0 it is interpreted as the
>     system call number that the kernel is supposed to restart.  */
> -#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
> +#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
>  
>  /* Total number of registers for GNU/Linux.  */
>  #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
>  
> +/* Get XSAVE extended state xcr0 from core dump.  */
> +extern unsigned long long i386_linux_core_read_xcr0
> +  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
> +
>  /* Linux target description.  */
>  extern struct target_desc *tdesc_i386_linux;
> +extern struct target_desc *tdesc_i386_avx_linux;
> +
> +/* Supported register note sections.  */
> +extern struct core_regset_section i386_linux_regset_sections[];
> +
> +/* Update XSAVE extended state register note section.  */
> +extern void i386_linux_update_xstateregset
> +  (struct core_regset_section *regset_sections, unsigned int xstate_size);
> +
> +/* Format of XSAVE extended state is:
> + 	struct
> +	{
> +	  fxsave_bytes[0..463]
> +	  sw_usable_bytes[464..511]
> +	  xstate_hdr_bytes[512..575]
> +	  avx_bytes[576..831]
> +	  future_state etc
> +	};
> +
> +  Same memory layout will be used for the coredump NT_X86_XSTATE
> +  representing the XSAVE extended state registers.
> +
> +  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
> +  enabled state mask,  which is same as the 64bit mask returned by the
> +  xgetbv's XCR0). We can use this mask as well as the mask saved in the
> +  xstate_hdr bytes to interpret what states the processor/OS supports and
> +  what state is in, used/initialized conditions, for the particular
> +  process/thread.  */

Can you ask a native english speaker to look at this comment? 

> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
> index 05afa56..8ced34a 100644
> --- a/gdb/i386-tdep.c
> +++ b/gdb/i386-tdep.c
> @@ -2183,6 +2241,59 @@ i387_ext_type (struct gdbarch *gdbarch)
>    return tdep->i387_ext_type;
>  }
>  
> +/* Construct vector type for pseudo XMM registers.  We can't use
> +   tdesc_find_type since XMM isn't described in target description.  */

I'm confused here.  If you have a non-AVX target, why do you need a 256-bit vector type?

> +static struct type *
> +i386_ymm_type (struct gdbarch *gdbarch)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +
> +  if (!tdep->i386_ymm_type)
> +    {
> +      const struct builtin_type *bt = builtin_type (gdbarch);
> +
> +      /* The type we're building is this: */
> +#if 0
> +      union __gdb_builtin_type_vec256i
> +      {
> +        int128_t uint128[2];
> +        int64_t v2_int64[4];
> +        int32_t v4_int32[8];
> +        int16_t v8_int16[16];
> +        int8_t v16_int8[32];
> +        double v2_double[4];
> +        float v4_float[8];
> +      };
> +#endif
> +
> +      struct type *t;
> +
> +      t = arch_composite_type (gdbarch,
> +			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
> +      append_composite_type_field (t, "v8_float",
> +				   init_vector_type (bt->builtin_float, 8));
> +      append_composite_type_field (t, "v4_double",
> +				   init_vector_type (bt->builtin_double, 4));
> +      append_composite_type_field (t, "v32_int8",
> +				   init_vector_type (bt->builtin_int8, 32));
> +      append_composite_type_field (t, "v16_int16",
> +				   init_vector_type (bt->builtin_int16, 16));
> +      append_composite_type_field (t, "v8_int32",
> +				   init_vector_type (bt->builtin_int32, 8));
> +      append_composite_type_field (t, "v4_int64",
> +				   init_vector_type (bt->builtin_int64, 4));
> +      append_composite_type_field (t, "v2_int128",
> +				   init_vector_type (bt->builtin_int128, 2));
> +
> +      TYPE_VECTOR (t) = 1;
> +      TYPE_NAME (t) = "builtin_type_vec128i";
> +      tdep->i386_ymm_type = t;
> +    }
> +
> +  return tdep->i386_ymm_type;
> +}
> +
>  /* Construct vector type for MMX registers.  */
>  static struct type *
>  i386_mmx_type (struct gdbarch *gdbarch)
> @@ -2233,6 +2344,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
>  {
>    if (i386_mmx_regnum_p (gdbarch, regnum))
>      return i386_mmx_type (gdbarch);
> +  else if (i386_ymm_regnum_p (gdbarch, regnum))
> +    return i386_ymm_type (gdbarch);
>    else
>      {
>        const struct builtin_type *bt = builtin_type (gdbarch);
> @@ -2284,7 +2397,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
>      {
>        struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
>  
> -      if (i386_word_regnum_p (gdbarch, regnum))
> +      if (i386_ymm_regnum_p (gdbarch, regnum))
> +	{
> +	  regnum -= tdep->ymm0_regnum;
> +
> +	  /* Extract (always little endian).  Read lower 16byte. */
> +	  regcache_raw_read (regcache,
> +			     I387_XMM0_REGNUM (tdep) + regnum,
> +			     raw_buf);
> +	  memcpy (buf, raw_buf, 16);
> +	  /* Read upper 16byte.  */
> +	  regcache_raw_read (regcache,
> +			     tdep->ymm0h_regnum + regnum,
> +			     raw_buf);
> +	  memcpy (buf + 16, raw_buf, 16);
> +	}
> +      else if (i386_word_regnum_p (gdbarch, regnum))
>  	{
>  	  int gpnum = regnum - tdep->ax_regnum;
>  
> @@ -2333,7 +2461,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
>      {
>        struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
>  
> -      if (i386_word_regnum_p (gdbarch, regnum))
> +      if (i386_ymm_regnum_p (gdbarch, regnum))
> +	{
> +	  regnum -= tdep->ymm0_regnum;
> +
> +	  /* ... Write lower 16byte.  */
> +	  regcache_raw_write (regcache,
> +			     I387_XMM0_REGNUM (tdep) + regnum,
> +			     buf);
> +	  /* ... Write upper 16byte.  */
> +	  regcache_raw_write (regcache,
> +			     tdep->ymm0h_regnum + regnum,
> +			     buf + 16);

Culd you change the comments here to say 128-bit instead of 16byte?

> @@ -5649,7 +5836,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>  		       struct tdesc_arch_data *tdesc_data)
>  {
>    const struct target_desc *tdesc = tdep->tdesc;
> -  const struct tdesc_feature *feature_core, *feature_vector;
> +  const struct tdesc_feature *feature_core;
> +  const struct tdesc_feature *feature_sse, *feature_avx;
>    int i, num_regs, valid_p;
>  
>    if (! tdesc_has_registers (tdesc))
> @@ -5659,13 +5847,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>    feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
>  
>    /* Get SSE registers.  */
> -  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
> +  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
>  
> -  if (feature_core == NULL || feature_vector == NULL)
> +  if (feature_core == NULL || feature_sse == NULL)
>      return 0;
>  
> +  /* Try AVX registers.  */
> +  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
> +
>    valid_p = 1;
>  
> +  /* The XCR0 bits.  */
> +  if (feature_avx)
> +    {
> +      tdep->xcr0 = I386_XSTATE_AVX_MASK;
> +
> +      /* It may be set by ABI-specific.  */

Sorry, but does comment makes no sense to me.

> @@ -5854,9 +6071,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
>    set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
>  
> -  /* The default ABI includes general-purpose registers, 
> -     floating-point registers, and the SSE registers.  */
> -  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
> +  /* Override the normal target description method to make the AVX
> +     upper halves anonymous.  */
> +  set_gdbarch_register_name (gdbarch, i386_register_name);
> +
> +  /* The default ABI includes general-purpose registers, floating-point
> +     registers, the SSE registers and the upper AVX registers.  */
> +  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);

Isn't it better to leave the AVX registers out of the default target,
and only provide them if we're talking to a target (native or remote)
that indicates it supports them?

> @@ -5940,6 +6177,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    set_gdbarch_fast_tracepoint_valid_at (gdbarch,
>  					i386_fast_tracepoint_valid_at);
>  
> +  /* Tell remote stub that we support XML target description.  */
> +  set_gdbarch_qsupported (gdbarch, "x86=xml");

> @@ -146,9 +156,24 @@ struct gdbarch_tdep
>    /* Number of SSE registers.  */
>    int num_xmm_regs;
>  
> +  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
> +     register), excluding the x87 bit, which are supported by this gdb.
> +   */
> +  unsigned long long xcr0;

GDB should be capitalized.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-27 15:30               ` Daniel Jacobowitz
@ 2010-03-27 16:05                 ` Mark Kettenis
  0 siblings, 0 replies; 115+ messages in thread
From: Mark Kettenis @ 2010-03-27 16:05 UTC (permalink / raw)
  To: dan; +Cc: hjl.tools, gdb-patches

> Date: Sat, 27 Mar 2010 11:30:22 -0400
> From: Daniel Jacobowitz <dan@codesourcery.com>
> 
> On Sat, Mar 27, 2010 at 03:54:56PM +0100, Mark Kettenis wrote:
> > Yes, we have done that in the past, but I think we should stop adding
> > #defines like that.  
> 
> I disagree.  The values are available in the kernel and glibc headers,
> sure.  But it takes years before a new define in linux/ptrace.h is
> widely available, and it is not uncommon for new kernels to enter use
> faster than that.  This way, if someone builds a new GDB with AVX
> support, and installs a new kernel with AVX support, they don't come
> to us and ask why AVX support isn't in their GDB.

I'm amazed that this stil is an issue in Linux-land.  Ah well, I'll
let you guys have it your way.  Don't act surprised if I put in a
snide remark of some sort about it ;).

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-12 17:25           ` H.J. Lu
@ 2010-03-27 16:07             ` Daniel Jacobowitz
  2010-03-28  1:11               ` H.J. Lu
  2010-03-29  1:09             ` PATCH: 6/6 [3rd " H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-27 16:07 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Fri, Mar 12, 2010 at 09:25:41AM -0800, H.J. Lu wrote:
> On Sat, Mar 06, 2010 at 02:22:50PM -0800, H.J. Lu wrote:
> > Hi,
> > 
> > Here are gdbserver changes to support AVX.  OK to install?
> > 
> > Thanks.
> > 
> > 
> 
> Here is the updated patch.  Any comments/suggestions?

I guess you haven't tested this one :-)  You may want to add an AVX
test to the testsuite, if it's not too much trouble.  You're checking
for the "x86=xml" feature in the target, but only calling the target
method for "x86:xstate=...".  I don't see how it could work.

The problem we're solving by modifying qSupported is that older
versions of GDB, which do not support XML registers at all, assume
a specific layout for the g/G packet.  Newer versions, which do
support XML, will use whatever the target supplies.  So, you only want
the target to supply the registers via XML if GDB will understand
them.  Is that accurate?

If that's the scope of the problem, then how about we handle
this in a way we can reuse for other targets?  That doesn't have
to change the implementation; just rename the feature to
"xmlRegisters+".

> @@ -264,21 +292,28 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
>  struct regset_info target_regsets[] =
>  {
>  #ifdef HAVE_PTRACE_GETREGS
> -  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
> +  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
>      GENERAL_REGS,
>      x86_fill_gregset, x86_store_gregset },
> +  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
> +# ifdef __x86_64__
> +    FP_REGS,
> +# else
> +    EXTENDED_REGS,
> +# endif
> +    x86_fill_xstateregset, x86_store_xstateregset },

What's this #ifdef for?  I don't think anything checks FP_REGS vs
EXTENDED_REGS.

> +int use_xml =
> +#ifdef USE_XML
> +  1;
> +#else
> +  0;
> +#endif
> +

I know this is just a style nit, but please do:

#ifndef USE_XML
# define USE_XML 0
#endif
int use_xml = USE_XML;

> -#ifdef USE_XML
> -  {
> -    extern const char *const xml_builtin[][2];
> -    int i;
> +  if (use_xml)
> +    {
> +      extern const char *const xml_builtin[][2];
> +      int i;
>  
> -    /* Look for the annex.  */
> -    for (i = 0; xml_builtin[i][0] != NULL; i++)
> -      if (strcmp (annex, xml_builtin[i][0]) == 0)
> -	break;
> +      /* Look for the annex.  */
> +      for (i = 0; xml_builtin[i][0] != NULL; i++)
> +	if (strcmp (annex, xml_builtin[i][0]) == 0)
> +	  break;
>  
> -    if (xml_builtin[i][0] != NULL)
> -      return xml_builtin[i][1];
> -  }
> -#endif
> +      if (xml_builtin[i][0] != NULL)
> +	return xml_builtin[i][1];
> +    }
>  
>    return NULL;
>  }

Has anything arranged for xml_builtin to be defined if !defined(USE_XML)?
That is what the #ifdef is actually for.

I am not convinced any of the fiddling of use_xml is necessary or does
what you want it to do.  xml_builtin is for returning static files,
i.e. those included using xi:include or referenced via
setting gdbserver_xmltarget.  The register cache files set
gdbserver_xmltarget which is above this check.  Have you tested
gdbserver with and without AVX?  What does it do?

I think it'll work if you remove use_xml, and leave USE_XML alone.  If
GDB does not support XML, you can adjust gdbserver_xmltarget to report
just the architecture and OSABI the way it did before you added
register XML files.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-27 15:33               ` H.J. Lu
@ 2010-03-27 16:09                 ` Mark Kettenis
  2010-03-28  1:39                   ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-27 16:09 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3176 bytes --]

> Date: Sat, 27 Mar 2010 08:33:01 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
> >> Date: Thu, 11 Mar 2010 16:00:05 -0800
> >> From: "H.J. Lu" <hjl.tools@gmail.com>
> >>
> >> >> +
> >> >> +#include "i386-xstate.h"
> >> >> +
> >> >> +#ifndef PTRACE_GETREGSET
> >> >> +#define PTRACE_GETREGSET     0x4204
> >> >> +#endif
> >> >> +
> >> >> +#ifndef PTRACE_SETREGSET
> >> >> +#define PTRACE_SETREGSET     0x4205
> >> >> +#endif
> >> >> +
> >> >> +#endif       /* NM_LINUX_XSTATE_H */
> >> >
> >> > Do we really have to hardcode constants like this in GDB?  They should
> >> > be available in through kernel/libc headers.  Are Drepper and Torvalds
> >> > still fighting over that issue?
> >>
> >> They are in Linux kernel 2.6.34-rc1. Do we enable gdb support only
> >> with the new kernel/glibc headers? I compiled gdb on RHEL4 and it
> >> works fine.  There are:
> >>
> >> #ifndef PTRACE_GET_THREAD_AREA
> >> #define PTRACE_GET_THREAD_AREA 25
> >>  ...
> >> #ifndef PTRACE_ARCH_PRCTL
> >> #define PTRACE_ARCH_PRCTL      30
> >>
> >> in amd64-linux-nat.c.
> >
> > Yes, we have done that in the past, but I think we should stop adding
> > #defines like that.
> 
> AVX gdb support only needs PTRACE_GETREGSET/PTRACE_SETREGSET,
> which are fixed constants. I don't think we should require new kernel/glibc
> header files for AVX support. I can change it to
> 
> #ifdef PTRACE_GETREGSET
> #if PTRACE_GETREGSET != 0x4204
> # error PTRACE_GETREGSET != 0x4204
> #endif
> #else
> #define PTRACE_GETREGSET        0x4204
> #endif

Ugh, no.  That's even worse.  Let's leave it as it was in your diff.

> >> >> +    perror_with_name (_("Couldn't read extended state status"));
> >> >> +
> >> >> +  i387_supply_xsave (regcache, -1, xstateregs);
> >> >> +  return 1;
> >> >> +}
> >> >> +
> >> >> +/* Store all valid registers in GDB's register array covered by the
> >> >> +   PTRACE_SETREGSET request into the process/thread specified by TID.
> >> >> +   Return non-zero if successful, zero otherwise.  */
> >> >> +
> >> >> +static int
> >> >> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
> >> >> +{
> >> >> +  unsigned long long xstateregs[xstate_size_n_of_int64];
> >> >
> >> > I think it is better to use I386_XSTATE_MAX_SIZE here.
> >>
> >> That is how the kernel interface works.  Whatever value
> >> I386_XSTATE_MAX_SIZE is today won't be the same tomorrow. We will
> >> increase it in the coming years. But the same gdb binary will work
> >> fine since kernel will only copy number of bytes specified in
> >> iov.iov_len, which is all gdb cares/needs.
> >
> > Yes, you'll need to raise I386_XSTATE_MAX_SIZE whenever the kernel
> > gains support for different/larger xstates.  But I don't see a problem
> > with that, since you'll have to make changes to GDB to support those
> > variants anyway.  That reminds me:
> 
> I will remove I386_XSTATE_MAX_SIZE since it isn't needed by kernel.

Huh?  You're missing the point here.  GDB is supposed to be written in
C90, which doesn't support variable-length arrays.  So you need a
compile-time constant to size the xstateregs array.  And
I386_XSTATE_MAX_SIZE fits the bill there perfectly.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 0/6 [2nd try]: Add AVX support
  2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
  2010-03-06 22:18   ` PATCH: 1/6 [2nd try]: Add AVX support (AVX XML files) H.J. Lu
  2010-03-07 14:16   ` PATCH: 0/6 [2nd try]: Add AVX support Mark Kettenis
@ 2010-03-27 16:16   ` Daniel Jacobowitz
  2010-03-29  0:16   ` PATCH: 0/6 [3nd " H.J. Lu
  3 siblings, 0 replies; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-27 16:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB, Mark Kettenis

I've skimmed all the patches and commented on the gdbserver parts.
Other than that, I don't have any comments that Mark hasn't raised
already.  Thanks for revising.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-27 16:07             ` Daniel Jacobowitz
@ 2010-03-28  1:11               ` H.J. Lu
  2010-03-28  7:55                 ` Pedro Alves
  2010-03-28 16:39                 ` Daniel Jacobowitz
  0 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-28  1:11 UTC (permalink / raw)
  To: GDB

On Sat, Mar 27, 2010 at 9:07 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> On Fri, Mar 12, 2010 at 09:25:41AM -0800, H.J. Lu wrote:
>> On Sat, Mar 06, 2010 at 02:22:50PM -0800, H.J. Lu wrote:
>> > Hi,
>> >
>> > Here are gdbserver changes to support AVX.  OK to install?
>> >
>> > Thanks.
>> >
>> >
>>
>> Here is the updated patch.  Any comments/suggestions?
>
> I guess you haven't tested this one :-)  You may want to add an AVX
> test to the testsuite, if it's not too much trouble.  You're checking
> for the "x86=xml" feature in the target, but only calling the target
> method for "x86:xstate=...".  I don't see how it could work.
>
> The problem we're solving by modifying qSupported is that older
> versions of GDB, which do not support XML registers at all, assume
> a specific layout for the g/G packet.  Newer versions, which do
> support XML, will use whatever the target supplies.  So, you only want
> the target to supply the registers via XML if GDB will understand
> them.  Is that accurate?

Yes,

> If that's the scope of the problem, then how about we handle
> this in a way we can reuse for other targets?  That doesn't have
> to change the implementation; just rename the feature to
> "xmlRegisters+".

I will make the change.

>> @@ -264,21 +292,28 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
>>  struct regset_info target_regsets[] =
>>  {
>>  #ifdef HAVE_PTRACE_GETREGS
>> -  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
>> +  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
>>      GENERAL_REGS,
>>      x86_fill_gregset, x86_store_gregset },
>> +  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
>> +# ifdef __x86_64__
>> +    FP_REGS,
>> +# else
>> +    EXTENDED_REGS,
>> +# endif
>> +    x86_fill_xstateregset, x86_store_xstateregset },
>
> What's this #ifdef for?  I don't think anything checks FP_REGS vs
> EXTENDED_REGS.

I just follow the current format where SSE register set is marked with
EXTENDED_REGS for i386 and FP_REGS for x86-64. I don't mind
changing it to either of them for both i386 and x86-64. Just let me
know which one I should use.

>> +int use_xml =
>> +#ifdef USE_XML
>> +  1;
>> +#else
>> +  0;
>> +#endif
>> +
>
> I know this is just a style nit, but please do:
>
> #ifndef USE_XML
> # define USE_XML 0
> #endif
> int use_xml = USE_XML;

I will make the change.

>> -#ifdef USE_XML
>> -  {
>> -    extern const char *const xml_builtin[][2];
>> -    int i;
>> +  if (use_xml)
>> +    {
>> +      extern const char *const xml_builtin[][2];
>> +      int i;
>>
>> -    /* Look for the annex.  */
>> -    for (i = 0; xml_builtin[i][0] != NULL; i++)
>> -      if (strcmp (annex, xml_builtin[i][0]) == 0)
>> -     break;
>> +      /* Look for the annex.  */
>> +      for (i = 0; xml_builtin[i][0] != NULL; i++)
>> +     if (strcmp (annex, xml_builtin[i][0]) == 0)
>> +       break;
>>
>> -    if (xml_builtin[i][0] != NULL)
>> -      return xml_builtin[i][1];
>> -  }
>> -#endif
>> +      if (xml_builtin[i][0] != NULL)
>> +     return xml_builtin[i][1];
>> +    }
>>
>>    return NULL;
>>  }
>
> Has anything arranged for xml_builtin to be defined if !defined(USE_XML)?
> That is what the #ifdef is actually for.
>
> I am not convinced any of the fiddling of use_xml is necessary or does
> what you want it to do.  xml_builtin is for returning static files,
> i.e. those included using xi:include or referenced via
> setting gdbserver_xmltarget.  The register cache files set
> gdbserver_xmltarget which is above this check.  Have you tested
> gdbserver with and without AVX?  What does it do?

Yes, I have tested them. The logic is in x86_linux_process_qsupported
which will set XML target to AVX if AVX is supported.

> I think it'll work if you remove use_xml, and leave USE_XML alone.  If
> GDB does not support XML, you can adjust gdbserver_xmltarget to report
> just the architecture and OSABI the way it did before you added
> register XML files.
>

I don't know how gdbserver_xmltarget should be set if gdb doesn't support
XML. My current approach is to turn off XML support at run-time even if
USE_XML is 1 when gdb doesn't support XML.

Can you show me some example to how to properly turn of XML via
gdbserver_xmltarget?

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-27 15:48       ` PATCH: 3/6 [2nd " Mark Kettenis
@ 2010-03-28  1:37         ` H.J. Lu
  2010-03-28 11:55           ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-28  1:37 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sat, Mar 27, 2010 at 8:47 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sat, 6 Mar 2010 14:20:37 -0800
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> Hi,
>>
>> Here are i386 changes to support AVX. OK to install?
>
> OK, here's a review of the remainder of this part of the diff.  I'll
> wait with reviewing the amd64 bits until we've got the i386 part
> right, since a lot of what I'll say about i386 will also apply to
> amd64.  OK?

That is fine.

>> diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
>> index b23c109..66ecf84 100644
>> --- a/gdb/i386-linux-tdep.c
>> +++ b/gdb/i386-linux-tdep.c
>> +#include "i387-tdep.h"
>> +#include "i386-xstate.h"
>> +
>>  /* The syscall's XML filename for i386.  */
>>  #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
>>
>> @@ -47,13 +50,15 @@
>>  #include <stdint.h>
>>
>>  #include "features/i386/i386-linux.c"
>> +#include "features/i386/i386-avx-linux.c"
>>
>>  /* Supported register note sections.  */
>> -static struct core_regset_section i386_linux_regset_sections[] =
>> +struct core_regset_section i386_linux_regset_sections[] =
>
> Why do you make this non-static?

I need to change size of .reg-xstate section from i386-linux-nat.c.

>>  {
>>    { ".reg", 144, "general-purpose" },
>>    { ".reg2", 108, "floating-point" },
>>    { ".reg-xfp", 512, "extended floating-point" },
>> +  { ".reg-xstate", 0, "XSAVE extended state" },
>>    { NULL, 0 }
>>  };
>> @@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
>>    0 * 4                              /* %gs */
>>  };
>>
>> +/* Update XSAVE extended state register note section.  */
>> +
>> +void
>> +i386_linux_update_xstateregset
>> +  (struct core_regset_section *regset_sections, unsigned int xstate_size)
>> +{
>> +  int i;
>> +
>> +  /* Update the XSAVE extended state register note section for "gcore".
>> +     Disable it if its size is 0.  */
>> +  for (i = 0; regset_sections[i].sect_name != NULL; i++)
>> +    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
>> +      {
>> +     if (xstate_size)
>> +       regset_sections[i].size = xstate_size;
>> +     else
>> +       regset_sections[i].sect_name = NULL;
>> +     break;
>> +      }
>> +}
>
> What will happen if you have a single GDB connected to two different
> remote targets, one with AVX support and one without?

The size of .reg-xstate section is used only for native gcore and
won't be used for remote targets.

>> +/* Get XSAVE extended state xcr0 from core dump.  */
>> +
>> +unsigned long long
>> +i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
>> +                        struct target_ops *target, bfd *abfd)
>
> If you follow my advice about using uint64_t for xr0, the return value
> will have to be adjusted.

I will make the change.

>> +{
>> +  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
>> +  unsigned long long xcr0;
>> +
>> +  if (xstate)
>> +    {
>> +      size_t size = bfd_section_size (abfd, xstate);
>> +
>> +      gdb_assert (size >= I386_XSTATE_SSE_SIZE);
>
> Isn't a gdb_assert() here a bit harsh?  What happens if you simply return 0?

I will remove it. If the size < I386_XSTATE_SSE_SIZE, a warning will be issued
and 0 will be returned.

>> +      /* Check extended state size.  */
>> +      if (size < I386_XSTATE_AVX_SIZE)
>> +     xcr0 = I386_XSTATE_SSE_MASK;
>> +      else
>> +     {
>> +       char contents[8];
>> +
>> +       if (! bfd_get_section_contents (abfd, xstate, contents,
>> +                                       (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
>> +                                       8))
>
> Is that cast really necessary?

I just follow the tradition. Most of bfd_get_section_contents calls have
(file_ptr) cast. It may be used to avoid 32bit vs 64bit VMA warning.

>> +  Same memory layout will be used for the coredump NT_X86_XSTATE
>> +  representing the XSAVE extended state registers.
>> +
>> +  The first 8 bytes of the sw_usable_bytes[464..467] is set to OS enabled
>> +  enabled state mask,  which is same as the 64bit mask returned by the
>> +  xgetbv's XCR0). We can use this mask as well as the mask saved in the
>> +  xstate_hdr bytes to interpret what states the processor/OS supports and
>> +  what state is in, used/initialized conditions, for the particular
>> +  process/thread.  */
>
> Can you ask a native english speaker to look at this comment?

I will see what I can do.

>> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
>> index 05afa56..8ced34a 100644
>> --- a/gdb/i386-tdep.c
>> +++ b/gdb/i386-tdep.c
>> @@ -2183,6 +2241,59 @@ i387_ext_type (struct gdbarch *gdbarch)
>>    return tdep->i387_ext_type;
>>  }
>>
>> +/* Construct vector type for pseudo XMM registers.  We can't use
>> +   tdesc_find_type since XMM isn't described in target description.  */
>
> I'm confused here.  If you have a non-AVX target, why do you need a 256-bit vector type?

i386_ymm_type is only called from

  else if (i386_ymm_regnum_p (gdbarch, regnum))
    return i386_ymm_type (gdbarch);

It won't be called if you have a non-AVX target.

>> +static struct type *
>> +i386_ymm_type (struct gdbarch *gdbarch)
>> +{
..
>>    if (i386_mmx_regnum_p (gdbarch, regnum))
>>      return i386_mmx_type (gdbarch);
>> +  else if (i386_ymm_regnum_p (gdbarch, regnum))
>> +    return i386_ymm_type (gdbarch);
>>    else
...
>> +       /* ... Write lower 16byte.  */
>> +       regcache_raw_write (regcache,
>> +                          I387_XMM0_REGNUM (tdep) + regnum,
>> +                          buf);
>> +       /* ... Write upper 16byte.  */
>> +       regcache_raw_write (regcache,
>> +                          tdep->ymm0h_regnum + regnum,
>> +                          buf + 16);
>
> Culd you change the comments here to say 128-bit instead of 16byte?

I will make the change.

>> @@ -5649,7 +5836,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>>                      struct tdesc_arch_data *tdesc_data)
>>  {
>>    const struct target_desc *tdesc = tdep->tdesc;
>> -  const struct tdesc_feature *feature_core, *feature_vector;
>> +  const struct tdesc_feature *feature_core;
>> +  const struct tdesc_feature *feature_sse, *feature_avx;
>>    int i, num_regs, valid_p;
>>
>>    if (! tdesc_has_registers (tdesc))
>> @@ -5659,13 +5847,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>>    feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
>>
>>    /* Get SSE registers.  */
>> -  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
>> +  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
>>
>> -  if (feature_core == NULL || feature_vector == NULL)
>> +  if (feature_core == NULL || feature_sse == NULL)
>>      return 0;
>>
>> +  /* Try AVX registers.  */
>> +  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
>> +
>>    valid_p = 1;
>>
>> +  /* The XCR0 bits.  */
>> +  if (feature_avx)
>> +    {
>> +      tdep->xcr0 = I386_XSTATE_AVX_MASK;
>> +
>> +      /* It may be set by ABI-specific.  */
>
> Sorry, but does comment makes no sense to me.

I will update it.

>> @@ -5854,9 +6071,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>>    set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
>>    set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
>>
>> -  /* The default ABI includes general-purpose registers,
>> -     floating-point registers, and the SSE registers.  */
>> -  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
>> +  /* Override the normal target description method to make the AVX
>> +     upper halves anonymous.  */
>> +  set_gdbarch_register_name (gdbarch, i386_register_name);
>> +
>> +  /* The default ABI includes general-purpose registers, floating-point
>> +     registers, the SSE registers and the upper AVX registers.  */
>> +  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
>
> Isn't it better to leave the AVX registers out of the default target,
> and only provide them if we're talking to a target (native or remote)
> that indicates it supports them?

That is set  to a value higher enough to support AVX. The actual number
of registers will be set properly later. See:

http://sourceware.org/ml/gdb-patches/2010-02/msg00709.html

>> @@ -5940,6 +6177,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>>    set_gdbarch_fast_tracepoint_valid_at (gdbarch,
>>                                       i386_fast_tracepoint_valid_at);
>>
>> +  /* Tell remote stub that we support XML target description.  */
>> +  set_gdbarch_qsupported (gdbarch, "x86=xml");
>
>> @@ -146,9 +156,24 @@ struct gdbarch_tdep
>>    /* Number of SSE registers.  */
>>    int num_xmm_regs;
>>
>> +  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
>> +     register), excluding the x87 bit, which are supported by this gdb.
>> +   */
>> +  unsigned long long xcr0;
>
> GDB should be capitalized.
>

I will make the change.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-27 16:09                 ` Mark Kettenis
@ 2010-03-28  1:39                   ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-28  1:39 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sat, Mar 27, 2010 at 9:09 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> >> >> +    perror_with_name (_("Couldn't read extended state status"));
>> >> >> +
>> >> >> +  i387_supply_xsave (regcache, -1, xstateregs);
>> >> >> +  return 1;
>> >> >> +}
>> >> >> +
>> >> >> +/* Store all valid registers in GDB's register array covered by the
>> >> >> +   PTRACE_SETREGSET request into the process/thread specified by TID.
>> >> >> +   Return non-zero if successful, zero otherwise.  */
>> >> >> +
>> >> >> +static int
>> >> >> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
>> >> >> +{
>> >> >> +  unsigned long long xstateregs[xstate_size_n_of_int64];
>> >> >
>> >> > I think it is better to use I386_XSTATE_MAX_SIZE here.
>> >>
>> >> That is how the kernel interface works.  Whatever value
>> >> I386_XSTATE_MAX_SIZE is today won't be the same tomorrow. We will
>> >> increase it in the coming years. But the same gdb binary will work
>> >> fine since kernel will only copy number of bytes specified in
>> >> iov.iov_len, which is all gdb cares/needs.
>> >
>> > Yes, you'll need to raise I386_XSTATE_MAX_SIZE whenever the kernel
>> > gains support for different/larger xstates.  But I don't see a problem
>> > with that, since you'll have to make changes to GDB to support those
>> > variants anyway.  That reminds me:
>>
>> I will remove I386_XSTATE_MAX_SIZE since it isn't needed by kernel.
>
> Huh?  You're missing the point here.  GDB is supposed to be written in
> C90, which doesn't support variable-length arrays.  So you need a
> compile-time constant to size the xstateregs array.  And
> I386_XSTATE_MAX_SIZE fits the bill there perfectly.
>

I will make the change.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28  1:11               ` H.J. Lu
@ 2010-03-28  7:55                 ` Pedro Alves
  2010-03-28 14:56                   ` H.J. Lu
  2010-03-28 16:40                   ` Daniel Jacobowitz
  2010-03-28 16:39                 ` Daniel Jacobowitz
  1 sibling, 2 replies; 115+ messages in thread
From: Pedro Alves @ 2010-03-28  7:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: H.J. Lu

On Sunday 28 March 2010 02:11:31, H.J. Lu wrote:
> > I guess you haven't tested this one :-)  You may want to add an AVX
> > test to the testsuite, if it's not too much trouble.  You're checking
> > for the "x86=xml" feature in the target, but only calling the target
> > method for "x86:xstate=...".  I don't see how it could work.
> >
> > The problem we're solving by modifying qSupported is that older
> > versions of GDB, which do not support XML registers at all, assume
> > a specific layout for the g/G packet.  Newer versions, which do
> > support XML, will use whatever the target supplies.  So, you only want
> > the target to supply the registers via XML if GDB will understand
> > them.  Is that accurate?
> 
> Yes,
> 
> > If that's the scope of the problem, then how about we handle
> > this in a way we can reuse for other targets?  That doesn't have
> > to change the implementation; just rename the feature to
> > "xmlRegisters+".
> 
> I will make the change.

This (and the gdbarch_qsupported mechanism) worries me multi-arch
design wise.  There's a bootstrapping problem here.  GDB sends qSupported
to the target before knowing the target's target description.  The target
sends the target description based on qSupported.
As is, things only work correctly, when GDB already somehow knows the
arch is some sort of x86 _before_ connecting to the target.  That's
usually true if you give GDB a binary, but may not be true in some
use cases.

As a matter of example, if you have, say, a PPC --enable-targets=all
GDB build, and you simply do:

 $ gdb
 (gdb) tar rem :9999

to connect to a x86 linux gdbserver, then, the x86 target will not
be sending the registers target description, because GDB wouldn't
be sending the "x86=xml" feature (the target_gdbarch would be
set to something not-x86 early in the connection, at the point
gdbarch_qsupported it called).  With the "xmlRegisters+" change,
it would be slightly even worse, as GDB would be sending a generic
"xmlRegisters+", meaning "Hello target, I understand xml register
descriptions for your arch", but, at a point when it may be
mistaken what is the target's arch, and the target would
have no way of knowing that.

It seems to me that GDB should be sending "x86=xml" or something
similar to the target unconditionally of whatever target_gdbarch is
before having fetched the target description.

What do you guys think?  Did I miss this being discussed?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-28  1:37         ` H.J. Lu
@ 2010-03-28 11:55           ` Mark Kettenis
  2010-03-28 14:25             ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-28 11:55 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 5887 bytes --]

> Date: Sat, 27 Mar 2010 18:37:41 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
>
> >> diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
> >> index b23c109..66ecf84 100644
> >> --- a/gdb/i386-linux-tdep.c
> >> +++ b/gdb/i386-linux-tdep.c
> >> +#include "i387-tdep.h"
> >> +#include "i386-xstate.h"
> >> +
> >>  /* The syscall's XML filename for i386.  */
> >>  #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
> >>
> >> @@ -47,13 +50,15 @@
> >>  #include <stdint.h>
> >>
> >>  #include "features/i386/i386-linux.c"
> >> +#include "features/i386/i386-avx-linux.c"
> >>
> >>  /* Supported register note sections.  */
> >> -static struct core_regset_section i386_linux_regset_sections[] =
> >> +struct core_regset_section i386_linux_regset_sections[] =
> >
> > Why do you make this non-static?
> 
> I need to change size of .reg-xstate section from i386-linux-nat.c.

But then, why do you have the i386_linux_update_xstateregs() function
if you still need to pass the array itself around?

Anyway, how about setting the size of the .reg-xstate to
I386_XSTATE_SSE_SIZE unconditionally?  Tools will look at xcr0 value
encoded in there to determine what information in there is valid, so
dumping a little bit more than strictly necessary shouldn't be a
problem.

It would simplify things a bit.  Less code is good!

> >>  {
> >>    { ".reg", 144, "general-purpose" },
> >>    { ".reg2", 108, "floating-point" },
> >>    { ".reg-xfp", 512, "extended floating-point" },
> >> +  { ".reg-xstate", 0, "XSAVE extended state" },
> >>    { NULL, 0 }
> >>  };
> >> @@ -560,6 +566,66 @@ static int i386_linux_sc_reg_offset[] =
> >>    0 * 4                              /* %gs */
> >>  };
> >>
> >> +/* Update XSAVE extended state register note section.  */
> >> +
> >> +void
> >> +i386_linux_update_xstateregset
> >> +  (struct core_regset_section *regset_sections, unsigned int xstate_size)
> >> +{
> >> +  int i;
> >> +
> >> +  /* Update the XSAVE extended state register note section for "gcore".
> >> +     Disable it if its size is 0.  */
> >> +  for (i = 0; regset_sections[i].sect_name != NULL; i++)
> >> +    if (strcmp (regset_sections[i].sect_name, ".reg-xstate") == 0)
> >> +      {
> >> +     if (xstate_size)
> >> +       regset_sections[i].size = xstate_size;
> >> +     else
> >> +       regset_sections[i].sect_name = NULL;
> >> +     break;
> >> +      }
> >> +}
> >
> > What will happen if you have a single GDB connected to two different
> > remote targets, one with AVX support and one without?
> 
> The size of .reg-xstate section is used only for native gcore and
> won't be used for remote targets.

Ugh, yes you're right, gcore is a native-only feature.

> >> +      /* Check extended state size.  */
> >> +      if (size < I386_XSTATE_AVX_SIZE)
> >> +     xcr0 = I386_XSTATE_SSE_MASK;
> >> +      else
> >> +     {
> >> +       char contents[8];
> >> +
> >> +       if (! bfd_get_section_contents (abfd, xstate, contents,
> >> +                                       (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
> >> +                                       8))
> >
> > Is that cast really necessary?
> 
> I just follow the tradition. Most of bfd_get_section_contents calls have
> (file_ptr) cast. It may be used to avoid 32bit vs 64bit VMA warning.

Please don't use casts when they're not absolutely necessary; they
tend to hide bugs.

> >> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
> >> index 05afa56..8ced34a 100644
> >> --- a/gdb/i386-tdep.c
> >> +++ b/gdb/i386-tdep.c
> >> @@ -2183,6 +2241,59 @@ i387_ext_type (struct gdbarch *gdbarch)
> >>    return tdep->i387_ext_type;
> >>  }
> >>
> >> +/* Construct vector type for pseudo XMM registers.  We can't use
> >> +   tdesc_find_type since XMM isn't described in target description.  */
> >
> > I'm confused here.  If you have a non-AVX target, why do you need a 256-bit vector type?
> 
> i386_ymm_type is only called from
> 
>   else if (i386_ymm_regnum_p (gdbarch, regnum))
>     return i386_ymm_type (gdbarch);
> 
> It won't be called if you have a non-AVX target.

Sorry; that confuses me even more.  Let me try to explain again what
puzzles me.  The pseudo XMM registers are 128-bit, so why are you
building a 256-bit type?  Is the problem simply that the comment is
wrong and you're talking about pseudo YMM registers here?

> >> @@ -5854,9 +6071,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
> >>    set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
> >>    set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
> >>
> >> -  /* The default ABI includes general-purpose registers,
> >> -     floating-point registers, and the SSE registers.  */
> >> -  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
> >> +  /* Override the normal target description method to make the AVX
> >> +     upper halves anonymous.  */
> >> +  set_gdbarch_register_name (gdbarch, i386_register_name);
> >> +
> >> +  /* The default ABI includes general-purpose registers, floating-point
> >> +     registers, the SSE registers and the upper AVX registers.  */
> >> +  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
> >
> > Isn't it better to leave the AVX registers out of the default target,
> > and only provide them if we're talking to a target (native or remote)
> > that indicates it supports them?
> 
> That is set  to a value higher enough to support AVX. The actual number
> of registers will be set properly later. See:

OK, then please adjust the comment to say something like:

    /* Even though the default ABI only includes general-purpose registers,
       floating-point registers and the SSE registers, we have to leave a
       gap for the upper AVX registers.  */

Thanks,

Mark

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-28 11:55           ` Mark Kettenis
@ 2010-03-28 14:25             ` H.J. Lu
  2010-03-29 20:32               ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-28 14:25 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Sun, Mar 28, 2010 at 4:55 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sat, 27 Mar 2010 18:37:41 -0700
>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>
>> >> diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
>> >> index b23c109..66ecf84 100644
>> >> --- a/gdb/i386-linux-tdep.c
>> >> +++ b/gdb/i386-linux-tdep.c
>> >> +#include "i387-tdep.h"
>> >> +#include "i386-xstate.h"
>> >> +
>> >>  /* The syscall's XML filename for i386.  */
>> >>  #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
>> >>
>> >> @@ -47,13 +50,15 @@
>> >>  #include <stdint.h>
>> >>
>> >>  #include "features/i386/i386-linux.c"
>> >> +#include "features/i386/i386-avx-linux.c"
>> >>
>> >>  /* Supported register note sections.  */
>> >> -static struct core_regset_section i386_linux_regset_sections[] =
>> >> +struct core_regset_section i386_linux_regset_sections[] =
>> >
>> > Why do you make this non-static?
>>
>> I need to change size of .reg-xstate section from i386-linux-nat.c.
>
> But then, why do you have the i386_linux_update_xstateregs() function
> if you still need to pass the array itself around?

i386-linux-nat.c calls i386_linux_update_xstateregs with
 i386_linux_regset_sections. Also amd64-linux-nat.c calls
i386_linux_update_xstateregset with amd64_linux_regset_sections.
If I don't make amd64_linux_regset_sections and
i386_linux_regset_sections global, I have to write
i386_linux_update_xstateregset and amd64_linux_update_xstateregset.
The only difference of 2 functions will be amd64_linux_regset_sections
vs. i386_linux_regset_sections.

> Anyway, how about setting the size of the .reg-xstate to
> I386_XSTATE_SSE_SIZE unconditionally?  Tools will look at xcr0 value
> encoded in there to determine what information in there is valid, so
> dumping a little bit more than strictly necessary shouldn't be a
> problem.

That will make the code more complex since the generic gcore
implementation will have to adjust section size based on XCR0.
But if it is what is required, I will make the change.

> It would simplify things a bit.  Less code is good!
>
>
>> >> +      /* Check extended state size.  */
>> >> +      if (size < I386_XSTATE_AVX_SIZE)
>> >> +     xcr0 = I386_XSTATE_SSE_MASK;
>> >> +      else
>> >> +     {
>> >> +       char contents[8];
>> >> +
>> >> +       if (! bfd_get_section_contents (abfd, xstate, contents,
>> >> +                                       (file_ptr) I386_LINUX_XSAVE_XCR0_OFFSET,
>> >> +                                       8))
>> >
>> > Is that cast really necessary?
>>
>> I just follow the tradition. Most of bfd_get_section_contents calls have
>> (file_ptr) cast. It may be used to avoid 32bit vs 64bit VMA warning.
>
> Please don't use casts when they're not absolutely necessary; they
> tend to hide bugs.

I will make the change.

>> >> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
>> >> index 05afa56..8ced34a 100644
>> >> --- a/gdb/i386-tdep.c
>> >> +++ b/gdb/i386-tdep.c
>> >> @@ -2183,6 +2241,59 @@ i387_ext_type (struct gdbarch *gdbarch)
>> >>    return tdep->i387_ext_type;
>> >>  }
>> >>
>> >> +/* Construct vector type for pseudo XMM registers.  We can't use
>> >> +   tdesc_find_type since XMM isn't described in target description.  */
>> >
>> > I'm confused here.  If you have a non-AVX target, why do you need a 256-bit vector type?
>>
>> i386_ymm_type is only called from
>>
>>   else if (i386_ymm_regnum_p (gdbarch, regnum))
>>     return i386_ymm_type (gdbarch);
>>
>> It won't be called if you have a non-AVX target.
>
> Sorry; that confuses me even more.  Let me try to explain again what
> puzzles me.  The pseudo XMM registers are 128-bit, so why are you
> building a 256-bit type?  Is the problem simply that the comment is
> wrong and you're talking about pseudo YMM registers here?

Ooops. I meant "pseudo YMM registers". I will update comments.

>> >> @@ -5854,9 +6071,13 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>> >>    set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
>> >>    set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
>> >>
>> >> -  /* The default ABI includes general-purpose registers,
>> >> -     floating-point registers, and the SSE registers.  */
>> >> -  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
>> >> +  /* Override the normal target description method to make the AVX
>> >> +     upper halves anonymous.  */
>> >> +  set_gdbarch_register_name (gdbarch, i386_register_name);
>> >> +
>> >> +  /* The default ABI includes general-purpose registers, floating-point
>> >> +     registers, the SSE registers and the upper AVX registers.  */
>> >> +  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
>> >
>> > Isn't it better to leave the AVX registers out of the default target,
>> > and only provide them if we're talking to a target (native or remote)
>> > that indicates it supports them?
>>
>> That is set  to a value higher enough to support AVX. The actual number
>> of registers will be set properly later. See:
>
> OK, then please adjust the comment to say something like:
>
>    /* Even though the default ABI only includes general-purpose registers,
>       floating-point registers and the SSE registers, we have to leave a
>       gap for the upper AVX registers.  */
>

I will make the change.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28  7:55                 ` Pedro Alves
@ 2010-03-28 14:56                   ` H.J. Lu
  2010-03-28 16:17                     ` Pedro Alves
  2010-03-28 16:40                   ` Daniel Jacobowitz
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-28 14:56 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

On Sun, Mar 28, 2010 at 12:55 AM, Pedro Alves <pedro@codesourcery.com> wrote:
> On Sunday 28 March 2010 02:11:31, H.J. Lu wrote:
>> > I guess you haven't tested this one :-)  You may want to add an AVX
>> > test to the testsuite, if it's not too much trouble.  You're checking
>> > for the "x86=xml" feature in the target, but only calling the target
>> > method for "x86:xstate=...".  I don't see how it could work.
>> >
>> > The problem we're solving by modifying qSupported is that older
>> > versions of GDB, which do not support XML registers at all, assume
>> > a specific layout for the g/G packet.  Newer versions, which do
>> > support XML, will use whatever the target supplies.  So, you only want
>> > the target to supply the registers via XML if GDB will understand
>> > them.  Is that accurate?
>>
>> Yes,
>>
>> > If that's the scope of the problem, then how about we handle
>> > this in a way we can reuse for other targets?  That doesn't have
>> > to change the implementation; just rename the feature to
>> > "xmlRegisters+".
>>
>> I will make the change.
>
> This (and the gdbarch_qsupported mechanism) worries me multi-arch
> design wise.  There's a bootstrapping problem here.  GDB sends qSupported
> to the target before knowing the target's target description.  The target
> sends the target description based on qSupported.
> As is, things only work correctly, when GDB already somehow knows the
> arch is some sort of x86 _before_ connecting to the target.  That's
> usually true if you give GDB a binary, but may not be true in some
> use cases.
>
> As a matter of example, if you have, say, a PPC --enable-targets=all
> GDB build, and you simply do:
>
>  $ gdb
>  (gdb) tar rem :9999
>
> to connect to a x86 linux gdbserver, then, the x86 target will not
> be sending the registers target description, because GDB wouldn't
> be sending the "x86=xml" feature (the target_gdbarch would be
> set to something not-x86 early in the connection, at the point
> gdbarch_qsupported it called).  With the "xmlRegisters+" change,
> it would be slightly even worse, as GDB would be sending a generic
> "xmlRegisters+", meaning "Hello target, I understand xml register
> descriptions for your arch", but, at a point when it may be
> mistaken what is the target's arch, and the target would
> have no way of knowing that.
>
> It seems to me that GDB should be sending "x86=xml" or something
> similar to the target unconditionally of whatever target_gdbarch is
> before having fetched the target description.
>

I think current_target should be set to something sensible before
sending qSupported. It should match arch and OSABI of the executable.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28 14:56                   ` H.J. Lu
@ 2010-03-28 16:17                     ` Pedro Alves
  2010-03-28 16:37                       ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Pedro Alves @ 2010-03-28 16:17 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

On Sunday 28 March 2010 15:56:17, H.J. Lu wrote:
> On Sun, Mar 28, 2010 at 12:55 AM, Pedro Alves <pedro@codesourcery.com> wrote:
> > On Sunday 28 March 2010 02:11:31, H.J. Lu wrote:
> >> > I guess you haven't tested this one :-)  You may want to add an AVX
> >> > test to the testsuite, if it's not too much trouble.  You're checking
> >> > for the "x86=xml" feature in the target, but only calling the target
> >> > method for "x86:xstate=...".  I don't see how it could work.
> >> >
> >> > The problem we're solving by modifying qSupported is that older
> >> > versions of GDB, which do not support XML registers at all, assume
> >> > a specific layout for the g/G packet.  Newer versions, which do
> >> > support XML, will use whatever the target supplies.  So, you only want
> >> > the target to supply the registers via XML if GDB will understand
> >> > them.  Is that accurate?
> >>
> >> Yes,
> >>
> >> > If that's the scope of the problem, then how about we handle
> >> > this in a way we can reuse for other targets?  That doesn't have
> >> > to change the implementation; just rename the feature to
> >> > "xmlRegisters+".
> >>
> >> I will make the change.
> >
> > This (and the gdbarch_qsupported mechanism) worries me multi-arch
> > design wise.  There's a bootstrapping problem here.  GDB sends qSupported
> > to the target before knowing the target's target description.  The target
> > sends the target description based on qSupported.
> > As is, things only work correctly, when GDB already somehow knows the
> > arch is some sort of x86 _before_ connecting to the target.  That's
> > usually true if you give GDB a binary, but may not be true in some
> > use cases.
> >
> > As a matter of example, if you have, say, a PPC --enable-targets=all
> > GDB build, and you simply do:
> >
> >  $ gdb
> >  (gdb) tar rem :9999
> >
> > to connect to a x86 linux gdbserver, then, the x86 target will not
> > be sending the registers target description, because GDB wouldn't
> > be sending the "x86=xml" feature (the target_gdbarch would be
> > set to something not-x86 early in the connection, at the point
> > gdbarch_qsupported it called).  With the "xmlRegisters+" change,
> > it would be slightly even worse, as GDB would be sending a generic
> > "xmlRegisters+", meaning "Hello target, I understand xml register
> > descriptions for your arch", but, at a point when it may be
> > mistaken what is the target's arch, and the target would
> > have no way of knowing that.
> >
> > It seems to me that GDB should be sending "x86=xml" or something
> > similar to the target unconditionally of whatever target_gdbarch is
> > before having fetched the target description.
> >
> 
> I think current_target should be set to something sensible before
> sending qSupported. It should match arch and OSABI of the executable.

I can't agree with that.  That's against the goal of having the target
fully self describe to GDB.  If that were true, then why would we
support target descriptions that describe the OSABI?
As I said and exampled above, you may not have a binary loaded in GDB
at all.  A design that assumes you have, can't be correct in all
supported cases.  GDB supports at least one x86 target that doesn't even
have a notion of executables, only shared libraries --- DICOS.  I wouldn't
want users of a non-x86 GDB build that supported that target to have
to do "set architecture i386" or similar before connecting to be
able to access the full register set as described by the target.

What are your worries with doing something as I suggested?

[To clear up confusions, this is about target_gdbarch, not
current_target.  The current_target is always target
remote / remote.c]

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28 16:17                     ` Pedro Alves
@ 2010-03-28 16:37                       ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-28 16:37 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

On Sun, Mar 28, 2010 at 9:17 AM, Pedro Alves <pedro@codesourcery.com> wrote:
> On Sunday 28 March 2010 15:56:17, H.J. Lu wrote:
>> On Sun, Mar 28, 2010 at 12:55 AM, Pedro Alves <pedro@codesourcery.com> wrote:
>> > On Sunday 28 March 2010 02:11:31, H.J. Lu wrote:
>> >> > I guess you haven't tested this one :-)  You may want to add an AVX
>> >> > test to the testsuite, if it's not too much trouble.  You're checking
>> >> > for the "x86=xml" feature in the target, but only calling the target
>> >> > method for "x86:xstate=...".  I don't see how it could work.
>> >> >
>> >> > The problem we're solving by modifying qSupported is that older
>> >> > versions of GDB, which do not support XML registers at all, assume
>> >> > a specific layout for the g/G packet.  Newer versions, which do
>> >> > support XML, will use whatever the target supplies.  So, you only want
>> >> > the target to supply the registers via XML if GDB will understand
>> >> > them.  Is that accurate?
>> >>
>> >> Yes,
>> >>
>> >> > If that's the scope of the problem, then how about we handle
>> >> > this in a way we can reuse for other targets?  That doesn't have
>> >> > to change the implementation; just rename the feature to
>> >> > "xmlRegisters+".
>> >>
>> >> I will make the change.
>> >
>> > This (and the gdbarch_qsupported mechanism) worries me multi-arch
>> > design wise.  There's a bootstrapping problem here.  GDB sends qSupported
>> > to the target before knowing the target's target description.  The target
>> > sends the target description based on qSupported.
>> > As is, things only work correctly, when GDB already somehow knows the
>> > arch is some sort of x86 _before_ connecting to the target.  That's
>> > usually true if you give GDB a binary, but may not be true in some
>> > use cases.
>> >
>> > As a matter of example, if you have, say, a PPC --enable-targets=all
>> > GDB build, and you simply do:
>> >
>> >  $ gdb
>> >  (gdb) tar rem :9999
>> >
>> > to connect to a x86 linux gdbserver, then, the x86 target will not
>> > be sending the registers target description, because GDB wouldn't
>> > be sending the "x86=xml" feature (the target_gdbarch would be
>> > set to something not-x86 early in the connection, at the point
>> > gdbarch_qsupported it called).  With the "xmlRegisters+" change,
>> > it would be slightly even worse, as GDB would be sending a generic
>> > "xmlRegisters+", meaning "Hello target, I understand xml register
>> > descriptions for your arch", but, at a point when it may be
>> > mistaken what is the target's arch, and the target would
>> > have no way of knowing that.
>> >
>> > It seems to me that GDB should be sending "x86=xml" or something
>> > similar to the target unconditionally of whatever target_gdbarch is
>> > before having fetched the target description.
>> >
>>
>> I think current_target should be set to something sensible before
>> sending qSupported. It should match arch and OSABI of the executable.
>
> I can't agree with that.  That's against the goal of having the target
> fully self describe to GDB.  If that were true, then why would we
> support target descriptions that describe the OSABI?
> As I said and exampled above, you may not have a binary loaded in GDB
> at all.  A design that assumes you have, can't be correct in all
> supported cases.  GDB supports at least one x86 target that doesn't even
> have a notion of executables, only shared libraries --- DICOS.  I wouldn't
> want users of a non-x86 GDB build that supported that target to have
> to do "set architecture i386" or similar before connecting to be
> able to access the full register set as described by the target.
>
> What are your worries with doing something as I suggested?
>
> [To clear up confusions, this is about target_gdbarch, not
> current_target.  The current_target is always target
> remote / remote.c]
>

I guess it may be OK to always "xmlRegisters+" to gdb stub
and let each arch decide what to do.

One problem may be

1. We add XXX support to gdb 7.2.
2. We enable XML support for XXX in gdb 7.3.

What will happen when we run gdbserver
fom gdb 7.3 against gdb 7.2? gdb will always send
 "xmlRegisters+" to gdb stub which will send back
XML files. But gdb 7.2 doesn't support it.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28  1:11               ` H.J. Lu
  2010-03-28  7:55                 ` Pedro Alves
@ 2010-03-28 16:39                 ` Daniel Jacobowitz
  2010-03-28 19:31                   ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-28 16:39 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Sat, Mar 27, 2010 at 06:11:31PM -0700, H.J. Lu wrote:
> I just follow the current format where SSE register set is marked with
> EXTENDED_REGS for i386 and FP_REGS for x86-64. I don't mind
> changing it to either of them for both i386 and x86-64. Just let me
> know which one I should use.

The only reason they're separate is that there was an FP_REGS already.
It doesn't make any difference to the implementation.  I suggest
EXTENDED_REGS unconditionally.

> Yes, I have tested them. The logic is in x86_linux_process_qsupported
> which will set XML target to AVX if AVX is supported.

Then you don't need to change USE_XML at all.

Your goal is not to turn off support for "XML".  Your goal is to not
report the AVX register description.  Before all your patches,
gdbserver would have reported a tiny target description that
contained the OSABI (e.g. "<osabi>GNU/Linux</osabi>").  That goes in
gdbserver_xmltarget using the "@" prefix that this function checks
for.

I suggest building such an older gdbserver, to see what it returns.

> > I think it'll work if you remove use_xml, and leave USE_XML alone.  If
> > GDB does not support XML, you can adjust gdbserver_xmltarget to report
> > just the architecture and OSABI the way it did before you added
> > register XML files.
> >
> 
> I don't know how gdbserver_xmltarget should be set if gdb doesn't support
> XML. My current approach is to turn off XML support at run-time even if
> USE_XML is 1 when gdb doesn't support XML.

Look at what USE_XML controls.  You do not need to turn off this block
of code.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28  7:55                 ` Pedro Alves
  2010-03-28 14:56                   ` H.J. Lu
@ 2010-03-28 16:40                   ` Daniel Jacobowitz
  2010-03-28 16:47                     ` Pedro Alves
  1 sibling, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-03-28 16:40 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, H.J. Lu

On Sun, Mar 28, 2010 at 07:55:44AM +0000, Pedro Alves wrote:
> This (and the gdbarch_qsupported mechanism) worries me multi-arch
> design wise.  There's a bootstrapping problem here.  GDB sends qSupported
> to the target before knowing the target's target description.  The target
> sends the target description based on qSupported.
> As is, things only work correctly, when GDB already somehow knows the
> arch is some sort of x86 _before_ connecting to the target.  That's
> usually true if you give GDB a binary, but may not be true in some
> use cases.

You're right.  I forgot about this; the design won't work.

Would "xmlRegisters=arm,x86" be a better solution?

If so, the way to implement that is to have a registration function.
i386-tdep.c:_initialize_i386_tdep can call a function in remote.c
to add "x86" to the list of xmlRegisters architectures.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28 16:40                   ` Daniel Jacobowitz
@ 2010-03-28 16:47                     ` Pedro Alves
  2010-03-28 20:53                       ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Pedro Alves @ 2010-03-28 16:47 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches, H.J. Lu

On Sunday 28 March 2010 17:40:45, Daniel Jacobowitz wrote:
> You're right.  I forgot about this; the design won't work.
> 
> Would "xmlRegisters=arm,x86" be a better solution?

Yes, I think so.

> If so, the way to implement that is to have a registration function.
> i386-tdep.c:_initialize_i386_tdep can call a function in remote.c
> to add "x86" to the list of xmlRegisters architectures.

Sounds good to me.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28 16:39                 ` Daniel Jacobowitz
@ 2010-03-28 19:31                   ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-28 19:31 UTC (permalink / raw)
  To: GDB

On Sun, Mar 28, 2010 at 9:38 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> On Sat, Mar 27, 2010 at 06:11:31PM -0700, H.J. Lu wrote:
>> I just follow the current format where SSE register set is marked with
>> EXTENDED_REGS for i386 and FP_REGS for x86-64. I don't mind
>> changing it to either of them for both i386 and x86-64. Just let me
>> know which one I should use.
>
> The only reason they're separate is that there was an FP_REGS already.
> It doesn't make any difference to the implementation.  I suggest
> EXTENDED_REGS unconditionally.

I will make the change.

>> Yes, I have tested them. The logic is in x86_linux_process_qsupported
>> which will set XML target to AVX if AVX is supported.
>
> Then you don't need to change USE_XML at all.
>
> Your goal is not to turn off support for "XML".  Your goal is to not
> report the AVX register description.  Before all your patches,
> gdbserver would have reported a tiny target description that
> contained the OSABI (e.g. "<osabi>GNU/Linux</osabi>").  That goes in
> gdbserver_xmltarget using the "@" prefix that this function checks
> for.
>
> I suggest building such an older gdbserver, to see what it returns.
>
>> > I think it'll work if you remove use_xml, and leave USE_XML alone.  If
>> > GDB does not support XML, you can adjust gdbserver_xmltarget to report
>> > just the architecture and OSABI the way it did before you added
>> > register XML files.
>> >
>>
>> I don't know how gdbserver_xmltarget should be set if gdb doesn't support
>> XML. My current approach is to turn off XML support at run-time even if
>> USE_XML is 1 when gdb doesn't support XML.
>
> Look at what USE_XML controls.  You do not need to turn off this block
> of code.

I will make the change.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28 16:47                     ` Pedro Alves
@ 2010-03-28 20:53                       ` H.J. Lu
  2010-03-28 21:27                         ` Pedro Alves
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-28 20:53 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Daniel Jacobowitz, gdb-patches

On Sun, Mar 28, 2010 at 9:46 AM, Pedro Alves <pedro@codesourcery.com> wrote:
> On Sunday 28 March 2010 17:40:45, Daniel Jacobowitz wrote:
>> You're right.  I forgot about this; the design won't work.
>>
>> Would "xmlRegisters=arm,x86" be a better solution?
>
> Yes, I think so.
>
>> If so, the way to implement that is to have a registration function.
>> i386-tdep.c:_initialize_i386_tdep can call a function in remote.c
>> to add "x86" to the list of xmlRegisters architectures.
>
> Sounds good to me.

A patch is posted at

http://sourceware.org/ml/gdb-patches/2010-03/msg00957.html


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [2nd try]: Add AVX support (gdbserver changes)
  2010-03-28 20:53                       ` H.J. Lu
@ 2010-03-28 21:27                         ` Pedro Alves
  0 siblings, 0 replies; 115+ messages in thread
From: Pedro Alves @ 2010-03-28 21:27 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Daniel Jacobowitz, gdb-patches

On Sunday 28 March 2010 21:53:07, H.J. Lu wrote:

> A patch is posted at
> 
> http://sourceware.org/ml/gdb-patches/2010-03/msg00957.html

Thank you.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 0/6 [3nd try]: Add AVX support
  2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
                     ` (2 preceding siblings ...)
  2010-03-27 16:16   ` Daniel Jacobowitz
@ 2010-03-29  0:16   ` H.J. Lu
  3 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-29  0:16 UTC (permalink / raw)
  To: Cc

AVX registers are saved and restored via the XSAVE extended state. The
extended control register 0 (the XFEATURE_ENABLED_MASK register), XCR0,
is used to determine which states, x87, SSE, AVX, ... are supported
in the XSAVE extended state.  XCR0 can be read with the new "xgetbv"
instruction.  The xstate_bv field at byte offset 512 in the XSAVE
extended state indicates what states the current process is in. If
the feature bit is cleared, the corresponding registers should be read as
0. If we update a register, we should set the corresponding feature
bit in the xstate_bv field.

We added PTRACE_GETREGSET and PTRACE_SETREGSET to Linux kernel to
fetch and store AVX registers with ptrace. Linux kernel also stores
XCR0 at the first 8 bytes of the software usable bytes, starting at
byte offset 464.

There are total 6 patches to add AVX support for Linux. i387 and XML
patches are unchanged from the 2nd try:

http://sourceware.org/ml/gdb-patches/2010-03/msg00262.html
http://sourceware.org/ml/gdb-patches/2010-03/msg00266.html

They support:

1. The upper 128bit YMM registers are added for AVX support. The upper
128bit YMM registers are hidden from users. Gdb combines XMM register,
%xmmX, with 128bit YMM register, %ymmXh, and present the whole 256bit
YMM register, %ymmX, as pseudo register to users.
2. Backward compatible. If AVX isn't supported, SSE will be used.
3. Forward compatible. If new state beyond AVX is supported in
the XSAVE extended state, only AVX state will be used.
4. Remote gdb protocol extension. GDB will send "xmlRegisters=" in
qSupported request packet to indicate that GDB supports XML target
desciption.  The x86 gdb stub will send XML target desciption if it sees
"xmlRegisters=" in qSupported request packet.

One advantage of this approach is YMM registers are actually stored as
XMM registers and upper YMM registers in the XSAVE extended state.  It
is easy and natural to access them as %xmmX and %ymmXh internally.  We
just need to hide %ymmXh from users.

To support AVX on other OSes, the following changes are needed:

1. Kernel support to get/set the XSAVE extended state.
2. Handle 8/16 upper YMM registers.
3. Provide target to_read_description to return SSE or AVX target
description.
4. Update gdbarch_core_read_description to return SSE or AVX target
description based on contents of core dump.


H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 2/6 [3rd try]: Add AVX support (Update document)
  2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
                       ` (2 preceding siblings ...)
  2010-03-12 16:46     ` H.J. Lu
@ 2010-03-29  0:18     ` H.J. Lu
  2010-03-30 16:41       ` H.J. Lu
  3 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-29  0:18 UTC (permalink / raw)
  To: GDB

Hi,

This patch updates document for AVX support.  OK to install?

Thanks.


H.J.
---
2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.texinfo (General Query Packets): Document xmlRegisters=
	(i386 Features): Add org.gnu.gdb.i386.avx.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 56dbe5d..168462a 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -30603,6 +30603,12 @@ extensions to the remote protocol.  @value{GDBN} does not use such
 extensions unless the stub also reports that it supports them by
 including @samp{multiprocess+} in its @samp{qSupported} reply.
 @xref{multiprocess extensions}, for details.
+
+@item xmlRegisters
+This feature indicates that @value{GDBN} supports supports the XML
+target description.  If the stub sees @samp{xmlRegisters=} with
+target specfic strings separated by comma, it can send @value{GDBN}
+the XML target description.
 @end table
 
 Stubs should ignore any unknown values for
@@ -33746,6 +33752,17 @@ describe registers:
 @samp{mxcsr}
 @end itemize
 
+The @samp{org.gnu.gdb.i386.avx} feature is optional.  It should
+describe the upper 128 bits of @sc{ymm} registers:
+
+@itemize @minus
+@item
+@samp{ymm0h} through @samp{ymm7h} for i386
+@item
+@samp{ymm0h} through @samp{ymm15h} for amd64
+@item 
+@end itemize
+
 The @samp{org.gnu.gdb.i386.linux} feature is optional.  It should
 describe a single register, @samp{orig_eax}.
 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 4/6 [3rd try]: Add AVX support (amd64 changes)
  2010-03-12 17:01         ` H.J. Lu
  2010-03-13  1:38           ` H.J. Lu
@ 2010-03-29  1:07           ` H.J. Lu
  2010-04-02 14:32             ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-29  1:07 UTC (permalink / raw)
  To: GDB

Here are the amd64 changes to support AVX with AVX testcases. I
also need to import cpuid.h from gcc 4.4 since AVX testcases need
ECX from cpuid.  OK to install?


H.J.
----
gdb/

2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_update_xstateregset): Likewise.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_update_xstateregset): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_dwarf_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

gdb/testsuite/

2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.arch/i386-avx.c: New.
	* gdb.arch/i386-avx.exp: Likewise.

	* gdb.arch/i386-cpuid.h: Updated from gcc 4.4.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..eb3957c 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -51,6 +54,18 @@
 #include "i386-linux-tdep.h"
 #include "amd64-nat.h"
 #include "i386-nat.h"
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
@@ -73,6 +88,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +116,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +201,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[I386_XSTATE_MAX_SIZE];
+	  struct iovec iov;
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = sizeof (xstateregs);
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
+
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +260,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[I386_XSTATE_MAX_SIZE];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = sizeof (xstateregs);
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
+
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
 
-      return;
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +740,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static uint64_t xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +755,54 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
+      struct iovec iov;
+      unsigned int xstate_size;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	{
+	  have_ptrace_getregset = 0;
+	  xstate_size = 0;
+	}
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (uint64_t))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	}
+
+      amd64_linux_update_xstateregset (xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..4cc4045 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,8 +28,11 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
+#include "i386-xstate.h"
 
 #include "gdb_string.h"
 
@@ -38,6 +41,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +49,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+static struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1242,6 +1255,22 @@ amd64_linux_record_signal (struct gdbarch *gdbarch,
   return 0;
 }
 
+/* Update XSAVE extended state register note section.  */
+
+void
+amd64_linux_update_xstateregset (unsigned int xstate_size)
+{
+  struct core_regset_section *xstate = &amd64_linux_regset_sections[2];
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  gdb_assert (strcmp (xstate->sect_name, ".reg-xstate") == 0);
+  if (xstate_size)
+    xstate->size = xstate_size;
+  else
+    xstate->sect_name = NULL;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -1250,12 +1279,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  uint64_t xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1331,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1354,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1556,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..8862057 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,17 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Update XSAVE extended state register note section.  */
+extern void amd64_linux_update_xstateregset (unsigned int xstate_size);
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index e5cfa71..aa4acfb 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -163,6 +177,8 @@ static const int amd64_dwarf_regmap_len =
 static int
 amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 {
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
   int regnum = -1;
 
   if (reg >= 0 && reg < amd64_dwarf_regmap_len)
@@ -170,6 +186,9 @@ amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 
   if (regnum == -1)
     warning (_("Unmapped DWARF Register #%d encountered."), reg);
+  else if (ymm0_regnum >= 0
+	   && i386_xmm_regnum_p (gdbarch, regnum))
+    regnum += ymm0_regnum - I387_XMM0_REGNUM (tdep);
 
   return regnum;
 }
@@ -234,6 +253,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -242,6 +274,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2148,6 +2182,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2166,6 +2222,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2228,6 +2294,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 16;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2241,6 +2314,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2321,6 +2396,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2356,6 +2432,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2379,3 +2479,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f
diff --git a/gdb/testsuite/gdb.arch/i386-avx.c b/gdb/testsuite/gdb.arch/i386-avx.c
new file mode 100644
index 0000000..73f92b6
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/i386-avx.c
@@ -0,0 +1,128 @@
+/* Test program for AVX registers.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include "i386-cpuid.h"
+
+typedef struct {
+  float f[8];
+} v8sf_t;
+
+
+v8sf_t data[] =
+  {
+    { {  0.0,  0.125,  0.25,  0.375,  0.50,  0.625,  0.75,  0.875 } },
+    { {  1.0,  1.125,  1.25,  1.375,  1.50,  1.625,  1.75,  1.875 } },
+    { {  2.0,  2.125,  2.25,  2.375,  2.50,  2.625,  2.75,  2.875 } },
+    { {  3.0,  3.125,  3.25,  3.375,  3.50,  3.625,  3.75,  3.875 } },
+    { {  4.0,  4.125,  4.25,  4.375,  4.50,  4.625,  4.75,  4.875 } },
+    { {  5.0,  5.125,  5.25,  5.375,  5.50,  5.625,  5.75,  5.875 } },
+    { {  6.0,  6.125,  6.25,  6.375,  6.50,  6.625,  6.75,  6.875 } },
+    { {  7.0,  7.125,  7.25,  7.375,  7.50,  7.625,  7.75,  7.875 } },
+#ifdef __x86_64__
+    { {  8.0,  8.125,  8.25,  8.375,  8.50,  8.625,  8.75,  8.875 } },
+    { {  9.0,  9.125,  9.25,  9.375,  9.50,  9.625,  9.75,  9.875 } },
+    { { 10.0, 10.125, 10.25, 10.375, 10.50, 10.625, 10.75, 10.875 } },
+    { { 11.0, 11.125, 11.25, 11.375, 11.50, 11.625, 11.75, 11.875 } },
+    { { 12.0, 12.125, 12.25, 12.375, 12.50, 12.625, 12.75, 12.875 } },
+    { { 13.0, 13.125, 13.25, 13.375, 13.50, 13.625, 13.75, 13.875 } },
+    { { 14.0, 14.125, 14.25, 14.375, 14.50, 14.625, 14.75, 14.875 } },
+    { { 15.0, 15.125, 15.25, 15.375, 15.50, 15.625, 15.75, 15.875 } },
+#endif
+  };
+
+
+int
+have_avx (void)
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  if ((ecx & (bit_AVX | bit_OSXSAVE)) == (bit_AVX | bit_OSXSAVE))
+    return 1;
+  else
+    return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  if (have_avx ())
+    {
+      asm ("vmovaps 0(%0), %%ymm0\n\t"
+           "vmovaps 32(%0), %%ymm1\n\t"
+           "vmovaps 64(%0), %%ymm2\n\t"
+           "vmovaps 96(%0), %%ymm3\n\t"
+           "vmovaps 128(%0), %%ymm4\n\t"
+           "vmovaps 160(%0), %%ymm5\n\t"
+           "vmovaps 192(%0), %%ymm6\n\t"
+           "vmovaps 224(%0), %%ymm7\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7");
+#ifdef __x86_64__
+      asm ("vmovaps 256(%0), %%ymm8\n\t"
+           "vmovaps 288(%0), %%ymm9\n\t"
+           "vmovaps 320(%0), %%ymm10\n\t"
+           "vmovaps 352(%0), %%ymm11\n\t"
+           "vmovaps 384(%0), %%ymm12\n\t"
+           "vmovaps 416(%0), %%ymm13\n\t"
+           "vmovaps 448(%0), %%ymm14\n\t"
+           "vmovaps 480(%0), %%ymm15\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
+#endif
+
+      asm ("nop"); /* first breakpoint here */
+
+      asm (
+           "vmovaps %%ymm0, 0(%0)\n\t"
+           "vmovaps %%ymm1, 32(%0)\n\t"
+           "vmovaps %%ymm2, 64(%0)\n\t"
+           "vmovaps %%ymm3, 96(%0)\n\t"
+           "vmovaps %%ymm4, 128(%0)\n\t"
+           "vmovaps %%ymm5, 160(%0)\n\t"
+           "vmovaps %%ymm6, 192(%0)\n\t"
+           "vmovaps %%ymm7, 224(%0)\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7");
+#ifdef __x86_64__
+      asm (
+           "vmovaps %%ymm8, 256(%0)\n\t"
+           "vmovaps %%ymm9, 288(%0)\n\t"
+           "vmovaps %%ymm10, 320(%0)\n\t"
+           "vmovaps %%ymm11, 352(%0)\n\t"
+           "vmovaps %%ymm12, 384(%0)\n\t"
+           "vmovaps %%ymm13, 416(%0)\n\t"
+           "vmovaps %%ymm14, 448(%0)\n\t"
+           "vmovaps %%ymm15, 480(%0)\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
+#endif
+
+      puts ("Bye!"); /* second breakpoint here */
+    }
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.arch/i386-avx.exp b/gdb/testsuite/gdb.arch/i386-avx.exp
new file mode 100644
index 0000000..561ddef
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/i386-avx.exp
@@ -0,0 +1,110 @@
+# Copyright 2010 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Please email any bugs, comments, and/or additions to this file to:
+# bug-gdb@gnu.org
+
+# This file is part of the gdb testsuite.
+
+if $tracelevel {
+    strace $tracelevel
+}
+
+set prms_id 0
+set bug_id 0
+
+if { ![istarget i?86-*-*] && ![istarget x86_64-*-* ] } {
+    verbose "Skipping x86 AVX tests."
+    return
+}
+
+set testfile "i386-avx"
+set srcfile ${testfile}.c
+set binfile ${objdir}/${subdir}/${testfile}
+
+if [get_compiler_info ${binfile}] {
+    return -1
+}
+
+set additional_flags ""
+if [test_compiler_info gcc*] {
+    set additional_flags "additional_flags=-mavx"
+}
+
+if { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${binfile}" executable [list debug $additional_flags]] != "" } {
+    unsupported "compiler does not support AVX"
+    return
+}
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+gdb_load ${binfile}
+
+if ![runto_main] then {
+    gdb_suppress_tests
+}
+
+send_gdb "print have_avx ()\r"
+gdb_expect {
+    -re ".. = 1\r\n$gdb_prompt " {
+        pass "check whether processor supports AVX"
+    }
+    -re ".. = 0\r\n$gdb_prompt " {
+        verbose "processor does not support AVX; skipping AVX tests"
+        return
+    }
+    -re ".*$gdb_prompt $" {
+        fail "check whether processor supports AVX"
+    }
+    timeout {
+        fail "check whether processor supports AVX (timeout)"
+    }
+}
+
+gdb_test "break [gdb_get_line_number "first breakpoint here"]" \
+         "Breakpoint .* at .*i386-avx.c.*" \
+         "set first breakpoint in main"
+gdb_continue_to_breakpoint "continue to first breakpoint in main"
+
+if [istarget i?86-*-*] {
+    set nr_regs 8
+} else {
+    set nr_regs 16
+}
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "print \$ymm$r.v8_float" \
+        ".. = \\{$r, $r.125, $r.25, $r.375, $r.5, $r.625, $r.75, $r.875\\}.*" \
+        "check float contents of %ymm$r"
+    gdb_test "print \$ymm$r.v32_int8" \
+        ".. = \\{(-?\[0-9\]+, ){31}-?\[0-9\]+\\}.*" \
+        "check int8 contents of %ymm$r"
+}
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "set var \$ymm$r.v8_float\[0\] = $r + 10" "" "set %ymm$r"
+}
+
+gdb_test "break [gdb_get_line_number "second breakpoint here"]" \
+         "Breakpoint .* at .*i386-avx.c.*" \
+         "set second breakpoint in main"
+gdb_continue_to_breakpoint "continue to second breakpoint in main"
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "print data\[$r\]" \
+        ".. = \\{f = \\{[expr $r + 10], $r.125, $r.25, $r.375, $r.5, $r.625, $r.75, $r.875\\}\\}.*" \
+        "check contents of data\[$r\]"
+}
diff --git a/gdb/testsuite/gdb.arch/i386-cpuid.h b/gdb/testsuite/gdb.arch/i386-cpuid.h
index 7ff0dba..5ebde5a 100644
--- a/gdb/testsuite/gdb.arch/i386-cpuid.h
+++ b/gdb/testsuite/gdb.arch/i386-cpuid.h
@@ -1,75 +1,200 @@
-/* Helper file for i386 platform.  Runtime check for MMX/SSE/SSE2 support.
+/* Helper file for i386 platform.  Runtime check for MMX/SSE/SSE2/AVX
+ * support. Copied from gcc 4.4.
+ *
+ * Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ * 
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ * 
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ * 
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
 
-   Copyright 2004, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+/* %ecx */
+#define bit_SSE3	(1 << 0)
+#define bit_PCLMUL	(1 << 1)
+#define bit_SSSE3	(1 << 9)
+#define bit_FMA		(1 << 12)
+#define bit_CMPXCHG16B	(1 << 13)
+#define bit_SSE4_1	(1 << 19)
+#define bit_SSE4_2	(1 << 20)
+#define bit_MOVBE	(1 << 22)
+#define bit_POPCNT	(1 << 23)
+#define bit_AES		(1 << 25)
+#define bit_XSAVE	(1 << 26)
+#define bit_OSXSAVE	(1 << 27)
+#define bit_AVX		(1 << 28)
 
-   This file is part of GDB.
+/* %edx */
+#define bit_CMPXCHG8B	(1 << 8)
+#define bit_CMOV	(1 << 15)
+#define bit_MMX		(1 << 23)
+#define bit_FXSAVE	(1 << 24)
+#define bit_SSE		(1 << 25)
+#define bit_SSE2	(1 << 26)
 
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3 of the License, or
-   (at your option) any later version.
+/* Extended Features */
+/* %ecx */
+#define bit_LAHF_LM	(1 << 0)
+#define bit_ABM		(1 << 5)
+#define bit_SSE4a	(1 << 6)
+#define bit_XOP         (1 << 11)
+#define bit_LWP 	(1 << 15)
+#define bit_FMA4        (1 << 16)
 
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+/* %edx */
+#define bit_LM		(1 << 29)
+#define bit_3DNOWP	(1 << 30)
+#define bit_3DNOW	(1 << 31)
 
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
-/* Used by 20020523-2.c and i386-sse-6.c, and possibly others.  */
-/* Plagarized from 20020523-2.c.  */
-/* Plagarized from gcc.  */
+#if defined(__i386__) && defined(__PIC__)
+/* %ebx may be the PIC register.  */
+#if __GNUC__ >= 3
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{l}\t{%%}ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-#define bit_CMOV (1 << 15)
-#define bit_MMX (1 << 23)
-#define bit_SSE (1 << 25)
-#define bit_SSE2 (1 << 26)
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{l}\t{%%}ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#else
+/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
+   nor alternatives in i386 code.  */
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchgl\t%%ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-#ifndef NOINLINE
-#define NOINLINE __attribute__ ((noinline))
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchgl\t%%ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
 #endif
+#else
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-unsigned int i386_cpuid (void) NOINLINE;
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#endif
 
-unsigned int NOINLINE
-i386_cpuid (void)
+/* Return highest supported input value for cpuid instruction.  ext can
+   be either 0x0 or 0x8000000 to return highest supported value for
+   basic or extended cpuid information.  Function returns 0 if cpuid
+   is not supported or whatever cpuid returns in eax register.  If sig
+   pointer is non-null, then first four bytes of the signature
+   (as found in ebx register) are returned in location pointed by sig.  */
+
+static __inline unsigned int
+__get_cpuid_max (unsigned int __ext, unsigned int *__sig)
 {
-  int fl1, fl2;
+  unsigned int __eax, __ebx, __ecx, __edx;
 
 #ifndef __x86_64__
+#if __GNUC__ >= 3
   /* See if we can use cpuid.  On AMD64 we always can.  */
-  __asm__ ("pushfl; pushfl; popl %0; movl %0,%1; xorl %2,%0;"
-	   "pushl %0; popfl; pushfl; popl %0; popfl"
-	   : "=&r" (fl1), "=&r" (fl2)
+  __asm__ ("pushf{l|d}\n\t"
+	   "pushf{l|d}\n\t"
+	   "pop{l}\t%0\n\t"
+	   "mov{l}\t{%0, %1|%1, %0}\n\t"
+	   "xor{l}\t{%2, %0|%0, %2}\n\t"
+	   "push{l}\t%0\n\t"
+	   "popf{l|d}\n\t"
+	   "pushf{l|d}\n\t"
+	   "pop{l}\t%0\n\t"
+	   "popf{l|d}\n\t"
+	   : "=&r" (__eax), "=&r" (__ebx)
+	   : "i" (0x00200000));
+#else
+/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
+   nor alternatives in i386 code.  */
+  __asm__ ("pushfl\n\t"
+	   "pushfl\n\t"
+	   "popl\t%0\n\t"
+	   "movl\t%0, %1\n\t"
+	   "xorl\t%2, %0\n\t"
+	   "pushl\t%0\n\t"
+	   "popfl\n\t"
+	   "pushfl\n\t"
+	   "popl\t%0\n\t"
+	   "popfl\n\t"
+	   : "=&r" (__eax), "=&r" (__ebx)
 	   : "i" (0x00200000));
-  if (((fl1 ^ fl2) & 0x00200000) == 0)
-    return (0);
 #endif
 
-  /* Host supports cpuid.  See if cpuid gives capabilities, try
-     CPUID(0).  Preserve %ebx and %ecx; cpuid insn clobbers these, we
-     don't need their CPUID values here, and %ebx may be the PIC
-     register.  */
-#ifdef __x86_64__
-  __asm__ ("pushq %%rcx; pushq %%rbx; cpuid; popq %%rbx; popq %%rcx"
-	   : "=a" (fl1) : "0" (0) : "rdx", "cc");
-#else
-  __asm__ ("pushl %%ecx; pushl %%ebx; cpuid; popl %%ebx; popl %%ecx"
-	   : "=a" (fl1) : "0" (0) : "edx", "cc");
+  if (!((__eax ^ __ebx) & 0x00200000))
+    return 0;
 #endif
-  if (fl1 == 0)
-    return (0);
-
-  /* Invoke CPUID(1), return %edx; caller can examine bits to
-     determine what's supported.  */
-#ifdef __x86_64__
-  __asm__ ("pushq %%rcx; pushq %%rbx; cpuid; popq %%rbx; popq %%rcx"
-	   : "=d" (fl2), "=a" (fl1) : "1" (1) : "cc");
-#else
-  __asm__ ("pushl %%ecx; pushl %%ebx; cpuid; popl %%ebx; popl %%ecx"
-	   : "=d" (fl2), "=a" (fl1) : "1" (1) : "cc");
+
+  /* Host supports cpuid.  Return highest supported cpuid input value.  */
+  __cpuid (__ext, __eax, __ebx, __ecx, __edx);
+
+  if (__sig)
+    *__sig = __ebx;
+
+  return __eax;
+}
+
+/* Return cpuid data for requested cpuid level, as found in returned
+   eax, ebx, ecx and edx registers.  The function checks if cpuid is
+   supported and returns 1 for valid cpuid information or 0 for
+   unsupported cpuid level.  All pointers are required to be non-null.  */
+
+static __inline int
+__get_cpuid (unsigned int __level,
+	     unsigned int *__eax, unsigned int *__ebx,
+	     unsigned int *__ecx, unsigned int *__edx)
+{
+  unsigned int __ext = __level & 0x80000000;
+
+  if (__get_cpuid_max (__ext, 0) < __level)
+    return 0;
+
+  __cpuid (__level, *__eax, *__ebx, *__ecx, *__edx);
+  return 1;
+}
+
+#ifndef NOINLINE
+#define NOINLINE __attribute__ ((noinline))
 #endif
 
-  return fl2;
+unsigned int i386_cpuid (void) NOINLINE;
+
+unsigned int NOINLINE
+i386_cpuid (void)
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  return edx;
 }

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-12 17:25           ` H.J. Lu
  2010-03-27 16:07             ` Daniel Jacobowitz
@ 2010-03-29  1:09             ` H.J. Lu
  2010-03-29 14:08               ` Eli Zaretskii
  2010-03-30 16:48               ` H.J. Lu
  1 sibling, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-29  1:09 UTC (permalink / raw)
  To: GDB

Hi,

Here are gdbserver changes to support AVX.  OK to install?

Thanks.


H.J.
---
2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>

	* Makefile.in (clean): Updated.
	(i386-avx.o): New.
	(i386-avx.c): Likewise.
	(i386-avx-linux.o): Likewise.
	(i386-avx-linux.c): Likewise.
	(amd64-avx.o): Likewise.
	(amd64-avx.c): Likewise.
	(amd64-avx-linux.o): Likewise.
	(amd64-avx-linux.c): Likewise.

	* configure.srv (srv_i386_regobj): Add i386-avx.o.
	(srv_i386_linux_regobj): Add i386-avx-linux.o.
	(srv_amd64_regobj): Add amd64-avx.o.
	(srv_amd64_linux_regobj): Add amd64-avx-linux.o.
	(srv_i386_32bit_xmlfiles): Add i386/32bit-avx.xml.
	(srv_i386_64bit_xmlfiles): Add i386/64bit-avx.xml.
	(srv_i386_xmlfiles): Add i386/i386-avx.xml.
	(srv_amd64_xmlfiles): Add i386/amd64-avx.xml.
	(srv_i386_linux_xmlfiles): Add i386/i386-avx-linux.xml.
	(srv_amd64_linux_xmlfiles): Add i386/amd64-avx-linux.xml.

	* i387-fp.c: Include "i386-xstate.h".
	(i387_xsave): New.
	(i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* i387-fp.h (i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* linux-arm-low.c (target_regsets): Initialize nt_type to 0.
	* linux-crisv32-low.c (target_regsets): Likewise.
	* linux-m68k-low.c (target_regsets): Likewise.
	* linux-mips-low.c (target_regsets): Likewise.
	* linux-ppc-low.c (target_regsets): Likewise.
	* linux-s390-low.c (target_regsets): Likewise.
	* linux-sh-low.c (target_regsets): Likewise.
	* linux-sparc-low.c (target_regsets): Likewise.
	* linux-xtensa-low.c (target_regsets): Likewise.

	* linux-low.c: Include <sys/uio.h>.
	(regsets_fetch_inferior_registers): Support nt_type.
	(regsets_store_inferior_registers): Likewise.
	(linux_process_qsupported): New.
	(linux_target_ops): Add linux_process_qsupported.

	* linux-low.h (regset_info): Add nt_type.
	(linux_target_ops): Add process_qsupported.

	* linux-x86-low.c: Include "i386-xstate.h", "elf/common.h",
	<sys/uio.h> and <unistd.h>.
	(init_registers_i386_avx_linux): New.
	(init_registers_amd64_avx_linux): Likewise.
	(xmltarget_amd64_linux_no_xml): Likewise.
	(xmltarget_i386_linux_no_xml): Likewise.
	(PTRACE_GETREGSET): Likewise.
	(PTRACE_SETREGSET): Likewise.
	(x86_fill_xstateregset): Likewise.
	(x86_store_xstateregset): Likewise.
	(x86_linux_process_qsupported): Likewise.
	(target_regsets): Add NT_X86_XSTATE entry and Initialize nt_type.
	(x86_arch_setup): Set gdbserver_xmltarget.
	(the_low_target): Add x86_linux_process_qsupported.

	* server.c (handle_query): Call target_process_qsupported.

	* target.h (target_ops): Add process_qsupported.
	(target_process_qsupported): New.

diff --git a/gdb/gdbserver/Makefile.in b/gdb/gdbserver/Makefile.in
index 7fecced..2ec9784 100644
--- a/gdb/gdbserver/Makefile.in
+++ b/gdb/gdbserver/Makefile.in
@@ -217,6 +217,8 @@ clean:
 	rm -f powerpc-isa205-vsx64l.c
 	rm -f s390-linux32.c s390-linux64.c s390x-linux64.c
 	rm -f xml-builtin.c stamp-xml
+	rm -f i386-avx.c i386-avx-linux.c
+	rm -f amd64-avx.c amd64-avx-linux.c
 
 maintainer-clean realclean distclean: clean
 	rm -f nm.h tm.h xm.h config.status config.h stamp-h config.log
@@ -351,6 +353,12 @@ i386.c : $(srcdir)/../regformats/i386/i386.dat $(regdat_sh)
 i386-linux.o : i386-linux.c $(regdef_h)
 i386-linux.c : $(srcdir)/../regformats/i386/i386-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-linux.dat i386-linux.c
+i386-avx.o : i386-avx.c $(regdef_h)
+i386-avx.c : $(srcdir)/../regformats/i386/i386-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx.dat i386-avx.c
+i386-avx-linux.o : i386-avx-linux.c $(regdef_h)
+i386-avx-linux.c : $(srcdir)/../regformats/i386/i386-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx-linux.dat i386-avx-linux.c
 reg-ia64.o : reg-ia64.c $(regdef_h)
 reg-ia64.c : $(srcdir)/../regformats/reg-ia64.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-ia64.dat reg-ia64.c
@@ -438,6 +446,12 @@ amd64.c : $(srcdir)/../regformats/i386/amd64.dat $(regdat_sh)
 amd64-linux.o : amd64-linux.c $(regdef_h)
 amd64-linux.c : $(srcdir)/../regformats/i386/amd64-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-linux.dat amd64-linux.c
+amd64-avx.o : amd64-avx.c $(regdef_h)
+amd64-avx.c : $(srcdir)/../regformats/i386/amd64-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx.dat amd64-avx.c
+amd64-avx-linux.o : amd64-avx-linux.c $(regdef_h)
+amd64-avx-linux.c : $(srcdir)/../regformats/i386/amd64-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx-linux.dat amd64-avx-linux.c
 reg-xtensa.o : reg-xtensa.c $(regdef_h)
 reg-xtensa.c : $(srcdir)/../regformats/reg-xtensa.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-xtensa.dat reg-xtensa.c
diff --git a/gdb/gdbserver/configure.srv b/gdb/gdbserver/configure.srv
index f7c80bd..8bc9aeb 100644
--- a/gdb/gdbserver/configure.srv
+++ b/gdb/gdbserver/configure.srv
@@ -22,17 +22,17 @@
 # Default hostio_last_error implementation
 srv_hostio_err_objs="hostio-errno.o"
 
-srv_i386_regobj=i386.o
-srv_i386_linux_regobj=i386-linux.o
-srv_amd64_regobj=amd64.o
-srv_amd64_linux_regobj=amd64-linux.o
+srv_i386_regobj="i386.o i386-avx.o"
+srv_i386_linux_regobj="i386-linux.o i386-avx-linux.o"
+srv_amd64_regobj="amd64.o x86-64-avx.o"
+srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
-srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml"
-srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml"
-srv_i386_xmlfiles="i386/i386.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_xmlfiles="i386/amd64.xml $srv_i386_64bit_xmlfiles"
-srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
+srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
+srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
+srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_xmlfiles="i386/amd64.xml i386/amd64-avx.xml $srv_i386_64bit_xmlfiles"
+srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/i386-avx-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/amd64-avx-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
 
 # Input is taken from the "${target}" variable.
 
diff --git a/gdb/gdbserver/i387-fp.c b/gdb/gdbserver/i387-fp.c
index 7ef4ba3..5461022 100644
--- a/gdb/gdbserver/i387-fp.c
+++ b/gdb/gdbserver/i387-fp.c
@@ -19,6 +19,7 @@
 
 #include "server.h"
 #include "i387-fp.h"
+#include "i386-xstate.h"
 
 int num_xmm_registers = 8;
 
@@ -72,6 +73,46 @@ struct i387_fxsave {
   unsigned char xmm_space[256];
 };
 
+struct i387_xsave {
+  /* All these are only sixteen bits, plus padding, except for fop (which
+     is only eleven bits), and fooff / fioff (which are 32 bits each).  */
+  unsigned short fctrl;
+  unsigned short fstat;
+  unsigned short ftag;
+  unsigned short fop;
+  unsigned int fioff;
+  unsigned short fiseg;
+  unsigned short pad1;
+  unsigned int fooff;
+  unsigned short foseg;
+  unsigned short pad12;
+
+  unsigned int mxcsr;
+  unsigned int mxcsr_mask;
+
+  /* Space for eight 80-bit FP values in 128-bit spaces.  */
+  unsigned char st_space[128];
+
+  /* Space for eight 128-bit XMM values, or 16 on x86-64.  */
+  unsigned char xmm_space[256];
+
+  unsigned char reserved1[48];
+
+  /* The extended control register 0 (the XFEATURE_ENABLED_MASK
+     register).  */
+  unsigned long long xcr0;
+
+  unsigned char reserved2[40];
+
+  /* The XSTATE_BV bit vector.  */
+  unsigned long long xstate_bv;
+
+  unsigned char reserved3[56];
+
+  /* Space for eight upper 128-bit YMM values, or 16 on x86-64.  */
+  unsigned char ymmh_space[256];
+};
+
 void
 i387_cache_to_fsave (struct regcache *regcache, void *buf)
 {
@@ -199,6 +240,128 @@ i387_cache_to_fxsave (struct regcache *regcache, void *buf)
   fp->foseg = val;
 }
 
+void
+i387_cache_to_xsave (struct regcache *regcache, void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  int i;
+  unsigned long val, val2;
+  unsigned int clear_bv;
+  unsigned long long xstate_bv = 0;
+  char raw[16];
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Clear part in x87 and vector registers if its bit in xstat_bv is
+     zero.  */
+  if (clear_bv)
+    {
+      if ((clear_bv & I386_XSTATE_X87))
+	for (i = 0; i < 8; i++)
+	  memset (((char *) &fp->st_space[0]) + i * 16, 0, 10);
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->xmm_space[0]) + i * 16, 0, 16);
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->ymmh_space[0]) + i * 16, 0, 16);
+    }
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      for (i = 0; i < 8; i++)
+	{
+	  collect_register (regcache, i + st0_regnum, raw);
+	  p = ((char *) &fp->st_space[0]) + i * 16;
+	  if (memcmp (raw, p, 10))
+	    {
+	      xstate_bv |= I386_XSTATE_X87;
+	      memcpy (p, raw, 10);
+	    }
+	}
+    }
+
+  /* Check if any SSE registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_SSE;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Check if any AVX registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + ymm0h_regnum, raw);
+	  p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_AVX;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Update the corresponding bits in xstate_bv if any SSE/AVX
+     registers are changed.  */
+  fp->xstate_bv |= xstate_bv;
+
+  collect_register_by_name (regcache, "fioff", &fp->fioff);
+  collect_register_by_name (regcache, "fooff", &fp->fooff);
+  collect_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* This one's 11 bits... */
+  collect_register_by_name (regcache, "fop", &val2);
+  fp->fop = (val2 & 0x7FF) | (fp->fop & 0xF800);
+
+  /* Some registers are 16-bit.  */
+  collect_register_by_name (regcache, "fctrl", &val);
+  fp->fctrl = val;
+
+  collect_register_by_name (regcache, "fstat", &val);
+  fp->fstat = val;
+
+  /* Convert to the simplifed tag form stored in fxsave data.  */
+  collect_register_by_name (regcache, "ftag", &val);
+  val &= 0xFFFF;
+  val2 = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag = (val >> (i * 2)) & 3;
+
+      if (tag != 3)
+	val2 |= (1 << i);
+    }
+  fp->ftag = val2;
+
+  collect_register_by_name (regcache, "fiseg", &val);
+  fp->fiseg = val;
+
+  collect_register_by_name (regcache, "foseg", &val);
+  fp->foseg = val;
+}
+
 static int
 i387_ftag (struct i387_fxsave *fp, int regno)
 {
@@ -296,3 +459,107 @@ i387_fxsave_to_cache (struct regcache *regcache, const void *buf)
   val = (fp->fop) & 0x7FF;
   supply_register_by_name (regcache, "fop", &val);
 }
+
+void
+i387_xsave_to_cache (struct regcache *regcache, const void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  struct i387_fxsave *fxp = (struct i387_fxsave *) buf;
+  int i, top;
+  unsigned long val;
+  unsigned int clear_bv;
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      if ((clear_bv & I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < 8; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->st_space[0]) + i * 16;
+	  supply_register (regcache, i + st0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  supply_register (regcache, i + ymm0h_regnum, p);
+	}
+    }
+
+  supply_register_by_name (regcache, "fioff", &fp->fioff);
+  supply_register_by_name (regcache, "fooff", &fp->fooff);
+  supply_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* Some registers are 16-bit.  */
+  val = fp->fctrl & 0xFFFF;
+  supply_register_by_name (regcache, "fctrl", &val);
+
+  val = fp->fstat & 0xFFFF;
+  supply_register_by_name (regcache, "fstat", &val);
+
+  /* Generate the form of ftag data that GDB expects.  */
+  top = (fp->fstat >> 11) & 0x7;
+  val = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag;
+      if (fp->ftag & (1 << i))
+	tag = i387_ftag (fxp, (i + 8 - top) % 8);
+      else
+	tag = 3;
+      val |= tag << (2 * i);
+    }
+  supply_register_by_name (regcache, "ftag", &val);
+
+  val = fp->fiseg & 0xFFFF;
+  supply_register_by_name (regcache, "fiseg", &val);
+
+  val = fp->foseg & 0xFFFF;
+  supply_register_by_name (regcache, "foseg", &val);
+
+  val = (fp->fop) & 0x7FF;
+  supply_register_by_name (regcache, "fop", &val);
+}
+
+/* Default to SSE.  */
+unsigned long long x86_xcr0 = I386_XSTATE_SSE_MASK;
diff --git a/gdb/gdbserver/i387-fp.h b/gdb/gdbserver/i387-fp.h
index d1e0681..ed1a322 100644
--- a/gdb/gdbserver/i387-fp.h
+++ b/gdb/gdbserver/i387-fp.h
@@ -26,6 +26,11 @@ void i387_fsave_to_cache (struct regcache *regcache, const void *buf);
 void i387_cache_to_fxsave (struct regcache *regcache, void *buf);
 void i387_fxsave_to_cache (struct regcache *regcache, const void *buf);
 
+void i387_cache_to_xsave (struct regcache *regcache, void *buf);
+void i387_xsave_to_cache (struct regcache *regcache, const void *buf);
+
+extern unsigned long long x86_xcr0;
+
 extern int num_xmm_registers;
 
 #endif /* I387_FP_H */
diff --git a/gdb/gdbserver/linux-arm-low.c b/gdb/gdbserver/linux-arm-low.c
index 54668f8..32bd7bb 100644
--- a/gdb/gdbserver/linux-arm-low.c
+++ b/gdb/gdbserver/linux-arm-low.c
@@ -354,16 +354,16 @@ arm_arch_setup (void)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, 18 * 4,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 18 * 4,
     GENERAL_REGS,
     arm_fill_gregset, arm_store_gregset },
-  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 16 * 8 + 6 * 4,
+  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 0, 16 * 8 + 6 * 4,
     EXTENDED_REGS,
     arm_fill_wmmxregset, arm_store_wmmxregset },
-  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 32 * 8 + 4,
+  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 0, 32 * 8 + 4,
     EXTENDED_REGS,
     arm_fill_vfpregset, arm_store_vfpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-crisv32-low.c b/gdb/gdbserver/linux-crisv32-low.c
index 6ba48b6..d426c32 100644
--- a/gdb/gdbserver/linux-crisv32-low.c
+++ b/gdb/gdbserver/linux-crisv32-low.c
@@ -365,9 +365,9 @@ cris_store_gregset (const void *buf)
 typedef unsigned long elf_gregset_t[cris_num_regs];
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS, cris_fill_gregset, cris_store_gregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index ad68179..f5d4c41 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -39,6 +39,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/vfs.h>
+#include <sys/uio.h>
 #ifndef ELFMAG0
 /* Don't include <linux/elf.h> here.  If it got included by gdb_proc_service.h
    then ELFMAG0 will have been defined.  If it didn't get included by
@@ -2957,14 +2958,15 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2973,10 +2975,21 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
 	}
 
       buf = xmalloc (regset->size);
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, data, nt_type);
 #endif
       if (res < 0)
 	{
@@ -3014,14 +3027,15 @@ regsets_store_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -3034,10 +3048,21 @@ regsets_store_inferior_registers (struct regcache *regcache)
       /* First fill the buffer with the current register set contents,
 	 in case there are any items in the kernel's regset that are
 	 not in gdbserver's regcache.  */
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, &iov, data);
 #endif
 
       if (res == 0)
@@ -3047,9 +3072,9 @@ regsets_store_inferior_registers (struct regcache *regcache)
 
 	  /* Only now do we write the register set.  */
 #ifndef __sparc__
-	  res = ptrace (regset->set_request, pid, 0, buf);
+	  res = ptrace (regset->set_request, pid, nt_type, data);
 #else
-	  res = ptrace (regset->set_request, pid, buf, 0);
+	  res = ptrace (regset->set_request, pid, data, nt_type);
 #endif
 	}
 
@@ -4113,6 +4138,13 @@ linux_core_of_thread (ptid_t ptid)
   return core;
 }
 
+static void
+linux_process_qsupported (const char *query)
+{
+  if (the_low_target.process_qsupported != NULL)
+    the_low_target.process_qsupported (query);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -4156,7 +4188,8 @@ static struct target_ops linux_target_ops = {
 #else
   NULL,
 #endif
-  linux_core_of_thread
+  linux_core_of_thread,
+  linux_process_qsupported
 };
 
 static void
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index d7aa418..52623bf 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -35,6 +35,9 @@ enum regset_type {
 struct regset_info
 {
   int get_request, set_request;
+  /* If NT_TYPE isn't 0, it will be passed to ptrace as the 3rd
+     argument and the 4th argument should be "const struct iovec *".  */
+  int nt_type;
   int size;
   enum regset_type type;
   regset_fill_func fill_function;
@@ -111,6 +114,9 @@ struct linux_target_ops
 
   /* Hook to call prior to resuming a thread.  */
   void (*prepare_to_resume) (struct lwp_info *);
+
+  /* Hook to support target specific qSupported.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct linux_target_ops the_low_target;
diff --git a/gdb/gdbserver/linux-m68k-low.c b/gdb/gdbserver/linux-m68k-low.c
index 14e3864..6c98bb1 100644
--- a/gdb/gdbserver/linux-m68k-low.c
+++ b/gdb/gdbserver/linux-m68k-low.c
@@ -112,14 +112,14 @@ m68k_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     m68k_fill_gregset, m68k_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     m68k_fill_fpregset, m68k_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static const unsigned char m68k_breakpoint[] = { 0x4E, 0x4F };
diff --git a/gdb/gdbserver/linux-mips-low.c b/gdb/gdbserver/linux-mips-low.c
index 70f6700..1c04b2e 100644
--- a/gdb/gdbserver/linux-mips-low.c
+++ b/gdb/gdbserver/linux-mips-low.c
@@ -343,12 +343,12 @@ mips_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, 38 * 8, GENERAL_REGS,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 38 * 8, GENERAL_REGS,
     mips_fill_gregset, mips_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 33 * 8, FP_REGS,
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, 33 * 8, FP_REGS,
     mips_fill_fpregset, mips_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-ppc-low.c b/gdb/gdbserver/linux-ppc-low.c
index 10a1309..0dab604 100644
--- a/gdb/gdbserver/linux-ppc-low.c
+++ b/gdb/gdbserver/linux-ppc-low.c
@@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
      fetch them every time, but still fall back to PTRACE_PEEKUSER for the
      general registers.  Some kernels support these, but not the newer
      PPC_PTRACE_GETREGS.  */
-  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
+  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
   ppc_fill_vsxregset, ppc_store_vsxregset },
   { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
     ppc_fill_vrregset, ppc_store_vrregset },
-  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
+  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
     ppc_fill_evrregset, ppc_store_evrregset },
-  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-s390-low.c b/gdb/gdbserver/linux-s390-low.c
index 5460f57..eb865dc 100644
--- a/gdb/gdbserver/linux-s390-low.c
+++ b/gdb/gdbserver/linux-s390-low.c
@@ -181,8 +181,8 @@ static void s390_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 
diff --git a/gdb/gdbserver/linux-sh-low.c b/gdb/gdbserver/linux-sh-low.c
index 9d27e7f..87a0dd2 100644
--- a/gdb/gdbserver/linux-sh-low.c
+++ b/gdb/gdbserver/linux-sh-low.c
@@ -104,8 +104,8 @@ static void sh_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-sparc-low.c b/gdb/gdbserver/linux-sparc-low.c
index 0bb5f2f..e0bfe81 100644
--- a/gdb/gdbserver/linux-sparc-low.c
+++ b/gdb/gdbserver/linux-sparc-low.c
@@ -260,13 +260,13 @@ sparc_reinsert_addr (void)
 
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     sparc_fill_gregset, sparc_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (fpregset_t),
     FP_REGS,
     sparc_fill_fpregset, sparc_store_fpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index fe5d46e..fbdbecb 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -24,6 +24,8 @@
 #include "linux-low.h"
 #include "i387-fp.h"
 #include "i386-low.h"
+#include "i386-xstate.h"
+#include "elf/common.h"
 
 #include "gdb_proc_service.h"
 
@@ -31,10 +33,36 @@
 void init_registers_i386_linux (void);
 /* Defined in auto-generated file amd64-linux.c.  */
 void init_registers_amd64_linux (void);
+/* Defined in auto-generated file i386-avx-linux.c.  */
+void init_registers_i386_avx_linux (void);
+/* Defined in auto-generated file amd64-avx-linux.c.  */
+void init_registers_amd64_avx_linux (void);
+
+/* Backward compatibility for gdb without XML support.  */
+
+static const char *xmltarget_amd64_linux_no_xml = "@<target>\
+<architecture>i386:x86-64</architecture>\
+<osabi>GNU/Linux</osabi>\
+</target>";
+static const char *xmltarget_i386_linux_no_xml = "@<target>\
+<architecture>i386</architecture>\
+<osabi>GNU/Linux</osabi>\
+</target>";
 
 #include <sys/reg.h>
 #include <sys/procfs.h>
 #include <sys/ptrace.h>
+#include <sys/uio.h>
+#include <unistd.h>
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
 
 #ifndef PTRACE_GET_THREAD_AREA
 #define PTRACE_GET_THREAD_AREA 25
@@ -252,6 +280,18 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 
 #endif
 
+static void
+x86_fill_xstateregset (struct regcache *regcache, void *buf)
+{
+  i387_cache_to_xsave (regcache, buf);
+}
+
+static void
+x86_store_xstateregset (struct regcache *regcache, const void *buf)
+{
+  i387_xsave_to_cache (regcache, buf);
+}
+
 /* ??? The non-biarch i386 case stores all the i387 regs twice.
    Once in i387_.*fsave.* and once in i387_.*fxsave.*.
    This is, presumably, to handle the case where PTRACE_[GS]ETFPXREGS
@@ -264,21 +304,23 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 struct regset_info target_regsets[] =
 {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     x86_fill_gregset, x86_store_gregset },
+  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
+    EXTENDED_REGS, x86_fill_xstateregset, x86_store_xstateregset },
 # ifndef __x86_64__
 #  ifdef HAVE_PTRACE_GETFPXREGS
-  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, sizeof (elf_fpxregset_t),
+  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, 0, sizeof (elf_fpxregset_t),
     EXTENDED_REGS,
     x86_fill_fpxregset, x86_store_fpxregset },
 #  endif
 # endif
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     x86_fill_fpregset, x86_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static CORE_ADDR
@@ -776,6 +818,66 @@ x86_siginfo_fixup (struct siginfo *native, void *inf, int direction)
   return 0;
 }
 \f
+/* Process qSupported query, "xmlRegisters=".  Update the buffer size for
+   PTRACE_GETREGSET.  */
+
+static void
+x86_linux_process_qsupported (const char *query)
+{
+  uint64_t xstateregs[I386_XSTATE_SSE_SIZE / sizeof (uint64_t)];
+  struct iovec iov;
+
+  /* Return if gdb doesn't support XML.  If gdb sends "xmlRegisters="
+     in qSupported query, it supports x86 XML target descriptions.  */
+  if (strncmp (query, "xmlRegisters=", 13) != 0)
+    return;
+
+  /* Update gdbserver_xmltarget with XML support.  */
+  if (num_xmm_registers == 8)
+    gdbserver_xmltarget = "i386-linux.xml";
+  else
+    gdbserver_xmltarget = "amd64-linux.xml";
+
+  /* Check if XSAVE extended state is supported.  */
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof (xstateregs);
+
+  /* Check if PTRACE_GETREGSET works.  */
+  if (ptrace (PTRACE_GETREGSET, getpid (),
+	      (unsigned int) NT_X86_XSTATE, &iov) == 0)
+    {
+      struct regset_info *regset;
+      unsigned long long xcr0;
+
+      /* Get XCR0 from XSAVE extended state at byte 464.  */
+      xcr0 = xstateregs[464 / sizeof (long long)];
+
+      /* Use PTRACE_GETREGSET if it is available.  */
+      for (regset = target_regsets;
+	   regset->fill_function != NULL; regset++)
+	if (regset->get_request == PTRACE_GETREGSET)
+	  regset->size = I386_XSTATE_SIZE (xcr0);
+	else if (regset->type != GENERAL_REGS)
+	  regset->size = 0;
+
+      /* AVX is the highest feature we support.  */
+      if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+	{
+	  x86_xcr0 = xcr0;
+
+#ifdef __x86_64__
+	  /* I386 has 8 xmm regs.  */
+	  if (num_xmm_registers == 8)
+	    init_registers_i386_avx_linux ();
+	  else
+	    init_registers_amd64_avx_linux ();
+#else
+	  init_registers_i386_avx_linux ();
+#endif
+	}
+    }
+}
+
 /* Initialize gdbserver for the architecture of the inferior.  */
 
 static void
@@ -798,6 +900,10 @@ x86_arch_setup (void)
     {
       init_registers_amd64_linux ();
 
+      /* Assume that gdb doesn't support XML.  We will update it in
+	 x86_linux_process_qsupported if gdb does support XML.  */
+      gdbserver_xmltarget = xmltarget_amd64_linux_no_xml;
+
       /* Amd64 doesn't have HAVE_LINUX_USRREGS.  */
       the_low_target.num_regs = -1;
       the_low_target.regmap = NULL;
@@ -815,6 +921,10 @@ x86_arch_setup (void)
 
   init_registers_i386_linux ();
 
+  /* Assume that gdb doesn't support XML.  We will update it in
+     x86_linux_process_qsupported if gdb does support XML.  */
+  gdbserver_xmltarget = xmltarget_i386_linux_no_xml;
+
   the_low_target.num_regs = I386_NUM_REGS;
   the_low_target.regmap = i386_regmap;
   the_low_target.cannot_fetch_register = i386_cannot_fetch_register;
@@ -854,5 +964,6 @@ struct linux_target_ops the_low_target =
   x86_siginfo_fixup,
   x86_linux_new_process,
   x86_linux_new_thread,
-  x86_linux_prepare_to_resume
+  x86_linux_prepare_to_resume,
+  x86_linux_process_qsupported 
 };
diff --git a/gdb/gdbserver/linux-xtensa-low.c b/gdb/gdbserver/linux-xtensa-low.c
index c5ed351..8d0e73a 100644
--- a/gdb/gdbserver/linux-xtensa-low.c
+++ b/gdb/gdbserver/linux-xtensa-low.c
@@ -131,13 +131,13 @@ xtensa_store_xtregset (struct regcache *regcache, const void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     xtensa_fill_gregset, xtensa_store_gregset },
-  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, XTENSA_ELF_XTREG_SIZE,
+  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, 0, XTENSA_ELF_XTREG_SIZE,
     EXTENDED_REGS,
     xtensa_fill_xtregset, xtensa_store_xtregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 #if XCHAL_HAVE_BE
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index 232085a..e596b33 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -1292,6 +1292,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		if (target_supports_multi_process ())
 		  multi_process = 1;
 	      }
+	    else
+	      target_process_qsupported (p);
 	  }
 
       sprintf (own_buf, "PacketSize=%x;QPassSignals+", PBUFSIZ - 1);
diff --git a/gdb/gdbserver/target.h b/gdb/gdbserver/target.h
index ac68652..6109b1c 100644
--- a/gdb/gdbserver/target.h
+++ b/gdb/gdbserver/target.h
@@ -286,6 +286,9 @@ struct target_ops
 
   /* Returns the core given a thread, or -1 if not known.  */
   int (*core_of_thread) (ptid_t);
+
+  /* Target specific qSupported support.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct target_ops *the_target;
@@ -326,6 +329,10 @@ void set_target_ops (struct target_ops *);
   (the_target->supports_multi_process ? \
    (*the_target->supports_multi_process) () : 0)
 
+#define target_process_qsupported(query) \
+  if (the_target->process_qsupported) \
+    the_target->process_qsupported (query)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-03-12 16:49       ` H.J. Lu
  2010-03-13  1:38         ` H.J. Lu
@ 2010-03-29  1:11         ` H.J. Lu
  2010-04-02 14:31           ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-29  1:11 UTC (permalink / raw)
  To: GDB

Hi,

Here are i386 changes to support AVX. OK to install?

Thanks.


H.J.
----
2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Add ".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "remote.h", "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_dbx_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.  Call register_remote_support_xml.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_xmm_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..5e16015
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,41 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define I386_XSTATE_X87		(1ULL << 0)
+#define I386_XSTATE_SSE		(1ULL << 1)
+#define I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	(I386_XSTATE_X87 | I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	(I386_XSTATE_SSE_MASK | I386_XSTATE_AVX)
+
+#define I386_XSTATE_SSE_SIZE	576
+#define I386_XSTATE_AVX_SIZE	832
+#define I386_XSTATE_MAX_SIZE	832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..d1048eb 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,19 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +114,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +128,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +376,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  char xstateregs[I386_XSTATE_MAX_SIZE];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof(xstateregs);
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  char xstateregs[I386_XSTATE_MAX_SIZE];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof(xstateregs);
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +561,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +575,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +633,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +647,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +946,50 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static uint64_t xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
+      struct iovec iov;
+      unsigned int xstate_size;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+		  &iov) < 0)
+	{
+	  have_ptrace_getregset = 0;
+	  xstate_size = 0;
+	}
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	}
+
+      i386_linux_update_xstateregset (xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..bda5d19 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,6 +50,7 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
 static struct core_regset_section i386_linux_regset_sections[] =
@@ -54,6 +58,7 @@ static struct core_regset_section i386_linux_regset_sections[] =
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,59 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset (unsigned int xstate_size)
+{
+  struct core_regset_section *xstate = &i386_linux_regset_sections[3];
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  gdb_assert (strcmp (xstate->sect_name, ".reg-xstate") == 0);
+  if (xstate_size)
+    xstate->size = xstate_size;
+  else
+    xstate->sect_name = NULL;
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+uint64_t
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  uint64_t xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +627,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  uint64_t xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +687,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +906,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..187769b 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,41 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern uint64_t i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset (unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is the OS enabled
+  extended state mask, which is the same as the extended control register
+  0 (the XFEATURE_ENABLED_MASK register), XCR0.  We can use this mask
+  together with the mask saved in the xstate_hdr_bytes to determine what
+  states the processor/OS supports and what state, used or initialized,
+  the process/thread is in.  */ 
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 83275ac..bc924bf 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -44,17 +44,20 @@
 #include "value.h"
 #include "dis-asm.h"
 #include "disasm.h"
+#include "remote.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -73,6 +76,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -149,18 +164,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
-static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -200,6 +244,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -208,6 +265,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -245,7 +304,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
   else if (reg >= 21 && reg <= 28)
     {
       /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      int ymm0_regnum = tdep->ymm0_regnum;
+
+      if (ymm0_regnum >= 0
+	  && i386_xmm_regnum_p (gdbarch, reg))
+	return reg - 21 + ymm0_regnum;
+      else
+	return reg - 21 + I387_XMM0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2183,6 +2248,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo YMM registers.  We can't use
+   tdesc_find_type since YMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2233,6 +2351,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2284,7 +2404,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 128bits. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 128bits.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2333,7 +2468,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 128bits.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 128bits.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2580,6 +2728,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2607,6 +2777,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2800,46 +2980,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5652,7 +5846,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -5662,13 +5857,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may have been set by OSABI initialization function.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -5677,7 +5896,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -5692,6 +5911,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -5712,6 +5932,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -5741,6 +5963,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -5857,9 +6081,14 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* Even though the default ABI only includes general-purpose registers,
+     floating-point registers and the SSE registers, we have to leave a
+     gap for the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -5870,10 +6099,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -5881,24 +6115,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -5908,16 +6143,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -5943,6 +6188,9 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_fast_tracepoint_valid_at (gdbarch,
 					i386_fast_tracepoint_valid_at);
 
+  /* Tell remote stub that we support XML target description.  */
+  register_remote_support_xml ("x86");
+
   return gdbarch;
 }
 
@@ -6000,4 +6248,5 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 }
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..6520d67 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this GDB.
+   */
+  uint64_t xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-29  1:09             ` PATCH: 6/6 [3rd " H.J. Lu
@ 2010-03-29 14:08               ` Eli Zaretskii
  2010-03-29 14:42                 ` H.J. Lu
  2010-03-30 16:48               ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-29 14:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Sun, 28 Mar 2010 18:09:35 -0700
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> Here are gdbserver changes to support AVX.  OK to install?

Thanks.  There are several files you add whose names will clash on DOS
filesystems.  Could you please provide suitable additions to
fnchange.lst for them?  TIA

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-29 14:08               ` Eli Zaretskii
@ 2010-03-29 14:42                 ` H.J. Lu
  2010-03-29 15:11                   ` Eli Zaretskii
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-29 14:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Mon, Mar 29, 2010 at 7:07 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Sun, 28 Mar 2010 18:09:35 -0700
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> Here are gdbserver changes to support AVX.  OK to install?
>
> Thanks.  There are several files you add whose names will clash on DOS
> filesystems.  Could you please provide suitable additions to
> fnchange.lst for them?  TIA
>

Those files are generated during gdb build. The original ones are covered
in

http://sourceware.org/ml/gdb-patches/2010-03/msg00262.html

Did I miss some files?

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-29 14:42                 ` H.J. Lu
@ 2010-03-29 15:11                   ` Eli Zaretskii
  2010-03-29 15:42                     ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-29 15:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Mon, 29 Mar 2010 07:42:19 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Cc: gdb-patches@sourceware.org
> 
> > Thanks.  There are several files you add whose names will clash on DOS
> > filesystems.  Could you please provide suitable additions to
> > fnchange.lst for them?  TIA
> >
> 
> Those files are generated during gdb build. The original ones are covered
> in
> 
> http://sourceware.org/ml/gdb-patches/2010-03/msg00262.html
> 
> Did I miss some files?

I meant this part from your patch:

  2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>

	  * Makefile.in (clean): Updated.
	  (i386-avx.o): New.
	  (i386-avx.c): Likewise.
	  (i386-avx-linux.o): Likewise.
	  (i386-avx-linux.c): Likewise.
	  (amd64-avx.o): Likewise.
	  (amd64-avx.c): Likewise.
	  (amd64-avx-linux.o): Likewise.
	  (amd64-avx-linux.c): Likewise.

I don't think these *.c files are covered by the message in the above
URL.  That message only handles files in the gdb/features/ directory,
but the files above are in gdb/, AFAIU.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-29 15:11                   ` Eli Zaretskii
@ 2010-03-29 15:42                     ` H.J. Lu
  2010-03-29 15:51                       ` Eli Zaretskii
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-29 15:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Mon, Mar 29, 2010 at 8:11 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Mon, 29 Mar 2010 07:42:19 -0700
>> From: "H.J. Lu" <hjl.tools@gmail.com>
>> Cc: gdb-patches@sourceware.org
>>
>> > Thanks.  There are several files you add whose names will clash on DOS
>> > filesystems.  Could you please provide suitable additions to
>> > fnchange.lst for them?  TIA
>> >
>>
>> Those files are generated during gdb build. The original ones are covered
>> in
>>
>> http://sourceware.org/ml/gdb-patches/2010-03/msg00262.html
>>
>> Did I miss some files?
>
> I meant this part from your patch:
>
>  2010-03-28  H.J. Lu  <hongjiu.lu@intel.com>
>
>          * Makefile.in (clean): Updated.
>          (i386-avx.o): New.
>          (i386-avx.c): Likewise.
>          (i386-avx-linux.o): Likewise.
>          (i386-avx-linux.c): Likewise.
>          (amd64-avx.o): Likewise.
>          (amd64-avx.c): Likewise.
>          (amd64-avx-linux.o): Likewise.
>          (amd64-avx-linux.c): Likewise.
>
> I don't think these *.c files are covered by the message in the above
> URL.  That message only handles files in the gdb/features/ directory,
> but the files above are in gdb/, AFAIU.
>
>

Those files aren't in gdb source tree and they are generated during
build in gdb/gdbserver build directory. How should I handle them?


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-29 15:42                     ` H.J. Lu
@ 2010-03-29 15:51                       ` Eli Zaretskii
  0 siblings, 0 replies; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-29 15:51 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Mon, 29 Mar 2010 08:42:34 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Cc: gdb-patches@sourceware.org
> 
> >          * Makefile.in (clean): Updated.
> >          (i386-avx.o): New.
> >          (i386-avx.c): Likewise.
> >          (i386-avx-linux.o): Likewise.
> >          (i386-avx-linux.c): Likewise.
> >          (amd64-avx.o): Likewise.
> >          (amd64-avx.c): Likewise.
> >          (amd64-avx-linux.o): Likewise.
> >          (amd64-avx-linux.c): Likewise.
> >
> > I don't think these *.c files are covered by the message in the above
> > URL.  That message only handles files in the gdb/features/ directory,
> > but the files above are in gdb/, AFAIU.
> >
> >
> 
> Those files aren't in gdb source tree and they are generated during
> build in gdb/gdbserver build directory.

If they are generated, you don't need to do anything.  The DJGPP build
(the one which needs fnchange.lst) does not compile gdbserver, so
these files will never be produced, and no harm will ever be done.

Sorry for my misunderstanding.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-28 14:25             ` H.J. Lu
@ 2010-03-29 20:32               ` Mark Kettenis
  2010-03-29 21:41                 ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-03-29 20:32 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 910 bytes --]

> Date: Sun, 28 Mar 2010 07:24:53 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
> 
> > Anyway, how about setting the size of the .reg-xstate to
> > I386_XSTATE_SSE_SIZE unconditionally?  Tools will look at xcr0 value
> > encoded in there to determine what information in there is valid, so
> > dumping a little bit more than strictly necessary shouldn't be a
> > problem.
> 
> That will make the code more complex since the generic gcore
> implementation will have to adjust section size based on XCR0.
> But if it is what is required, I will make the change.

Sorry, I think you're missing my point here.  The idea is to make
gcore always write out a NT_XSTATE note that has the maximal size
(I386_XSTATE_MAX_SIZE, I now see I typed the wrong thing above).  Then
when you write out the section, you fill the bits that aren't used
with zeroes and make sure the value of xcr0 stored in there is set
correctly.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [2nd try]: Add AVX support (i386 changes)
  2010-03-29 20:32               ` Mark Kettenis
@ 2010-03-29 21:41                 ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-29 21:41 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Mon, Mar 29, 2010 at 1:32 PM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Sun, 28 Mar 2010 07:24:53 -0700
>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>
>> > Anyway, how about setting the size of the .reg-xstate to
>> > I386_XSTATE_SSE_SIZE unconditionally?  Tools will look at xcr0 value
>> > encoded in there to determine what information in there is valid, so
>> > dumping a little bit more than strictly necessary shouldn't be a
>> > problem.
>>
>> That will make the code more complex since the generic gcore
>> implementation will have to adjust section size based on XCR0.
>> But if it is what is required, I will make the change.
>
> Sorry, I think you're missing my point here.  The idea is to make
> gcore always write out a NT_XSTATE note that has the maximal size
> (I386_XSTATE_MAX_SIZE, I now see I typed the wrong thing above).  Then
> when you write out the section, you fill the bits that aren't used
> with zeroes and make sure the value of xcr0 stored in there is set
> correctly.
>

i386-linux-tdep.c has

---
/* Update XSAVE extended state register note section.  */

void
i386_linux_update_xstateregset (unsigned int xstate_size)
{
  struct core_regset_section *xstate = &i386_linux_regset_sections[3];

  /* Update the XSAVE extended state register note section for "gcore".
     Disable it if its size is 0.  */
  gdb_assert (strcmp (xstate->sect_name, ".reg-xstate") == 0);
  if (xstate_size)
    xstate->size = xstate_size;
  else
    xstate->sect_name = NULL;
}
---

Even if I set xstate_size to I386_XSTATE_MAX_SIZE, I still need to
call it to set sect_name to NULL when the XSAVE extended state isn't
available. Please let me know if this is what you want.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [3rd try]: Add AVX support (Update document)
  2010-03-29  0:18     ` PATCH: 2/6 [3rd " H.J. Lu
@ 2010-03-30 16:41       ` H.J. Lu
  2010-03-30 18:27         ` Eli Zaretskii
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-03-30 16:41 UTC (permalink / raw)
  To: GDB

On Sun, Mar 28, 2010 at 05:18:27PM -0700, H.J. Lu wrote:
> Hi,
> 
> This patch updates document for AVX support.  OK to install?
> 
> Thanks.

Here is the updated patch since xmlRegisters= has been checked in.
OK to install?

Thanks.


H.J.
---
2010-03-30  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.texinfo (i386 Features): Add org.gnu.gdb.i386.avx.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index fab06a8..e60977b 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -33826,6 +33826,17 @@ describe registers:
 @samp{mxcsr}
 @end itemize
 
+The @samp{org.gnu.gdb.i386.avx} feature is optional.  It should
+describe the upper 128 bits of @sc{ymm} registers:
+
+@itemize @minus
+@item
+@samp{ymm0h} through @samp{ymm7h} for i386
+@item
+@samp{ymm0h} through @samp{ymm15h} for amd64
+@item 
+@end itemize
+
 The @samp{org.gnu.gdb.i386.linux} feature is optional.  It should
 describe a single register, @samp{orig_eax}.
 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-29  1:09             ` PATCH: 6/6 [3rd " H.J. Lu
  2010-03-29 14:08               ` Eli Zaretskii
@ 2010-03-30 16:48               ` H.J. Lu
  2010-04-02 17:39                 ` Daniel Jacobowitz
                                   ` (2 more replies)
  1 sibling, 3 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-30 16:48 UTC (permalink / raw)
  To: GDB

On Sun, Mar 28, 2010 at 06:09:35PM -0700, H.J. Lu wrote:
> Hi,
> 
> Here are gdbserver changes to support AVX.  OK to install?
> 
> Thanks.
> 

Here is the updated gdbserver change.  I tested it with

# gdbserver --multi host:10000

connecting from

1. gdb without AVX support.
2. gdb with AVX support,
3. gdb without XML support.

to debug 32bit and 64bit binaries.  Everything works correctly.
OK to install?

Thanks.


H.J.
--
2010-03-30  H.J. Lu  <hongjiu.lu@intel.com>

	* Makefile.in (clean): Updated.
	(i386-avx.o): New.
	(i386-avx.c): Likewise.
	(i386-avx-linux.o): Likewise.
	(i386-avx-linux.c): Likewise.
	(amd64-avx.o): Likewise.
	(amd64-avx.c): Likewise.
	(amd64-avx-linux.o): Likewise.
	(amd64-avx-linux.c): Likewise.

	* configure.srv (srv_i386_regobj): Add i386-avx.o.
	(srv_i386_linux_regobj): Add i386-avx-linux.o.
	(srv_amd64_regobj): Add amd64-avx.o.
	(srv_amd64_linux_regobj): Add amd64-avx-linux.o.
	(srv_i386_32bit_xmlfiles): Add i386/32bit-avx.xml.
	(srv_i386_64bit_xmlfiles): Add i386/64bit-avx.xml.
	(srv_i386_xmlfiles): Add i386/i386-avx.xml.
	(srv_amd64_xmlfiles): Add i386/amd64-avx.xml.
	(srv_i386_linux_xmlfiles): Add i386/i386-avx-linux.xml.
	(srv_amd64_linux_xmlfiles): Add i386/amd64-avx-linux.xml.

	* i387-fp.c: Include "i386-xstate.h".
	(i387_xsave): New.
	(i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* i387-fp.h (i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* linux-arm-low.c (target_regsets): Initialize nt_type to 0.
	* linux-crisv32-low.c (target_regsets): Likewise.
	* linux-m68k-low.c (target_regsets): Likewise.
	* linux-mips-low.c (target_regsets): Likewise.
	* linux-ppc-low.c (target_regsets): Likewise.
	* linux-s390-low.c (target_regsets): Likewise.
	* linux-sh-low.c (target_regsets): Likewise.
	* linux-sparc-low.c (target_regsets): Likewise.
	* linux-xtensa-low.c (target_regsets): Likewise.

	* linux-low.c: Include <sys/uio.h>.
	(regsets_fetch_inferior_registers): Support nt_type.
	(regsets_store_inferior_registers): Likewise.
	(linux_process_qsupported): New.
	(linux_target_ops): Add linux_process_qsupported.

	* linux-low.h (regset_info): Add nt_type.
	(linux_target_ops): Add process_qsupported.

	* linux-x86-low.c: Include "i386-xstate.h", "elf/common.h"
	and <sys/uio.h>.
	(init_registers_i386_avx_linux): New.
	(init_registers_amd64_avx_linux): Likewise.
	(xmltarget_i386_linux_no_xml): Likewise.
	(xmltarget_amd64_linux_no_xml): Likewise.
	(PTRACE_GETREGSET): Likewise.
	(PTRACE_SETREGSET): Likewise.
	(x86_fill_xstateregset): Likewise.
	(x86_store_xstateregset): Likewise.
	(use_xml): Likewise.
	(x86_linux_update_xmltarget): Likewise.
	(x86_linux_process_qsupported): Likewise.
	(target_regsets): Add NT_X86_XSTATE entry and Initialize nt_type.
	(x86_arch_setup): Don't call init_registers_amd64_linux nor
	init_registers_i386_linux here.  Call
	x86_linux_update_xmltarget.
	(the_low_target): Add x86_linux_process_qsupported.

	* server.c (handle_query): Call target_process_qsupported.

	* target.h (target_ops): Add process_qsupported.
	(target_process_qsupported): New.

diff --git a/gdb/gdbserver/Makefile.in b/gdb/gdbserver/Makefile.in
index 7fecced..2ec9784 100644
--- a/gdb/gdbserver/Makefile.in
+++ b/gdb/gdbserver/Makefile.in
@@ -217,6 +217,8 @@ clean:
 	rm -f powerpc-isa205-vsx64l.c
 	rm -f s390-linux32.c s390-linux64.c s390x-linux64.c
 	rm -f xml-builtin.c stamp-xml
+	rm -f i386-avx.c i386-avx-linux.c
+	rm -f amd64-avx.c amd64-avx-linux.c
 
 maintainer-clean realclean distclean: clean
 	rm -f nm.h tm.h xm.h config.status config.h stamp-h config.log
@@ -351,6 +353,12 @@ i386.c : $(srcdir)/../regformats/i386/i386.dat $(regdat_sh)
 i386-linux.o : i386-linux.c $(regdef_h)
 i386-linux.c : $(srcdir)/../regformats/i386/i386-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-linux.dat i386-linux.c
+i386-avx.o : i386-avx.c $(regdef_h)
+i386-avx.c : $(srcdir)/../regformats/i386/i386-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx.dat i386-avx.c
+i386-avx-linux.o : i386-avx-linux.c $(regdef_h)
+i386-avx-linux.c : $(srcdir)/../regformats/i386/i386-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx-linux.dat i386-avx-linux.c
 reg-ia64.o : reg-ia64.c $(regdef_h)
 reg-ia64.c : $(srcdir)/../regformats/reg-ia64.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-ia64.dat reg-ia64.c
@@ -438,6 +446,12 @@ amd64.c : $(srcdir)/../regformats/i386/amd64.dat $(regdat_sh)
 amd64-linux.o : amd64-linux.c $(regdef_h)
 amd64-linux.c : $(srcdir)/../regformats/i386/amd64-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-linux.dat amd64-linux.c
+amd64-avx.o : amd64-avx.c $(regdef_h)
+amd64-avx.c : $(srcdir)/../regformats/i386/amd64-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx.dat amd64-avx.c
+amd64-avx-linux.o : amd64-avx-linux.c $(regdef_h)
+amd64-avx-linux.c : $(srcdir)/../regformats/i386/amd64-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx-linux.dat amd64-avx-linux.c
 reg-xtensa.o : reg-xtensa.c $(regdef_h)
 reg-xtensa.c : $(srcdir)/../regformats/reg-xtensa.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-xtensa.dat reg-xtensa.c
diff --git a/gdb/gdbserver/configure.srv b/gdb/gdbserver/configure.srv
index f7c80bd..8bc9aeb 100644
--- a/gdb/gdbserver/configure.srv
+++ b/gdb/gdbserver/configure.srv
@@ -22,17 +22,17 @@
 # Default hostio_last_error implementation
 srv_hostio_err_objs="hostio-errno.o"
 
-srv_i386_regobj=i386.o
-srv_i386_linux_regobj=i386-linux.o
-srv_amd64_regobj=amd64.o
-srv_amd64_linux_regobj=amd64-linux.o
+srv_i386_regobj="i386.o i386-avx.o"
+srv_i386_linux_regobj="i386-linux.o i386-avx-linux.o"
+srv_amd64_regobj="amd64.o x86-64-avx.o"
+srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
-srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml"
-srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml"
-srv_i386_xmlfiles="i386/i386.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_xmlfiles="i386/amd64.xml $srv_i386_64bit_xmlfiles"
-srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
+srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
+srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
+srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_xmlfiles="i386/amd64.xml i386/amd64-avx.xml $srv_i386_64bit_xmlfiles"
+srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/i386-avx-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/amd64-avx-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
 
 # Input is taken from the "${target}" variable.
 
diff --git a/gdb/gdbserver/i387-fp.c b/gdb/gdbserver/i387-fp.c
index 7ef4ba3..5461022 100644
--- a/gdb/gdbserver/i387-fp.c
+++ b/gdb/gdbserver/i387-fp.c
@@ -19,6 +19,7 @@
 
 #include "server.h"
 #include "i387-fp.h"
+#include "i386-xstate.h"
 
 int num_xmm_registers = 8;
 
@@ -72,6 +73,46 @@ struct i387_fxsave {
   unsigned char xmm_space[256];
 };
 
+struct i387_xsave {
+  /* All these are only sixteen bits, plus padding, except for fop (which
+     is only eleven bits), and fooff / fioff (which are 32 bits each).  */
+  unsigned short fctrl;
+  unsigned short fstat;
+  unsigned short ftag;
+  unsigned short fop;
+  unsigned int fioff;
+  unsigned short fiseg;
+  unsigned short pad1;
+  unsigned int fooff;
+  unsigned short foseg;
+  unsigned short pad12;
+
+  unsigned int mxcsr;
+  unsigned int mxcsr_mask;
+
+  /* Space for eight 80-bit FP values in 128-bit spaces.  */
+  unsigned char st_space[128];
+
+  /* Space for eight 128-bit XMM values, or 16 on x86-64.  */
+  unsigned char xmm_space[256];
+
+  unsigned char reserved1[48];
+
+  /* The extended control register 0 (the XFEATURE_ENABLED_MASK
+     register).  */
+  unsigned long long xcr0;
+
+  unsigned char reserved2[40];
+
+  /* The XSTATE_BV bit vector.  */
+  unsigned long long xstate_bv;
+
+  unsigned char reserved3[56];
+
+  /* Space for eight upper 128-bit YMM values, or 16 on x86-64.  */
+  unsigned char ymmh_space[256];
+};
+
 void
 i387_cache_to_fsave (struct regcache *regcache, void *buf)
 {
@@ -199,6 +240,128 @@ i387_cache_to_fxsave (struct regcache *regcache, void *buf)
   fp->foseg = val;
 }
 
+void
+i387_cache_to_xsave (struct regcache *regcache, void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  int i;
+  unsigned long val, val2;
+  unsigned int clear_bv;
+  unsigned long long xstate_bv = 0;
+  char raw[16];
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Clear part in x87 and vector registers if its bit in xstat_bv is
+     zero.  */
+  if (clear_bv)
+    {
+      if ((clear_bv & I386_XSTATE_X87))
+	for (i = 0; i < 8; i++)
+	  memset (((char *) &fp->st_space[0]) + i * 16, 0, 10);
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->xmm_space[0]) + i * 16, 0, 16);
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->ymmh_space[0]) + i * 16, 0, 16);
+    }
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      for (i = 0; i < 8; i++)
+	{
+	  collect_register (regcache, i + st0_regnum, raw);
+	  p = ((char *) &fp->st_space[0]) + i * 16;
+	  if (memcmp (raw, p, 10))
+	    {
+	      xstate_bv |= I386_XSTATE_X87;
+	      memcpy (p, raw, 10);
+	    }
+	}
+    }
+
+  /* Check if any SSE registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_SSE;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Check if any AVX registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + ymm0h_regnum, raw);
+	  p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_AVX;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Update the corresponding bits in xstate_bv if any SSE/AVX
+     registers are changed.  */
+  fp->xstate_bv |= xstate_bv;
+
+  collect_register_by_name (regcache, "fioff", &fp->fioff);
+  collect_register_by_name (regcache, "fooff", &fp->fooff);
+  collect_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* This one's 11 bits... */
+  collect_register_by_name (regcache, "fop", &val2);
+  fp->fop = (val2 & 0x7FF) | (fp->fop & 0xF800);
+
+  /* Some registers are 16-bit.  */
+  collect_register_by_name (regcache, "fctrl", &val);
+  fp->fctrl = val;
+
+  collect_register_by_name (regcache, "fstat", &val);
+  fp->fstat = val;
+
+  /* Convert to the simplifed tag form stored in fxsave data.  */
+  collect_register_by_name (regcache, "ftag", &val);
+  val &= 0xFFFF;
+  val2 = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag = (val >> (i * 2)) & 3;
+
+      if (tag != 3)
+	val2 |= (1 << i);
+    }
+  fp->ftag = val2;
+
+  collect_register_by_name (regcache, "fiseg", &val);
+  fp->fiseg = val;
+
+  collect_register_by_name (regcache, "foseg", &val);
+  fp->foseg = val;
+}
+
 static int
 i387_ftag (struct i387_fxsave *fp, int regno)
 {
@@ -296,3 +459,107 @@ i387_fxsave_to_cache (struct regcache *regcache, const void *buf)
   val = (fp->fop) & 0x7FF;
   supply_register_by_name (regcache, "fop", &val);
 }
+
+void
+i387_xsave_to_cache (struct regcache *regcache, const void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  struct i387_fxsave *fxp = (struct i387_fxsave *) buf;
+  int i, top;
+  unsigned long val;
+  unsigned int clear_bv;
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      if ((clear_bv & I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < 8; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->st_space[0]) + i * 16;
+	  supply_register (regcache, i + st0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  supply_register (regcache, i + ymm0h_regnum, p);
+	}
+    }
+
+  supply_register_by_name (regcache, "fioff", &fp->fioff);
+  supply_register_by_name (regcache, "fooff", &fp->fooff);
+  supply_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* Some registers are 16-bit.  */
+  val = fp->fctrl & 0xFFFF;
+  supply_register_by_name (regcache, "fctrl", &val);
+
+  val = fp->fstat & 0xFFFF;
+  supply_register_by_name (regcache, "fstat", &val);
+
+  /* Generate the form of ftag data that GDB expects.  */
+  top = (fp->fstat >> 11) & 0x7;
+  val = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag;
+      if (fp->ftag & (1 << i))
+	tag = i387_ftag (fxp, (i + 8 - top) % 8);
+      else
+	tag = 3;
+      val |= tag << (2 * i);
+    }
+  supply_register_by_name (regcache, "ftag", &val);
+
+  val = fp->fiseg & 0xFFFF;
+  supply_register_by_name (regcache, "fiseg", &val);
+
+  val = fp->foseg & 0xFFFF;
+  supply_register_by_name (regcache, "foseg", &val);
+
+  val = (fp->fop) & 0x7FF;
+  supply_register_by_name (regcache, "fop", &val);
+}
+
+/* Default to SSE.  */
+unsigned long long x86_xcr0 = I386_XSTATE_SSE_MASK;
diff --git a/gdb/gdbserver/i387-fp.h b/gdb/gdbserver/i387-fp.h
index d1e0681..ed1a322 100644
--- a/gdb/gdbserver/i387-fp.h
+++ b/gdb/gdbserver/i387-fp.h
@@ -26,6 +26,11 @@ void i387_fsave_to_cache (struct regcache *regcache, const void *buf);
 void i387_cache_to_fxsave (struct regcache *regcache, void *buf);
 void i387_fxsave_to_cache (struct regcache *regcache, const void *buf);
 
+void i387_cache_to_xsave (struct regcache *regcache, void *buf);
+void i387_xsave_to_cache (struct regcache *regcache, const void *buf);
+
+extern unsigned long long x86_xcr0;
+
 extern int num_xmm_registers;
 
 #endif /* I387_FP_H */
diff --git a/gdb/gdbserver/linux-arm-low.c b/gdb/gdbserver/linux-arm-low.c
index 54668f8..32bd7bb 100644
--- a/gdb/gdbserver/linux-arm-low.c
+++ b/gdb/gdbserver/linux-arm-low.c
@@ -354,16 +354,16 @@ arm_arch_setup (void)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, 18 * 4,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 18 * 4,
     GENERAL_REGS,
     arm_fill_gregset, arm_store_gregset },
-  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 16 * 8 + 6 * 4,
+  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 0, 16 * 8 + 6 * 4,
     EXTENDED_REGS,
     arm_fill_wmmxregset, arm_store_wmmxregset },
-  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 32 * 8 + 4,
+  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 0, 32 * 8 + 4,
     EXTENDED_REGS,
     arm_fill_vfpregset, arm_store_vfpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-crisv32-low.c b/gdb/gdbserver/linux-crisv32-low.c
index 6ba48b6..d426c32 100644
--- a/gdb/gdbserver/linux-crisv32-low.c
+++ b/gdb/gdbserver/linux-crisv32-low.c
@@ -365,9 +365,9 @@ cris_store_gregset (const void *buf)
 typedef unsigned long elf_gregset_t[cris_num_regs];
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS, cris_fill_gregset, cris_store_gregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index ad68179..f5d4c41 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -39,6 +39,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/vfs.h>
+#include <sys/uio.h>
 #ifndef ELFMAG0
 /* Don't include <linux/elf.h> here.  If it got included by gdb_proc_service.h
    then ELFMAG0 will have been defined.  If it didn't get included by
@@ -2957,14 +2958,15 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2973,10 +2975,21 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
 	}
 
       buf = xmalloc (regset->size);
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, data, nt_type);
 #endif
       if (res < 0)
 	{
@@ -3014,14 +3027,15 @@ regsets_store_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -3034,10 +3048,21 @@ regsets_store_inferior_registers (struct regcache *regcache)
       /* First fill the buffer with the current register set contents,
 	 in case there are any items in the kernel's regset that are
 	 not in gdbserver's regcache.  */
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, &iov, data);
 #endif
 
       if (res == 0)
@@ -3047,9 +3072,9 @@ regsets_store_inferior_registers (struct regcache *regcache)
 
 	  /* Only now do we write the register set.  */
 #ifndef __sparc__
-	  res = ptrace (regset->set_request, pid, 0, buf);
+	  res = ptrace (regset->set_request, pid, nt_type, data);
 #else
-	  res = ptrace (regset->set_request, pid, buf, 0);
+	  res = ptrace (regset->set_request, pid, data, nt_type);
 #endif
 	}
 
@@ -4113,6 +4138,13 @@ linux_core_of_thread (ptid_t ptid)
   return core;
 }
 
+static void
+linux_process_qsupported (const char *query)
+{
+  if (the_low_target.process_qsupported != NULL)
+    the_low_target.process_qsupported (query);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -4156,7 +4188,8 @@ static struct target_ops linux_target_ops = {
 #else
   NULL,
 #endif
-  linux_core_of_thread
+  linux_core_of_thread,
+  linux_process_qsupported
 };
 
 static void
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index d7aa418..52623bf 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -35,6 +35,9 @@ enum regset_type {
 struct regset_info
 {
   int get_request, set_request;
+  /* If NT_TYPE isn't 0, it will be passed to ptrace as the 3rd
+     argument and the 4th argument should be "const struct iovec *".  */
+  int nt_type;
   int size;
   enum regset_type type;
   regset_fill_func fill_function;
@@ -111,6 +114,9 @@ struct linux_target_ops
 
   /* Hook to call prior to resuming a thread.  */
   void (*prepare_to_resume) (struct lwp_info *);
+
+  /* Hook to support target specific qSupported.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct linux_target_ops the_low_target;
diff --git a/gdb/gdbserver/linux-m68k-low.c b/gdb/gdbserver/linux-m68k-low.c
index 14e3864..6c98bb1 100644
--- a/gdb/gdbserver/linux-m68k-low.c
+++ b/gdb/gdbserver/linux-m68k-low.c
@@ -112,14 +112,14 @@ m68k_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     m68k_fill_gregset, m68k_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     m68k_fill_fpregset, m68k_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static const unsigned char m68k_breakpoint[] = { 0x4E, 0x4F };
diff --git a/gdb/gdbserver/linux-mips-low.c b/gdb/gdbserver/linux-mips-low.c
index 70f6700..1c04b2e 100644
--- a/gdb/gdbserver/linux-mips-low.c
+++ b/gdb/gdbserver/linux-mips-low.c
@@ -343,12 +343,12 @@ mips_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, 38 * 8, GENERAL_REGS,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 38 * 8, GENERAL_REGS,
     mips_fill_gregset, mips_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 33 * 8, FP_REGS,
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, 33 * 8, FP_REGS,
     mips_fill_fpregset, mips_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-ppc-low.c b/gdb/gdbserver/linux-ppc-low.c
index 10a1309..0dab604 100644
--- a/gdb/gdbserver/linux-ppc-low.c
+++ b/gdb/gdbserver/linux-ppc-low.c
@@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
      fetch them every time, but still fall back to PTRACE_PEEKUSER for the
      general registers.  Some kernels support these, but not the newer
      PPC_PTRACE_GETREGS.  */
-  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
+  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
   ppc_fill_vsxregset, ppc_store_vsxregset },
   { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
     ppc_fill_vrregset, ppc_store_vrregset },
-  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
+  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
     ppc_fill_evrregset, ppc_store_evrregset },
-  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-s390-low.c b/gdb/gdbserver/linux-s390-low.c
index 5460f57..eb865dc 100644
--- a/gdb/gdbserver/linux-s390-low.c
+++ b/gdb/gdbserver/linux-s390-low.c
@@ -181,8 +181,8 @@ static void s390_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 
diff --git a/gdb/gdbserver/linux-sh-low.c b/gdb/gdbserver/linux-sh-low.c
index 9d27e7f..87a0dd2 100644
--- a/gdb/gdbserver/linux-sh-low.c
+++ b/gdb/gdbserver/linux-sh-low.c
@@ -104,8 +104,8 @@ static void sh_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-sparc-low.c b/gdb/gdbserver/linux-sparc-low.c
index 0bb5f2f..e0bfe81 100644
--- a/gdb/gdbserver/linux-sparc-low.c
+++ b/gdb/gdbserver/linux-sparc-low.c
@@ -260,13 +260,13 @@ sparc_reinsert_addr (void)
 
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     sparc_fill_gregset, sparc_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (fpregset_t),
     FP_REGS,
     sparc_fill_fpregset, sparc_store_fpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index fe5d46e..5d75365 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -24,6 +24,8 @@
 #include "linux-low.h"
 #include "i387-fp.h"
 #include "i386-low.h"
+#include "i386-xstate.h"
+#include "elf/common.h"
 
 #include "gdb_proc_service.h"
 
@@ -31,10 +33,35 @@
 void init_registers_i386_linux (void);
 /* Defined in auto-generated file amd64-linux.c.  */
 void init_registers_amd64_linux (void);
+/* Defined in auto-generated file i386-avx-linux.c.  */
+void init_registers_i386_avx_linux (void);
+/* Defined in auto-generated file amd64-avx-linux.c.  */
+void init_registers_amd64_avx_linux (void);
+
+/* Backward compatibility for gdb without XML support.  */
+
+static const char *xmltarget_i386_linux_no_xml = "@<target>\
+<architecture>i386</architecture>\
+<osabi>GNU/Linux</osabi>\
+</target>";
+static const char *xmltarget_amd64_linux_no_xml = "@<target>\
+<architecture>i386:x86-64</architecture>\
+<osabi>GNU/Linux</osabi>\
+</target>";
 
 #include <sys/reg.h>
 #include <sys/procfs.h>
 #include <sys/ptrace.h>
+#include <sys/uio.h>
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
 
 #ifndef PTRACE_GET_THREAD_AREA
 #define PTRACE_GET_THREAD_AREA 25
@@ -252,6 +279,18 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 
 #endif
 
+static void
+x86_fill_xstateregset (struct regcache *regcache, void *buf)
+{
+  i387_cache_to_xsave (regcache, buf);
+}
+
+static void
+x86_store_xstateregset (struct regcache *regcache, const void *buf)
+{
+  i387_xsave_to_cache (regcache, buf);
+}
+
 /* ??? The non-biarch i386 case stores all the i387 regs twice.
    Once in i387_.*fsave.* and once in i387_.*fxsave.*.
    This is, presumably, to handle the case where PTRACE_[GS]ETFPXREGS
@@ -264,21 +303,23 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 struct regset_info target_regsets[] =
 {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     x86_fill_gregset, x86_store_gregset },
+  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
+    EXTENDED_REGS, x86_fill_xstateregset, x86_store_xstateregset },
 # ifndef __x86_64__
 #  ifdef HAVE_PTRACE_GETFPXREGS
-  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, sizeof (elf_fpxregset_t),
+  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, 0, sizeof (elf_fpxregset_t),
     EXTENDED_REGS,
     x86_fill_fpxregset, x86_store_fpxregset },
 #  endif
 # endif
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     x86_fill_fpregset, x86_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static CORE_ADDR
@@ -776,6 +817,121 @@ x86_siginfo_fixup (struct siginfo *native, void *inf, int direction)
   return 0;
 }
 \f
+static int use_xml;
+
+/* Update gdbserver_xmltarget.  */
+
+static void
+x86_linux_update_xmltarget (void)
+{
+  static unsigned long long xcr0;
+  static int have_ptrace_getregset = -1;
+
+  if (!current_inferior)
+    return;
+
+#ifdef __x86_64__
+  if (num_xmm_registers == 8)
+    init_registers_i386_linux ();
+  else
+    init_registers_amd64_linux ();
+#else
+  init_registers_i386_linux ();
+#endif
+
+  if (!use_xml)
+    {
+      /* Don't use XML.  */
+#ifdef __x86_64__
+      if (num_xmm_registers == 8)
+	gdbserver_xmltarget = xmltarget_i386_linux_no_xml;
+      else
+	gdbserver_xmltarget = xmltarget_amd64_linux_no_xml;
+#else
+      gdbserver_xmltarget = xmltarget_i386_linux_no_xml;
+#endif
+
+      x86_xcr0 = I386_XSTATE_SSE_MASK;
+
+      return;
+    }
+
+  /* Update gdbserver_xmltarget with XML support.  */
+#ifdef __x86_64__
+  if (num_xmm_registers == 8)
+    gdbserver_xmltarget = "i386-linux.xml";
+  else
+    gdbserver_xmltarget = "amd64-linux.xml";
+#else
+  gdbserver_xmltarget = "i386-linux.xml";
+#endif
+
+  /* Check if XSAVE extended state is supported.  */
+  if (have_ptrace_getregset == -1)
+    {
+      int pid = pid_of (get_thread_lwp (current_inferior));
+      unsigned long long xstateregs[I386_XSTATE_SSE_SIZE / sizeof (long long)];
+      struct iovec iov;
+      struct regset_info *regset;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, pid, (unsigned int) NT_X86_XSTATE,
+		  &iov) < 0)
+	{
+	  have_ptrace_getregset = 0;
+	  return;
+	}
+      else
+	have_ptrace_getregset = 1;
+
+      /* Get XCR0 from XSAVE extended state at byte 464.  */
+      xcr0 = xstateregs[464 / sizeof (long long)];
+
+      /* Use PTRACE_GETREGSET if it is available.  */
+      for (regset = target_regsets;
+	   regset->fill_function != NULL; regset++)
+	if (regset->get_request == PTRACE_GETREGSET)
+	  regset->size = I386_XSTATE_SIZE (xcr0);
+	else if (regset->type != GENERAL_REGS)
+	  regset->size = 0;
+    }
+
+  if (have_ptrace_getregset)
+    {
+      /* AVX is the highest feature we support.  */
+      if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+	{
+	  x86_xcr0 = xcr0;
+
+#ifdef __x86_64__
+	  /* I386 has 8 xmm regs.  */
+	  if (num_xmm_registers == 8)
+	    init_registers_i386_avx_linux ();
+	  else
+	    init_registers_amd64_avx_linux ();
+#else
+	  init_registers_i386_avx_linux ();
+#endif
+	}
+    }
+}
+
+/* Process qSupported query, "xmlRegisters=".  Update the buffer size for
+   PTRACE_GETREGSET.  */
+
+static void
+x86_linux_process_qsupported (const char *query)
+{
+  /* Return if gdb doesn't support XML.  If gdb sends "xmlRegisters="
+     in qSupported query, it supports x86 XML target descriptions.  */
+  use_xml = query != NULL && strncmp (query, "xmlRegisters=", 13) == 0;
+
+  x86_linux_update_xmltarget ();
+}
+
 /* Initialize gdbserver for the architecture of the inferior.  */
 
 static void
@@ -796,8 +952,6 @@ x86_arch_setup (void)
     }
   else if (use_64bit)
     {
-      init_registers_amd64_linux ();
-
       /* Amd64 doesn't have HAVE_LINUX_USRREGS.  */
       the_low_target.num_regs = -1;
       the_low_target.regmap = NULL;
@@ -807,14 +961,13 @@ x86_arch_setup (void)
       /* Amd64 has 16 xmm regs.  */
       num_xmm_registers = 16;
 
+      x86_linux_update_xmltarget ();
       return;
     }
 #endif
 
   /* Ok we have a 32-bit inferior.  */
 
-  init_registers_i386_linux ();
-
   the_low_target.num_regs = I386_NUM_REGS;
   the_low_target.regmap = i386_regmap;
   the_low_target.cannot_fetch_register = i386_cannot_fetch_register;
@@ -822,6 +975,8 @@ x86_arch_setup (void)
 
   /* I386 has 8 xmm regs.  */
   num_xmm_registers = 8;
+
+  x86_linux_update_xmltarget ();
 }
 
 /* This is initialized assuming an amd64 target.
@@ -854,5 +1009,6 @@ struct linux_target_ops the_low_target =
   x86_siginfo_fixup,
   x86_linux_new_process,
   x86_linux_new_thread,
-  x86_linux_prepare_to_resume
+  x86_linux_prepare_to_resume,
+  x86_linux_process_qsupported 
 };
diff --git a/gdb/gdbserver/linux-xtensa-low.c b/gdb/gdbserver/linux-xtensa-low.c
index c5ed351..8d0e73a 100644
--- a/gdb/gdbserver/linux-xtensa-low.c
+++ b/gdb/gdbserver/linux-xtensa-low.c
@@ -131,13 +131,13 @@ xtensa_store_xtregset (struct regcache *regcache, const void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     xtensa_fill_gregset, xtensa_store_gregset },
-  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, XTENSA_ELF_XTREG_SIZE,
+  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, 0, XTENSA_ELF_XTREG_SIZE,
     EXTENDED_REGS,
     xtensa_fill_xtregset, xtensa_store_xtregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 #if XCHAL_HAVE_BE
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index 232085a..1bdd6a6 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -1277,6 +1277,9 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
     {
       char *p = &own_buf[10];
 
+      /* Start processing qSupported packet.  */
+      target_process_qsupported (NULL);
+
       /* Process each feature being provided by GDB.  The first
 	 feature will follow a ':', and latter features will follow
 	 ';'.  */
@@ -1292,6 +1295,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		if (target_supports_multi_process ())
 		  multi_process = 1;
 	      }
+	    else
+	      target_process_qsupported (p);
 	  }
 
       sprintf (own_buf, "PacketSize=%x;QPassSignals+", PBUFSIZ - 1);
diff --git a/gdb/gdbserver/target.h b/gdb/gdbserver/target.h
index ac68652..6109b1c 100644
--- a/gdb/gdbserver/target.h
+++ b/gdb/gdbserver/target.h
@@ -286,6 +286,9 @@ struct target_ops
 
   /* Returns the core given a thread, or -1 if not known.  */
   int (*core_of_thread) (ptid_t);
+
+  /* Target specific qSupported support.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct target_ops *the_target;
@@ -326,6 +329,10 @@ void set_target_ops (struct target_ops *);
   (the_target->supports_multi_process ? \
    (*the_target->supports_multi_process) () : 0)
 
+#define target_process_qsupported(query) \
+  if (the_target->process_qsupported) \
+    the_target->process_qsupported (query)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [3rd try]: Add AVX support (Update document)
  2010-03-30 16:41       ` H.J. Lu
@ 2010-03-30 18:27         ` Eli Zaretskii
  2010-03-30 18:37           ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Eli Zaretskii @ 2010-03-30 18:27 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gdb-patches

> Date: Tue, 30 Mar 2010 09:41:09 -0700
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> On Sun, Mar 28, 2010 at 05:18:27PM -0700, H.J. Lu wrote:
> > Hi,
> > 
> > This patch updates document for AVX support.  OK to install?
> > 
> > Thanks.
> 
> Here is the updated patch since xmlRegisters= has been checked in.
> OK to install?

Yes, thanks.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 2/6 [3rd try]: Add AVX support (Update document)
  2010-03-30 18:27         ` Eli Zaretskii
@ 2010-03-30 18:37           ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-03-30 18:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Tue, Mar 30, 2010 at 11:26 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Tue, 30 Mar 2010 09:41:09 -0700
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> On Sun, Mar 28, 2010 at 05:18:27PM -0700, H.J. Lu wrote:
>> > Hi,
>> >
>> > This patch updates document for AVX support.  OK to install?
>> >
>> > Thanks.
>>
>> Here is the updated patch since xmlRegisters= has been checked in.
>> OK to install?
>
> Yes, thanks.
>

Checked in.  Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-03-29  1:11         ` PATCH: 3/6 [3rd " H.J. Lu
@ 2010-04-02 14:31           ` H.J. Lu
  2010-04-02 14:42             ` Mark Kettenis
  2010-04-07 16:55             ` H.J. Lu
  0 siblings, 2 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-02 14:31 UTC (permalink / raw)
  To: GDB

On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
> Hi,
> 
> Here are i386 changes to support AVX. OK to install?
> 

Here is the updated i386 changes to support AVX. OK to install?

Thanks.


H.J.
----
2010-04-02  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Add ".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_update_xstateregset): New.
	(i386_linux_core_read_xcr0): Likewise.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(i386_linux_update_xstateregset): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_dbx_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_process_record): Replace i386_sse_regnum_p with
	i386_xmm_regnum_p.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_xmm_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..5e16015
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,41 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define I386_XSTATE_X87		(1ULL << 0)
+#define I386_XSTATE_SSE		(1ULL << 1)
+#define I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	(I386_XSTATE_X87 | I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	(I386_XSTATE_SSE_MASK | I386_XSTATE_AVX)
+
+#define I386_XSTATE_SSE_SIZE	576
+#define I386_XSTATE_AVX_SIZE	832
+#define I386_XSTATE_MAX_SIZE	832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..d1048eb 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,19 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +114,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +128,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +376,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  char xstateregs[I386_XSTATE_MAX_SIZE];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof(xstateregs);
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  char xstateregs[I386_XSTATE_MAX_SIZE];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof(xstateregs);
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +561,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +575,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +633,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +647,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +946,50 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static uint64_t xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
+      struct iovec iov;
+      unsigned int xstate_size;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+		  &iov) < 0)
+	{
+	  have_ptrace_getregset = 0;
+	  xstate_size = 0;
+	}
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	}
+
+      i386_linux_update_xstateregset (xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..bda5d19 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,6 +50,7 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
 static struct core_regset_section i386_linux_regset_sections[] =
@@ -54,6 +58,7 @@ static struct core_regset_section i386_linux_regset_sections[] =
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,59 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Update XSAVE extended state register note section.  */
+
+void
+i386_linux_update_xstateregset (unsigned int xstate_size)
+{
+  struct core_regset_section *xstate = &i386_linux_regset_sections[3];
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  gdb_assert (strcmp (xstate->sect_name, ".reg-xstate") == 0);
+  if (xstate_size)
+    xstate->size = xstate_size;
+  else
+    xstate->sect_name = NULL;
+}
+
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+uint64_t
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  uint64_t xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +627,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  uint64_t xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +687,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +906,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..187769b 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,41 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern uint64_t i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Update XSAVE extended state register note section.  */
+extern void i386_linux_update_xstateregset (unsigned int xstate_size);
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is the OS enabled
+  extended state mask, which is the same as the extended control register
+  0 (the XFEATURE_ENABLED_MASK register), XCR0.  We can use this mask
+  together with the mask saved in the xstate_hdr_bytes to determine what
+  states the processor/OS supports and what state, used or initialized,
+  the process/thread is in.  */ 
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 703d003..ce658cd 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -51,11 +51,13 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -74,6 +76,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -150,18 +164,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
-static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -201,6 +244,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -209,6 +265,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -246,7 +304,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
   else if (reg >= 21 && reg <= 28)
     {
       /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      int ymm0_regnum = tdep->ymm0_regnum;
+
+      if (ymm0_regnum >= 0
+	  && i386_xmm_regnum_p (gdbarch, reg))
+	return reg - 21 + ymm0_regnum;
+      else
+	return reg - 21 + I387_XMM0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2184,6 +2248,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo YMM registers.  We can't use
+   tdesc_find_type since YMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2234,6 +2351,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2285,7 +2404,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 128bits. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 128bits.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2334,7 +2468,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 128bits.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 128bits.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2581,6 +2728,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2608,6 +2777,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2801,46 +2980,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5665,7 +5858,7 @@ no_support_3dnow_data:
               record_arch_list_add_reg (ir.regcache, i);
 
             for (i = I387_XMM0_REGNUM (tdep);
-                 i386_sse_regnum_p (gdbarch, i); i++)
+                 i386_xmm_regnum_p (gdbarch, i); i++)
               record_arch_list_add_reg (ir.regcache, i);
 
             if (i386_mxcsr_regnum_p (gdbarch, I387_MXCSR_REGNUM(tdep)))
@@ -6065,7 +6258,7 @@ reswitch_prefix_add:
           if (i386_record_modrm (&ir))
 	    return -1;
           ir.reg |= rex_r;
-          if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.reg))
+          if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.reg))
             goto no_support;
           record_arch_list_add_reg (ir.regcache,
                                     I387_XMM0_REGNUM (tdep) + ir.reg);
@@ -6097,7 +6290,7 @@ reswitch_prefix_add:
                   || opcode == 0x0f17 || opcode == 0x660f17)
                 goto no_support;
               ir.rm |= ir.rex_b;
-              if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
+              if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
                 goto no_support;
               record_arch_list_add_reg (ir.regcache,
                                         I387_XMM0_REGNUM (tdep) + ir.rm);
@@ -6275,7 +6468,7 @@ reswitch_prefix_add:
           if (i386_record_modrm (&ir))
 	    return -1;
           ir.rm |= ir.rex_b;
-          if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
+          if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
             goto no_support;
           record_arch_list_add_reg (ir.regcache,
                                     I387_XMM0_REGNUM (tdep) + ir.rm);
@@ -6329,7 +6522,7 @@ reswitch_prefix_add:
           if (ir.mod == 3)
             {
               ir.rm |= ir.rex_b;
-              if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
+              if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
                 goto no_support;
               record_arch_list_add_reg (ir.regcache,
                                         I387_XMM0_REGNUM (tdep) + ir.rm);
@@ -6449,7 +6642,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -6459,13 +6653,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may have been set by OSABI initialization function.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -6474,7 +6692,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -6489,6 +6707,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -6509,6 +6728,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -6538,6 +6759,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -6654,9 +6877,14 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* Even though the default ABI only includes general-purpose registers,
+     floating-point registers and the SSE registers, we have to leave a
+     gap for the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -6667,10 +6895,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -6678,24 +6911,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -6705,16 +6939,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -6797,6 +7041,7 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 
   /* Tell remote stub that we support XML target description.  */
   register_remote_support_xml ("i386");
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..6520d67 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this GDB.
+   */
+  uint64_t xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 4/6 [3rd try]: Add AVX support (amd64 changes)
  2010-03-29  1:07           ` PATCH: 4/6 [3rd " H.J. Lu
@ 2010-04-02 14:32             ` H.J. Lu
  2010-04-07 16:54               ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-04-02 14:32 UTC (permalink / raw)
  To: GDB

On Sun, Mar 28, 2010 at 06:07:34PM -0700, H.J. Lu wrote:
> Here are the amd64 changes to support AVX with AVX testcases. I
> also need to import cpuid.h from gcc 4.4 since AVX testcases need
> ECX from cpuid.  OK to install?
> 
> 

Here is the updated amd64 changes for AVX.  OK to install?

Thanks.


H.J.
---
gdb/

2010-04-02  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_update_xstateregset): Likewise.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_update_xstateregset): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_dwarf_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

gdb/testsuite/

2010-04-02  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.arch/i386-avx.c: New.
	* gdb.arch/i386-avx.exp: Likewise.

	* gdb.arch/i386-cpuid.h: Updated from gcc 4.4.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..c5393d3 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -51,6 +54,18 @@
 #include "i386-linux-tdep.h"
 #include "amd64-nat.h"
 #include "i386-nat.h"
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
@@ -73,6 +88,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +116,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +201,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[I386_XSTATE_MAX_SIZE];
+	  struct iovec iov;
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = sizeof (xstateregs);
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
+
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +260,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[I386_XSTATE_MAX_SIZE];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = sizeof (xstateregs);
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
+
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
 
-      return;
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +740,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static uint64_t xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +755,55 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
+      struct iovec iov;
+      unsigned int xstate_size;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	{
+	  have_ptrace_getregset = 0;
+	  xstate_size = 0;
+	}
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (uint64_t))];
+
+	  xstate_size = I386_XSTATE_SIZE (xcr0);
+	}
+
+      i386_linux_update_xstateregset (xstate_size);
+      amd64_linux_update_xstateregset (xstate_size);
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..4cc4045 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,8 +28,11 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
+#include "i386-xstate.h"
 
 #include "gdb_string.h"
 
@@ -38,6 +41,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +49,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+static struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", 0, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1242,6 +1255,22 @@ amd64_linux_record_signal (struct gdbarch *gdbarch,
   return 0;
 }
 
+/* Update XSAVE extended state register note section.  */
+
+void
+amd64_linux_update_xstateregset (unsigned int xstate_size)
+{
+  struct core_regset_section *xstate = &amd64_linux_regset_sections[2];
+
+  /* Update the XSAVE extended state register note section for "gcore".
+     Disable it if its size is 0.  */
+  gdb_assert (strcmp (xstate->sect_name, ".reg-xstate") == 0);
+  if (xstate_size)
+    xstate->size = xstate_size;
+  else
+    xstate->sect_name = NULL;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -1250,12 +1279,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  uint64_t xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1331,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1354,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1556,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..8862057 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,17 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
+
+/* Update XSAVE extended state register note section.  */
+extern void amd64_linux_update_xstateregset (unsigned int xstate_size);
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index acab4ac..1aa49b9 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -163,6 +177,8 @@ static const int amd64_dwarf_regmap_len =
 static int
 amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 {
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
   int regnum = -1;
 
   if (reg >= 0 && reg < amd64_dwarf_regmap_len)
@@ -170,6 +186,9 @@ amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 
   if (regnum == -1)
     warning (_("Unmapped DWARF Register #%d encountered."), reg);
+  else if (ymm0_regnum >= 0
+	   && i386_xmm_regnum_p (gdbarch, regnum))
+    regnum += ymm0_regnum - I387_XMM0_REGNUM (tdep);
 
   return regnum;
 }
@@ -238,6 +257,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -246,6 +278,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2176,6 +2210,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2194,6 +2250,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2256,6 +2322,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 20;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2269,6 +2342,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2349,6 +2424,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2384,6 +2460,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2407,3 +2507,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f
diff --git a/gdb/testsuite/gdb.arch/i386-avx.c b/gdb/testsuite/gdb.arch/i386-avx.c
new file mode 100644
index 0000000..73f92b6
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/i386-avx.c
@@ -0,0 +1,128 @@
+/* Test program for AVX registers.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include "i386-cpuid.h"
+
+typedef struct {
+  float f[8];
+} v8sf_t;
+
+
+v8sf_t data[] =
+  {
+    { {  0.0,  0.125,  0.25,  0.375,  0.50,  0.625,  0.75,  0.875 } },
+    { {  1.0,  1.125,  1.25,  1.375,  1.50,  1.625,  1.75,  1.875 } },
+    { {  2.0,  2.125,  2.25,  2.375,  2.50,  2.625,  2.75,  2.875 } },
+    { {  3.0,  3.125,  3.25,  3.375,  3.50,  3.625,  3.75,  3.875 } },
+    { {  4.0,  4.125,  4.25,  4.375,  4.50,  4.625,  4.75,  4.875 } },
+    { {  5.0,  5.125,  5.25,  5.375,  5.50,  5.625,  5.75,  5.875 } },
+    { {  6.0,  6.125,  6.25,  6.375,  6.50,  6.625,  6.75,  6.875 } },
+    { {  7.0,  7.125,  7.25,  7.375,  7.50,  7.625,  7.75,  7.875 } },
+#ifdef __x86_64__
+    { {  8.0,  8.125,  8.25,  8.375,  8.50,  8.625,  8.75,  8.875 } },
+    { {  9.0,  9.125,  9.25,  9.375,  9.50,  9.625,  9.75,  9.875 } },
+    { { 10.0, 10.125, 10.25, 10.375, 10.50, 10.625, 10.75, 10.875 } },
+    { { 11.0, 11.125, 11.25, 11.375, 11.50, 11.625, 11.75, 11.875 } },
+    { { 12.0, 12.125, 12.25, 12.375, 12.50, 12.625, 12.75, 12.875 } },
+    { { 13.0, 13.125, 13.25, 13.375, 13.50, 13.625, 13.75, 13.875 } },
+    { { 14.0, 14.125, 14.25, 14.375, 14.50, 14.625, 14.75, 14.875 } },
+    { { 15.0, 15.125, 15.25, 15.375, 15.50, 15.625, 15.75, 15.875 } },
+#endif
+  };
+
+
+int
+have_avx (void)
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  if ((ecx & (bit_AVX | bit_OSXSAVE)) == (bit_AVX | bit_OSXSAVE))
+    return 1;
+  else
+    return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  if (have_avx ())
+    {
+      asm ("vmovaps 0(%0), %%ymm0\n\t"
+           "vmovaps 32(%0), %%ymm1\n\t"
+           "vmovaps 64(%0), %%ymm2\n\t"
+           "vmovaps 96(%0), %%ymm3\n\t"
+           "vmovaps 128(%0), %%ymm4\n\t"
+           "vmovaps 160(%0), %%ymm5\n\t"
+           "vmovaps 192(%0), %%ymm6\n\t"
+           "vmovaps 224(%0), %%ymm7\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7");
+#ifdef __x86_64__
+      asm ("vmovaps 256(%0), %%ymm8\n\t"
+           "vmovaps 288(%0), %%ymm9\n\t"
+           "vmovaps 320(%0), %%ymm10\n\t"
+           "vmovaps 352(%0), %%ymm11\n\t"
+           "vmovaps 384(%0), %%ymm12\n\t"
+           "vmovaps 416(%0), %%ymm13\n\t"
+           "vmovaps 448(%0), %%ymm14\n\t"
+           "vmovaps 480(%0), %%ymm15\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
+#endif
+
+      asm ("nop"); /* first breakpoint here */
+
+      asm (
+           "vmovaps %%ymm0, 0(%0)\n\t"
+           "vmovaps %%ymm1, 32(%0)\n\t"
+           "vmovaps %%ymm2, 64(%0)\n\t"
+           "vmovaps %%ymm3, 96(%0)\n\t"
+           "vmovaps %%ymm4, 128(%0)\n\t"
+           "vmovaps %%ymm5, 160(%0)\n\t"
+           "vmovaps %%ymm6, 192(%0)\n\t"
+           "vmovaps %%ymm7, 224(%0)\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7");
+#ifdef __x86_64__
+      asm (
+           "vmovaps %%ymm8, 256(%0)\n\t"
+           "vmovaps %%ymm9, 288(%0)\n\t"
+           "vmovaps %%ymm10, 320(%0)\n\t"
+           "vmovaps %%ymm11, 352(%0)\n\t"
+           "vmovaps %%ymm12, 384(%0)\n\t"
+           "vmovaps %%ymm13, 416(%0)\n\t"
+           "vmovaps %%ymm14, 448(%0)\n\t"
+           "vmovaps %%ymm15, 480(%0)\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
+#endif
+
+      puts ("Bye!"); /* second breakpoint here */
+    }
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.arch/i386-avx.exp b/gdb/testsuite/gdb.arch/i386-avx.exp
new file mode 100644
index 0000000..561ddef
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/i386-avx.exp
@@ -0,0 +1,110 @@
+# Copyright 2010 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Please email any bugs, comments, and/or additions to this file to:
+# bug-gdb@gnu.org
+
+# This file is part of the gdb testsuite.
+
+if $tracelevel {
+    strace $tracelevel
+}
+
+set prms_id 0
+set bug_id 0
+
+if { ![istarget i?86-*-*] && ![istarget x86_64-*-* ] } {
+    verbose "Skipping x86 AVX tests."
+    return
+}
+
+set testfile "i386-avx"
+set srcfile ${testfile}.c
+set binfile ${objdir}/${subdir}/${testfile}
+
+if [get_compiler_info ${binfile}] {
+    return -1
+}
+
+set additional_flags ""
+if [test_compiler_info gcc*] {
+    set additional_flags "additional_flags=-mavx"
+}
+
+if { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${binfile}" executable [list debug $additional_flags]] != "" } {
+    unsupported "compiler does not support AVX"
+    return
+}
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+gdb_load ${binfile}
+
+if ![runto_main] then {
+    gdb_suppress_tests
+}
+
+send_gdb "print have_avx ()\r"
+gdb_expect {
+    -re ".. = 1\r\n$gdb_prompt " {
+        pass "check whether processor supports AVX"
+    }
+    -re ".. = 0\r\n$gdb_prompt " {
+        verbose "processor does not support AVX; skipping AVX tests"
+        return
+    }
+    -re ".*$gdb_prompt $" {
+        fail "check whether processor supports AVX"
+    }
+    timeout {
+        fail "check whether processor supports AVX (timeout)"
+    }
+}
+
+gdb_test "break [gdb_get_line_number "first breakpoint here"]" \
+         "Breakpoint .* at .*i386-avx.c.*" \
+         "set first breakpoint in main"
+gdb_continue_to_breakpoint "continue to first breakpoint in main"
+
+if [istarget i?86-*-*] {
+    set nr_regs 8
+} else {
+    set nr_regs 16
+}
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "print \$ymm$r.v8_float" \
+        ".. = \\{$r, $r.125, $r.25, $r.375, $r.5, $r.625, $r.75, $r.875\\}.*" \
+        "check float contents of %ymm$r"
+    gdb_test "print \$ymm$r.v32_int8" \
+        ".. = \\{(-?\[0-9\]+, ){31}-?\[0-9\]+\\}.*" \
+        "check int8 contents of %ymm$r"
+}
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "set var \$ymm$r.v8_float\[0\] = $r + 10" "" "set %ymm$r"
+}
+
+gdb_test "break [gdb_get_line_number "second breakpoint here"]" \
+         "Breakpoint .* at .*i386-avx.c.*" \
+         "set second breakpoint in main"
+gdb_continue_to_breakpoint "continue to second breakpoint in main"
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "print data\[$r\]" \
+        ".. = \\{f = \\{[expr $r + 10], $r.125, $r.25, $r.375, $r.5, $r.625, $r.75, $r.875\\}\\}.*" \
+        "check contents of data\[$r\]"
+}
diff --git a/gdb/testsuite/gdb.arch/i386-cpuid.h b/gdb/testsuite/gdb.arch/i386-cpuid.h
index 7ff0dba..5ebde5a 100644
--- a/gdb/testsuite/gdb.arch/i386-cpuid.h
+++ b/gdb/testsuite/gdb.arch/i386-cpuid.h
@@ -1,75 +1,200 @@
-/* Helper file for i386 platform.  Runtime check for MMX/SSE/SSE2 support.
+/* Helper file for i386 platform.  Runtime check for MMX/SSE/SSE2/AVX
+ * support. Copied from gcc 4.4.
+ *
+ * Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ * 
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ * 
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ * 
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
 
-   Copyright 2004, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+/* %ecx */
+#define bit_SSE3	(1 << 0)
+#define bit_PCLMUL	(1 << 1)
+#define bit_SSSE3	(1 << 9)
+#define bit_FMA		(1 << 12)
+#define bit_CMPXCHG16B	(1 << 13)
+#define bit_SSE4_1	(1 << 19)
+#define bit_SSE4_2	(1 << 20)
+#define bit_MOVBE	(1 << 22)
+#define bit_POPCNT	(1 << 23)
+#define bit_AES		(1 << 25)
+#define bit_XSAVE	(1 << 26)
+#define bit_OSXSAVE	(1 << 27)
+#define bit_AVX		(1 << 28)
 
-   This file is part of GDB.
+/* %edx */
+#define bit_CMPXCHG8B	(1 << 8)
+#define bit_CMOV	(1 << 15)
+#define bit_MMX		(1 << 23)
+#define bit_FXSAVE	(1 << 24)
+#define bit_SSE		(1 << 25)
+#define bit_SSE2	(1 << 26)
 
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3 of the License, or
-   (at your option) any later version.
+/* Extended Features */
+/* %ecx */
+#define bit_LAHF_LM	(1 << 0)
+#define bit_ABM		(1 << 5)
+#define bit_SSE4a	(1 << 6)
+#define bit_XOP         (1 << 11)
+#define bit_LWP 	(1 << 15)
+#define bit_FMA4        (1 << 16)
 
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+/* %edx */
+#define bit_LM		(1 << 29)
+#define bit_3DNOWP	(1 << 30)
+#define bit_3DNOW	(1 << 31)
 
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
-/* Used by 20020523-2.c and i386-sse-6.c, and possibly others.  */
-/* Plagarized from 20020523-2.c.  */
-/* Plagarized from gcc.  */
+#if defined(__i386__) && defined(__PIC__)
+/* %ebx may be the PIC register.  */
+#if __GNUC__ >= 3
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{l}\t{%%}ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-#define bit_CMOV (1 << 15)
-#define bit_MMX (1 << 23)
-#define bit_SSE (1 << 25)
-#define bit_SSE2 (1 << 26)
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{l}\t{%%}ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#else
+/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
+   nor alternatives in i386 code.  */
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchgl\t%%ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-#ifndef NOINLINE
-#define NOINLINE __attribute__ ((noinline))
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchgl\t%%ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
 #endif
+#else
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-unsigned int i386_cpuid (void) NOINLINE;
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#endif
 
-unsigned int NOINLINE
-i386_cpuid (void)
+/* Return highest supported input value for cpuid instruction.  ext can
+   be either 0x0 or 0x8000000 to return highest supported value for
+   basic or extended cpuid information.  Function returns 0 if cpuid
+   is not supported or whatever cpuid returns in eax register.  If sig
+   pointer is non-null, then first four bytes of the signature
+   (as found in ebx register) are returned in location pointed by sig.  */
+
+static __inline unsigned int
+__get_cpuid_max (unsigned int __ext, unsigned int *__sig)
 {
-  int fl1, fl2;
+  unsigned int __eax, __ebx, __ecx, __edx;
 
 #ifndef __x86_64__
+#if __GNUC__ >= 3
   /* See if we can use cpuid.  On AMD64 we always can.  */
-  __asm__ ("pushfl; pushfl; popl %0; movl %0,%1; xorl %2,%0;"
-	   "pushl %0; popfl; pushfl; popl %0; popfl"
-	   : "=&r" (fl1), "=&r" (fl2)
+  __asm__ ("pushf{l|d}\n\t"
+	   "pushf{l|d}\n\t"
+	   "pop{l}\t%0\n\t"
+	   "mov{l}\t{%0, %1|%1, %0}\n\t"
+	   "xor{l}\t{%2, %0|%0, %2}\n\t"
+	   "push{l}\t%0\n\t"
+	   "popf{l|d}\n\t"
+	   "pushf{l|d}\n\t"
+	   "pop{l}\t%0\n\t"
+	   "popf{l|d}\n\t"
+	   : "=&r" (__eax), "=&r" (__ebx)
+	   : "i" (0x00200000));
+#else
+/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
+   nor alternatives in i386 code.  */
+  __asm__ ("pushfl\n\t"
+	   "pushfl\n\t"
+	   "popl\t%0\n\t"
+	   "movl\t%0, %1\n\t"
+	   "xorl\t%2, %0\n\t"
+	   "pushl\t%0\n\t"
+	   "popfl\n\t"
+	   "pushfl\n\t"
+	   "popl\t%0\n\t"
+	   "popfl\n\t"
+	   : "=&r" (__eax), "=&r" (__ebx)
 	   : "i" (0x00200000));
-  if (((fl1 ^ fl2) & 0x00200000) == 0)
-    return (0);
 #endif
 
-  /* Host supports cpuid.  See if cpuid gives capabilities, try
-     CPUID(0).  Preserve %ebx and %ecx; cpuid insn clobbers these, we
-     don't need their CPUID values here, and %ebx may be the PIC
-     register.  */
-#ifdef __x86_64__
-  __asm__ ("pushq %%rcx; pushq %%rbx; cpuid; popq %%rbx; popq %%rcx"
-	   : "=a" (fl1) : "0" (0) : "rdx", "cc");
-#else
-  __asm__ ("pushl %%ecx; pushl %%ebx; cpuid; popl %%ebx; popl %%ecx"
-	   : "=a" (fl1) : "0" (0) : "edx", "cc");
+  if (!((__eax ^ __ebx) & 0x00200000))
+    return 0;
 #endif
-  if (fl1 == 0)
-    return (0);
-
-  /* Invoke CPUID(1), return %edx; caller can examine bits to
-     determine what's supported.  */
-#ifdef __x86_64__
-  __asm__ ("pushq %%rcx; pushq %%rbx; cpuid; popq %%rbx; popq %%rcx"
-	   : "=d" (fl2), "=a" (fl1) : "1" (1) : "cc");
-#else
-  __asm__ ("pushl %%ecx; pushl %%ebx; cpuid; popl %%ebx; popl %%ecx"
-	   : "=d" (fl2), "=a" (fl1) : "1" (1) : "cc");
+
+  /* Host supports cpuid.  Return highest supported cpuid input value.  */
+  __cpuid (__ext, __eax, __ebx, __ecx, __edx);
+
+  if (__sig)
+    *__sig = __ebx;
+
+  return __eax;
+}
+
+/* Return cpuid data for requested cpuid level, as found in returned
+   eax, ebx, ecx and edx registers.  The function checks if cpuid is
+   supported and returns 1 for valid cpuid information or 0 for
+   unsupported cpuid level.  All pointers are required to be non-null.  */
+
+static __inline int
+__get_cpuid (unsigned int __level,
+	     unsigned int *__eax, unsigned int *__ebx,
+	     unsigned int *__ecx, unsigned int *__edx)
+{
+  unsigned int __ext = __level & 0x80000000;
+
+  if (__get_cpuid_max (__ext, 0) < __level)
+    return 0;
+
+  __cpuid (__level, *__eax, *__ebx, *__ecx, *__edx);
+  return 1;
+}
+
+#ifndef NOINLINE
+#define NOINLINE __attribute__ ((noinline))
 #endif
 
-  return fl2;
+unsigned int i386_cpuid (void) NOINLINE;
+
+unsigned int NOINLINE
+i386_cpuid (void)
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  return edx;
 }

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-02 14:31           ` H.J. Lu
@ 2010-04-02 14:42             ` Mark Kettenis
  2010-04-02 15:28               ` H.J. Lu
  2010-04-07 16:55             ` H.J. Lu
  1 sibling, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-04-02 14:42 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

> Date: Fri, 2 Apr 2010 07:31:07 -0700
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
> > Hi,
> > 
> > Here are i386 changes to support AVX. OK to install?
> > 
> 
> Here is the updated i386 changes to support AVX. OK to install?

Sorry, but I'm still unhappy with the way you modify the
i386_linux_regset_sections[] array at run time.  I think the best
thing to do is to have gcore *always* create a NT_X86_XSTATE note of
the maximum size supported by GDB.  That way you can remove a lot of
code (including the duplication of code in i387_collect_xsave).

> H.J.
> ----
> 2010-04-02  H.J. Lu  <hongjiu.lu@intel.com>
> 
> 	* i386-linux-nat.c: Include "regset.h", "elf/common.h",
> 	<sys/uio.h> and "i386-xstate.h".
> 	(PTRACE_GETREGSET): New.
> 	(PTRACE_SETREGSET): Likewise.
> 	(fetch_xstateregs): Likewise.
> 	(store_xstateregs): Likewise.
> 	(GETXSTATEREGS_SUPPLIES): Likewise.
> 	(regmap): Include 8 upper YMM registers.
> 	(i386_linux_fetch_inferior_registers): Support XSAVE extended
> 	state.
> 	(i386_linux_store_inferior_registers): Likewise.
> 	(i386_linux_read_description): Check and enable AVX target
> 	descriptions.
> 
> 	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
> 	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
> 	(i386_linux_regset_sections): Add ".reg-xstate".
> 	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
> 	(i386_linux_update_xstateregset): New.
> 	(i386_linux_core_read_xcr0): Likewise.
> 	(i386_linux_core_read_description): Check and enable AVX target
> 	description.
> 	(i386_linux_init_abi): Set xsave_xcr0_offset.
> 	(_initialize_i386_linux_tdep): Call
> 	initialize_tdesc_i386_avx_linux.
> 
> 	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
> 	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
> 	(i386_linux_core_read_xcr0): New.
> 	(tdesc_i386_avx_linux): Likewise.
> 	(i386_linux_update_xstateregset): Likewise.
> 	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.
> 
> 	* i386-tdep.c: Include "i386-xstate.h" and
> 	"features/i386/i386-avx.c".
> 	(i386_ymm_names): New.
> 	(i386_ymmh_names): Likewise.
> 	(i386_ymmh_regnum_p): Likewise.
> 	(i386_ymm_regnum_p): Likewise.
> 	(i386_xmm_regnum_p): Likewise.
> 	(i386_register_name): Likewise.
> 	(i386_ymm_type): Likewise.
> 	(i386_supply_xstateregset): Likewise.
> 	(i386_collect_xstateregset): Likewise.
> 	(i386_sse_regnum_p): Removed.
> 	(i386_pseudo_register_name): Support pseudo YMM registers.
> 	(i386_pseudo_register_type): Likewise.
> 	(i386_pseudo_register_read): Likewise.
> 	(i386_pseudo_register_write): Likewise.
> 	(i386_dbx_reg_to_regnum): Return %ymmN register number for
> 	%xmmN if AVX is available.
> 	(i386_regset_from_core_section): Support .reg-xstate section.
> 	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
> 	(i386_process_record): Replace i386_sse_regnum_p with
> 	i386_xmm_regnum_p.
> 	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
> 	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
> 	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
> 	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
> 	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
> 	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
> 	Set ymm0_regnum.
> 	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.
> 
> 	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
> 	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
> 	i386_ymm_type.
> 	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
> 	(I386_AVX_NUM_REGS): New.
> 	(i386_xmm_regnum_p): Likewise.
> 	(i386_ymm_regnum_p): Likewise.
> 	(i386_ymmh_regnum_p): Likewise.
> 
> 	* common/i386-xstate.h: New.
> 
> diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
> new file mode 100644
> index 0000000..5e16015
> --- /dev/null
> +++ b/gdb/common/i386-xstate.h
> @@ -0,0 +1,41 @@
> +/* Common code for i386 XSAVE extended state.
> +
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef I386_XSTATE_H
> +#define I386_XSTATE_H 1
> +
> +/* The extended state feature bits.  */
> +#define I386_XSTATE_X87		(1ULL << 0)
> +#define I386_XSTATE_SSE		(1ULL << 1)
> +#define I386_XSTATE_AVX		(1ULL << 2)
> +
> +/* Supported mask and size of the extended state.  */
> +#define I386_XSTATE_SSE_MASK	(I386_XSTATE_X87 | I386_XSTATE_SSE)
> +#define I386_XSTATE_AVX_MASK	(I386_XSTATE_SSE_MASK | I386_XSTATE_AVX)
> +
> +#define I386_XSTATE_SSE_SIZE	576
> +#define I386_XSTATE_AVX_SIZE	832
> +#define I386_XSTATE_MAX_SIZE	832
> +
> +/* Get I386 XSAVE extended state size.  */
> +#define I386_XSTATE_SIZE(XCR0)	\
> +  (((XCR0) & I386_XSTATE_AVX) != 0 \
> +   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
> +
> +#endif /* I386_XSTATE_H */
> diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
> index 31b9086..d1048eb 100644
> --- a/gdb/i386-linux-nat.c
> +++ b/gdb/i386-linux-nat.c
> @@ -23,11 +23,14 @@
>  #include "inferior.h"
>  #include "gdbcore.h"
>  #include "regcache.h"
> +#include "regset.h"
>  #include "target.h"
>  #include "linux-nat.h"
>  
>  #include "gdb_assert.h"
>  #include "gdb_string.h"
> +#include "elf/common.h"
> +#include <sys/uio.h>
>  #include <sys/ptrace.h>
>  #include <sys/user.h>
>  #include <sys/procfs.h>
> @@ -69,6 +72,19 @@
>  
>  /* Defines ps_err_e, struct ps_prochandle.  */
>  #include "gdb_proc_service.h"
> +
> +#include "i386-xstate.h"
> +
> +#ifndef PTRACE_GETREGSET
> +#define PTRACE_GETREGSET	0x4204
> +#endif
> +
> +#ifndef PTRACE_SETREGSET
> +#define PTRACE_SETREGSET	0x4205
> +#endif
> +
> +/* Does the current host support PTRACE_GETREGSET?  */
> +static int have_ptrace_getregset = -1;
>  \f
>  
>  /* The register sets used in GNU/Linux ELF core-dumps are identical to
> @@ -98,6 +114,8 @@ static int regmap[] =
>    -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
>    -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
>    -1,				/* mxcsr */
> +  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
> +  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
>    ORIG_EAX
>  };
>  
> @@ -110,6 +128,9 @@ static int regmap[] =
>  #define GETFPXREGS_SUPPLIES(regno) \
>    (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
>  
> +#define GETXSTATEREGS_SUPPLIES(regno) \
> +  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
> +
>  /* Does the current host support the GETREGS request?  */
>  int have_ptrace_getregs =
>  #ifdef HAVE_PTRACE_GETREGS
> @@ -355,6 +376,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
>  
>  /* Transfering floating-point and SSE registers to and from GDB.  */
>  
> +/* Fetch all registers covered by the PTRACE_GETREGSET request from
> +   process/thread TID and store their values in GDB's register array.
> +   Return non-zero if successful, zero otherwise.  */
> +
> +static int
> +fetch_xstateregs (struct regcache *regcache, int tid)
> +{
> +  char xstateregs[I386_XSTATE_MAX_SIZE];
> +  struct iovec iov;
> +
> +  if (!have_ptrace_getregset)
> +    return 0;
> +
> +  iov.iov_base = xstateregs;
> +  iov.iov_len = sizeof(xstateregs);
> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +	      &iov) < 0)
> +    perror_with_name (_("Couldn't read extended state status"));
> +
> +  i387_supply_xsave (regcache, -1, xstateregs);
> +  return 1;
> +}
> +
> +/* Store all valid registers in GDB's register array covered by the
> +   PTRACE_SETREGSET request into the process/thread specified by TID.
> +   Return non-zero if successful, zero otherwise.  */
> +
> +static int
> +store_xstateregs (const struct regcache *regcache, int tid, int regno)
> +{
> +  char xstateregs[I386_XSTATE_MAX_SIZE];
> +  struct iovec iov;
> +
> +  if (!have_ptrace_getregset)
> +    return 0;
> +  
> +  iov.iov_base = xstateregs;
> +  iov.iov_len = sizeof(xstateregs);
> +  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +	      &iov) < 0)
> +    perror_with_name (_("Couldn't read extended state status"));
> +
> +  i387_collect_xsave (regcache, regno, xstateregs, 0);
> +
> +  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +	      (int) &iov) < 0)
> +    perror_with_name (_("Couldn't write extended state status"));
> +
> +  return 1;
> +}
> +
>  #ifdef HAVE_PTRACE_GETFPXREGS
>  
>  /* Fill GDB's register array with the floating-point and SSE register
> @@ -489,6 +561,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
>  	  return;
>  	}
>  
> +      if (fetch_xstateregs (regcache, tid))
> +	return;
>        if (fetch_fpxregs (regcache, tid))
>  	return;
>        fetch_fpregs (regcache, tid);
> @@ -501,6 +575,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
>        return;
>      }
>  
> +  if (GETXSTATEREGS_SUPPLIES (regno))
> +    {
> +      if (fetch_xstateregs (regcache, tid))
> +	return;
> +    }
> +
>    if (GETFPXREGS_SUPPLIES (regno))
>      {
>        if (fetch_fpxregs (regcache, tid))
> @@ -553,6 +633,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
>    if (regno == -1)
>      {
>        store_regs (regcache, tid, regno);
> +      if (store_xstateregs (regcache, tid, regno))
> +	return;
>        if (store_fpxregs (regcache, tid, regno))
>  	return;
>        store_fpregs (regcache, tid, regno);
> @@ -565,6 +647,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
>        return;
>      }
>  
> +  if (GETXSTATEREGS_SUPPLIES (regno))
> +    {
> +      if (store_xstateregs (regcache, tid, regno))
> +	return;
> +    }
> +
>    if (GETFPXREGS_SUPPLIES (regno))
>      {
>        if (store_fpxregs (regcache, tid, regno))
> @@ -858,7 +946,50 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
>  static const struct target_desc *
>  i386_linux_read_description (struct target_ops *ops)
>  {
> -  return tdesc_i386_linux;
> +  static uint64_t xcr0;
> +
> +  if (have_ptrace_getregset == -1)
> +    {
> +      int tid;
> +      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
> +      struct iovec iov;
> +      unsigned int xstate_size;
> +
> +      /* GNU/Linux LWP ID's are process ID's.  */
> +      tid = TIDGET (inferior_ptid);
> +      if (tid == 0)
> +	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
> +
> +      iov.iov_base = xstateregs;
> +      iov.iov_len = sizeof (xstateregs);
> +
> +      /* Check if PTRACE_GETREGSET works.  */
> +      if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
> +		  &iov) < 0)
> +	{
> +	  have_ptrace_getregset = 0;
> +	  xstate_size = 0;
> +	}
> +      else
> +	{
> +	  have_ptrace_getregset = 1;
> +
> +	  /* Get XCR0 from XSAVE extended state.  */
> +	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
> +			     / sizeof (long long))];
> +
> +	  xstate_size = I386_XSTATE_SIZE (xcr0);
> +	}
> +
> +      i386_linux_update_xstateregset (xstate_size);
> +    }
> +
> +  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
> +  if (have_ptrace_getregset
> +      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
> +    return tdesc_i386_avx_linux;
> +  else
> +    return tdesc_i386_linux;
>  }
>  
>  void
> diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
> index b23c109..bda5d19 100644
> --- a/gdb/i386-linux-tdep.c
> +++ b/gdb/i386-linux-tdep.c
> @@ -23,6 +23,7 @@
>  #include "frame.h"
>  #include "value.h"
>  #include "regcache.h"
> +#include "regset.h"
>  #include "inferior.h"
>  #include "osabi.h"
>  #include "reggroups.h"
> @@ -36,9 +37,11 @@
>  #include "solib-svr4.h"
>  #include "symtab.h"
>  #include "arch-utils.h"
> -#include "regset.h"
>  #include "xml-syscall.h"
>  
> +#include "i387-tdep.h"
> +#include "i386-xstate.h"
> +
>  /* The syscall's XML filename for i386.  */
>  #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
>  
> @@ -47,6 +50,7 @@
>  #include <stdint.h>
>  
>  #include "features/i386/i386-linux.c"
> +#include "features/i386/i386-avx-linux.c"
>  
>  /* Supported register note sections.  */
>  static struct core_regset_section i386_linux_regset_sections[] =
> @@ -54,6 +58,7 @@ static struct core_regset_section i386_linux_regset_sections[] =
>    { ".reg", 144, "general-purpose" },
>    { ".reg2", 108, "floating-point" },
>    { ".reg-xfp", 512, "extended floating-point" },
> +  { ".reg-xstate", 0, "XSAVE extended state" },
>    { NULL, 0 }
>  };
>  
> @@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
>    -1, -1, -1, -1, -1, -1, -1, -1,
>    -1, -1, -1, -1, -1, -1, -1, -1,
>    -1,
> +  -1, -1, -1, -1, -1, -1, -1, -1,
>    11 * 4			/* "orig_eax" */
>  };
>  
> @@ -560,6 +566,59 @@ static int i386_linux_sc_reg_offset[] =
>    0 * 4				/* %gs */
>  };
>  
> +/* Update XSAVE extended state register note section.  */
> +
> +void
> +i386_linux_update_xstateregset (unsigned int xstate_size)
> +{
> +  struct core_regset_section *xstate = &i386_linux_regset_sections[3];
> +
> +  /* Update the XSAVE extended state register note section for "gcore".
> +     Disable it if its size is 0.  */
> +  gdb_assert (strcmp (xstate->sect_name, ".reg-xstate") == 0);
> +  if (xstate_size)
> +    xstate->size = xstate_size;
> +  else
> +    xstate->sect_name = NULL;
> +}
> +
> +/* Get XSAVE extended state xcr0 from core dump.  */
> +
> +uint64_t
> +i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
> +			   struct target_ops *target, bfd *abfd)
> +{
> +  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
> +  uint64_t xcr0;
> +
> +  if (xstate)
> +    {
> +      size_t size = bfd_section_size (abfd, xstate);
> +
> +      /* Check extended state size.  */
> +      if (size < I386_XSTATE_AVX_SIZE)
> +	xcr0 = I386_XSTATE_SSE_MASK;
> +      else
> +	{
> +	  char contents[8];
> +
> +	  if (! bfd_get_section_contents (abfd, xstate, contents,
> +					  I386_LINUX_XSAVE_XCR0_OFFSET,
> +					  8))
> +	    {
> +	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
> +	      return 0;
> +	    }
> +
> +	  xcr0 = bfd_get_64 (abfd, contents);
> +	}
> +    }
> +  else
> +    xcr0 = I386_XSTATE_SSE_MASK;
> +
> +  return xcr0;
> +}
> +
>  /* Get Linux/x86 target description from core dump.  */
>  
>  static const struct target_desc *
> @@ -568,12 +627,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
>  				  bfd *abfd)
>  {
>    asection *section = bfd_get_section_by_name (abfd, ".reg2");
> +  uint64_t xcr0;
>  
>    if (section == NULL)
>      return NULL;
>  
>    /* Linux/i386.  */
> -  return tdesc_i386_linux;
> +  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
> +  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
> +    return tdesc_i386_avx_linux;
> +  else
> +    return tdesc_i386_linux;
>  }
>  
>  static void
> @@ -623,6 +687,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
>    tdep->sc_reg_offset = i386_linux_sc_reg_offset;
>    tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
>  
> +  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
> +
>    set_gdbarch_process_record (gdbarch, i386_process_record);
>    set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
>  
> @@ -840,4 +906,5 @@ _initialize_i386_linux_tdep (void)
>  
>    /* Initialize the Linux target description  */
>    initialize_tdesc_i386_linux ();
> +  initialize_tdesc_i386_avx_linux ();
>  }
> diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
> index 11f7295..187769b 100644
> --- a/gdb/i386-linux-tdep.h
> +++ b/gdb/i386-linux-tdep.h
> @@ -30,12 +30,41 @@
>  /* Register number for the "orig_eax" pseudo-register.  If this
>     pseudo-register contains a value >= 0 it is interpreted as the
>     system call number that the kernel is supposed to restart.  */
> -#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
> +#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
>  
>  /* Total number of registers for GNU/Linux.  */
>  #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
>  
> +/* Get XSAVE extended state xcr0 from core dump.  */
> +extern uint64_t i386_linux_core_read_xcr0
> +  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
> +
>  /* Linux target description.  */
>  extern struct target_desc *tdesc_i386_linux;
> +extern struct target_desc *tdesc_i386_avx_linux;
> +
> +/* Update XSAVE extended state register note section.  */
> +extern void i386_linux_update_xstateregset (unsigned int xstate_size);
> +
> +/* Format of XSAVE extended state is:
> + 	struct
> +	{
> +	  fxsave_bytes[0..463]
> +	  sw_usable_bytes[464..511]
> +	  xstate_hdr_bytes[512..575]
> +	  avx_bytes[576..831]
> +	  future_state etc
> +	};
> +
> +  Same memory layout will be used for the coredump NT_X86_XSTATE
> +  representing the XSAVE extended state registers.
> +
> +  The first 8 bytes of the sw_usable_bytes[464..467] is the OS enabled
> +  extended state mask, which is the same as the extended control register
> +  0 (the XFEATURE_ENABLED_MASK register), XCR0.  We can use this mask
> +  together with the mask saved in the xstate_hdr_bytes to determine what
> +  states the processor/OS supports and what state, used or initialized,
> +  the process/thread is in.  */ 
> +#define I386_LINUX_XSAVE_XCR0_OFFSET 464
>  
>  #endif /* i386-linux-tdep.h */
> diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
> index 703d003..ce658cd 100644
> --- a/gdb/i386-tdep.c
> +++ b/gdb/i386-tdep.c
> @@ -51,11 +51,13 @@
>  
>  #include "i386-tdep.h"
>  #include "i387-tdep.h"
> +#include "i386-xstate.h"
>  
>  #include "record.h"
>  #include <stdint.h>
>  
>  #include "features/i386/i386.c"
> +#include "features/i386/i386-avx.c"
>  
>  /* Register names.  */
>  
> @@ -74,6 +76,18 @@ static const char *i386_register_names[] =
>    "mxcsr"
>  };
>  
> +static const char *i386_ymm_names[] =
> +{
> +  "ymm0",  "ymm1",   "ymm2",  "ymm3",
> +  "ymm4",  "ymm5",   "ymm6",  "ymm7",
> +};
> +
> +static const char *i386_ymmh_names[] =
> +{
> +  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
> +  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
> +};
> +
>  /* Register names for MMX pseudo-registers.  */
>  
>  static const char *i386_mmx_names[] =
> @@ -150,18 +164,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
>    return regnum >= 0 && regnum < tdep->num_dword_regs;
>  }
>  
> +int
> +i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +  int ymm0h_regnum = tdep->ymm0h_regnum;
> +
> +  if (ymm0h_regnum < 0)
> +    return 0;
> +
> +  regnum -= ymm0h_regnum;
> +  return regnum >= 0 && regnum < tdep->num_ymm_regs;
> +}
> +
> +/* AVX register?  */
> +
> +int
> +i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +  int ymm0_regnum = tdep->ymm0_regnum;
> +
> +  if (ymm0_regnum < 0)
> +    return 0;
> +
> +  regnum -= ymm0_regnum;
> +  return regnum >= 0 && regnum < tdep->num_ymm_regs;
> +}
> +
>  /* SSE register?  */
>  
> -static int
> -i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
> +int
> +i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
>  {
>    struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
>  
> -  if (I387_NUM_XMM_REGS (tdep) == 0)
> +  if (num_xmm_regs == 0)
>      return 0;
>  
> -  return (I387_XMM0_REGNUM (tdep) <= regnum
> -	  && regnum < I387_MXCSR_REGNUM (tdep));
> +  regnum -= I387_XMM0_REGNUM (tdep);
> +  return regnum >= 0 && regnum < num_xmm_regs;
>  }
>  
>  static int
> @@ -201,6 +244,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
>  	  && regnum < I387_XMM0_REGNUM (tdep));
>  }
>  
> +/* Return the name of register REGNUM, or the empty string if it is
> +   an anonymous register.  */
> +
> +static const char *
> +i386_register_name (struct gdbarch *gdbarch, int regnum)
> +{
> +  /* Hide the upper YMM registers.  */
> +  if (i386_ymmh_regnum_p (gdbarch, regnum))
> +    return "";
> +
> +  return tdesc_register_name (gdbarch, regnum);
> +}
> +
>  /* Return the name of register REGNUM.  */
>  
>  const char *
> @@ -209,6 +265,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
>    struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
>    if (i386_mmx_regnum_p (gdbarch, regnum))
>      return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
> +  else if (i386_ymm_regnum_p (gdbarch, regnum))
> +    return i386_ymm_names[regnum - tdep->ymm0_regnum];
>    else if (i386_byte_regnum_p (gdbarch, regnum))
>      return i386_byte_names[regnum - tdep->al_regnum];
>    else if (i386_word_regnum_p (gdbarch, regnum))
> @@ -246,7 +304,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
>    else if (reg >= 21 && reg <= 28)
>      {
>        /* SSE registers.  */
> -      return reg - 21 + I387_XMM0_REGNUM (tdep);
> +      int ymm0_regnum = tdep->ymm0_regnum;
> +
> +      if (ymm0_regnum >= 0
> +	  && i386_xmm_regnum_p (gdbarch, reg))
> +	return reg - 21 + ymm0_regnum;
> +      else
> +	return reg - 21 + I387_XMM0_REGNUM (tdep);
>      }
>    else if (reg >= 29 && reg <= 36)
>      {
> @@ -2184,6 +2248,59 @@ i387_ext_type (struct gdbarch *gdbarch)
>    return tdep->i387_ext_type;
>  }
>  
> +/* Construct vector type for pseudo YMM registers.  We can't use
> +   tdesc_find_type since YMM isn't described in target description.  */
> +
> +static struct type *
> +i386_ymm_type (struct gdbarch *gdbarch)
> +{
> +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +
> +  if (!tdep->i386_ymm_type)
> +    {
> +      const struct builtin_type *bt = builtin_type (gdbarch);
> +
> +      /* The type we're building is this: */
> +#if 0
> +      union __gdb_builtin_type_vec256i
> +      {
> +        int128_t uint128[2];
> +        int64_t v2_int64[4];
> +        int32_t v4_int32[8];
> +        int16_t v8_int16[16];
> +        int8_t v16_int8[32];
> +        double v2_double[4];
> +        float v4_float[8];
> +      };
> +#endif
> +
> +      struct type *t;
> +
> +      t = arch_composite_type (gdbarch,
> +			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
> +      append_composite_type_field (t, "v8_float",
> +				   init_vector_type (bt->builtin_float, 8));
> +      append_composite_type_field (t, "v4_double",
> +				   init_vector_type (bt->builtin_double, 4));
> +      append_composite_type_field (t, "v32_int8",
> +				   init_vector_type (bt->builtin_int8, 32));
> +      append_composite_type_field (t, "v16_int16",
> +				   init_vector_type (bt->builtin_int16, 16));
> +      append_composite_type_field (t, "v8_int32",
> +				   init_vector_type (bt->builtin_int32, 8));
> +      append_composite_type_field (t, "v4_int64",
> +				   init_vector_type (bt->builtin_int64, 4));
> +      append_composite_type_field (t, "v2_int128",
> +				   init_vector_type (bt->builtin_int128, 2));
> +
> +      TYPE_VECTOR (t) = 1;
> +      TYPE_NAME (t) = "builtin_type_vec128i";
> +      tdep->i386_ymm_type = t;
> +    }
> +
> +  return tdep->i386_ymm_type;
> +}
> +
>  /* Construct vector type for MMX registers.  */
>  static struct type *
>  i386_mmx_type (struct gdbarch *gdbarch)
> @@ -2234,6 +2351,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
>  {
>    if (i386_mmx_regnum_p (gdbarch, regnum))
>      return i386_mmx_type (gdbarch);
> +  else if (i386_ymm_regnum_p (gdbarch, regnum))
> +    return i386_ymm_type (gdbarch);
>    else
>      {
>        const struct builtin_type *bt = builtin_type (gdbarch);
> @@ -2285,7 +2404,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
>      {
>        struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
>  
> -      if (i386_word_regnum_p (gdbarch, regnum))
> +      if (i386_ymm_regnum_p (gdbarch, regnum))
> +	{
> +	  regnum -= tdep->ymm0_regnum;
> +
> +	  /* Extract (always little endian).  Read lower 128bits. */
> +	  regcache_raw_read (regcache,
> +			     I387_XMM0_REGNUM (tdep) + regnum,
> +			     raw_buf);
> +	  memcpy (buf, raw_buf, 16);
> +	  /* Read upper 128bits.  */
> +	  regcache_raw_read (regcache,
> +			     tdep->ymm0h_regnum + regnum,
> +			     raw_buf);
> +	  memcpy (buf + 16, raw_buf, 16);
> +	}
> +      else if (i386_word_regnum_p (gdbarch, regnum))
>  	{
>  	  int gpnum = regnum - tdep->ax_regnum;
>  
> @@ -2334,7 +2468,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
>      {
>        struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
>  
> -      if (i386_word_regnum_p (gdbarch, regnum))
> +      if (i386_ymm_regnum_p (gdbarch, regnum))
> +	{
> +	  regnum -= tdep->ymm0_regnum;
> +
> +	  /* ... Write lower 128bits.  */
> +	  regcache_raw_write (regcache,
> +			     I387_XMM0_REGNUM (tdep) + regnum,
> +			     buf);
> +	  /* ... Write upper 128bits.  */
> +	  regcache_raw_write (regcache,
> +			     tdep->ymm0h_regnum + regnum,
> +			     buf + 16);
> +	}
> +      else if (i386_word_regnum_p (gdbarch, regnum))
>  	{
>  	  int gpnum = regnum - tdep->ax_regnum;
>  
> @@ -2581,6 +2728,28 @@ i386_collect_fpregset (const struct regset *regset,
>    i387_collect_fsave (regcache, regnum, fpregs);
>  }
>  
> +/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
> +
> +static void
> +i386_supply_xstateregset (const struct regset *regset,
> +			  struct regcache *regcache, int regnum,
> +			  const void *xstateregs, size_t len)
> +{
> +  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
> +  i387_supply_xsave (regcache, regnum, xstateregs);
> +}
> +
> +/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
> +
> +static void
> +i386_collect_xstateregset (const struct regset *regset,
> +			   const struct regcache *regcache,
> +			   int regnum, void *xstateregs, size_t len)
> +{
> +  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
> +  i387_collect_xsave (regcache, regnum, xstateregs, 1);
> +}
> +
>  /* Return the appropriate register set for the core section identified
>     by SECT_NAME and SECT_SIZE.  */
>  
> @@ -2608,6 +2777,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
>        return tdep->fpregset;
>      }
>  
> +  if (strcmp (sect_name, ".reg-xstate") == 0)
> +    {
> +      if (tdep->xstateregset == NULL)
> +	tdep->xstateregset = regset_alloc (gdbarch,
> +					   i386_supply_xstateregset,
> +					   i386_collect_xstateregset);
> +
> +      return tdep->xstateregset;
> +    }
> +
>    return NULL;
>  }
>  \f
> @@ -2801,46 +2980,60 @@ int
>  i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
>  			  struct reggroup *group)
>  {
> -  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
> -      word_regnum_p, dword_regnum_p;
> +  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> +  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
> +      ymm_regnum_p, ymmh_regnum_p;
>  
>    /* Don't include pseudo registers, except for MMX, in any register
>       groups.  */
> -  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
> -  if (byte_regnum_p)
> +  if (i386_byte_regnum_p (gdbarch, regnum))
>      return 0;
>  
> -  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
> -  if (word_regnum_p)
> +  if (i386_word_regnum_p (gdbarch, regnum))
>      return 0;
>  
> -  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
> -  if (dword_regnum_p)
> +  if (i386_dword_regnum_p (gdbarch, regnum))
>      return 0;
>  
>    mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
>    if (group == i386_mmx_reggroup)
>      return mmx_regnum_p;
>  
> -  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
> -		  || i386_mxcsr_regnum_p (gdbarch, regnum));
> +  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
> +  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
>    if (group == i386_sse_reggroup)
> -    return sse_regnum_p;
> +    return xmm_regnum_p || mxcsr_regnum_p;
> +
> +  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
>    if (group == vector_reggroup)
> -    return mmx_regnum_p || sse_regnum_p;
> +    return (mmx_regnum_p
> +	    || ymm_regnum_p
> +	    || mxcsr_regnum_p
> +	    || (xmm_regnum_p
> +		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
> +		    == I386_XSTATE_SSE_MASK)));
>  
>    fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
>  		 || i386_fpc_regnum_p (gdbarch, regnum));
>    if (group == float_reggroup)
>      return fp_regnum_p;
>  
> +  /* For "info reg all", don't include upper YMM registers nor XMM
> +     registers when AVX is supported.  */
> +  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
> +  if (group == all_reggroup
> +      && ((xmm_regnum_p
> +	   && (tdep->xcr0 & I386_XSTATE_AVX))
> +	  || ymmh_regnum_p))
> +    return 0;
> +
>    if (group == general_reggroup)
>      return (!fp_regnum_p
>  	    && !mmx_regnum_p
> -	    && !sse_regnum_p
> -	    && !byte_regnum_p
> -	    && !word_regnum_p
> -	    && !dword_regnum_p);
> +	    && !mxcsr_regnum_p
> +	    && !xmm_regnum_p
> +	    && !ymm_regnum_p
> +	    && !ymmh_regnum_p);
>  
>    return default_register_reggroup_p (gdbarch, regnum, group);
>  }
> @@ -5665,7 +5858,7 @@ no_support_3dnow_data:
>                record_arch_list_add_reg (ir.regcache, i);
>  
>              for (i = I387_XMM0_REGNUM (tdep);
> -                 i386_sse_regnum_p (gdbarch, i); i++)
> +                 i386_xmm_regnum_p (gdbarch, i); i++)
>                record_arch_list_add_reg (ir.regcache, i);
>  
>              if (i386_mxcsr_regnum_p (gdbarch, I387_MXCSR_REGNUM(tdep)))
> @@ -6065,7 +6258,7 @@ reswitch_prefix_add:
>            if (i386_record_modrm (&ir))
>  	    return -1;
>            ir.reg |= rex_r;
> -          if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.reg))
> +          if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.reg))
>              goto no_support;
>            record_arch_list_add_reg (ir.regcache,
>                                      I387_XMM0_REGNUM (tdep) + ir.reg);
> @@ -6097,7 +6290,7 @@ reswitch_prefix_add:
>                    || opcode == 0x0f17 || opcode == 0x660f17)
>                  goto no_support;
>                ir.rm |= ir.rex_b;
> -              if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
> +              if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
>                  goto no_support;
>                record_arch_list_add_reg (ir.regcache,
>                                          I387_XMM0_REGNUM (tdep) + ir.rm);
> @@ -6275,7 +6468,7 @@ reswitch_prefix_add:
>            if (i386_record_modrm (&ir))
>  	    return -1;
>            ir.rm |= ir.rex_b;
> -          if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
> +          if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
>              goto no_support;
>            record_arch_list_add_reg (ir.regcache,
>                                      I387_XMM0_REGNUM (tdep) + ir.rm);
> @@ -6329,7 +6522,7 @@ reswitch_prefix_add:
>            if (ir.mod == 3)
>              {
>                ir.rm |= ir.rex_b;
> -              if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
> +              if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
>                  goto no_support;
>                record_arch_list_add_reg (ir.regcache,
>                                          I387_XMM0_REGNUM (tdep) + ir.rm);
> @@ -6449,7 +6642,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>  		       struct tdesc_arch_data *tdesc_data)
>  {
>    const struct target_desc *tdesc = tdep->tdesc;
> -  const struct tdesc_feature *feature_core, *feature_vector;
> +  const struct tdesc_feature *feature_core;
> +  const struct tdesc_feature *feature_sse, *feature_avx;
>    int i, num_regs, valid_p;
>  
>    if (! tdesc_has_registers (tdesc))
> @@ -6459,13 +6653,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>    feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
>  
>    /* Get SSE registers.  */
> -  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
> +  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
>  
> -  if (feature_core == NULL || feature_vector == NULL)
> +  if (feature_core == NULL || feature_sse == NULL)
>      return 0;
>  
> +  /* Try AVX registers.  */
> +  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
> +
>    valid_p = 1;
>  
> +  /* The XCR0 bits.  */
> +  if (feature_avx)
> +    {
> +      tdep->xcr0 = I386_XSTATE_AVX_MASK;
> +
> +      /* It may have been set by OSABI initialization function.  */
> +      if (tdep->num_ymm_regs == 0)
> +	{
> +	  tdep->ymmh_register_names = i386_ymmh_names;
> +	  tdep->num_ymm_regs = 8;
> +	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
> +	}
> +
> +      for (i = 0; i < tdep->num_ymm_regs; i++)
> +	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
> +					    tdep->ymm0h_regnum + i,
> +					    tdep->ymmh_register_names[i]);
> +    }
> +  else
> +    tdep->xcr0 = I386_XSTATE_SSE_MASK;
> +
>    num_regs = tdep->num_core_regs;
>    for (i = 0; i < num_regs; i++)
>      valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
> @@ -6474,7 +6692,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
>    /* Need to include %mxcsr, so add one.  */
>    num_regs += tdep->num_xmm_regs + 1;
>    for (; i < num_regs; i++)
> -    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
> +    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
>  					tdep->register_names[i]);
>  
>    return valid_p;
> @@ -6489,6 +6707,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    struct tdesc_arch_data *tdesc_data;
>    const struct target_desc *tdesc;
>    int mm0_regnum;
> +  int ymm0_regnum;
>  
>    /* If there is already a candidate, use it.  */
>    arches = gdbarch_list_lookup_by_info (arches, &info);
> @@ -6509,6 +6728,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    tdep->fpregset = NULL;
>    tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
>  
> +  tdep->xstateregset = NULL;
> +
>    /* The default settings include the FPU registers, the MMX registers
>       and the SSE registers.  This can be overridden for a specific ABI
>       by adjusting the members `st0_regnum', `mm0_regnum' and
> @@ -6538,6 +6759,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    tdep->sc_pc_offset = -1;
>    tdep->sc_sp_offset = -1;
>  
> +  tdep->xsave_xcr0_offset = -1;
> +
>    tdep->record_regmap = i386_record_regmap;
>  
>    /* The format used for `long double' on almost all i386 targets is
> @@ -6654,9 +6877,14 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
>    set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
>  
> -  /* The default ABI includes general-purpose registers, 
> -     floating-point registers, and the SSE registers.  */
> -  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
> +  /* Override the normal target description method to make the AVX
> +     upper halves anonymous.  */
> +  set_gdbarch_register_name (gdbarch, i386_register_name);
> +
> +  /* Even though the default ABI only includes general-purpose registers,
> +     floating-point registers and the SSE registers, we have to leave a
> +     gap for the upper AVX registers.  */
> +  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
>  
>    /* Get the x86 target description from INFO.  */
>    tdesc = info.target_desc;
> @@ -6667,10 +6895,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
>    tdep->register_names = i386_register_names;
>  
> +  /* No upper YMM registers.  */
> +  tdep->ymmh_register_names = NULL;
> +  tdep->ymm0h_regnum = -1;
> +
>    tdep->num_byte_regs = 8;
>    tdep->num_word_regs = 8;
>    tdep->num_dword_regs = 0;
>    tdep->num_mmx_regs = 8;
> +  tdep->num_ymm_regs = 0;
>  
>    tdesc_data = tdesc_data_alloc ();
>  
> @@ -6678,24 +6911,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    info.tdep_info = (void *) tdesc_data;
>    gdbarch_init_osabi (info, gdbarch);
>  
> +  if (!i386_validate_tdesc_p (tdep, tdesc_data))
> +    {
> +      tdesc_data_cleanup (tdesc_data);
> +      xfree (tdep);
> +      gdbarch_free (gdbarch);
> +      return NULL;
> +    }
> +
>    /* Wire in pseudo registers.  Number of pseudo registers may be
>       changed.  */
>    set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
>  					 + tdep->num_word_regs
>  					 + tdep->num_dword_regs
> -					 + tdep->num_mmx_regs));
> +					 + tdep->num_mmx_regs
> +					 + tdep->num_ymm_regs));
>  
>    /* Target description may be changed.  */
>    tdesc = tdep->tdesc;
>  
> -  if (!i386_validate_tdesc_p (tdep, tdesc_data))
> -    {
> -      tdesc_data_cleanup (tdesc_data);
> -      xfree (tdep);
> -      gdbarch_free (gdbarch);
> -      return NULL;
> -    }
> -
>    tdesc_use_registers (gdbarch, tdesc, tdesc_data);
>  
>    /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
> @@ -6705,16 +6939,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>    tdep->al_regnum = gdbarch_num_regs (gdbarch);
>    tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
>  
> -  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
> +  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
>    if (tdep->num_dword_regs)
>      {
>        /* Support dword pseudo-registesr if it hasn't been disabled,  */
> -      tdep->eax_regnum = mm0_regnum;
> -      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
> +      tdep->eax_regnum = ymm0_regnum;
> +      ymm0_regnum += tdep->num_dword_regs;
>      }
>    else
>      tdep->eax_regnum = -1;
>  
> +  mm0_regnum = ymm0_regnum;
> +  if (tdep->num_ymm_regs)
> +    {
> +      /* Support YMM pseudo-registesr if it is available,  */
> +      tdep->ymm0_regnum = ymm0_regnum;
> +      mm0_regnum += tdep->num_ymm_regs;
> +    }
> +  else
> +    tdep->ymm0_regnum = -1;
> +
>    if (tdep->num_mmx_regs != 0)
>      {
>        /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
> @@ -6797,6 +7041,7 @@ is \"default\"."),
>  
>    /* Initialize the standard target descriptions.  */
>    initialize_tdesc_i386 ();
> +  initialize_tdesc_i386_avx ();
>  
>    /* Tell remote stub that we support XML target description.  */
>    register_remote_support_xml ("i386");
> diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
> index 72c634e..6520d67 100644
> --- a/gdb/i386-tdep.h
> +++ b/gdb/i386-tdep.h
> @@ -109,6 +109,9 @@ struct gdbarch_tdep
>    struct regset *fpregset;
>    size_t sizeof_fpregset;
>  
> +  /* XSAVE extended state.  */
> +  struct regset *xstateregset;
> +
>    /* Register number for %st(0).  The register numbers for the other
>       registers follow from this one.  Set this to -1 to indicate the
>       absence of an FPU.  */
> @@ -121,6 +124,13 @@ struct gdbarch_tdep
>       of MMX support.  */
>    int mm0_regnum;
>  
> +  /* Number of pseudo YMM registers.  */
> +  int num_ymm_regs;
> +
> +  /* Register number for %ymm0.  Set this to -1 to indicate the absence
> +     of pseudo YMM register support.  */
> +  int ymm0_regnum;
> +
>    /* Number of byte registers.  */
>    int num_byte_regs;
>  
> @@ -146,9 +156,24 @@ struct gdbarch_tdep
>    /* Number of SSE registers.  */
>    int num_xmm_regs;
>  
> +  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
> +     register), excluding the x87 bit, which are supported by this GDB.
> +   */
> +  uint64_t xcr0;
> +
> +  /* Offset of XCR0 in XSAVE extended state.  */
> +  int xsave_xcr0_offset;
> +
>    /* Register names.  */
>    const char **register_names;
>  
> +  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
> +     of upper YMM register support.  */
> +  int ymm0h_regnum;
> +
> +  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
> +  const char **ymmh_register_names;
> +
>    /* Target description.  */
>    const struct target_desc *tdesc;
>  
> @@ -182,6 +207,7 @@ struct gdbarch_tdep
>  
>    /* ISA-specific data types.  */
>    struct type *i386_mmx_type;
> +  struct type *i386_ymm_type;
>    struct type *i387_ext_type;
>  
>    /* Process record/replay target.  */
> @@ -228,7 +254,9 @@ enum i386_regnum
>    I386_FS_REGNUM,		/* %fs */
>    I386_GS_REGNUM,		/* %gs */
>    I386_ST0_REGNUM,		/* %st(0) */
> -  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
> +  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
> +  I386_YMM0H_REGNUM,		/* %ymm0h */
> +  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
>  };
>  
>  /* Register numbers of RECORD_REGMAP.  */
> @@ -265,6 +293,7 @@ enum record_i386_regnum
>  #define I386_NUM_XREGS  9
>  
>  #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
> +#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
>  
>  /* Size of the largest register.  */
>  #define I386_MAX_REGISTER_SIZE	16
> @@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
>  extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
>  extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
>  extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
> +extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
> +extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
> +extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
>  
>  extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
>  					      int regnum);
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-02 14:42             ` Mark Kettenis
@ 2010-04-02 15:28               ` H.J. Lu
  2010-04-07 10:13                 ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-04-02 15:28 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Fri, Apr 2, 2010 at 7:41 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Fri, 2 Apr 2010 07:31:07 -0700
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
>> > Hi,
>> >
>> > Here are i386 changes to support AVX. OK to install?
>> >
>>
>> Here is the updated i386 changes to support AVX. OK to install?
>
> Sorry, but I'm still unhappy with the way you modify the
> i386_linux_regset_sections[] array at run time.  I think the best
> thing to do is to have gcore *always* create a NT_X86_XSTATE note of

Generate NT_X86_XSTATE note without kernel/processor
NT_X86_XSTATE note support may require changes to
existing FXSAVE code path.  I will investigate it.

BTW, I have a follow up patch to implement 32bit core
registers without SSE registers to properly support older
processors, like Pentium and Pentium Pro.  Should
"gcore" generate NT_PRXFPREG note?

> the maximum size supported by GDB.  That way you can remove a lot of
> code (including the duplication of code in i387_collect_xsave).
>

XSAVE is different from FXSAVE in some subtle ways, although
XSAVE memory layout is an extension to FXSAVE memory layout.
XSAVE has used or initialized states for SSE and AVX registers.
Most of the codes in i387_collect_xsave deal with used/initialized states.

Please identify the duplication of code in i387_collect_xsave. I will take
a look.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-30 16:48               ` H.J. Lu
@ 2010-04-02 17:39                 ` Daniel Jacobowitz
  2010-04-07  4:37                   ` H.J. Lu
  2010-04-03 21:57                 ` Jan Kratochvil
  2010-04-07 16:59                 ` H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: Daniel Jacobowitz @ 2010-04-02 17:39 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Tue, Mar 30, 2010 at 09:48:33AM -0700, H.J. Lu wrote:
> OK to install?

Pretty much OK.

> +#ifdef __x86_64__
> +  if (num_xmm_registers == 8)
> +    init_registers_i386_linux ();
> +  else
> +    init_registers_amd64_linux ();
> +#else
> +  init_registers_i386_linux ();
> +#endif

...

> +  /* Update gdbserver_xmltarget with XML support.  */
> +#ifdef __x86_64__
> +  if (num_xmm_registers == 8)
> +    gdbserver_xmltarget = "i386-linux.xml";
> +  else
> +    gdbserver_xmltarget = "amd64-linux.xml";
> +#else
> +  gdbserver_xmltarget = "i386-linux.xml";
> +#endif

Isn't the second block redundant with the first block?

> +/* Process qSupported query, "xmlRegisters=".  Update the buffer size for
> +   PTRACE_GETREGSET.  */
> +
> +static void
> +x86_linux_process_qsupported (const char *query)
> +{
> +  /* Return if gdb doesn't support XML.  If gdb sends "xmlRegisters="
> +     in qSupported query, it supports x86 XML target descriptions.  */
> +  use_xml = query != NULL && strncmp (query, "xmlRegisters=", 13) == 0;
> +
> +  x86_linux_update_xmltarget ();
> +}

Presumably, the protocol-wise correct thing to do would be to
search for "xmlRegisters=" that had an element "x86".

Otherwise OK.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-30 16:48               ` H.J. Lu
  2010-04-02 17:39                 ` Daniel Jacobowitz
@ 2010-04-03 21:57                 ` Jan Kratochvil
  2010-04-07  4:12                   ` H.J. Lu
  2010-04-07 16:59                 ` H.J. Lu
  2 siblings, 1 reply; 115+ messages in thread
From: Jan Kratochvil @ 2010-04-03 21:57 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GDB

On Tue, 30 Mar 2010 18:48:33 +0200, H.J. Lu wrote:
> --- a/gdb/gdbserver/linux-ppc-low.c
> +++ b/gdb/gdbserver/linux-ppc-low.c
> @@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
>       fetch them every time, but still fall back to PTRACE_PEEKUSER for the
>       general registers.  Some kernels support these, but not the newer
>       PPC_PTRACE_GETREGS.  */
> -  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
> +  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
>    ppc_fill_vsxregset, ppc_store_vsxregset },
>    { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
                                          ^ missing "0, "

linux-ppc-low.c:599: error: incompatible types when initializing type ‘enum regset_type’ using type ‘void (*)(struct regcache *, void *)’
linux-ppc-low.c:599: warning: initialization from incompatible pointer type

>      ppc_fill_vrregset, ppc_store_vrregset },
> -  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
> +  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
>      ppc_fill_evrregset, ppc_store_evrregset },
> -  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
> -  { 0, 0, -1, -1, NULL, NULL }
> +  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
> +  { 0, 0, 0, -1, -1, NULL, NULL }
>  };


Thanks,
Jan

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-04-03 21:57                 ` Jan Kratochvil
@ 2010-04-07  4:12                   ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-07  4:12 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: GDB

2010/4/3 Jan Kratochvil <jan.kratochvil@redhat.com>:
> On Tue, 30 Mar 2010 18:48:33 +0200, H.J. Lu wrote:
>> --- a/gdb/gdbserver/linux-ppc-low.c
>> +++ b/gdb/gdbserver/linux-ppc-low.c
>> @@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
>>       fetch them every time, but still fall back to PTRACE_PEEKUSER for the
>>       general registers.  Some kernels support these, but not the newer
>>       PPC_PTRACE_GETREGS.  */
>> -  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
>> +  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
>>    ppc_fill_vsxregset, ppc_store_vsxregset },
>>    { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
>                                          ^ missing "0, "
>
> linux-ppc-low.c:599: error: incompatible types when initializing type ‘enum regset_type’ using type ‘void (*)(struct regcache *, void *)’
> linux-ppc-low.c:599: warning: initialization from incompatible pointer type
>

I will fix it.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-04-02 17:39                 ` Daniel Jacobowitz
@ 2010-04-07  4:37                   ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-07  4:37 UTC (permalink / raw)
  To: H.J. Lu, GDB

On Fri, Apr 2, 2010 at 10:39 AM, Daniel Jacobowitz <dan@codesourcery.com> wrote:
> On Tue, Mar 30, 2010 at 09:48:33AM -0700, H.J. Lu wrote:
>> OK to install?
>
> Pretty much OK.
>
>> +#ifdef __x86_64__
>> +  if (num_xmm_registers == 8)
>> +    init_registers_i386_linux ();
>> +  else
>> +    init_registers_amd64_linux ();
>> +#else
>> +  init_registers_i386_linux ();
>> +#endif
>
> ...
>
>> +  /* Update gdbserver_xmltarget with XML support.  */
>> +#ifdef __x86_64__
>> +  if (num_xmm_registers == 8)
>> +    gdbserver_xmltarget = "i386-linux.xml";
>> +  else
>> +    gdbserver_xmltarget = "amd64-linux.xml";
>> +#else
>> +  gdbserver_xmltarget = "i386-linux.xml";
>> +#endif
>
> Isn't the second block redundant with the first block?

You are right. I will remove it.

>> +/* Process qSupported query, "xmlRegisters=".  Update the buffer size for
>> +   PTRACE_GETREGSET.  */
>> +
>> +static void
>> +x86_linux_process_qsupported (const char *query)
>> +{
>> +  /* Return if gdb doesn't support XML.  If gdb sends "xmlRegisters="
>> +     in qSupported query, it supports x86 XML target descriptions.  */
>> +  use_xml = query != NULL && strncmp (query, "xmlRegisters=", 13) == 0;
>> +
>> +  x86_linux_update_xmltarget ();
>> +}
>
> Presumably, the protocol-wise correct thing to do would be to
> search for "xmlRegisters=" that had an element "x86".

I will update to check "i386", which will be sent from x86 gdb.

>
> Otherwise OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-02 15:28               ` H.J. Lu
@ 2010-04-07 10:13                 ` Mark Kettenis
  2010-04-07 14:56                   ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-04-07 10:13 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1875 bytes --]

> Date: Fri, 2 Apr 2010 08:27:55 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
> 
> On Fri, Apr 2, 2010 at 7:41 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >> Date: Fri, 2 Apr 2010 07:31:07 -0700
> >> From: "H.J. Lu" <hongjiu.lu@intel.com>
> >>
> >> On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
> >> > Hi,
> >> >
> >> > Here are i386 changes to support AVX. OK to install?
> >> >
> >>
> >> Here is the updated i386 changes to support AVX. OK to install?
> >
> > Sorry, but I'm still unhappy with the way you modify the
> > i386_linux_regset_sections[] array at run time.  I think the best
> > thing to do is to have gcore *always* create a NT_X86_XSTATE note of
> 
> Generate NT_X86_XSTATE note without kernel/processor
> NT_X86_XSTATE note support may require changes to
> existing FXSAVE code path.  I will investigate it.
> 
> BTW, I have a follow up patch to implement 32bit core
> registers without SSE registers to properly support older
> processors, like Pentium and Pentium Pro.  Should
> "gcore" generate NT_PRXFPREG note?

Probably.  It'll surely make the code simpler.

> 
> > the maximum size supported by GDB.  That way you can remove a lot of
> > code (including the duplication of code in i387_collect_xsave).
> >
> 
> XSAVE is different from FXSAVE in some subtle ways, although
> XSAVE memory layout is an extension to FXSAVE memory layout.
> XSAVE has used or initialized states for SSE and AVX registers.
> Most of the codes in i387_collect_xsave deal with used/initialized states.
> 
> Please identify the duplication of code in i387_collect_xsave. I will take
> a look.

There is in if (gcore) { } else { } there, that has quite a bit of
duplicated code.  I may be missing something, but the fact that
i387_collect_xsave() does different things whether it is generating a
core file or not seems to be undesirable and wrong to me.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-07 10:13                 ` Mark Kettenis
@ 2010-04-07 14:56                   ` H.J. Lu
  2010-04-07 15:04                     ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 14:56 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Wed, Apr 7, 2010 at 3:13 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>
>> XSAVE is different from FXSAVE in some subtle ways, although
>> XSAVE memory layout is an extension to FXSAVE memory layout.
>> XSAVE has used or initialized states for SSE and AVX registers.
>> Most of the codes in i387_collect_xsave deal with used/initialized states.
>>
>> Please identify the duplication of code in i387_collect_xsave. I will take
>> a look.
>
> There is in if (gcore) { } else { } there, that has quite a bit of
> duplicated code.  I may be missing something, but the fact that
> i387_collect_xsave() does different things whether it is generating a
> core file or not seems to be undesirable and wrong to me.
>

I will take a look.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-07 14:56                   ` H.J. Lu
@ 2010-04-07 15:04                     ` H.J. Lu
  2010-04-07 15:19                       ` Mark Kettenis
  0 siblings, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 15:04 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Wed, Apr 7, 2010 at 7:55 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Apr 7, 2010 at 3:13 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>
>>> XSAVE is different from FXSAVE in some subtle ways, although
>>> XSAVE memory layout is an extension to FXSAVE memory layout.
>>> XSAVE has used or initialized states for SSE and AVX registers.
>>> Most of the codes in i387_collect_xsave deal with used/initialized states.
>>>
>>> Please identify the duplication of code in i387_collect_xsave. I will take
>>> a look.
>>
>> There is in if (gcore) { } else { } there, that has quite a bit of
>> duplicated code.  I may be missing something, but the fact that
>> i387_collect_xsave() does different things whether it is generating a
>> core file or not seems to be undesirable and wrong to me.
>>
>
> I will take a look.
>

That is xstate_bv optimization. xstate_bv controls how vector registers
are restored via xsave. If we will eventually call xsave, we set up xstate_bv
such that xsave will do minimum work to properly restore vector registers.
If it is for gcore, we don't need to optimize xstate_bv. Of course, we can
always optimize xstate_bv even for gcore. I can do that if it is desired.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-07 15:04                     ` H.J. Lu
@ 2010-04-07 15:19                       ` Mark Kettenis
  0 siblings, 0 replies; 115+ messages in thread
From: Mark Kettenis @ 2010-04-07 15:19 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1431 bytes --]

> Date: Wed, 7 Apr 2010 08:04:17 -0700
> From: "H.J. Lu" <hjl.tools@gmail.com>
> 
> On Wed, Apr 7, 2010 at 7:55 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Wed, Apr 7, 2010 at 3:13 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> >>>
> >>> XSAVE is different from FXSAVE in some subtle ways, although
> >>> XSAVE memory layout is an extension to FXSAVE memory layout.
> >>> XSAVE has used or initialized states for SSE and AVX registers.
> >>> Most of the codes in i387_collect_xsave deal with used/initialized states.
> >>>
> >>> Please identify the duplication of code in i387_collect_xsave. I will take
> >>> a look.
> >>
> >> There is in if (gcore) { } else { } there, that has quite a bit of
> >> duplicated code.  I may be missing something, but the fact that
> >> i387_collect_xsave() does different things whether it is generating a
> >> core file or not seems to be undesirable and wrong to me.
> >>
> >
> > I will take a look.
> >
> 
> That is xstate_bv optimization. xstate_bv controls how vector registers
> are restored via xsave. If we will eventually call xsave, we set up xstate_bv
> such that xsave will do minimum work to properly restore vector registers.
> If it is for gcore, we don't need to optimize xstate_bv. Of course, we can
> always optimize xstate_bv even for gcore. I can do that if it is desired.

Please do; for generating core files, you should just call
i387_collect_xsave(..., -1, ...).

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 4/6 [3rd try]: Add AVX support (amd64 changes)
  2010-04-02 14:32             ` H.J. Lu
@ 2010-04-07 16:54               ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 16:54 UTC (permalink / raw)
  To: GDB

On Fri, Apr 02, 2010 at 07:32:08AM -0700, H.J. Lu wrote:
> On Sun, Mar 28, 2010 at 06:07:34PM -0700, H.J. Lu wrote:
> > Here are the amd64 changes to support AVX with AVX testcases. I
> > also need to import cpuid.h from gcc 4.4 since AVX testcases need
> > ECX from cpuid.  OK to install?
> > 
> > 
> 
Here is the updated amd64 changes for AVX.  OK to install?

Thanks.


H.J.
---
gdb/

2010-04-07  H.J. Lu  <hongjiu.lu@intel.com>

	* amd64-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(have_ptrace_getregset): Likewise.
	(amd64_linux_gregset64_reg_offset): Include 16 upper YMM
	registers.
	(amd64_linux_gregset32_reg_offset): Include 8 upper YMM
	registers.
	(amd64_linux_fetch_inferior_registers): Support PTRACE_GETFPREGS.
	(amd64_linux_store_inferior_registers): Likewise.
	(amd64_linux_read_description): Check and enable AVX target
	descriptions.

	* amd64-linux-tdep.c: Include "regset.h", "i386-linux-tdep.h"
	and "features/i386/amd64-avx-linux.c".
	(amd64_linux_regset_sections): New.
	(amd64_linux_core_read_description): Check and enable AVX
	target description.
	(amd64_linux_init_abi): Set xsave_xcr0_offset.  Call
	set_gdbarch_core_regset_sections.
	(_initialize_amd64_linux_tdep): Call
	initialize_tdesc_amd64_avx_linux.

	* amd64-linux-tdep.h (AMD64_LINUX_ORIG_RAX_REGNUM): Replace
	AMD64_MXCSR_REGNUM with AMD64_YMM15H_REGNUM.
	(tdesc_amd64_avx_linux): New.
	(amd64_linux_update_xstateregset): Likewise.

	* amd64-tdep.c: Include "features/i386/amd64-avx.c".
	(amd64_ymm_names): New.
	(amd64_ymmh_names): Likewise.
	(amd64_register_name): Likewise.
	(amd64_supply_xstateregset): Likewise.
	(amd64_collect_xstateregset): Likewise.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(AMD64_NUM_REGS): Removed.
	(amd64_dwarf_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(amd64_pseudo_register_name): Support pseudo YMM registers.
	(amd64_regset_from_core_section): Support .reg-xstate section.
	(amd64_init_abi): Set ymmh_register_names, num_ymm_regs
	and ymm0h_regnum.  Call set_gdbarch_register_name.
	(amd64_init_abi): Call initialize_tdesc_amd64_avx.

	* amd64-tdep.h (amd64_regnum): Add AMD64_YMM0H_REGNUM and
	AMD64_YMM15H_REGNUM.
	(AMD64_NUM_REGS): New.
	(amd64_supply_xsave): Likewise.
	(amd64_collect_xsave): Likewise.
	(amd64_register_name): Removed.
	(amd64_register_type): Likewise.

gdb/testsuite/

2010-04-02  H.J. Lu  <hongjiu.lu@intel.com>

	* gdb.arch/i386-avx.c: New.
	* gdb.arch/i386-avx.exp: Likewise.

	* gdb.arch/i386-cpuid.h: Updated from gcc 4.4.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index b9d5833..9812610 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "linux-nat.h"
 #include "amd64-linux-tdep.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/debugreg.h>
 #include <sys/syscall.h>
@@ -51,6 +54,18 @@
 #include "i386-linux-tdep.h"
 #include "amd64-nat.h"
 #include "i386-nat.h"
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 
 /* Mapping between the general-purpose registers in GNU/Linux x86-64
    `struct user' format and GDB's register cache layout.  */
@@ -73,6 +88,8 @@ static int amd64_linux_gregset64_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8
 };
 \f
@@ -99,6 +116,7 @@ static int amd64_linux_gregset32_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1, -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   ORIG_RAX * 8			/* "orig_eax" */
 };
 \f
@@ -183,10 +201,26 @@ amd64_linux_fetch_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[I386_XSTATE_MAX_SIZE];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = sizeof (xstateregs);
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_supply_fxsave (regcache, -1, &fpregs);
+	  amd64_supply_xsave (regcache, -1, xstateregs);
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
+
+	  amd64_supply_fxsave (regcache, -1, &fpregs);
+	}
     }
 }
 
@@ -226,15 +260,33 @@ amd64_linux_store_inferior_registers (struct target_ops *ops,
     {
       elf_fpregset_t fpregs;
 
-      if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't get floating point status"));
+      if (have_ptrace_getregset)
+	{
+	  char xstateregs[I386_XSTATE_MAX_SIZE];
+	  struct iovec iov;
+
+	  iov.iov_base = xstateregs;
+	  iov.iov_len = sizeof (xstateregs);
+	  if (ptrace (PTRACE_GETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't get extended state status"));
 
-      amd64_collect_fxsave (regcache, regnum, &fpregs);
+	  amd64_collect_xsave (regcache, regnum, xstateregs, 0);
+
+	  if (ptrace (PTRACE_SETREGSET, tid,
+		      (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	    perror_with_name (_("Couldn't write extended state status"));
+	}
+      else
+	{
+	  if (ptrace (PTRACE_GETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't get floating point status"));
 
-      if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
-	perror_with_name (_("Couldn't write floating point status"));
+	  amd64_collect_fxsave (regcache, regnum, &fpregs);
 
-      return;
+	  if (ptrace (PTRACE_SETFPREGS, tid, 0, (long) &fpregs) < 0)
+	    perror_with_name (_("Couldn't write floating point status"));
+	}
     }
 }
 \f
@@ -688,6 +740,8 @@ amd64_linux_read_description (struct target_ops *ops)
 {
   unsigned long cs;
   int tid;
+  int is_64bit;
+  static uint64_t xcr0;
 
   /* GNU/Linux LWP ID's are process ID's.  */
   tid = TIDGET (inferior_ptid);
@@ -701,10 +755,46 @@ amd64_linux_read_description (struct target_ops *ops)
   if (errno != 0)
     perror_with_name (_("Couldn't get CS register"));
 
-  if (cs == AMD64_LINUX_USER64_CS)
-    return tdesc_amd64_linux;
+  is_64bit = cs == AMD64_LINUX_USER64_CS;
+
+  if (have_ptrace_getregset == -1)
+    {
+      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
+      struct iovec iov;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid,
+		  (unsigned int) NT_X86_XSTATE, (long) &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (uint64_t))];
+	}
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    {
+      if (is_64bit)
+	return tdesc_amd64_avx_linux;
+      else
+	return tdesc_i386_avx_linux;
+    }
   else
-    return tdesc_i386_linux;
+    {
+      if (is_64bit)
+	return tdesc_amd64_linux;
+      else
+	return tdesc_i386_linux;
+    }
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
diff --git a/gdb/amd64-linux-tdep.c b/gdb/amd64-linux-tdep.c
index 4ad6dc9..1205e31 100644
--- a/gdb/amd64-linux-tdep.c
+++ b/gdb/amd64-linux-tdep.c
@@ -28,8 +28,11 @@
 #include "symtab.h"
 #include "gdbtypes.h"
 #include "reggroups.h"
+#include "regset.h"
 #include "amd64-linux-tdep.h"
+#include "i386-linux-tdep.h"
 #include "linux-tdep.h"
+#include "i386-xstate.h"
 
 #include "gdb_string.h"
 
@@ -38,6 +41,7 @@
 #include "xml-syscall.h"
 
 #include "features/i386/amd64-linux.c"
+#include "features/i386/amd64-avx-linux.c"
 
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_AMD64 "syscalls/amd64-linux.xml"
@@ -45,6 +49,15 @@
 #include "record.h"
 #include "linux-record.h"
 
+/* Supported register note sections.  */
+static struct core_regset_section amd64_linux_regset_sections[] =
+{
+  { ".reg", 144, "general-purpose" },
+  { ".reg2", 512, "floating-point" },
+  { ".reg-xstate", I386_XSTATE_MAX_SIZE, "XSAVE extended state" },
+  { NULL, 0 }
+};
+
 /* Mapping between the general-purpose registers in `struct user'
    format and GDB's register cache layout.  */
 
@@ -1250,12 +1263,17 @@ amd64_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  uint64_t xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/x86-64.  */
-  return tdesc_amd64_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_amd64_avx_linux;
+  else
+    return tdesc_amd64_linux;
 }
 
 static void
@@ -1297,6 +1315,8 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = amd64_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (amd64_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_solib_svr4_fetch_link_map_offsets
     (gdbarch, svr4_lp64_fetch_link_map_offsets);
@@ -1318,6 +1338,9 @@ amd64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* GNU/Linux uses SVR4-style shared libraries.  */
   set_gdbarch_skip_trampoline_code (gdbarch, find_solib_trampoline_target);
 
+  /* Install supported register note sections.  */
+  set_gdbarch_core_regset_sections (gdbarch, amd64_linux_regset_sections);
+
   set_gdbarch_core_read_description (gdbarch,
 				     amd64_linux_core_read_description);
 
@@ -1517,4 +1540,5 @@ _initialize_amd64_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_amd64_linux ();
+  initialize_tdesc_amd64_avx_linux ();
 }
diff --git a/gdb/amd64-linux-tdep.h b/gdb/amd64-linux-tdep.h
index 33316fb..b99872b 100644
--- a/gdb/amd64-linux-tdep.h
+++ b/gdb/amd64-linux-tdep.h
@@ -26,13 +26,14 @@
 /* Register number for the "orig_rax" register.  If this register
    contains a value >= 0 it is interpreted as the system call number
    that the kernel is supposed to restart.  */
-#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_MXCSR_REGNUM + 1)
+#define AMD64_LINUX_ORIG_RAX_REGNUM (AMD64_YMM15H_REGNUM + 1)
 
 /* Total number of registers for GNU/Linux.  */
 #define AMD64_LINUX_NUM_REGS (AMD64_LINUX_ORIG_RAX_REGNUM + 1)
 
 /* Linux target description.  */
 extern struct target_desc *tdesc_amd64_linux;
+extern struct target_desc *tdesc_amd64_avx_linux;
 
 /* Enum that defines the syscall identifiers for amd64 linux.
    Used for process record/replay, these will be translated into
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index acab4ac..1aa49b9 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -43,6 +43,7 @@
 #include "i387-tdep.h"
 
 #include "features/i386/amd64.c"
+#include "features/i386/amd64-avx.c"
 
 /* Note that the AMD64 architecture was previously known as x86-64.
    The latter is (forever) engraved into the canonical system name as
@@ -71,8 +72,21 @@ static const char *amd64_register_names[] =
   "mxcsr",
 };
 
-/* Total number of registers.  */
-#define AMD64_NUM_REGS	ARRAY_SIZE (amd64_register_names)
+static const char *amd64_ymm_names[] = 
+{
+  "ymm0", "ymm1", "ymm2", "ymm3",
+  "ymm4", "ymm5", "ymm6", "ymm7",
+  "ymm8", "ymm9", "ymm10", "ymm11",
+  "ymm12", "ymm13", "ymm14", "ymm15"
+};
+
+static const char *amd64_ymmh_names[] = 
+{
+  "ymm0h", "ymm1h", "ymm2h", "ymm3h",
+  "ymm4h", "ymm5h", "ymm6h", "ymm7h",
+  "ymm8h", "ymm9h", "ymm10h", "ymm11h",
+  "ymm12h", "ymm13h", "ymm14h", "ymm15h"
+};
 
 /* The registers used to pass integer arguments during a function call.  */
 static int amd64_dummy_call_integer_regs[] =
@@ -163,6 +177,8 @@ static const int amd64_dwarf_regmap_len =
 static int
 amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 {
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
   int regnum = -1;
 
   if (reg >= 0 && reg < amd64_dwarf_regmap_len)
@@ -170,6 +186,9 @@ amd64_dwarf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
 
   if (regnum == -1)
     warning (_("Unmapped DWARF Register #%d encountered."), reg);
+  else if (ymm0_regnum >= 0
+	   && i386_xmm_regnum_p (gdbarch, regnum))
+    regnum += ymm0_regnum - I387_XMM0_REGNUM (tdep);
 
   return regnum;
 }
@@ -238,6 +257,19 @@ static const char *amd64_dword_names[] =
   "r8d", "r9d", "r10d", "r11d", "r12d", "r13d", "r14d", "r15d"
 };
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register. */
+
+static const char *
+amd64_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -246,6 +278,8 @@ amd64_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_byte_regnum_p (gdbarch, regnum))
     return amd64_byte_names[regnum - tdep->al_regnum];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return amd64_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
     return amd64_word_names[regnum - tdep->ax_regnum];
   else if (i386_dword_regnum_p (gdbarch, regnum))
@@ -2176,6 +2210,28 @@ amd64_collect_fpregset (const struct regset *regset,
   amd64_collect_fxsave (regcache, regnum, fpregs);
 }
 
+/* Similar to amd64_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_supply_xstateregset (const struct regset *regset,
+			   struct regcache *regcache, int regnum,
+			   const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to amd64_collect_fpregset, but use XSAVE extended state.  */
+
+static void
+amd64_collect_xstateregset (const struct regset *regset,
+			    const struct regcache *regcache,
+			    int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  amd64_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2194,6 +2250,16 @@ amd64_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   amd64_supply_xstateregset,
+					   amd64_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return i386_regset_from_core_section (gdbarch, sect_name, sect_size);
 }
 \f
@@ -2256,6 +2322,13 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->num_core_regs = AMD64_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = amd64_register_names;
 
+  if (tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx") != NULL)
+    {
+      tdep->ymmh_register_names = amd64_ymmh_names;
+      tdep->num_ymm_regs = 16;
+      tdep->ymm0h_regnum = AMD64_YMM0H_REGNUM;
+    }
+
   tdep->num_byte_regs = 20;
   tdep->num_word_regs = 16;
   tdep->num_dword_regs = 16;
@@ -2269,6 +2342,8 @@ amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   set_tdesc_pseudo_register_name (gdbarch, amd64_pseudo_register_name);
 
+  set_gdbarch_register_name (gdbarch, amd64_register_name);
+
   /* AMD64 has an FPU and 16 SSE registers.  */
   tdep->st0_regnum = AMD64_ST0_REGNUM;
   tdep->num_xmm_regs = 16;
@@ -2349,6 +2424,7 @@ void
 _initialize_amd64_tdep (void)
 {
   initialize_tdesc_amd64 ();
+  initialize_tdesc_amd64_avx ();
 }
 \f
 
@@ -2384,6 +2460,30 @@ amd64_supply_fxsave (struct regcache *regcache, int regnum,
     }
 }
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+
+void
+amd64_supply_xsave (struct regcache *regcache, int regnum,
+		    const void *xsave)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  i387_supply_xsave (regcache, regnum, xsave);
+
+  if (xsave && gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      const gdb_byte *regs = xsave;
+
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FISEG_REGNUM (tdep),
+			     regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_supply (regcache, I387_FOSEG_REGNUM (tdep),
+			     regs + 20);
+    }
+}
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -2407,3 +2507,26 @@ amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep), regs + 20);
     }
 }
+
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+
+void
+amd64_collect_xsave (const struct regcache *regcache, int regnum,
+		     void *xsave, int gcore)
+{
+  struct gdbarch *gdbarch = get_regcache_arch (regcache);
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  gdb_byte *regs = xsave;
+
+  i387_collect_xsave (regcache, regnum, xsave, gcore);
+
+  if (gdbarch_ptr_bit (gdbarch) == 64)
+    {
+      if (regnum == -1 || regnum == I387_FISEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FISEG_REGNUM (tdep),
+			      regs + 12);
+      if (regnum == -1 || regnum == I387_FOSEG_REGNUM (tdep))
+	regcache_raw_collect (regcache, I387_FOSEG_REGNUM (tdep),
+			      regs + 20);
+    }
+}
diff --git a/gdb/amd64-tdep.h b/gdb/amd64-tdep.h
index 363479c..9f07dda 100644
--- a/gdb/amd64-tdep.h
+++ b/gdb/amd64-tdep.h
@@ -61,12 +61,16 @@ enum amd64_regnum
   AMD64_FSTAT_REGNUM = AMD64_ST0_REGNUM + 9,
   AMD64_XMM0_REGNUM = 40,	/* %xmm0 */
   AMD64_XMM1_REGNUM,		/* %xmm1 */
-  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16
+  AMD64_MXCSR_REGNUM = AMD64_XMM0_REGNUM + 16,
+  AMD64_YMM0H_REGNUM,		/* %ymm0h */
+  AMD64_YMM15H_REGNUM = AMD64_YMM0H_REGNUM + 15
 };
 
 /* Number of general purpose registers.  */
 #define AMD64_NUM_GREGS		24
 
+#define AMD64_NUM_REGS		(AMD64_YMM15H_REGNUM + 1)
+
 extern struct displaced_step_closure *amd64_displaced_step_copy_insn
   (struct gdbarch *gdbarch, CORE_ADDR from, CORE_ADDR to,
    struct regcache *regs);
@@ -77,12 +81,6 @@ extern void amd64_displaced_step_fixup (struct gdbarch *gdbarch,
 
 extern void amd64_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch);
 
-/* Functions from amd64-tdep.c which may be needed on architectures
-   with extra registers.  */
-
-extern const char *amd64_register_name (struct gdbarch *gdbarch, int regnum);
-extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
-
 /* Fill register REGNUM in REGCACHE with the appropriate
    floating-point or SSE register value from *FXSAVE.  If REGNUM is
    -1, do this for all registers.  This function masks off any of the
@@ -91,6 +89,10 @@ extern struct type *amd64_register_type (struct gdbarch *gdbarch, int regnum);
 extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 				 const void *fxsave);
 
+/* Similar to amd64_supply_fxsave, but use XSAVE extended state.  */
+extern void amd64_supply_xsave (struct regcache *regcache, int regnum,
+				const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -99,6 +101,10 @@ extern void amd64_supply_fxsave (struct regcache *regcache, int regnum,
 extern void amd64_collect_fxsave (const struct regcache *regcache, int regnum,
 				  void *fxsave);
 
+/* Similar to amd64_collect_fxsave, but but use XSAVE extended state.  */
+extern void amd64_collect_xsave (const struct regcache *regcache,
+				 int regnum, void *xsave, int gcore);
+
 void amd64_classify (struct type *type, enum amd64_reg_class class[2]);
 
 \f
diff --git a/gdb/testsuite/gdb.arch/i386-avx.c b/gdb/testsuite/gdb.arch/i386-avx.c
new file mode 100644
index 0000000..73f92b6
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/i386-avx.c
@@ -0,0 +1,128 @@
+/* Test program for AVX registers.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include "i386-cpuid.h"
+
+typedef struct {
+  float f[8];
+} v8sf_t;
+
+
+v8sf_t data[] =
+  {
+    { {  0.0,  0.125,  0.25,  0.375,  0.50,  0.625,  0.75,  0.875 } },
+    { {  1.0,  1.125,  1.25,  1.375,  1.50,  1.625,  1.75,  1.875 } },
+    { {  2.0,  2.125,  2.25,  2.375,  2.50,  2.625,  2.75,  2.875 } },
+    { {  3.0,  3.125,  3.25,  3.375,  3.50,  3.625,  3.75,  3.875 } },
+    { {  4.0,  4.125,  4.25,  4.375,  4.50,  4.625,  4.75,  4.875 } },
+    { {  5.0,  5.125,  5.25,  5.375,  5.50,  5.625,  5.75,  5.875 } },
+    { {  6.0,  6.125,  6.25,  6.375,  6.50,  6.625,  6.75,  6.875 } },
+    { {  7.0,  7.125,  7.25,  7.375,  7.50,  7.625,  7.75,  7.875 } },
+#ifdef __x86_64__
+    { {  8.0,  8.125,  8.25,  8.375,  8.50,  8.625,  8.75,  8.875 } },
+    { {  9.0,  9.125,  9.25,  9.375,  9.50,  9.625,  9.75,  9.875 } },
+    { { 10.0, 10.125, 10.25, 10.375, 10.50, 10.625, 10.75, 10.875 } },
+    { { 11.0, 11.125, 11.25, 11.375, 11.50, 11.625, 11.75, 11.875 } },
+    { { 12.0, 12.125, 12.25, 12.375, 12.50, 12.625, 12.75, 12.875 } },
+    { { 13.0, 13.125, 13.25, 13.375, 13.50, 13.625, 13.75, 13.875 } },
+    { { 14.0, 14.125, 14.25, 14.375, 14.50, 14.625, 14.75, 14.875 } },
+    { { 15.0, 15.125, 15.25, 15.375, 15.50, 15.625, 15.75, 15.875 } },
+#endif
+  };
+
+
+int
+have_avx (void)
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  if ((ecx & (bit_AVX | bit_OSXSAVE)) == (bit_AVX | bit_OSXSAVE))
+    return 1;
+  else
+    return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  if (have_avx ())
+    {
+      asm ("vmovaps 0(%0), %%ymm0\n\t"
+           "vmovaps 32(%0), %%ymm1\n\t"
+           "vmovaps 64(%0), %%ymm2\n\t"
+           "vmovaps 96(%0), %%ymm3\n\t"
+           "vmovaps 128(%0), %%ymm4\n\t"
+           "vmovaps 160(%0), %%ymm5\n\t"
+           "vmovaps 192(%0), %%ymm6\n\t"
+           "vmovaps 224(%0), %%ymm7\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7");
+#ifdef __x86_64__
+      asm ("vmovaps 256(%0), %%ymm8\n\t"
+           "vmovaps 288(%0), %%ymm9\n\t"
+           "vmovaps 320(%0), %%ymm10\n\t"
+           "vmovaps 352(%0), %%ymm11\n\t"
+           "vmovaps 384(%0), %%ymm12\n\t"
+           "vmovaps 416(%0), %%ymm13\n\t"
+           "vmovaps 448(%0), %%ymm14\n\t"
+           "vmovaps 480(%0), %%ymm15\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
+#endif
+
+      asm ("nop"); /* first breakpoint here */
+
+      asm (
+           "vmovaps %%ymm0, 0(%0)\n\t"
+           "vmovaps %%ymm1, 32(%0)\n\t"
+           "vmovaps %%ymm2, 64(%0)\n\t"
+           "vmovaps %%ymm3, 96(%0)\n\t"
+           "vmovaps %%ymm4, 128(%0)\n\t"
+           "vmovaps %%ymm5, 160(%0)\n\t"
+           "vmovaps %%ymm6, 192(%0)\n\t"
+           "vmovaps %%ymm7, 224(%0)\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7");
+#ifdef __x86_64__
+      asm (
+           "vmovaps %%ymm8, 256(%0)\n\t"
+           "vmovaps %%ymm9, 288(%0)\n\t"
+           "vmovaps %%ymm10, 320(%0)\n\t"
+           "vmovaps %%ymm11, 352(%0)\n\t"
+           "vmovaps %%ymm12, 384(%0)\n\t"
+           "vmovaps %%ymm13, 416(%0)\n\t"
+           "vmovaps %%ymm14, 448(%0)\n\t"
+           "vmovaps %%ymm15, 480(%0)\n\t"
+           : /* no output operands */
+           : "r" (data) 
+           : "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
+#endif
+
+      puts ("Bye!"); /* second breakpoint here */
+    }
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.arch/i386-avx.exp b/gdb/testsuite/gdb.arch/i386-avx.exp
new file mode 100644
index 0000000..561ddef
--- /dev/null
+++ b/gdb/testsuite/gdb.arch/i386-avx.exp
@@ -0,0 +1,110 @@
+# Copyright 2010 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Please email any bugs, comments, and/or additions to this file to:
+# bug-gdb@gnu.org
+
+# This file is part of the gdb testsuite.
+
+if $tracelevel {
+    strace $tracelevel
+}
+
+set prms_id 0
+set bug_id 0
+
+if { ![istarget i?86-*-*] && ![istarget x86_64-*-* ] } {
+    verbose "Skipping x86 AVX tests."
+    return
+}
+
+set testfile "i386-avx"
+set srcfile ${testfile}.c
+set binfile ${objdir}/${subdir}/${testfile}
+
+if [get_compiler_info ${binfile}] {
+    return -1
+}
+
+set additional_flags ""
+if [test_compiler_info gcc*] {
+    set additional_flags "additional_flags=-mavx"
+}
+
+if { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${binfile}" executable [list debug $additional_flags]] != "" } {
+    unsupported "compiler does not support AVX"
+    return
+}
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+gdb_load ${binfile}
+
+if ![runto_main] then {
+    gdb_suppress_tests
+}
+
+send_gdb "print have_avx ()\r"
+gdb_expect {
+    -re ".. = 1\r\n$gdb_prompt " {
+        pass "check whether processor supports AVX"
+    }
+    -re ".. = 0\r\n$gdb_prompt " {
+        verbose "processor does not support AVX; skipping AVX tests"
+        return
+    }
+    -re ".*$gdb_prompt $" {
+        fail "check whether processor supports AVX"
+    }
+    timeout {
+        fail "check whether processor supports AVX (timeout)"
+    }
+}
+
+gdb_test "break [gdb_get_line_number "first breakpoint here"]" \
+         "Breakpoint .* at .*i386-avx.c.*" \
+         "set first breakpoint in main"
+gdb_continue_to_breakpoint "continue to first breakpoint in main"
+
+if [istarget i?86-*-*] {
+    set nr_regs 8
+} else {
+    set nr_regs 16
+}
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "print \$ymm$r.v8_float" \
+        ".. = \\{$r, $r.125, $r.25, $r.375, $r.5, $r.625, $r.75, $r.875\\}.*" \
+        "check float contents of %ymm$r"
+    gdb_test "print \$ymm$r.v32_int8" \
+        ".. = \\{(-?\[0-9\]+, ){31}-?\[0-9\]+\\}.*" \
+        "check int8 contents of %ymm$r"
+}
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "set var \$ymm$r.v8_float\[0\] = $r + 10" "" "set %ymm$r"
+}
+
+gdb_test "break [gdb_get_line_number "second breakpoint here"]" \
+         "Breakpoint .* at .*i386-avx.c.*" \
+         "set second breakpoint in main"
+gdb_continue_to_breakpoint "continue to second breakpoint in main"
+
+for { set r 0 } { $r < $nr_regs } { incr r } {
+    gdb_test "print data\[$r\]" \
+        ".. = \\{f = \\{[expr $r + 10], $r.125, $r.25, $r.375, $r.5, $r.625, $r.75, $r.875\\}\\}.*" \
+        "check contents of data\[$r\]"
+}
diff --git a/gdb/testsuite/gdb.arch/i386-cpuid.h b/gdb/testsuite/gdb.arch/i386-cpuid.h
index 7ff0dba..5ebde5a 100644
--- a/gdb/testsuite/gdb.arch/i386-cpuid.h
+++ b/gdb/testsuite/gdb.arch/i386-cpuid.h
@@ -1,75 +1,200 @@
-/* Helper file for i386 platform.  Runtime check for MMX/SSE/SSE2 support.
+/* Helper file for i386 platform.  Runtime check for MMX/SSE/SSE2/AVX
+ * support. Copied from gcc 4.4.
+ *
+ * Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ * 
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ * 
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ * 
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
 
-   Copyright 2004, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+/* %ecx */
+#define bit_SSE3	(1 << 0)
+#define bit_PCLMUL	(1 << 1)
+#define bit_SSSE3	(1 << 9)
+#define bit_FMA		(1 << 12)
+#define bit_CMPXCHG16B	(1 << 13)
+#define bit_SSE4_1	(1 << 19)
+#define bit_SSE4_2	(1 << 20)
+#define bit_MOVBE	(1 << 22)
+#define bit_POPCNT	(1 << 23)
+#define bit_AES		(1 << 25)
+#define bit_XSAVE	(1 << 26)
+#define bit_OSXSAVE	(1 << 27)
+#define bit_AVX		(1 << 28)
 
-   This file is part of GDB.
+/* %edx */
+#define bit_CMPXCHG8B	(1 << 8)
+#define bit_CMOV	(1 << 15)
+#define bit_MMX		(1 << 23)
+#define bit_FXSAVE	(1 << 24)
+#define bit_SSE		(1 << 25)
+#define bit_SSE2	(1 << 26)
 
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3 of the License, or
-   (at your option) any later version.
+/* Extended Features */
+/* %ecx */
+#define bit_LAHF_LM	(1 << 0)
+#define bit_ABM		(1 << 5)
+#define bit_SSE4a	(1 << 6)
+#define bit_XOP         (1 << 11)
+#define bit_LWP 	(1 << 15)
+#define bit_FMA4        (1 << 16)
 
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+/* %edx */
+#define bit_LM		(1 << 29)
+#define bit_3DNOWP	(1 << 30)
+#define bit_3DNOW	(1 << 31)
 
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
-/* Used by 20020523-2.c and i386-sse-6.c, and possibly others.  */
-/* Plagarized from 20020523-2.c.  */
-/* Plagarized from gcc.  */
+#if defined(__i386__) && defined(__PIC__)
+/* %ebx may be the PIC register.  */
+#if __GNUC__ >= 3
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{l}\t{%%}ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-#define bit_CMOV (1 << 15)
-#define bit_MMX (1 << 23)
-#define bit_SSE (1 << 25)
-#define bit_SSE2 (1 << 26)
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{l}\t{%%}ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#else
+/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
+   nor alternatives in i386 code.  */
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchgl\t%%ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-#ifndef NOINLINE
-#define NOINLINE __attribute__ ((noinline))
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchgl\t%%ebx, %1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
 #endif
+#else
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
 
-unsigned int i386_cpuid (void) NOINLINE;
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#endif
 
-unsigned int NOINLINE
-i386_cpuid (void)
+/* Return highest supported input value for cpuid instruction.  ext can
+   be either 0x0 or 0x8000000 to return highest supported value for
+   basic or extended cpuid information.  Function returns 0 if cpuid
+   is not supported or whatever cpuid returns in eax register.  If sig
+   pointer is non-null, then first four bytes of the signature
+   (as found in ebx register) are returned in location pointed by sig.  */
+
+static __inline unsigned int
+__get_cpuid_max (unsigned int __ext, unsigned int *__sig)
 {
-  int fl1, fl2;
+  unsigned int __eax, __ebx, __ecx, __edx;
 
 #ifndef __x86_64__
+#if __GNUC__ >= 3
   /* See if we can use cpuid.  On AMD64 we always can.  */
-  __asm__ ("pushfl; pushfl; popl %0; movl %0,%1; xorl %2,%0;"
-	   "pushl %0; popfl; pushfl; popl %0; popfl"
-	   : "=&r" (fl1), "=&r" (fl2)
+  __asm__ ("pushf{l|d}\n\t"
+	   "pushf{l|d}\n\t"
+	   "pop{l}\t%0\n\t"
+	   "mov{l}\t{%0, %1|%1, %0}\n\t"
+	   "xor{l}\t{%2, %0|%0, %2}\n\t"
+	   "push{l}\t%0\n\t"
+	   "popf{l|d}\n\t"
+	   "pushf{l|d}\n\t"
+	   "pop{l}\t%0\n\t"
+	   "popf{l|d}\n\t"
+	   : "=&r" (__eax), "=&r" (__ebx)
+	   : "i" (0x00200000));
+#else
+/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
+   nor alternatives in i386 code.  */
+  __asm__ ("pushfl\n\t"
+	   "pushfl\n\t"
+	   "popl\t%0\n\t"
+	   "movl\t%0, %1\n\t"
+	   "xorl\t%2, %0\n\t"
+	   "pushl\t%0\n\t"
+	   "popfl\n\t"
+	   "pushfl\n\t"
+	   "popl\t%0\n\t"
+	   "popfl\n\t"
+	   : "=&r" (__eax), "=&r" (__ebx)
 	   : "i" (0x00200000));
-  if (((fl1 ^ fl2) & 0x00200000) == 0)
-    return (0);
 #endif
 
-  /* Host supports cpuid.  See if cpuid gives capabilities, try
-     CPUID(0).  Preserve %ebx and %ecx; cpuid insn clobbers these, we
-     don't need their CPUID values here, and %ebx may be the PIC
-     register.  */
-#ifdef __x86_64__
-  __asm__ ("pushq %%rcx; pushq %%rbx; cpuid; popq %%rbx; popq %%rcx"
-	   : "=a" (fl1) : "0" (0) : "rdx", "cc");
-#else
-  __asm__ ("pushl %%ecx; pushl %%ebx; cpuid; popl %%ebx; popl %%ecx"
-	   : "=a" (fl1) : "0" (0) : "edx", "cc");
+  if (!((__eax ^ __ebx) & 0x00200000))
+    return 0;
 #endif
-  if (fl1 == 0)
-    return (0);
-
-  /* Invoke CPUID(1), return %edx; caller can examine bits to
-     determine what's supported.  */
-#ifdef __x86_64__
-  __asm__ ("pushq %%rcx; pushq %%rbx; cpuid; popq %%rbx; popq %%rcx"
-	   : "=d" (fl2), "=a" (fl1) : "1" (1) : "cc");
-#else
-  __asm__ ("pushl %%ecx; pushl %%ebx; cpuid; popl %%ebx; popl %%ecx"
-	   : "=d" (fl2), "=a" (fl1) : "1" (1) : "cc");
+
+  /* Host supports cpuid.  Return highest supported cpuid input value.  */
+  __cpuid (__ext, __eax, __ebx, __ecx, __edx);
+
+  if (__sig)
+    *__sig = __ebx;
+
+  return __eax;
+}
+
+/* Return cpuid data for requested cpuid level, as found in returned
+   eax, ebx, ecx and edx registers.  The function checks if cpuid is
+   supported and returns 1 for valid cpuid information or 0 for
+   unsupported cpuid level.  All pointers are required to be non-null.  */
+
+static __inline int
+__get_cpuid (unsigned int __level,
+	     unsigned int *__eax, unsigned int *__ebx,
+	     unsigned int *__ecx, unsigned int *__edx)
+{
+  unsigned int __ext = __level & 0x80000000;
+
+  if (__get_cpuid_max (__ext, 0) < __level)
+    return 0;
+
+  __cpuid (__level, *__eax, *__ebx, *__ecx, *__edx);
+  return 1;
+}
+
+#ifndef NOINLINE
+#define NOINLINE __attribute__ ((noinline))
 #endif
 
-  return fl2;
+unsigned int i386_cpuid (void) NOINLINE;
+
+unsigned int NOINLINE
+i386_cpuid (void)
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  return edx;
 }

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-02 14:31           ` H.J. Lu
  2010-04-02 14:42             ` Mark Kettenis
@ 2010-04-07 16:55             ` H.J. Lu
  2010-04-07 18:34               ` Mark Kettenis
  1 sibling, 1 reply; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 16:55 UTC (permalink / raw)
  To: GDB

On Fri, Apr 02, 2010 at 07:31:07AM -0700, H.J. Lu wrote:
> On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
> > Hi,
> > 
> > Here are i386 changes to support AVX. OK to install?
> > 
> 
Here is the updated i386 changes to support AVX. I removed
i386_linux_update_xstateregset.  OK to install?

Thanks.


H.J.
----
2010-04-07  H.J. Lu  <hongjiu.lu@intel.com>

	* i386-linux-nat.c: Include "regset.h", "elf/common.h",
	<sys/uio.h> and "i386-xstate.h".
	(PTRACE_GETREGSET): New.
	(PTRACE_SETREGSET): Likewise.
	(fetch_xstateregs): Likewise.
	(store_xstateregs): Likewise.
	(GETXSTATEREGS_SUPPLIES): Likewise.
	(regmap): Include 8 upper YMM registers.
	(i386_linux_fetch_inferior_registers): Support XSAVE extended
	state.
	(i386_linux_store_inferior_registers): Likewise.
	(i386_linux_read_description): Check and enable AVX target
	descriptions.

	* i386-linux-tdep.c: Include "regset.h", "i387-tdep.h",
	"i386-xstate.h" and "features/i386/i386-avx-linux.c".
	(i386_linux_regset_sections): Add ".reg-xstate".
	(i386_linux_gregset_reg_offset): Include 8 upper YMM registers.
	(i386_linux_core_read_xcr0): New.
	(i386_linux_core_read_description): Check and enable AVX target
	description.
	(i386_linux_init_abi): Set xsave_xcr0_offset.
	(_initialize_i386_linux_tdep): Call
	initialize_tdesc_i386_avx_linux.

	* i386-linux-tdep.h (I386_LINUX_ORIG_EAX_REGNUM): Replace
	I386_SSE_NUM_REGS with I386_AVX_NUM_REGS.
	(i386_linux_core_read_xcr0): New.
	(tdesc_i386_avx_linux): Likewise.
	(I386_LINUX_XSAVE_XCR0_OFFSET): Likewise.

	* i386-tdep.c: Include "i386-xstate.h" and
	"features/i386/i386-avx.c".
	(i386_ymm_names): New.
	(i386_ymmh_names): Likewise.
	(i386_ymmh_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_xmm_regnum_p): Likewise.
	(i386_register_name): Likewise.
	(i386_ymm_type): Likewise.
	(i386_supply_xstateregset): Likewise.
	(i386_collect_xstateregset): Likewise.
	(i386_sse_regnum_p): Removed.
	(i386_pseudo_register_name): Support pseudo YMM registers.
	(i386_pseudo_register_type): Likewise.
	(i386_pseudo_register_read): Likewise.
	(i386_pseudo_register_write): Likewise.
	(i386_dbx_reg_to_regnum): Return %ymmN register number for
	%xmmN if AVX is available.
	(i386_regset_from_core_section): Support .reg-xstate section.
	(i386_register_reggroup_p): Supper upper YMM and YMM registers.
	(i386_process_record): Replace i386_sse_regnum_p with
	i386_xmm_regnum_p.
	(i386_validate_tdesc_p): Support org.gnu.gdb.i386.avx feature.
	Set ymmh_register_names, num_ymm_regs, ymm0h_regnum and xcr0.
	(i386_gdbarch_init): Set xstateregset.  Set xsave_xcr0_offset. 
	Call set_gdbarch_register_name.  Replace I386_SSE_NUM_REGS with
	I386_AVX_NUM_REGS.  Set ymmh_register_names, ymm0h_regnum and
	num_ymm_regs.  Add num_ymm_regs to set_gdbarch_num_pseudo_regs.
	Set ymm0_regnum.
	(_initialize_i386_tdep): Call initialize_tdesc_i386_avx.

	* i386-tdep.h (gdbarch_tdep): Add xstateregset, ymm0_regnum,
	xcr0, xsave_xcr0_offset, ymm0h_regnum, ymmh_register_names and
	i386_ymm_type.
	(i386_regnum): Add I386_YMM0H_REGNUM, and I386_YMM7H_REGNUM.
	(I386_AVX_NUM_REGS): New.
	(i386_xmm_regnum_p): Likewise.
	(i386_ymm_regnum_p): Likewise.
	(i386_ymmh_regnum_p): Likewise.

	* common/i386-xstate.h: New.

diff --git a/gdb/common/i386-xstate.h b/gdb/common/i386-xstate.h
new file mode 100644
index 0000000..5e16015
--- /dev/null
+++ b/gdb/common/i386-xstate.h
@@ -0,0 +1,41 @@
+/* Common code for i386 XSAVE extended state.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef I386_XSTATE_H
+#define I386_XSTATE_H 1
+
+/* The extended state feature bits.  */
+#define I386_XSTATE_X87		(1ULL << 0)
+#define I386_XSTATE_SSE		(1ULL << 1)
+#define I386_XSTATE_AVX		(1ULL << 2)
+
+/* Supported mask and size of the extended state.  */
+#define I386_XSTATE_SSE_MASK	(I386_XSTATE_X87 | I386_XSTATE_SSE)
+#define I386_XSTATE_AVX_MASK	(I386_XSTATE_SSE_MASK | I386_XSTATE_AVX)
+
+#define I386_XSTATE_SSE_SIZE	576
+#define I386_XSTATE_AVX_SIZE	832
+#define I386_XSTATE_MAX_SIZE	832
+
+/* Get I386 XSAVE extended state size.  */
+#define I386_XSTATE_SIZE(XCR0)	\
+  (((XCR0) & I386_XSTATE_AVX) != 0 \
+   ? I386_XSTATE_AVX_SIZE : I386_XSTATE_SSE_SIZE)
+
+#endif /* I386_XSTATE_H */
diff --git a/gdb/i386-linux-nat.c b/gdb/i386-linux-nat.c
index 31b9086..a251907 100644
--- a/gdb/i386-linux-nat.c
+++ b/gdb/i386-linux-nat.c
@@ -23,11 +23,14 @@
 #include "inferior.h"
 #include "gdbcore.h"
 #include "regcache.h"
+#include "regset.h"
 #include "target.h"
 #include "linux-nat.h"
 
 #include "gdb_assert.h"
 #include "gdb_string.h"
+#include "elf/common.h"
+#include <sys/uio.h>
 #include <sys/ptrace.h>
 #include <sys/user.h>
 #include <sys/procfs.h>
@@ -69,6 +72,19 @@
 
 /* Defines ps_err_e, struct ps_prochandle.  */
 #include "gdb_proc_service.h"
+
+#include "i386-xstate.h"
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
+/* Does the current host support PTRACE_GETREGSET?  */
+static int have_ptrace_getregset = -1;
 \f
 
 /* The register sets used in GNU/Linux ELF core-dumps are identical to
@@ -98,6 +114,8 @@ static int regmap[] =
   -1, -1, -1, -1,		/* xmm0, xmm1, xmm2, xmm3 */
   -1, -1, -1, -1,		/* xmm4, xmm5, xmm6, xmm6 */
   -1,				/* mxcsr */
+  -1, -1, -1, -1,		/* ymm0h, ymm1h, ymm2h, ymm3h */
+  -1, -1, -1, -1,		/* ymm4h, ymm5h, ymm6h, ymm6h */
   ORIG_EAX
 };
 
@@ -110,6 +128,9 @@ static int regmap[] =
 #define GETFPXREGS_SUPPLIES(regno) \
   (I386_ST0_REGNUM <= (regno) && (regno) < I386_SSE_NUM_REGS)
 
+#define GETXSTATEREGS_SUPPLIES(regno) \
+  (I386_ST0_REGNUM <= (regno) && (regno) < I386_AVX_NUM_REGS)
+
 /* Does the current host support the GETREGS request?  */
 int have_ptrace_getregs =
 #ifdef HAVE_PTRACE_GETREGS
@@ -355,6 +376,57 @@ static void store_fpregs (const struct regcache *regcache, int tid, int regno) {
 
 /* Transfering floating-point and SSE registers to and from GDB.  */
 
+/* Fetch all registers covered by the PTRACE_GETREGSET request from
+   process/thread TID and store their values in GDB's register array.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+fetch_xstateregs (struct regcache *regcache, int tid)
+{
+  char xstateregs[I386_XSTATE_MAX_SIZE];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof(xstateregs);
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_supply_xsave (regcache, -1, xstateregs);
+  return 1;
+}
+
+/* Store all valid registers in GDB's register array covered by the
+   PTRACE_SETREGSET request into the process/thread specified by TID.
+   Return non-zero if successful, zero otherwise.  */
+
+static int
+store_xstateregs (const struct regcache *regcache, int tid, int regno)
+{
+  char xstateregs[I386_XSTATE_MAX_SIZE];
+  struct iovec iov;
+
+  if (!have_ptrace_getregset)
+    return 0;
+  
+  iov.iov_base = xstateregs;
+  iov.iov_len = sizeof(xstateregs);
+  if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      &iov) < 0)
+    perror_with_name (_("Couldn't read extended state status"));
+
+  i387_collect_xsave (regcache, regno, xstateregs, 0);
+
+  if (ptrace (PTRACE_SETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+	      (int) &iov) < 0)
+    perror_with_name (_("Couldn't write extended state status"));
+
+  return 1;
+}
+
 #ifdef HAVE_PTRACE_GETFPXREGS
 
 /* Fill GDB's register array with the floating-point and SSE register
@@ -489,6 +561,8 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
 	  return;
 	}
 
+      if (fetch_xstateregs (regcache, tid))
+	return;
       if (fetch_fpxregs (regcache, tid))
 	return;
       fetch_fpregs (regcache, tid);
@@ -501,6 +575,12 @@ i386_linux_fetch_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (fetch_xstateregs (regcache, tid))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (fetch_fpxregs (regcache, tid))
@@ -553,6 +633,8 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
   if (regno == -1)
     {
       store_regs (regcache, tid, regno);
+      if (store_xstateregs (regcache, tid, regno))
+	return;
       if (store_fpxregs (regcache, tid, regno))
 	return;
       store_fpregs (regcache, tid, regno);
@@ -565,6 +647,12 @@ i386_linux_store_inferior_registers (struct target_ops *ops,
       return;
     }
 
+  if (GETXSTATEREGS_SUPPLIES (regno))
+    {
+      if (store_xstateregs (regcache, tid, regno))
+	return;
+    }
+
   if (GETFPXREGS_SUPPLIES (regno))
     {
       if (store_fpxregs (regcache, tid, regno))
@@ -858,7 +946,42 @@ i386_linux_child_post_startup_inferior (ptid_t ptid)
 static const struct target_desc *
 i386_linux_read_description (struct target_ops *ops)
 {
-  return tdesc_i386_linux;
+  static uint64_t xcr0;
+
+  if (have_ptrace_getregset == -1)
+    {
+      int tid;
+      uint64_t xstateregs[(I386_XSTATE_SSE_SIZE / sizeof (uint64_t))];
+      struct iovec iov;
+
+      /* GNU/Linux LWP ID's are process ID's.  */
+      tid = TIDGET (inferior_ptid);
+      if (tid == 0)
+	tid = PIDGET (inferior_ptid); /* Not a threaded program.  */
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, tid, (unsigned int) NT_X86_XSTATE,
+		  &iov) < 0)
+	have_ptrace_getregset = 0;
+      else
+	{
+	  have_ptrace_getregset = 1;
+
+	  /* Get XCR0 from XSAVE extended state.  */
+	  xcr0 = xstateregs[(I386_LINUX_XSAVE_XCR0_OFFSET
+			     / sizeof (long long))];
+	}
+    }
+
+  /* Check the native XCR0 only if PTRACE_GETREGSET is available.  */
+  if (have_ptrace_getregset
+      && (xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 void
diff --git a/gdb/i386-linux-tdep.c b/gdb/i386-linux-tdep.c
index b23c109..34a1924 100644
--- a/gdb/i386-linux-tdep.c
+++ b/gdb/i386-linux-tdep.c
@@ -23,6 +23,7 @@
 #include "frame.h"
 #include "value.h"
 #include "regcache.h"
+#include "regset.h"
 #include "inferior.h"
 #include "osabi.h"
 #include "reggroups.h"
@@ -36,9 +37,11 @@
 #include "solib-svr4.h"
 #include "symtab.h"
 #include "arch-utils.h"
-#include "regset.h"
 #include "xml-syscall.h"
 
+#include "i387-tdep.h"
+#include "i386-xstate.h"
+
 /* The syscall's XML filename for i386.  */
 #define XML_SYSCALL_FILENAME_I386 "syscalls/i386-linux.xml"
 
@@ -47,6 +50,7 @@
 #include <stdint.h>
 
 #include "features/i386/i386-linux.c"
+#include "features/i386/i386-avx-linux.c"
 
 /* Supported register note sections.  */
 static struct core_regset_section i386_linux_regset_sections[] =
@@ -54,6 +58,7 @@ static struct core_regset_section i386_linux_regset_sections[] =
   { ".reg", 144, "general-purpose" },
   { ".reg2", 108, "floating-point" },
   { ".reg-xfp", 512, "extended floating-point" },
+  { ".reg-xstate", I386_XSTATE_MAX_SIZE, "XSAVE extended state" },
   { NULL, 0 }
 };
 
@@ -533,6 +538,7 @@ static int i386_linux_gregset_reg_offset[] =
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1, -1, -1, -1, -1, -1, -1, -1,
   -1,
+  -1, -1, -1, -1, -1, -1, -1, -1,
   11 * 4			/* "orig_eax" */
 };
 
@@ -560,6 +566,43 @@ static int i386_linux_sc_reg_offset[] =
   0 * 4				/* %gs */
 };
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+
+uint64_t
+i386_linux_core_read_xcr0 (struct gdbarch *gdbarch,
+			   struct target_ops *target, bfd *abfd)
+{
+  asection *xstate = bfd_get_section_by_name (abfd, ".reg-xstate");
+  uint64_t xcr0;
+
+  if (xstate)
+    {
+      size_t size = bfd_section_size (abfd, xstate);
+
+      /* Check extended state size.  */
+      if (size < I386_XSTATE_AVX_SIZE)
+	xcr0 = I386_XSTATE_SSE_MASK;
+      else
+	{
+	  char contents[8];
+
+	  if (! bfd_get_section_contents (abfd, xstate, contents,
+					  I386_LINUX_XSAVE_XCR0_OFFSET,
+					  8))
+	    {
+	      warning (_("Couldn't read `xcr0' bytes from `.reg-xstate' section in core file."));
+	      return 0;
+	    }
+
+	  xcr0 = bfd_get_64 (abfd, contents);
+	}
+    }
+  else
+    xcr0 = I386_XSTATE_SSE_MASK;
+
+  return xcr0;
+}
+
 /* Get Linux/x86 target description from core dump.  */
 
 static const struct target_desc *
@@ -568,12 +611,17 @@ i386_linux_core_read_description (struct gdbarch *gdbarch,
 				  bfd *abfd)
 {
   asection *section = bfd_get_section_by_name (abfd, ".reg2");
+  uint64_t xcr0;
 
   if (section == NULL)
     return NULL;
 
   /* Linux/i386.  */
-  return tdesc_i386_linux;
+  xcr0 = i386_linux_core_read_xcr0 (gdbarch, target, abfd);
+  if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+    return tdesc_i386_avx_linux;
+  else
+    return tdesc_i386_linux;
 }
 
 static void
@@ -623,6 +671,8 @@ i386_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   tdep->sc_reg_offset = i386_linux_sc_reg_offset;
   tdep->sc_num_regs = ARRAY_SIZE (i386_linux_sc_reg_offset);
 
+  tdep->xsave_xcr0_offset = I386_LINUX_XSAVE_XCR0_OFFSET;
+
   set_gdbarch_process_record (gdbarch, i386_process_record);
   set_gdbarch_process_record_signal (gdbarch, i386_linux_record_signal);
 
@@ -840,4 +890,5 @@ _initialize_i386_linux_tdep (void)
 
   /* Initialize the Linux target description  */
   initialize_tdesc_i386_linux ();
+  initialize_tdesc_i386_avx_linux ();
 }
diff --git a/gdb/i386-linux-tdep.h b/gdb/i386-linux-tdep.h
index 11f7295..eaeb63c 100644
--- a/gdb/i386-linux-tdep.h
+++ b/gdb/i386-linux-tdep.h
@@ -30,12 +30,38 @@
 /* Register number for the "orig_eax" pseudo-register.  If this
    pseudo-register contains a value >= 0 it is interpreted as the
    system call number that the kernel is supposed to restart.  */
-#define I386_LINUX_ORIG_EAX_REGNUM I386_SSE_NUM_REGS
+#define I386_LINUX_ORIG_EAX_REGNUM I386_AVX_NUM_REGS
 
 /* Total number of registers for GNU/Linux.  */
 #define I386_LINUX_NUM_REGS (I386_LINUX_ORIG_EAX_REGNUM + 1)
 
+/* Get XSAVE extended state xcr0 from core dump.  */
+extern uint64_t i386_linux_core_read_xcr0
+  (struct gdbarch *gdbarch, struct target_ops *target, bfd *abfd);
+
 /* Linux target description.  */
 extern struct target_desc *tdesc_i386_linux;
+extern struct target_desc *tdesc_i386_avx_linux;
+
+/* Format of XSAVE extended state is:
+ 	struct
+	{
+	  fxsave_bytes[0..463]
+	  sw_usable_bytes[464..511]
+	  xstate_hdr_bytes[512..575]
+	  avx_bytes[576..831]
+	  future_state etc
+	};
+
+  Same memory layout will be used for the coredump NT_X86_XSTATE
+  representing the XSAVE extended state registers.
+
+  The first 8 bytes of the sw_usable_bytes[464..467] is the OS enabled
+  extended state mask, which is the same as the extended control register
+  0 (the XFEATURE_ENABLED_MASK register), XCR0.  We can use this mask
+  together with the mask saved in the xstate_hdr_bytes to determine what
+  states the processor/OS supports and what state, used or initialized,
+  the process/thread is in.  */ 
+#define I386_LINUX_XSAVE_XCR0_OFFSET 464
 
 #endif /* i386-linux-tdep.h */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 703d003..ce658cd 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -51,11 +51,13 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 #include "record.h"
 #include <stdint.h>
 
 #include "features/i386/i386.c"
+#include "features/i386/i386-avx.c"
 
 /* Register names.  */
 
@@ -74,6 +76,18 @@ static const char *i386_register_names[] =
   "mxcsr"
 };
 
+static const char *i386_ymm_names[] =
+{
+  "ymm0",  "ymm1",   "ymm2",  "ymm3",
+  "ymm4",  "ymm5",   "ymm6",  "ymm7",
+};
+
+static const char *i386_ymmh_names[] =
+{
+  "ymm0h",  "ymm1h",   "ymm2h",  "ymm3h",
+  "ymm4h",  "ymm5h",   "ymm6h",  "ymm7h",
+};
+
 /* Register names for MMX pseudo-registers.  */
 
 static const char *i386_mmx_names[] =
@@ -150,18 +164,47 @@ i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum)
   return regnum >= 0 && regnum < tdep->num_dword_regs;
 }
 
+int
+i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0h_regnum = tdep->ymm0h_regnum;
+
+  if (ymm0h_regnum < 0)
+    return 0;
+
+  regnum -= ymm0h_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
+/* AVX register?  */
+
+int
+i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int ymm0_regnum = tdep->ymm0_regnum;
+
+  if (ymm0_regnum < 0)
+    return 0;
+
+  regnum -= ymm0_regnum;
+  return regnum >= 0 && regnum < tdep->num_ymm_regs;
+}
+
 /* SSE register?  */
 
-static int
-i386_sse_regnum_p (struct gdbarch *gdbarch, int regnum)
+int
+i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum)
 {
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int num_xmm_regs = I387_NUM_XMM_REGS (tdep);
 
-  if (I387_NUM_XMM_REGS (tdep) == 0)
+  if (num_xmm_regs == 0)
     return 0;
 
-  return (I387_XMM0_REGNUM (tdep) <= regnum
-	  && regnum < I387_MXCSR_REGNUM (tdep));
+  regnum -= I387_XMM0_REGNUM (tdep);
+  return regnum >= 0 && regnum < num_xmm_regs;
 }
 
 static int
@@ -201,6 +244,19 @@ i386_fpc_regnum_p (struct gdbarch *gdbarch, int regnum)
 	  && regnum < I387_XMM0_REGNUM (tdep));
 }
 
+/* Return the name of register REGNUM, or the empty string if it is
+   an anonymous register.  */
+
+static const char *
+i386_register_name (struct gdbarch *gdbarch, int regnum)
+{
+  /* Hide the upper YMM registers.  */
+  if (i386_ymmh_regnum_p (gdbarch, regnum))
+    return "";
+
+  return tdesc_register_name (gdbarch, regnum);
+}
+
 /* Return the name of register REGNUM.  */
 
 const char *
@@ -209,6 +265,8 @@ i386_pseudo_register_name (struct gdbarch *gdbarch, int regnum)
   struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_names[regnum - I387_MM0_REGNUM (tdep)];
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_names[regnum - tdep->ymm0_regnum];
   else if (i386_byte_regnum_p (gdbarch, regnum))
     return i386_byte_names[regnum - tdep->al_regnum];
   else if (i386_word_regnum_p (gdbarch, regnum))
@@ -246,7 +304,13 @@ i386_dbx_reg_to_regnum (struct gdbarch *gdbarch, int reg)
   else if (reg >= 21 && reg <= 28)
     {
       /* SSE registers.  */
-      return reg - 21 + I387_XMM0_REGNUM (tdep);
+      int ymm0_regnum = tdep->ymm0_regnum;
+
+      if (ymm0_regnum >= 0
+	  && i386_xmm_regnum_p (gdbarch, reg))
+	return reg - 21 + ymm0_regnum;
+      else
+	return reg - 21 + I387_XMM0_REGNUM (tdep);
     }
   else if (reg >= 29 && reg <= 36)
     {
@@ -2184,6 +2248,59 @@ i387_ext_type (struct gdbarch *gdbarch)
   return tdep->i387_ext_type;
 }
 
+/* Construct vector type for pseudo YMM registers.  We can't use
+   tdesc_find_type since YMM isn't described in target description.  */
+
+static struct type *
+i386_ymm_type (struct gdbarch *gdbarch)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  if (!tdep->i386_ymm_type)
+    {
+      const struct builtin_type *bt = builtin_type (gdbarch);
+
+      /* The type we're building is this: */
+#if 0
+      union __gdb_builtin_type_vec256i
+      {
+        int128_t uint128[2];
+        int64_t v2_int64[4];
+        int32_t v4_int32[8];
+        int16_t v8_int16[16];
+        int8_t v16_int8[32];
+        double v2_double[4];
+        float v4_float[8];
+      };
+#endif
+
+      struct type *t;
+
+      t = arch_composite_type (gdbarch,
+			       "__gdb_builtin_type_vec256i", TYPE_CODE_UNION);
+      append_composite_type_field (t, "v8_float",
+				   init_vector_type (bt->builtin_float, 8));
+      append_composite_type_field (t, "v4_double",
+				   init_vector_type (bt->builtin_double, 4));
+      append_composite_type_field (t, "v32_int8",
+				   init_vector_type (bt->builtin_int8, 32));
+      append_composite_type_field (t, "v16_int16",
+				   init_vector_type (bt->builtin_int16, 16));
+      append_composite_type_field (t, "v8_int32",
+				   init_vector_type (bt->builtin_int32, 8));
+      append_composite_type_field (t, "v4_int64",
+				   init_vector_type (bt->builtin_int64, 4));
+      append_composite_type_field (t, "v2_int128",
+				   init_vector_type (bt->builtin_int128, 2));
+
+      TYPE_VECTOR (t) = 1;
+      TYPE_NAME (t) = "builtin_type_vec128i";
+      tdep->i386_ymm_type = t;
+    }
+
+  return tdep->i386_ymm_type;
+}
+
 /* Construct vector type for MMX registers.  */
 static struct type *
 i386_mmx_type (struct gdbarch *gdbarch)
@@ -2234,6 +2351,8 @@ i386_pseudo_register_type (struct gdbarch *gdbarch, int regnum)
 {
   if (i386_mmx_regnum_p (gdbarch, regnum))
     return i386_mmx_type (gdbarch);
+  else if (i386_ymm_regnum_p (gdbarch, regnum))
+    return i386_ymm_type (gdbarch);
   else
     {
       const struct builtin_type *bt = builtin_type (gdbarch);
@@ -2285,7 +2404,22 @@ i386_pseudo_register_read (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* Extract (always little endian).  Read lower 128bits. */
+	  regcache_raw_read (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     raw_buf);
+	  memcpy (buf, raw_buf, 16);
+	  /* Read upper 128bits.  */
+	  regcache_raw_read (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     raw_buf);
+	  memcpy (buf + 16, raw_buf, 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2334,7 +2468,20 @@ i386_pseudo_register_write (struct gdbarch *gdbarch, struct regcache *regcache,
     {
       struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
 
-      if (i386_word_regnum_p (gdbarch, regnum))
+      if (i386_ymm_regnum_p (gdbarch, regnum))
+	{
+	  regnum -= tdep->ymm0_regnum;
+
+	  /* ... Write lower 128bits.  */
+	  regcache_raw_write (regcache,
+			     I387_XMM0_REGNUM (tdep) + regnum,
+			     buf);
+	  /* ... Write upper 128bits.  */
+	  regcache_raw_write (regcache,
+			     tdep->ymm0h_regnum + regnum,
+			     buf + 16);
+	}
+      else if (i386_word_regnum_p (gdbarch, regnum))
 	{
 	  int gpnum = regnum - tdep->ax_regnum;
 
@@ -2581,6 +2728,28 @@ i386_collect_fpregset (const struct regset *regset,
   i387_collect_fsave (regcache, regnum, fpregs);
 }
 
+/* Similar to i386_supply_fpregset, but use XSAVE extended state.  */
+
+static void
+i386_supply_xstateregset (const struct regset *regset,
+			  struct regcache *regcache, int regnum,
+			  const void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_supply_xsave (regcache, regnum, xstateregs);
+}
+
+/* Similar to i386_collect_fpregset , but use XSAVE extended state.  */
+
+static void
+i386_collect_xstateregset (const struct regset *regset,
+			   const struct regcache *regcache,
+			   int regnum, void *xstateregs, size_t len)
+{
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (regset->arch);
+  i387_collect_xsave (regcache, regnum, xstateregs, 1);
+}
+
 /* Return the appropriate register set for the core section identified
    by SECT_NAME and SECT_SIZE.  */
 
@@ -2608,6 +2777,16 @@ i386_regset_from_core_section (struct gdbarch *gdbarch,
       return tdep->fpregset;
     }
 
+  if (strcmp (sect_name, ".reg-xstate") == 0)
+    {
+      if (tdep->xstateregset == NULL)
+	tdep->xstateregset = regset_alloc (gdbarch,
+					   i386_supply_xstateregset,
+					   i386_collect_xstateregset);
+
+      return tdep->xstateregset;
+    }
+
   return NULL;
 }
 \f
@@ -2801,46 +2980,60 @@ int
 i386_register_reggroup_p (struct gdbarch *gdbarch, int regnum,
 			  struct reggroup *group)
 {
-  int sse_regnum_p, fp_regnum_p, mmx_regnum_p, byte_regnum_p,
-      word_regnum_p, dword_regnum_p;
+  const struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  int fp_regnum_p, mmx_regnum_p, xmm_regnum_p, mxcsr_regnum_p,
+      ymm_regnum_p, ymmh_regnum_p;
 
   /* Don't include pseudo registers, except for MMX, in any register
      groups.  */
-  byte_regnum_p = i386_byte_regnum_p (gdbarch, regnum);
-  if (byte_regnum_p)
+  if (i386_byte_regnum_p (gdbarch, regnum))
     return 0;
 
-  word_regnum_p = i386_word_regnum_p (gdbarch, regnum);
-  if (word_regnum_p)
+  if (i386_word_regnum_p (gdbarch, regnum))
     return 0;
 
-  dword_regnum_p = i386_dword_regnum_p (gdbarch, regnum);
-  if (dword_regnum_p)
+  if (i386_dword_regnum_p (gdbarch, regnum))
     return 0;
 
   mmx_regnum_p = i386_mmx_regnum_p (gdbarch, regnum);
   if (group == i386_mmx_reggroup)
     return mmx_regnum_p;
 
-  sse_regnum_p = (i386_sse_regnum_p (gdbarch, regnum)
-		  || i386_mxcsr_regnum_p (gdbarch, regnum));
+  xmm_regnum_p = i386_xmm_regnum_p (gdbarch, regnum);
+  mxcsr_regnum_p = i386_mxcsr_regnum_p (gdbarch, regnum);
   if (group == i386_sse_reggroup)
-    return sse_regnum_p;
+    return xmm_regnum_p || mxcsr_regnum_p;
+
+  ymm_regnum_p = i386_ymm_regnum_p (gdbarch, regnum);
   if (group == vector_reggroup)
-    return mmx_regnum_p || sse_regnum_p;
+    return (mmx_regnum_p
+	    || ymm_regnum_p
+	    || mxcsr_regnum_p
+	    || (xmm_regnum_p
+		&& ((tdep->xcr0 & I386_XSTATE_AVX_MASK)
+		    == I386_XSTATE_SSE_MASK)));
 
   fp_regnum_p = (i386_fp_regnum_p (gdbarch, regnum)
 		 || i386_fpc_regnum_p (gdbarch, regnum));
   if (group == float_reggroup)
     return fp_regnum_p;
 
+  /* For "info reg all", don't include upper YMM registers nor XMM
+     registers when AVX is supported.  */
+  ymmh_regnum_p = i386_ymmh_regnum_p (gdbarch, regnum);
+  if (group == all_reggroup
+      && ((xmm_regnum_p
+	   && (tdep->xcr0 & I386_XSTATE_AVX))
+	  || ymmh_regnum_p))
+    return 0;
+
   if (group == general_reggroup)
     return (!fp_regnum_p
 	    && !mmx_regnum_p
-	    && !sse_regnum_p
-	    && !byte_regnum_p
-	    && !word_regnum_p
-	    && !dword_regnum_p);
+	    && !mxcsr_regnum_p
+	    && !xmm_regnum_p
+	    && !ymm_regnum_p
+	    && !ymmh_regnum_p);
 
   return default_register_reggroup_p (gdbarch, regnum, group);
 }
@@ -5665,7 +5858,7 @@ no_support_3dnow_data:
               record_arch_list_add_reg (ir.regcache, i);
 
             for (i = I387_XMM0_REGNUM (tdep);
-                 i386_sse_regnum_p (gdbarch, i); i++)
+                 i386_xmm_regnum_p (gdbarch, i); i++)
               record_arch_list_add_reg (ir.regcache, i);
 
             if (i386_mxcsr_regnum_p (gdbarch, I387_MXCSR_REGNUM(tdep)))
@@ -6065,7 +6258,7 @@ reswitch_prefix_add:
           if (i386_record_modrm (&ir))
 	    return -1;
           ir.reg |= rex_r;
-          if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.reg))
+          if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.reg))
             goto no_support;
           record_arch_list_add_reg (ir.regcache,
                                     I387_XMM0_REGNUM (tdep) + ir.reg);
@@ -6097,7 +6290,7 @@ reswitch_prefix_add:
                   || opcode == 0x0f17 || opcode == 0x660f17)
                 goto no_support;
               ir.rm |= ir.rex_b;
-              if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
+              if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
                 goto no_support;
               record_arch_list_add_reg (ir.regcache,
                                         I387_XMM0_REGNUM (tdep) + ir.rm);
@@ -6275,7 +6468,7 @@ reswitch_prefix_add:
           if (i386_record_modrm (&ir))
 	    return -1;
           ir.rm |= ir.rex_b;
-          if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
+          if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
             goto no_support;
           record_arch_list_add_reg (ir.regcache,
                                     I387_XMM0_REGNUM (tdep) + ir.rm);
@@ -6329,7 +6522,7 @@ reswitch_prefix_add:
           if (ir.mod == 3)
             {
               ir.rm |= ir.rex_b;
-              if (!i386_sse_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
+              if (!i386_xmm_regnum_p (gdbarch, I387_XMM0_REGNUM (tdep) + ir.rm))
                 goto no_support;
               record_arch_list_add_reg (ir.regcache,
                                         I387_XMM0_REGNUM (tdep) + ir.rm);
@@ -6449,7 +6642,8 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
 		       struct tdesc_arch_data *tdesc_data)
 {
   const struct target_desc *tdesc = tdep->tdesc;
-  const struct tdesc_feature *feature_core, *feature_vector;
+  const struct tdesc_feature *feature_core;
+  const struct tdesc_feature *feature_sse, *feature_avx;
   int i, num_regs, valid_p;
 
   if (! tdesc_has_registers (tdesc))
@@ -6459,13 +6653,37 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   feature_core = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.core");
 
   /* Get SSE registers.  */
-  feature_vector = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
+  feature_sse = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.sse");
 
-  if (feature_core == NULL || feature_vector == NULL)
+  if (feature_core == NULL || feature_sse == NULL)
     return 0;
 
+  /* Try AVX registers.  */
+  feature_avx = tdesc_find_feature (tdesc, "org.gnu.gdb.i386.avx");
+
   valid_p = 1;
 
+  /* The XCR0 bits.  */
+  if (feature_avx)
+    {
+      tdep->xcr0 = I386_XSTATE_AVX_MASK;
+
+      /* It may have been set by OSABI initialization function.  */
+      if (tdep->num_ymm_regs == 0)
+	{
+	  tdep->ymmh_register_names = i386_ymmh_names;
+	  tdep->num_ymm_regs = 8;
+	  tdep->ymm0h_regnum = I386_YMM0H_REGNUM;
+	}
+
+      for (i = 0; i < tdep->num_ymm_regs; i++)
+	valid_p &= tdesc_numbered_register (feature_avx, tdesc_data,
+					    tdep->ymm0h_regnum + i,
+					    tdep->ymmh_register_names[i]);
+    }
+  else
+    tdep->xcr0 = I386_XSTATE_SSE_MASK;
+
   num_regs = tdep->num_core_regs;
   for (i = 0; i < num_regs; i++)
     valid_p &= tdesc_numbered_register (feature_core, tdesc_data, i,
@@ -6474,7 +6692,7 @@ i386_validate_tdesc_p (struct gdbarch_tdep *tdep,
   /* Need to include %mxcsr, so add one.  */
   num_regs += tdep->num_xmm_regs + 1;
   for (; i < num_regs; i++)
-    valid_p &= tdesc_numbered_register (feature_vector, tdesc_data, i,
+    valid_p &= tdesc_numbered_register (feature_sse, tdesc_data, i,
 					tdep->register_names[i]);
 
   return valid_p;
@@ -6489,6 +6707,7 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   struct tdesc_arch_data *tdesc_data;
   const struct target_desc *tdesc;
   int mm0_regnum;
+  int ymm0_regnum;
 
   /* If there is already a candidate, use it.  */
   arches = gdbarch_list_lookup_by_info (arches, &info);
@@ -6509,6 +6728,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->fpregset = NULL;
   tdep->sizeof_fpregset = I387_SIZEOF_FSAVE;
 
+  tdep->xstateregset = NULL;
+
   /* The default settings include the FPU registers, the MMX registers
      and the SSE registers.  This can be overridden for a specific ABI
      by adjusting the members `st0_regnum', `mm0_regnum' and
@@ -6538,6 +6759,8 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->sc_pc_offset = -1;
   tdep->sc_sp_offset = -1;
 
+  tdep->xsave_xcr0_offset = -1;
+
   tdep->record_regmap = i386_record_regmap;
 
   /* The format used for `long double' on almost all i386 targets is
@@ -6654,9 +6877,14 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_tdesc_pseudo_register_type (gdbarch, i386_pseudo_register_type);
   set_tdesc_pseudo_register_name (gdbarch, i386_pseudo_register_name);
 
-  /* The default ABI includes general-purpose registers, 
-     floating-point registers, and the SSE registers.  */
-  set_gdbarch_num_regs (gdbarch, I386_SSE_NUM_REGS);
+  /* Override the normal target description method to make the AVX
+     upper halves anonymous.  */
+  set_gdbarch_register_name (gdbarch, i386_register_name);
+
+  /* Even though the default ABI only includes general-purpose registers,
+     floating-point registers and the SSE registers, we have to leave a
+     gap for the upper AVX registers.  */
+  set_gdbarch_num_regs (gdbarch, I386_AVX_NUM_REGS);
 
   /* Get the x86 target description from INFO.  */
   tdesc = info.target_desc;
@@ -6667,10 +6895,15 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->num_core_regs = I386_NUM_GREGS + I387_NUM_REGS;
   tdep->register_names = i386_register_names;
 
+  /* No upper YMM registers.  */
+  tdep->ymmh_register_names = NULL;
+  tdep->ymm0h_regnum = -1;
+
   tdep->num_byte_regs = 8;
   tdep->num_word_regs = 8;
   tdep->num_dword_regs = 0;
   tdep->num_mmx_regs = 8;
+  tdep->num_ymm_regs = 0;
 
   tdesc_data = tdesc_data_alloc ();
 
@@ -6678,24 +6911,25 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   info.tdep_info = (void *) tdesc_data;
   gdbarch_init_osabi (info, gdbarch);
 
+  if (!i386_validate_tdesc_p (tdep, tdesc_data))
+    {
+      tdesc_data_cleanup (tdesc_data);
+      xfree (tdep);
+      gdbarch_free (gdbarch);
+      return NULL;
+    }
+
   /* Wire in pseudo registers.  Number of pseudo registers may be
      changed.  */
   set_gdbarch_num_pseudo_regs (gdbarch, (tdep->num_byte_regs
 					 + tdep->num_word_regs
 					 + tdep->num_dword_regs
-					 + tdep->num_mmx_regs));
+					 + tdep->num_mmx_regs
+					 + tdep->num_ymm_regs));
 
   /* Target description may be changed.  */
   tdesc = tdep->tdesc;
 
-  if (!i386_validate_tdesc_p (tdep, tdesc_data))
-    {
-      tdesc_data_cleanup (tdesc_data);
-      xfree (tdep);
-      gdbarch_free (gdbarch);
-      return NULL;
-    }
-
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   /* Override gdbarch_register_reggroup_p set in tdesc_use_registers.  */
@@ -6705,16 +6939,26 @@ i386_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->al_regnum = gdbarch_num_regs (gdbarch);
   tdep->ax_regnum = tdep->al_regnum + tdep->num_byte_regs;
 
-  mm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
+  ymm0_regnum = tdep->ax_regnum + tdep->num_word_regs;
   if (tdep->num_dword_regs)
     {
       /* Support dword pseudo-registesr if it hasn't been disabled,  */
-      tdep->eax_regnum = mm0_regnum;
-      mm0_regnum = tdep->eax_regnum + tdep->num_dword_regs;
+      tdep->eax_regnum = ymm0_regnum;
+      ymm0_regnum += tdep->num_dword_regs;
     }
   else
     tdep->eax_regnum = -1;
 
+  mm0_regnum = ymm0_regnum;
+  if (tdep->num_ymm_regs)
+    {
+      /* Support YMM pseudo-registesr if it is available,  */
+      tdep->ymm0_regnum = ymm0_regnum;
+      mm0_regnum += tdep->num_ymm_regs;
+    }
+  else
+    tdep->ymm0_regnum = -1;
+
   if (tdep->num_mmx_regs != 0)
     {
       /* Support MMX pseudo-registesr if MMX hasn't been disabled,  */
@@ -6797,6 +7041,7 @@ is \"default\"."),
 
   /* Initialize the standard target descriptions.  */
   initialize_tdesc_i386 ();
+  initialize_tdesc_i386_avx ();
 
   /* Tell remote stub that we support XML target description.  */
   register_remote_support_xml ("i386");
diff --git a/gdb/i386-tdep.h b/gdb/i386-tdep.h
index 72c634e..6520d67 100644
--- a/gdb/i386-tdep.h
+++ b/gdb/i386-tdep.h
@@ -109,6 +109,9 @@ struct gdbarch_tdep
   struct regset *fpregset;
   size_t sizeof_fpregset;
 
+  /* XSAVE extended state.  */
+  struct regset *xstateregset;
+
   /* Register number for %st(0).  The register numbers for the other
      registers follow from this one.  Set this to -1 to indicate the
      absence of an FPU.  */
@@ -121,6 +124,13 @@ struct gdbarch_tdep
      of MMX support.  */
   int mm0_regnum;
 
+  /* Number of pseudo YMM registers.  */
+  int num_ymm_regs;
+
+  /* Register number for %ymm0.  Set this to -1 to indicate the absence
+     of pseudo YMM register support.  */
+  int ymm0_regnum;
+
   /* Number of byte registers.  */
   int num_byte_regs;
 
@@ -146,9 +156,24 @@ struct gdbarch_tdep
   /* Number of SSE registers.  */
   int num_xmm_regs;
 
+  /* Bits of the extended control register 0 (the XFEATURE_ENABLED_MASK
+     register), excluding the x87 bit, which are supported by this GDB.
+   */
+  uint64_t xcr0;
+
+  /* Offset of XCR0 in XSAVE extended state.  */
+  int xsave_xcr0_offset;
+
   /* Register names.  */
   const char **register_names;
 
+  /* Register number for %ymm0h.  Set this to -1 to indicate the absence
+     of upper YMM register support.  */
+  int ymm0h_regnum;
+
+  /* Upper YMM register names.  Only used for tdesc_numbered_register.  */
+  const char **ymmh_register_names;
+
   /* Target description.  */
   const struct target_desc *tdesc;
 
@@ -182,6 +207,7 @@ struct gdbarch_tdep
 
   /* ISA-specific data types.  */
   struct type *i386_mmx_type;
+  struct type *i386_ymm_type;
   struct type *i387_ext_type;
 
   /* Process record/replay target.  */
@@ -228,7 +254,9 @@ enum i386_regnum
   I386_FS_REGNUM,		/* %fs */
   I386_GS_REGNUM,		/* %gs */
   I386_ST0_REGNUM,		/* %st(0) */
-  I386_MXCSR_REGNUM = 40	/* %mxcsr */ 
+  I386_MXCSR_REGNUM = 40,	/* %mxcsr */ 
+  I386_YMM0H_REGNUM,		/* %ymm0h */
+  I386_YMM7H_REGNUM = I386_YMM0H_REGNUM + 7
 };
 
 /* Register numbers of RECORD_REGMAP.  */
@@ -265,6 +293,7 @@ enum record_i386_regnum
 #define I386_NUM_XREGS  9
 
 #define I386_SSE_NUM_REGS	(I386_MXCSR_REGNUM + 1)
+#define I386_AVX_NUM_REGS	(I386_YMM7H_REGNUM + 1)
 
 /* Size of the largest register.  */
 #define I386_MAX_REGISTER_SIZE	16
@@ -276,6 +305,9 @@ extern struct type *i387_ext_type (struct gdbarch *gdbarch);
 extern int i386_byte_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_word_regnum_p (struct gdbarch *gdbarch, int regnum);
 extern int i386_dword_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_xmm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymm_regnum_p (struct gdbarch *gdbarch, int regnum);
+extern int i386_ymmh_regnum_p (struct gdbarch *gdbarch, int regnum);
 
 extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch,
 					      int regnum);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 5/6 [3rd try]: Add AVX support (i387 changes)
  2010-03-12 17:24         ` H.J. Lu
@ 2010-04-07 16:57           ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 16:57 UTC (permalink / raw)
  To: GDB

On Fri, Mar 12, 2010 at 09:24:45AM -0800, H.J. Lu wrote:
> On Sat, Mar 06, 2010 at 02:22:12PM -0800, H.J. Lu wrote:
> > Hi,
> > 
> > Here are i387 changes to support AVX.  OK to install?
> >  
> > Thanks.
> > 
> 

Here is the updated patch.  I updated i387_collect_xsave to reduce the
size of i387_collect_xsave by removing gcore optimization.  OK to install?

Thanks.


H.J.
---
2010-04-07  H.J. Lu  <hongjiu.lu@intel.com>

	* i387-tdep.c: Include "i386-xstate.h".
	(XSAVE_XSTATE_BV_ADDR): New.
	(xsave_avxh_offset): Likewise.
	(XSAVE_AVXH_ADDR): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

	* i387-tdep.h (I387_NUM_YMM_REGS): New.
	(I387_YMM0H_REGNUM): Likewise.
	(I387_YMMENDH_REGNUM): Likewise.
	(i387_supply_xsave): Likewise.
	(i387_collect_xsave): Likewise.

diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c
index 3fb5b56..4c32e09 100644
--- a/gdb/i387-tdep.c
+++ b/gdb/i387-tdep.c
@@ -34,6 +34,7 @@
 
 #include "i386-tdep.h"
 #include "i387-tdep.h"
+#include "i386-xstate.h"
 
 /* Print the floating point number specified by RAW.  */
 
@@ -677,6 +678,475 @@ i387_collect_fxsave (const struct regcache *regcache, int regnum, void *fxsave)
 			  FXSAVE_MXCSR_ADDR (regs));
 }
 
+/* `xstate_bv' is at byte offset 512.  */
+#define XSAVE_XSTATE_BV_ADDR(xsave) (xsave + 512)
+
+/* At xsave_avxh_offset[REGNUM] you'll find the offset to the location in
+   the upper 128bit of AVX register data structure used by the "xsave"
+   instruction where GDB register REGNUM is stored.  */
+
+static int xsave_avxh_offset[] =
+{
+  576 + 0 * 16,		/* Upper 128bit of %ymm0 through ...  */
+  576 + 1 * 16,
+  576 + 2 * 16,
+  576 + 3 * 16,
+  576 + 4 * 16,
+  576 + 5 * 16,
+  576 + 6 * 16,
+  576 + 7 * 16,
+  576 + 8 * 16,
+  576 + 9 * 16,
+  576 + 10 * 16,
+  576 + 11 * 16,
+  576 + 12 * 16,
+  576 + 13 * 16,
+  576 + 14 * 16,
+  576 + 15 * 16		/* Upper 128bit of ... %ymm15 (128 bits each).  */
+};
+
+#define XSAVE_AVXH_ADDR(tdep, xsave, regnum) \
+  (xsave + xsave_avxh_offset[regnum - I387_YMM0H_REGNUM (tdep)])
+
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+void
+i387_supply_xsave (struct regcache *regcache, int regnum,
+		   const void *xsave)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  const gdb_byte *regs = xsave;
+  int i;
+  unsigned int clear_bv;
+  const gdb_byte *p;
+  enum
+    {
+      none = 0x0,
+      x87 = 0x1,
+      sse = 0x2,
+      avxh = 0x4,
+      all = x87 | sse | avxh
+    } regclass;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_xmm_regs > 0);
+
+  if (regnum == -1)
+    regclass = all;
+  else if (regnum >= I387_YMM0H_REGNUM (tdep)
+	   && regnum < I387_YMMENDH_REGNUM (tdep))
+    regclass = avxh;
+  else if (regnum >= I387_XMM0_REGNUM(tdep)
+	   && regnum < I387_MXCSR_REGNUM (tdep))
+    regclass = sse;
+  else if (regnum >= I387_ST0_REGNUM (tdep)
+	   && regnum < I387_FCTRL_REGNUM (tdep))
+    regclass = x87;
+  else
+    regclass = none;
+
+  if (regs != NULL && regclass != none)
+    {
+      /* Get `xstat_bv'.  */
+      const gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+
+      /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+	 vector registers if its bit in xstat_bv is zero.  */
+      clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+    }
+  else
+    clear_bv = I386_XSTATE_AVX_MASK;
+
+  switch (regclass)
+    {
+    case none:
+      break;
+
+    case avxh:
+      if ((clear_bv & I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case sse:
+      if ((clear_bv & I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = FXSAVE_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case x87:
+      if ((clear_bv & I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = FXSAVE_ADDR (tdep, regs, regnum);
+      regcache_raw_supply (regcache, regnum, p);
+      return;
+
+    case all:
+      /* Hanle the upper YMM registers.  */
+      if ((tdep->xcr0 & I386_XSTATE_AVX))
+	{
+	  if ((clear_bv & I386_XSTATE_AVX))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_YMM0H_REGNUM (tdep);
+	       i < I387_YMMENDH_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = XSAVE_AVXH_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+
+      /* Handle the XMM registers.  */
+      if ((tdep->xcr0 & I386_XSTATE_SSE))
+	{
+	  if ((clear_bv & I386_XSTATE_SSE))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_XMM0_REGNUM (tdep);
+	       i < I387_MXCSR_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = FXSAVE_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+
+      /* Handle the x87 registers.  */
+      if ((tdep->xcr0 & I386_XSTATE_X87))
+	{
+	  if ((clear_bv & I386_XSTATE_X87))
+	    p = NULL;
+	  else
+	    p = regs;
+
+	  for (i = I387_ST0_REGNUM (tdep);
+	       i < I387_FCTRL_REGNUM (tdep); i++)
+	    {
+	      if (p != NULL)
+		p = FXSAVE_ADDR (tdep, regs, i);
+	      regcache_raw_supply (regcache, i, p);
+	    }
+	}
+      break;
+    }
+
+  /* Only handle x87 control registers.  */
+  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	if (regs == NULL)
+	  {
+	    regcache_raw_supply (regcache, i, NULL);
+	    continue;
+	  }
+
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte val[4];
+
+	    memcpy (val, FXSAVE_ADDR (tdep, regs, i), 2);
+	    val[2] = val[3] = 0;
+	    if (i == I387_FOP_REGNUM (tdep))
+	      val[1] &= ((1 << 3) - 1);
+	    else if (i== I387_FTAG_REGNUM (tdep))
+	      {
+		/* The fxsave area contains a simplified version of
+		   the tag word.  We have to look at the actual 80-bit
+		   FP data to recreate the traditional i387 tag word.  */
+
+		unsigned long ftag = 0;
+		int fpreg;
+		int top;
+
+		top = ((FXSAVE_ADDR (tdep, regs,
+				     I387_FSTAT_REGNUM (tdep)))[1] >> 3);
+		top &= 0x7;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag;
+
+		    if (val[0] & (1 << fpreg))
+		      {
+			int regnum = (fpreg + 8 - top) % 8 
+				       + I387_ST0_REGNUM (tdep);
+			tag = i387_tag (FXSAVE_ADDR (tdep, regs, regnum));
+		      }
+		    else
+		      tag = 3;		/* Empty */
+
+		    ftag |= tag << (2 * fpreg);
+		  }
+		val[0] = ftag & 0xff;
+		val[1] = (ftag >> 8) & 0xff;
+	      }
+	    regcache_raw_supply (regcache, i, val);
+	  }
+	else 
+	  regcache_raw_supply (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    {
+      p = regs == NULL ? NULL : FXSAVE_MXCSR_ADDR (regs);
+      regcache_raw_supply (regcache, I387_MXCSR_REGNUM (tdep), p);
+    }
+}
+
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+void
+i387_collect_xsave (const struct regcache *regcache, int regnum,
+		    void *xsave, int gcore)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (get_regcache_arch (regcache));
+  gdb_byte *regs = xsave;
+  int i;
+  enum
+    {
+      none = 0x0,
+      check = 0x1,
+      x87 = 0x2 | check,
+      sse = 0x4 | check,
+      avxh = 0x8 | check,
+      all = x87 | sse | avxh
+    } regclass;
+
+  gdb_assert (tdep->st0_regnum >= I386_ST0_REGNUM);
+  gdb_assert (tdep->num_xmm_regs > 0);
+
+  if (regnum == -1)
+    regclass = all;
+  else if (regnum >= I387_YMM0H_REGNUM (tdep)
+	   && regnum < I387_YMMENDH_REGNUM (tdep))
+    regclass = avxh;
+  else if (regnum >= I387_XMM0_REGNUM(tdep)
+	   && regnum < I387_MXCSR_REGNUM (tdep))
+    regclass = sse;
+  else if (regnum >= I387_ST0_REGNUM (tdep)
+	   && regnum < I387_FCTRL_REGNUM (tdep))
+    regclass = x87;
+  else
+    regclass = none;
+
+  if (gcore)
+    {
+      /* Clear XSAVE extended state.  */
+      memset (regs, 0, I386_XSTATE_SIZE (tdep->xcr0));
+
+      /* Update XCR0 and `xstate_bv' with XCR0 for gcore.  */
+      if (tdep->xsave_xcr0_offset != -1)
+	memcpy (regs + tdep->xsave_xcr0_offset, &tdep->xcr0, 8);
+      memcpy (XSAVE_XSTATE_BV_ADDR (regs), &tdep->xcr0, 8);
+    }
+
+  if ((regclass & check))
+    {
+      gdb_byte raw[I386_MAX_REGISTER_SIZE];
+      gdb_byte *xstate_bv_p = XSAVE_XSTATE_BV_ADDR (regs);
+      unsigned int xstate_bv = 0;
+      /* The supported bits in `xstat_bv' are 1 byte. */
+      unsigned int clear_bv = (~(*xstate_bv_p)) & tdep->xcr0;
+      gdb_byte *p;
+
+      /* Clear register set if its bit in xstat_bv is zero.  */
+      if (clear_bv)
+	{
+	  if ((clear_bv & I386_XSTATE_AVX))
+	    for (i = I387_YMM0H_REGNUM (tdep);
+		 i < I387_YMMENDH_REGNUM (tdep); i++)
+	      memset (XSAVE_AVXH_ADDR (tdep, regs, i), 0, 16);
+
+	  if ((clear_bv & I386_XSTATE_SSE))
+	    for (i = I387_XMM0_REGNUM (tdep);
+		 i < I387_MXCSR_REGNUM (tdep); i++)
+	      memset (FXSAVE_ADDR (tdep, regs, i), 0, 16);
+
+	  if ((clear_bv & I386_XSTATE_X87))
+	    for (i = I387_ST0_REGNUM (tdep);
+		 i < I387_FCTRL_REGNUM (tdep); i++)
+	      memset (FXSAVE_ADDR (tdep, regs, i), 0, 10);
+	}
+
+      if (regclass == all)
+	{
+	  /* Check if any upper YMM registers are changed.  */
+	  if ((tdep->xcr0 & I386_XSTATE_AVX))
+	    for (i = I387_YMM0H_REGNUM (tdep);
+		 i < I387_YMMENDH_REGNUM (tdep); i++)
+	      {
+		regcache_raw_collect (regcache, i, raw);
+		p = XSAVE_AVXH_ADDR (tdep, regs, i);
+		if (memcmp (raw, p, 16))
+		  {
+		    xstate_bv |= I386_XSTATE_AVX;
+		    memcpy (p, raw, 16);
+		  }
+	      }
+
+	  /* Check if any SSE registers are changed.  */
+	  if ((tdep->xcr0 & I386_XSTATE_SSE))
+	    for (i = I387_XMM0_REGNUM (tdep);
+		 i < I387_MXCSR_REGNUM (tdep); i++)
+	      {
+		regcache_raw_collect (regcache, i, raw);
+		p = FXSAVE_ADDR (tdep, regs, i);
+		if (memcmp (raw, p, 16))
+		  {
+		    xstate_bv |= I386_XSTATE_SSE;
+		    memcpy (p, raw, 16);
+		  }
+	      }
+
+	  /* Check if any X87 registers are changed.  */
+	  if ((tdep->xcr0 & I386_XSTATE_X87))
+	    for (i = I387_ST0_REGNUM (tdep);
+		 i < I387_FCTRL_REGNUM (tdep); i++)
+	      {
+		regcache_raw_collect (regcache, i, raw);
+		p = FXSAVE_ADDR (tdep, regs, i);
+		if (memcmp (raw, p, 10))
+		  {
+		    xstate_bv |= I386_XSTATE_X87;
+		    memcpy (p, raw, 10);
+		  }
+	      }
+	}
+      else
+	{
+	  /* Check if REGNUM is changed.  */
+	  regcache_raw_collect (regcache, regnum, raw);
+
+	  switch (regclass)
+	    {
+	    default:
+		  abort ();
+
+		case avxh:
+		  /* This is an upper YMM register.  */
+		  p = XSAVE_AVXH_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 16))
+		    {
+		      xstate_bv |= I386_XSTATE_AVX;
+		      memcpy (p, raw, 16);
+		    }
+		  break;
+
+		case sse:
+		  /* This is an SSE register.  */
+		  p = FXSAVE_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 16))
+		    {
+		      xstate_bv |= I386_XSTATE_SSE;
+		      memcpy (p, raw, 16);
+		    }
+		  break;
+
+		case x87:
+		  /* This is an x87 register.  */
+		  p = FXSAVE_ADDR (tdep, regs, regnum);
+		  if (memcmp (raw, p, 10))
+		    {
+		      xstate_bv |= I386_XSTATE_X87;
+		      memcpy (p, raw, 10);
+		    }
+		  break;
+		}
+	    }
+
+	  /* Update the corresponding bits in `xstate_bv' if any SSE/AVX
+	     registers are changed.  */
+	  if (xstate_bv)
+	    {
+	      /* The supported bits in `xstat_bv' are 1 byte.  */
+	      *xstate_bv_p |= (gdb_byte) xstate_bv;
+
+	      switch (regclass)
+		{
+		default:
+		  abort ();
+
+		case all:
+		  break;
+
+		case x87:
+		case sse:
+		case avxh:
+		  /* Register REGNUM has been updated.  Return.  */
+		  return;
+		}
+	    }
+	  else
+	    {
+	      /* Return if REGNUM isn't changed.  */
+	      if (regclass != all)
+		return;
+	    }
+    }
+
+  /* Only handle x87 control registers.  */
+  for (i = I387_FCTRL_REGNUM (tdep); i < I387_XMM0_REGNUM (tdep); i++)
+    if (regnum == -1 || regnum == i)
+      {
+	/* Most of the FPU control registers occupy only 16 bits in
+	   the xsave extended state.  Give those a special treatment.  */
+	if (i != I387_FIOFF_REGNUM (tdep)
+	    && i != I387_FOOFF_REGNUM (tdep))
+	  {
+	    gdb_byte buf[4];
+
+	    regcache_raw_collect (regcache, i, buf);
+
+	    if (i == I387_FOP_REGNUM (tdep))
+	      {
+		/* The opcode occupies only 11 bits.  Make sure we
+                   don't touch the other bits.  */
+		buf[1] &= ((1 << 3) - 1);
+		buf[1] |= ((FXSAVE_ADDR (tdep, regs, i))[1] & ~((1 << 3) - 1));
+	      }
+	    else if (i == I387_FTAG_REGNUM (tdep))
+	      {
+		/* Converting back is much easier.  */
+
+		unsigned short ftag;
+		int fpreg;
+
+		ftag = (buf[1] << 8) | buf[0];
+		buf[0] = 0;
+		buf[1] = 0;
+
+		for (fpreg = 7; fpreg >= 0; fpreg--)
+		  {
+		    int tag = (ftag >> (fpreg * 2)) & 3;
+
+		    if (tag != 3)
+		      buf[0] |= (1 << fpreg);
+		  }
+	      }
+	    memcpy (FXSAVE_ADDR (tdep, regs, i), buf, 2);
+	  }
+	else
+	  regcache_raw_collect (regcache, i, FXSAVE_ADDR (tdep, regs, i));
+      }
+
+  if (regnum == I387_MXCSR_REGNUM (tdep) || regnum == -1)
+    regcache_raw_collect (regcache, I387_MXCSR_REGNUM (tdep),
+			  FXSAVE_MXCSR_ADDR (regs));
+}
+
 /* Recreate the FTW (tag word) valid bits from the 80-bit FP data in
    *RAW.  */
 
diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h
index 645eb91..976fa11 100644
--- a/gdb/i387-tdep.h
+++ b/gdb/i387-tdep.h
@@ -33,6 +33,8 @@ struct ui_file;
 #define I387_ST0_REGNUM(tdep) ((tdep)->st0_regnum)
 #define I387_NUM_XMM_REGS(tdep) ((tdep)->num_xmm_regs)
 #define I387_MM0_REGNUM(tdep) ((tdep)->mm0_regnum)
+#define I387_NUM_YMM_REGS(tdep) ((tdep)->num_ymm_regs)
+#define I387_YMM0H_REGNUM(tdep) ((tdep)->ymm0h_regnum)
 
 #define I387_FCTRL_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 8)
 #define I387_FSTAT_REGNUM(tdep) (I387_FCTRL_REGNUM (tdep) + 1)
@@ -45,6 +47,8 @@ struct ui_file;
 #define I387_XMM0_REGNUM(tdep) (I387_ST0_REGNUM (tdep) + 16)
 #define I387_MXCSR_REGNUM(tdep) \
   (I387_XMM0_REGNUM (tdep) + I387_NUM_XMM_REGS (tdep))
+#define I387_YMMENDH_REGNUM(tdep) \
+  (I387_YMM0H_REGNUM (tdep) + I387_NUM_YMM_REGS (tdep))
 
 /* Print out the i387 floating point state.  */
 
@@ -99,6 +103,11 @@ extern void i387_collect_fsave (const struct regcache *regcache, int regnum,
 extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 				const void *fxsave);
 
+/* Similar to i387_supply_fxsave, but use XSAVE extended state.  */
+
+extern void i387_supply_xsave (struct regcache *regcache, int regnum,
+			       const void *xsave);
+
 /* Fill register REGNUM (if it is a floating-point or SSE register) in
    *FXSAVE with the value from REGCACHE.  If REGNUM is -1, do this for
    all registers.  This function doesn't touch any of the reserved
@@ -107,6 +116,11 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum,
 extern void i387_collect_fxsave (const struct regcache *regcache, int regnum,
 				 void *fxsave);
 
+/* Similar to i387_collect_fxsave, but use XSAVE extended state.  */
+
+extern void i387_collect_xsave (const struct regcache *regcache,
+				int regnum, void *xsave, int gcore);
+
 /* Prepare the FPU stack in REGCACHE for a function return.  */
 
 extern void i387_return_value (struct gdbarch *gdbarch,

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 6/6 [3rd try]: Add AVX support (gdbserver changes)
  2010-03-30 16:48               ` H.J. Lu
  2010-04-02 17:39                 ` Daniel Jacobowitz
  2010-04-03 21:57                 ` Jan Kratochvil
@ 2010-04-07 16:59                 ` H.J. Lu
  2 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 16:59 UTC (permalink / raw)
  To: GDB

On Tue, Mar 30, 2010 at 09:48:33AM -0700, H.J. Lu wrote:
> On Sun, Mar 28, 2010 at 06:09:35PM -0700, H.J. Lu wrote:
> > Hi,
> > 
> > Here are gdbserver changes to support AVX.  OK to install?
> > 
> > Thanks.
> > 
> 
> Here is the updated gdbserver change.  I tested it with
> 
> # gdbserver --multi host:10000
> 
> connecting from
> 
> 1. gdb without AVX support.
> 2. gdb with AVX support,
> 3. gdb without XML support.
> 
> to debug 32bit and 64bit binaries.  Everything works correctly.
> OK to install?
> 
> Thanks.
> 
> 

Here is the updated patch. I will check it in together with i386/amd64
changes.

Thanks.

H.J.
---
2010-04-06  H.J. Lu  <hongjiu.lu@intel.com>

	* Makefile.in (clean): Updated.
	(i386-avx.o): New.
	(i386-avx.c): Likewise.
	(i386-avx-linux.o): Likewise.
	(i386-avx-linux.c): Likewise.
	(amd64-avx.o): Likewise.
	(amd64-avx.c): Likewise.
	(amd64-avx-linux.o): Likewise.
	(amd64-avx-linux.c): Likewise.

	* configure.srv (srv_i386_regobj): Add i386-avx.o.
	(srv_i386_linux_regobj): Add i386-avx-linux.o.
	(srv_amd64_regobj): Add amd64-avx.o.
	(srv_amd64_linux_regobj): Add amd64-avx-linux.o.
	(srv_i386_32bit_xmlfiles): Add i386/32bit-avx.xml.
	(srv_i386_64bit_xmlfiles): Add i386/64bit-avx.xml.
	(srv_i386_xmlfiles): Add i386/i386-avx.xml.
	(srv_amd64_xmlfiles): Add i386/amd64-avx.xml.
	(srv_i386_linux_xmlfiles): Add i386/i386-avx-linux.xml.
	(srv_amd64_linux_xmlfiles): Add i386/amd64-avx-linux.xml.

	* i387-fp.c: Include "i386-xstate.h".
	(i387_xsave): New.
	(i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* i387-fp.h (i387_cache_to_xsave): Likewise.
	(i387_xsave_to_cache): Likewise.
	(x86_xcr0): Likewise.

	* linux-arm-low.c (target_regsets): Initialize nt_type to 0.
	* linux-crisv32-low.c (target_regsets): Likewise.
	* linux-m68k-low.c (target_regsets): Likewise.
	* linux-mips-low.c (target_regsets): Likewise.
	* linux-ppc-low.c (target_regsets): Likewise.
	* linux-s390-low.c (target_regsets): Likewise.
	* linux-sh-low.c (target_regsets): Likewise.
	* linux-sparc-low.c (target_regsets): Likewise.
	* linux-xtensa-low.c (target_regsets): Likewise.

	* linux-low.c: Include <sys/uio.h>.
	(regsets_fetch_inferior_registers): Support nt_type.
	(regsets_store_inferior_registers): Likewise.
	(linux_process_qsupported): New.
	(linux_target_ops): Add linux_process_qsupported.

	* linux-low.h (regset_info): Add nt_type.
	(linux_target_ops): Add process_qsupported.

	* linux-x86-low.c: Include "i386-xstate.h", "elf/common.h"
	and <sys/uio.h>.
	(init_registers_i386_avx_linux): New.
	(init_registers_amd64_avx_linux): Likewise.
	(xmltarget_i386_linux_no_xml): Likewise.
	(xmltarget_amd64_linux_no_xml): Likewise.
	(PTRACE_GETREGSET): Likewise.
	(PTRACE_SETREGSET): Likewise.
	(x86_fill_xstateregset): Likewise.
	(x86_store_xstateregset): Likewise.
	(use_xml): Likewise.
	(x86_linux_update_xmltarget): Likewise.
	(x86_linux_process_qsupported): Likewise.
	(target_regsets): Add NT_X86_XSTATE entry and Initialize nt_type.
	(x86_arch_setup): Don't call init_registers_amd64_linux nor
	init_registers_i386_linux here.  Call
	x86_linux_update_xmltarget.
	(the_low_target): Add x86_linux_process_qsupported.

	* server.c (handle_query): Call target_process_qsupported.

	* target.h (target_ops): Add process_qsupported.
	(target_process_qsupported): New.

diff --git a/gdb/gdbserver/Makefile.in b/gdb/gdbserver/Makefile.in
index 7fecced..2ec9784 100644
--- a/gdb/gdbserver/Makefile.in
+++ b/gdb/gdbserver/Makefile.in
@@ -217,6 +217,8 @@ clean:
 	rm -f powerpc-isa205-vsx64l.c
 	rm -f s390-linux32.c s390-linux64.c s390x-linux64.c
 	rm -f xml-builtin.c stamp-xml
+	rm -f i386-avx.c i386-avx-linux.c
+	rm -f amd64-avx.c amd64-avx-linux.c
 
 maintainer-clean realclean distclean: clean
 	rm -f nm.h tm.h xm.h config.status config.h stamp-h config.log
@@ -351,6 +353,12 @@ i386.c : $(srcdir)/../regformats/i386/i386.dat $(regdat_sh)
 i386-linux.o : i386-linux.c $(regdef_h)
 i386-linux.c : $(srcdir)/../regformats/i386/i386-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-linux.dat i386-linux.c
+i386-avx.o : i386-avx.c $(regdef_h)
+i386-avx.c : $(srcdir)/../regformats/i386/i386-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx.dat i386-avx.c
+i386-avx-linux.o : i386-avx-linux.c $(regdef_h)
+i386-avx-linux.c : $(srcdir)/../regformats/i386/i386-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/i386-avx-linux.dat i386-avx-linux.c
 reg-ia64.o : reg-ia64.c $(regdef_h)
 reg-ia64.c : $(srcdir)/../regformats/reg-ia64.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-ia64.dat reg-ia64.c
@@ -438,6 +446,12 @@ amd64.c : $(srcdir)/../regformats/i386/amd64.dat $(regdat_sh)
 amd64-linux.o : amd64-linux.c $(regdef_h)
 amd64-linux.c : $(srcdir)/../regformats/i386/amd64-linux.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-linux.dat amd64-linux.c
+amd64-avx.o : amd64-avx.c $(regdef_h)
+amd64-avx.c : $(srcdir)/../regformats/i386/amd64-avx.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx.dat amd64-avx.c
+amd64-avx-linux.o : amd64-avx-linux.c $(regdef_h)
+amd64-avx-linux.c : $(srcdir)/../regformats/i386/amd64-avx-linux.dat $(regdat_sh)
+	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/i386/amd64-avx-linux.dat amd64-avx-linux.c
 reg-xtensa.o : reg-xtensa.c $(regdef_h)
 reg-xtensa.c : $(srcdir)/../regformats/reg-xtensa.dat $(regdat_sh)
 	$(SHELL) $(regdat_sh) $(srcdir)/../regformats/reg-xtensa.dat reg-xtensa.c
diff --git a/gdb/gdbserver/configure.srv b/gdb/gdbserver/configure.srv
index f7c80bd..8bc9aeb 100644
--- a/gdb/gdbserver/configure.srv
+++ b/gdb/gdbserver/configure.srv
@@ -22,17 +22,17 @@
 # Default hostio_last_error implementation
 srv_hostio_err_objs="hostio-errno.o"
 
-srv_i386_regobj=i386.o
-srv_i386_linux_regobj=i386-linux.o
-srv_amd64_regobj=amd64.o
-srv_amd64_linux_regobj=amd64-linux.o
+srv_i386_regobj="i386.o i386-avx.o"
+srv_i386_linux_regobj="i386-linux.o i386-avx-linux.o"
+srv_amd64_regobj="amd64.o x86-64-avx.o"
+srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
-srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml"
-srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml"
-srv_i386_xmlfiles="i386/i386.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_xmlfiles="i386/amd64.xml $srv_i386_64bit_xmlfiles"
-srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
-srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
+srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
+srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
+srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_xmlfiles="i386/amd64.xml i386/amd64-avx.xml $srv_i386_64bit_xmlfiles"
+srv_i386_linux_xmlfiles="i386/i386-linux.xml i386/i386-avx-linux.xml i386/32bit-linux.xml $srv_i386_32bit_xmlfiles"
+srv_amd64_linux_xmlfiles="i386/amd64-linux.xml i386/amd64-avx-linux.xml i386/64bit-linux.xml $srv_i386_64bit_xmlfiles"
 
 # Input is taken from the "${target}" variable.
 
diff --git a/gdb/gdbserver/i387-fp.c b/gdb/gdbserver/i387-fp.c
index 7ef4ba3..5461022 100644
--- a/gdb/gdbserver/i387-fp.c
+++ b/gdb/gdbserver/i387-fp.c
@@ -19,6 +19,7 @@
 
 #include "server.h"
 #include "i387-fp.h"
+#include "i386-xstate.h"
 
 int num_xmm_registers = 8;
 
@@ -72,6 +73,46 @@ struct i387_fxsave {
   unsigned char xmm_space[256];
 };
 
+struct i387_xsave {
+  /* All these are only sixteen bits, plus padding, except for fop (which
+     is only eleven bits), and fooff / fioff (which are 32 bits each).  */
+  unsigned short fctrl;
+  unsigned short fstat;
+  unsigned short ftag;
+  unsigned short fop;
+  unsigned int fioff;
+  unsigned short fiseg;
+  unsigned short pad1;
+  unsigned int fooff;
+  unsigned short foseg;
+  unsigned short pad12;
+
+  unsigned int mxcsr;
+  unsigned int mxcsr_mask;
+
+  /* Space for eight 80-bit FP values in 128-bit spaces.  */
+  unsigned char st_space[128];
+
+  /* Space for eight 128-bit XMM values, or 16 on x86-64.  */
+  unsigned char xmm_space[256];
+
+  unsigned char reserved1[48];
+
+  /* The extended control register 0 (the XFEATURE_ENABLED_MASK
+     register).  */
+  unsigned long long xcr0;
+
+  unsigned char reserved2[40];
+
+  /* The XSTATE_BV bit vector.  */
+  unsigned long long xstate_bv;
+
+  unsigned char reserved3[56];
+
+  /* Space for eight upper 128-bit YMM values, or 16 on x86-64.  */
+  unsigned char ymmh_space[256];
+};
+
 void
 i387_cache_to_fsave (struct regcache *regcache, void *buf)
 {
@@ -199,6 +240,128 @@ i387_cache_to_fxsave (struct regcache *regcache, void *buf)
   fp->foseg = val;
 }
 
+void
+i387_cache_to_xsave (struct regcache *regcache, void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  int i;
+  unsigned long val, val2;
+  unsigned int clear_bv;
+  unsigned long long xstate_bv = 0;
+  char raw[16];
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Clear part in x87 and vector registers if its bit in xstat_bv is
+     zero.  */
+  if (clear_bv)
+    {
+      if ((clear_bv & I386_XSTATE_X87))
+	for (i = 0; i < 8; i++)
+	  memset (((char *) &fp->st_space[0]) + i * 16, 0, 10);
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->xmm_space[0]) + i * 16, 0, 16);
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	for (i = 0; i < num_xmm_registers; i++) 
+	  memset (((char *) &fp->ymmh_space[0]) + i * 16, 0, 16);
+    }
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      for (i = 0; i < 8; i++)
+	{
+	  collect_register (regcache, i + st0_regnum, raw);
+	  p = ((char *) &fp->st_space[0]) + i * 16;
+	  if (memcmp (raw, p, 10))
+	    {
+	      xstate_bv |= I386_XSTATE_X87;
+	      memcpy (p, raw, 10);
+	    }
+	}
+    }
+
+  /* Check if any SSE registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + xmm0_regnum, raw);
+	  p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_SSE;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Check if any AVX registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      for (i = 0; i < num_xmm_registers; i++) 
+	{
+	  collect_register (regcache, i + ymm0h_regnum, raw);
+	  p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  if (memcmp (raw, p, 16))
+	    {
+	      xstate_bv |= I386_XSTATE_AVX;
+	      memcpy (p, raw, 16);
+	    }
+	}
+    }
+
+  /* Update the corresponding bits in xstate_bv if any SSE/AVX
+     registers are changed.  */
+  fp->xstate_bv |= xstate_bv;
+
+  collect_register_by_name (regcache, "fioff", &fp->fioff);
+  collect_register_by_name (regcache, "fooff", &fp->fooff);
+  collect_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* This one's 11 bits... */
+  collect_register_by_name (regcache, "fop", &val2);
+  fp->fop = (val2 & 0x7FF) | (fp->fop & 0xF800);
+
+  /* Some registers are 16-bit.  */
+  collect_register_by_name (regcache, "fctrl", &val);
+  fp->fctrl = val;
+
+  collect_register_by_name (regcache, "fstat", &val);
+  fp->fstat = val;
+
+  /* Convert to the simplifed tag form stored in fxsave data.  */
+  collect_register_by_name (regcache, "ftag", &val);
+  val &= 0xFFFF;
+  val2 = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag = (val >> (i * 2)) & 3;
+
+      if (tag != 3)
+	val2 |= (1 << i);
+    }
+  fp->ftag = val2;
+
+  collect_register_by_name (regcache, "fiseg", &val);
+  fp->fiseg = val;
+
+  collect_register_by_name (regcache, "foseg", &val);
+  fp->foseg = val;
+}
+
 static int
 i387_ftag (struct i387_fxsave *fp, int regno)
 {
@@ -296,3 +459,107 @@ i387_fxsave_to_cache (struct regcache *regcache, const void *buf)
   val = (fp->fop) & 0x7FF;
   supply_register_by_name (regcache, "fop", &val);
 }
+
+void
+i387_xsave_to_cache (struct regcache *regcache, const void *buf)
+{
+  struct i387_xsave *fp = (struct i387_xsave *) buf;
+  struct i387_fxsave *fxp = (struct i387_fxsave *) buf;
+  int i, top;
+  unsigned long val;
+  unsigned int clear_bv;
+  char *p;
+
+  /* The supported bits in `xstat_bv' are 1 byte.  Clear part in
+     vector registers if its bit in xstat_bv is zero.  */
+  clear_bv = (~fp->xstate_bv) & x86_xcr0;
+
+  /* Check if any x87 registers are changed.  */
+  if ((x86_xcr0 & I386_XSTATE_X87))
+    {
+      int st0_regnum = find_regno ("st0");
+
+      if ((clear_bv & I386_XSTATE_X87))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < 8; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->st_space[0]) + i * 16;
+	  supply_register (regcache, i + st0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_SSE))
+    {
+      int xmm0_regnum = find_regno ("xmm0");
+
+      if ((clear_bv & I386_XSTATE_SSE))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->xmm_space[0]) + i * 16;
+	  supply_register (regcache, i + xmm0_regnum, p);
+	}
+    }
+
+  if ((x86_xcr0 & I386_XSTATE_AVX))
+    {
+      int ymm0h_regnum = find_regno ("ymm0h");
+
+      if ((clear_bv & I386_XSTATE_AVX))
+	p = NULL;
+      else
+	p = (char *) buf;
+
+      for (i = 0; i < num_xmm_registers; i++)
+	{
+	  if (p)
+	    p = ((char *) &fp->ymmh_space[0]) + i * 16;
+	  supply_register (regcache, i + ymm0h_regnum, p);
+	}
+    }
+
+  supply_register_by_name (regcache, "fioff", &fp->fioff);
+  supply_register_by_name (regcache, "fooff", &fp->fooff);
+  supply_register_by_name (regcache, "mxcsr", &fp->mxcsr);
+
+  /* Some registers are 16-bit.  */
+  val = fp->fctrl & 0xFFFF;
+  supply_register_by_name (regcache, "fctrl", &val);
+
+  val = fp->fstat & 0xFFFF;
+  supply_register_by_name (regcache, "fstat", &val);
+
+  /* Generate the form of ftag data that GDB expects.  */
+  top = (fp->fstat >> 11) & 0x7;
+  val = 0;
+  for (i = 7; i >= 0; i--)
+    {
+      int tag;
+      if (fp->ftag & (1 << i))
+	tag = i387_ftag (fxp, (i + 8 - top) % 8);
+      else
+	tag = 3;
+      val |= tag << (2 * i);
+    }
+  supply_register_by_name (regcache, "ftag", &val);
+
+  val = fp->fiseg & 0xFFFF;
+  supply_register_by_name (regcache, "fiseg", &val);
+
+  val = fp->foseg & 0xFFFF;
+  supply_register_by_name (regcache, "foseg", &val);
+
+  val = (fp->fop) & 0x7FF;
+  supply_register_by_name (regcache, "fop", &val);
+}
+
+/* Default to SSE.  */
+unsigned long long x86_xcr0 = I386_XSTATE_SSE_MASK;
diff --git a/gdb/gdbserver/i387-fp.h b/gdb/gdbserver/i387-fp.h
index d1e0681..ed1a322 100644
--- a/gdb/gdbserver/i387-fp.h
+++ b/gdb/gdbserver/i387-fp.h
@@ -26,6 +26,11 @@ void i387_fsave_to_cache (struct regcache *regcache, const void *buf);
 void i387_cache_to_fxsave (struct regcache *regcache, void *buf);
 void i387_fxsave_to_cache (struct regcache *regcache, const void *buf);
 
+void i387_cache_to_xsave (struct regcache *regcache, void *buf);
+void i387_xsave_to_cache (struct regcache *regcache, const void *buf);
+
+extern unsigned long long x86_xcr0;
+
 extern int num_xmm_registers;
 
 #endif /* I387_FP_H */
diff --git a/gdb/gdbserver/linux-arm-low.c b/gdb/gdbserver/linux-arm-low.c
index 54668f8..32bd7bb 100644
--- a/gdb/gdbserver/linux-arm-low.c
+++ b/gdb/gdbserver/linux-arm-low.c
@@ -354,16 +354,16 @@ arm_arch_setup (void)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, 18 * 4,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 18 * 4,
     GENERAL_REGS,
     arm_fill_gregset, arm_store_gregset },
-  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 16 * 8 + 6 * 4,
+  { PTRACE_GETWMMXREGS, PTRACE_SETWMMXREGS, 0, 16 * 8 + 6 * 4,
     EXTENDED_REGS,
     arm_fill_wmmxregset, arm_store_wmmxregset },
-  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 32 * 8 + 4,
+  { PTRACE_GETVFPREGS, PTRACE_SETVFPREGS, 0, 32 * 8 + 4,
     EXTENDED_REGS,
     arm_fill_vfpregset, arm_store_vfpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-crisv32-low.c b/gdb/gdbserver/linux-crisv32-low.c
index 6ba48b6..d426c32 100644
--- a/gdb/gdbserver/linux-crisv32-low.c
+++ b/gdb/gdbserver/linux-crisv32-low.c
@@ -365,9 +365,9 @@ cris_store_gregset (const void *buf)
 typedef unsigned long elf_gregset_t[cris_num_regs];
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS, cris_fill_gregset, cris_store_gregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 38af9d0..f159244 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -39,6 +39,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/vfs.h>
+#include <sys/uio.h>
 #ifndef ELFMAG0
 /* Don't include <linux/elf.h> here.  If it got included by gdb_proc_service.h
    then ELFMAG0 will have been defined.  If it didn't get included by
@@ -2977,14 +2978,15 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -2993,10 +2995,21 @@ regsets_fetch_inferior_registers (struct regcache *regcache)
 	}
 
       buf = xmalloc (regset->size);
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, data, nt_type);
 #endif
       if (res < 0)
 	{
@@ -3034,14 +3047,15 @@ regsets_store_inferior_registers (struct regcache *regcache)
   struct regset_info *regset;
   int saw_general_regs = 0;
   int pid;
+  struct iovec iov;
 
   regset = target_regsets;
 
   pid = lwpid_of (get_thread_lwp (current_inferior));
   while (regset->size >= 0)
     {
-      void *buf;
-      int res;
+      void *buf, *data;
+      int nt_type, res;
 
       if (regset->size == 0 || disabled_regsets[regset - target_regsets])
 	{
@@ -3054,10 +3068,21 @@ regsets_store_inferior_registers (struct regcache *regcache)
       /* First fill the buffer with the current register set contents,
 	 in case there are any items in the kernel's regset that are
 	 not in gdbserver's regcache.  */
+
+      nt_type = regset->nt_type;
+      if (nt_type)
+	{
+	  iov.iov_base = buf;
+	  iov.iov_len = regset->size;
+	  data = (void *) &iov;
+	}
+      else
+	data = buf;
+
 #ifndef __sparc__
-      res = ptrace (regset->get_request, pid, 0, buf);
+      res = ptrace (regset->get_request, pid, nt_type, data);
 #else
-      res = ptrace (regset->get_request, pid, buf, 0);
+      res = ptrace (regset->get_request, pid, &iov, data);
 #endif
 
       if (res == 0)
@@ -3067,9 +3092,9 @@ regsets_store_inferior_registers (struct regcache *regcache)
 
 	  /* Only now do we write the register set.  */
 #ifndef __sparc__
-	  res = ptrace (regset->set_request, pid, 0, buf);
+	  res = ptrace (regset->set_request, pid, nt_type, data);
 #else
-	  res = ptrace (regset->set_request, pid, buf, 0);
+	  res = ptrace (regset->set_request, pid, data, nt_type);
 #endif
 	}
 
@@ -4133,6 +4158,13 @@ linux_core_of_thread (ptid_t ptid)
   return core;
 }
 
+static void
+linux_process_qsupported (const char *query)
+{
+  if (the_low_target.process_qsupported != NULL)
+    the_low_target.process_qsupported (query);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -4176,7 +4208,8 @@ static struct target_ops linux_target_ops = {
 #else
   NULL,
 #endif
-  linux_core_of_thread
+  linux_core_of_thread,
+  linux_process_qsupported
 };
 
 static void
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index d7aa418..52623bf 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -35,6 +35,9 @@ enum regset_type {
 struct regset_info
 {
   int get_request, set_request;
+  /* If NT_TYPE isn't 0, it will be passed to ptrace as the 3rd
+     argument and the 4th argument should be "const struct iovec *".  */
+  int nt_type;
   int size;
   enum regset_type type;
   regset_fill_func fill_function;
@@ -111,6 +114,9 @@ struct linux_target_ops
 
   /* Hook to call prior to resuming a thread.  */
   void (*prepare_to_resume) (struct lwp_info *);
+
+  /* Hook to support target specific qSupported.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct linux_target_ops the_low_target;
diff --git a/gdb/gdbserver/linux-m68k-low.c b/gdb/gdbserver/linux-m68k-low.c
index 14e3864..6c98bb1 100644
--- a/gdb/gdbserver/linux-m68k-low.c
+++ b/gdb/gdbserver/linux-m68k-low.c
@@ -112,14 +112,14 @@ m68k_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     m68k_fill_gregset, m68k_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     m68k_fill_fpregset, m68k_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static const unsigned char m68k_breakpoint[] = { 0x4E, 0x4F };
diff --git a/gdb/gdbserver/linux-mips-low.c b/gdb/gdbserver/linux-mips-low.c
index 70f6700..1c04b2e 100644
--- a/gdb/gdbserver/linux-mips-low.c
+++ b/gdb/gdbserver/linux-mips-low.c
@@ -343,12 +343,12 @@ mips_store_fpregset (struct regcache *regcache, const void *buf)
 
 struct regset_info target_regsets[] = {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, 38 * 8, GENERAL_REGS,
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, 38 * 8, GENERAL_REGS,
     mips_fill_gregset, mips_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 33 * 8, FP_REGS,
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, 33 * 8, FP_REGS,
     mips_fill_fpregset, mips_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-ppc-low.c b/gdb/gdbserver/linux-ppc-low.c
index 10a1309..000b20f 100644
--- a/gdb/gdbserver/linux-ppc-low.c
+++ b/gdb/gdbserver/linux-ppc-low.c
@@ -593,14 +593,14 @@ struct regset_info target_regsets[] = {
      fetch them every time, but still fall back to PTRACE_PEEKUSER for the
      general registers.  Some kernels support these, but not the newer
      PPC_PTRACE_GETREGS.  */
-  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, SIZEOF_VSXREGS, EXTENDED_REGS,
+  { PTRACE_GETVSXREGS, PTRACE_SETVSXREGS, 0, SIZEOF_VSXREGS, EXTENDED_REGS,
   ppc_fill_vsxregset, ppc_store_vsxregset },
-  { PTRACE_GETVRREGS, PTRACE_SETVRREGS, SIZEOF_VRREGS, EXTENDED_REGS,
+  { PTRACE_GETVRREGS, PTRACE_SETVRREGS, 0, SIZEOF_VRREGS, EXTENDED_REGS,
     ppc_fill_vrregset, ppc_store_vrregset },
-  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 32 * 4 + 8 + 4, EXTENDED_REGS,
+  { PTRACE_GETEVRREGS, PTRACE_SETEVRREGS, 0, 32 * 4 + 8 + 4, EXTENDED_REGS,
     ppc_fill_evrregset, ppc_store_evrregset },
-  { 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, ppc_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-s390-low.c b/gdb/gdbserver/linux-s390-low.c
index 5460f57..eb865dc 100644
--- a/gdb/gdbserver/linux-s390-low.c
+++ b/gdb/gdbserver/linux-s390-low.c
@@ -181,8 +181,8 @@ static void s390_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, s390_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 
diff --git a/gdb/gdbserver/linux-sh-low.c b/gdb/gdbserver/linux-sh-low.c
index 9d27e7f..87a0dd2 100644
--- a/gdb/gdbserver/linux-sh-low.c
+++ b/gdb/gdbserver/linux-sh-low.c
@@ -104,8 +104,8 @@ static void sh_fill_gregset (struct regcache *regcache, void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, 0, GENERAL_REGS, sh_fill_gregset, NULL },
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-sparc-low.c b/gdb/gdbserver/linux-sparc-low.c
index 0bb5f2f..e0bfe81 100644
--- a/gdb/gdbserver/linux-sparc-low.c
+++ b/gdb/gdbserver/linux-sparc-low.c
@@ -260,13 +260,13 @@ sparc_reinsert_addr (void)
 
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     sparc_fill_gregset, sparc_store_gregset },
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (fpregset_t),
     FP_REGS,
     sparc_fill_fpregset, sparc_store_fpregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 struct linux_target_ops the_low_target = {
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index 37fe60f..3853b25 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -24,6 +24,8 @@
 #include "linux-low.h"
 #include "i387-fp.h"
 #include "i386-low.h"
+#include "i386-xstate.h"
+#include "elf/common.h"
 
 #include "gdb_proc_service.h"
 
@@ -31,10 +33,35 @@
 void init_registers_i386_linux (void);
 /* Defined in auto-generated file amd64-linux.c.  */
 void init_registers_amd64_linux (void);
+/* Defined in auto-generated file i386-avx-linux.c.  */
+void init_registers_i386_avx_linux (void);
+/* Defined in auto-generated file amd64-avx-linux.c.  */
+void init_registers_amd64_avx_linux (void);
+
+/* Backward compatibility for gdb without XML support.  */
+
+static const char *xmltarget_i386_linux_no_xml = "@<target>\
+<architecture>i386</architecture>\
+<osabi>GNU/Linux</osabi>\
+</target>";
+static const char *xmltarget_amd64_linux_no_xml = "@<target>\
+<architecture>i386:x86-64</architecture>\
+<osabi>GNU/Linux</osabi>\
+</target>";
 
 #include <sys/reg.h>
 #include <sys/procfs.h>
 #include <sys/ptrace.h>
+#include <sys/uio.h>
+
+#ifndef PTRACE_GETREGSET
+#define PTRACE_GETREGSET	0x4204
+#endif
+
+#ifndef PTRACE_SETREGSET
+#define PTRACE_SETREGSET	0x4205
+#endif
+
 
 #ifndef PTRACE_GET_THREAD_AREA
 #define PTRACE_GET_THREAD_AREA 25
@@ -252,6 +279,18 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 
 #endif
 
+static void
+x86_fill_xstateregset (struct regcache *regcache, void *buf)
+{
+  i387_cache_to_xsave (regcache, buf);
+}
+
+static void
+x86_store_xstateregset (struct regcache *regcache, const void *buf)
+{
+  i387_xsave_to_cache (regcache, buf);
+}
+
 /* ??? The non-biarch i386 case stores all the i387 regs twice.
    Once in i387_.*fsave.* and once in i387_.*fxsave.*.
    This is, presumably, to handle the case where PTRACE_[GS]ETFPXREGS
@@ -264,21 +303,23 @@ x86_store_fpxregset (struct regcache *regcache, const void *buf)
 struct regset_info target_regsets[] =
 {
 #ifdef HAVE_PTRACE_GETREGS
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     x86_fill_gregset, x86_store_gregset },
+  { PTRACE_GETREGSET, PTRACE_SETREGSET, NT_X86_XSTATE, 0,
+    EXTENDED_REGS, x86_fill_xstateregset, x86_store_xstateregset },
 # ifndef __x86_64__
 #  ifdef HAVE_PTRACE_GETFPXREGS
-  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, sizeof (elf_fpxregset_t),
+  { PTRACE_GETFPXREGS, PTRACE_SETFPXREGS, 0, sizeof (elf_fpxregset_t),
     EXTENDED_REGS,
     x86_fill_fpxregset, x86_store_fpxregset },
 #  endif
 # endif
-  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, sizeof (elf_fpregset_t),
+  { PTRACE_GETFPREGS, PTRACE_SETFPREGS, 0, sizeof (elf_fpregset_t),
     FP_REGS,
     x86_fill_fpregset, x86_store_fpregset },
 #endif /* HAVE_PTRACE_GETREGS */
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 static CORE_ADDR
@@ -780,6 +821,128 @@ x86_siginfo_fixup (struct siginfo *native, void *inf, int direction)
   return 0;
 }
 \f
+static int use_xml;
+
+/* Update gdbserver_xmltarget.  */
+
+static void
+x86_linux_update_xmltarget (void)
+{
+  static unsigned long long xcr0;
+  static int have_ptrace_getregset = -1;
+
+  if (!current_inferior)
+    return;
+
+#ifdef __x86_64__
+  if (num_xmm_registers == 8)
+    init_registers_i386_linux ();
+  else
+    init_registers_amd64_linux ();
+#else
+  init_registers_i386_linux ();
+#endif
+
+  if (!use_xml)
+    {
+      /* Don't use XML.  */
+#ifdef __x86_64__
+      if (num_xmm_registers == 8)
+	gdbserver_xmltarget = xmltarget_i386_linux_no_xml;
+      else
+	gdbserver_xmltarget = xmltarget_amd64_linux_no_xml;
+#else
+      gdbserver_xmltarget = xmltarget_i386_linux_no_xml;
+#endif
+
+      x86_xcr0 = I386_XSTATE_SSE_MASK;
+
+      return;
+    }
+
+  /* Check if XSAVE extended state is supported.  */
+  if (have_ptrace_getregset == -1)
+    {
+      int pid = pid_of (get_thread_lwp (current_inferior));
+      unsigned long long xstateregs[I386_XSTATE_SSE_SIZE / sizeof (long long)];
+      struct iovec iov;
+      struct regset_info *regset;
+
+      iov.iov_base = xstateregs;
+      iov.iov_len = sizeof (xstateregs);
+
+      /* Check if PTRACE_GETREGSET works.  */
+      if (ptrace (PTRACE_GETREGSET, pid, (unsigned int) NT_X86_XSTATE,
+		  &iov) < 0)
+	{
+	  have_ptrace_getregset = 0;
+	  return;
+	}
+      else
+	have_ptrace_getregset = 1;
+
+      /* Get XCR0 from XSAVE extended state at byte 464.  */
+      xcr0 = xstateregs[464 / sizeof (long long)];
+
+      /* Use PTRACE_GETREGSET if it is available.  */
+      for (regset = target_regsets;
+	   regset->fill_function != NULL; regset++)
+	if (regset->get_request == PTRACE_GETREGSET)
+	  regset->size = I386_XSTATE_SIZE (xcr0);
+	else if (regset->type != GENERAL_REGS)
+	  regset->size = 0;
+    }
+
+  if (have_ptrace_getregset)
+    {
+      /* AVX is the highest feature we support.  */
+      if ((xcr0 & I386_XSTATE_AVX_MASK) == I386_XSTATE_AVX_MASK)
+	{
+	  x86_xcr0 = xcr0;
+
+#ifdef __x86_64__
+	  /* I386 has 8 xmm regs.  */
+	  if (num_xmm_registers == 8)
+	    init_registers_i386_avx_linux ();
+	  else
+	    init_registers_amd64_avx_linux ();
+#else
+	  init_registers_i386_avx_linux ();
+#endif
+	}
+    }
+}
+
+/* Process qSupported query, "xmlRegisters=".  Update the buffer size for
+   PTRACE_GETREGSET.  */
+
+static void
+x86_linux_process_qsupported (const char *query)
+{
+  /* Return if gdb doesn't support XML.  If gdb sends "xmlRegisters="
+     with "i386" in qSupported query, it supports x86 XML target
+     descriptions.  */
+  use_xml = 0;
+  if (query != NULL && strncmp (query, "xmlRegisters=", 13) == 0)
+    {
+      char *copy = xstrdup (query + 13);
+      char *p;
+
+      for (p = strtok (copy, ","); p != NULL; p = strtok (NULL, ","))
+	{
+	  if (strcmp (p, "i386") == 0)
+	    {
+	      use_xml = 1;
+	      break;
+	    }
+	} 
+
+      free (copy);
+    }
+
+  x86_linux_update_xmltarget ();
+}
+
 /* Initialize gdbserver for the architecture of the inferior.  */
 
 static void
@@ -800,8 +963,6 @@ x86_arch_setup (void)
     }
   else if (use_64bit)
     {
-      init_registers_amd64_linux ();
-
       /* Amd64 doesn't have HAVE_LINUX_USRREGS.  */
       the_low_target.num_regs = -1;
       the_low_target.regmap = NULL;
@@ -811,14 +972,13 @@ x86_arch_setup (void)
       /* Amd64 has 16 xmm regs.  */
       num_xmm_registers = 16;
 
+      x86_linux_update_xmltarget ();
       return;
     }
 #endif
 
   /* Ok we have a 32-bit inferior.  */
 
-  init_registers_i386_linux ();
-
   the_low_target.num_regs = I386_NUM_REGS;
   the_low_target.regmap = i386_regmap;
   the_low_target.cannot_fetch_register = i386_cannot_fetch_register;
@@ -826,6 +986,8 @@ x86_arch_setup (void)
 
   /* I386 has 8 xmm regs.  */
   num_xmm_registers = 8;
+
+  x86_linux_update_xmltarget ();
 }
 
 /* This is initialized assuming an amd64 target.
@@ -858,5 +1020,6 @@ struct linux_target_ops the_low_target =
   x86_siginfo_fixup,
   x86_linux_new_process,
   x86_linux_new_thread,
-  x86_linux_prepare_to_resume
+  x86_linux_prepare_to_resume,
+  x86_linux_process_qsupported 
 };
diff --git a/gdb/gdbserver/linux-xtensa-low.c b/gdb/gdbserver/linux-xtensa-low.c
index c5ed351..8d0e73a 100644
--- a/gdb/gdbserver/linux-xtensa-low.c
+++ b/gdb/gdbserver/linux-xtensa-low.c
@@ -131,13 +131,13 @@ xtensa_store_xtregset (struct regcache *regcache, const void *buf)
 }
 
 struct regset_info target_regsets[] = {
-  { PTRACE_GETREGS, PTRACE_SETREGS, sizeof (elf_gregset_t),
+  { PTRACE_GETREGS, PTRACE_SETREGS, 0, sizeof (elf_gregset_t),
     GENERAL_REGS,
     xtensa_fill_gregset, xtensa_store_gregset },
-  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, XTENSA_ELF_XTREG_SIZE,
+  { PTRACE_GETXTREGS, PTRACE_SETXTREGS, 0, XTENSA_ELF_XTREG_SIZE,
     EXTENDED_REGS,
     xtensa_fill_xtregset, xtensa_store_xtregset },
-  { 0, 0, -1, -1, NULL, NULL }
+  { 0, 0, 0, -1, -1, NULL, NULL }
 };
 
 #if XCHAL_HAVE_BE
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index c6fc005..568640e 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -1289,6 +1289,9 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
     {
       char *p = &own_buf[10];
 
+      /* Start processing qSupported packet.  */
+      target_process_qsupported (NULL);
+
       /* Process each feature being provided by GDB.  The first
 	 feature will follow a ':', and latter features will follow
 	 ';'.  */
@@ -1304,6 +1307,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		if (target_supports_multi_process ())
 		  multi_process = 1;
 	      }
+	    else
+	      target_process_qsupported (p);
 	  }
 
       sprintf (own_buf, "PacketSize=%x;QPassSignals+", PBUFSIZ - 1);
diff --git a/gdb/gdbserver/target.h b/gdb/gdbserver/target.h
index ac68652..6109b1c 100644
--- a/gdb/gdbserver/target.h
+++ b/gdb/gdbserver/target.h
@@ -286,6 +286,9 @@ struct target_ops
 
   /* Returns the core given a thread, or -1 if not known.  */
   int (*core_of_thread) (ptid_t);
+
+  /* Target specific qSupported support.  */
+  void (*process_qsupported) (const char *);
 };
 
 extern struct target_ops *the_target;
@@ -326,6 +329,10 @@ void set_target_ops (struct target_ops *);
   (the_target->supports_multi_process ? \
    (*the_target->supports_multi_process) () : 0)
 
+#define target_process_qsupported(query) \
+  if (the_target->process_qsupported) \
+    the_target->process_qsupported (query)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-07 16:55             ` H.J. Lu
@ 2010-04-07 18:34               ` Mark Kettenis
  2010-04-07 18:50                 ` H.J. Lu
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Kettenis @ 2010-04-07 18:34 UTC (permalink / raw)
  To: hjl.tools; +Cc: gdb-patches

> Date: Wed, 7 Apr 2010 09:54:58 -0700
> From: "H.J. Lu" <hongjiu.lu@intel.com>
> 
> On Fri, Apr 02, 2010 at 07:31:07AM -0700, H.J. Lu wrote:
> > On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
> > > Hi,
> > > 
> > > Here are i386 changes to support AVX. OK to install?
> > > 
> > 
> Here is the updated i386 changes to support AVX. I removed
> i386_linux_update_xstateregset.  OK to install?

Still some nits, but this is taking long enough already.  Let's get 3,
4 and 5 in and I'll fix the remaining issues in the tree.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PATCH: 3/6 [3rd try]: Add AVX support (i386 changes)
  2010-04-07 18:34               ` Mark Kettenis
@ 2010-04-07 18:50                 ` H.J. Lu
  0 siblings, 0 replies; 115+ messages in thread
From: H.J. Lu @ 2010-04-07 18:50 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On Wed, Apr 7, 2010 at 11:34 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>> Date: Wed, 7 Apr 2010 09:54:58 -0700
>> From: "H.J. Lu" <hongjiu.lu@intel.com>
>>
>> On Fri, Apr 02, 2010 at 07:31:07AM -0700, H.J. Lu wrote:
>> > On Sun, Mar 28, 2010 at 06:11:24PM -0700, H.J. Lu wrote:
>> > > Hi,
>> > >
>> > > Here are i386 changes to support AVX. OK to install?
>> > >
>> >
>> Here is the updated i386 changes to support AVX. I removed
>> i386_linux_update_xstateregset.  OK to install?
>
> Still some nits, but this is taking long enough already.  Let's get 3,
> 4 and 5 in and I'll fix the remaining issues in the tree.
>

I checked in all AVX patches.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 115+ messages in thread

end of thread, other threads:[~2010-04-07 18:50 UTC | newest]

Thread overview: 115+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-04 18:02 PATCH: 1/6: Add AVX support H.J. Lu
2010-03-04 18:05 ` PATCH: 2/6: Add AVX support (Update document) H.J. Lu
2010-03-04 18:06   ` PATCH: 3/6: Add AVX support (i386 changes) H.J. Lu
2010-03-06 22:21     ` PATCH: 3/6 [2nd try]: " H.J. Lu
2010-03-07 21:32       ` H.J. Lu
2010-03-11 22:37         ` Mark Kettenis
2010-03-12  0:00           ` H.J. Lu
2010-03-27 14:55             ` Mark Kettenis
2010-03-27 15:30               ` Daniel Jacobowitz
2010-03-27 16:05                 ` Mark Kettenis
2010-03-27 15:33               ` H.J. Lu
2010-03-27 16:09                 ` Mark Kettenis
2010-03-28  1:39                   ` H.J. Lu
2010-03-12 16:49       ` H.J. Lu
2010-03-13  1:38         ` H.J. Lu
2010-03-29  1:11         ` PATCH: 3/6 [3rd " H.J. Lu
2010-04-02 14:31           ` H.J. Lu
2010-04-02 14:42             ` Mark Kettenis
2010-04-02 15:28               ` H.J. Lu
2010-04-07 10:13                 ` Mark Kettenis
2010-04-07 14:56                   ` H.J. Lu
2010-04-07 15:04                     ` H.J. Lu
2010-04-07 15:19                       ` Mark Kettenis
2010-04-07 16:55             ` H.J. Lu
2010-04-07 18:34               ` Mark Kettenis
2010-04-07 18:50                 ` H.J. Lu
2010-03-27 15:48       ` PATCH: 3/6 [2nd " Mark Kettenis
2010-03-28  1:37         ` H.J. Lu
2010-03-28 11:55           ` Mark Kettenis
2010-03-28 14:25             ` H.J. Lu
2010-03-29 20:32               ` Mark Kettenis
2010-03-29 21:41                 ` H.J. Lu
2010-03-04 18:08   ` PATCH: 4/6: Add AVX support (amd64 changes) H.J. Lu
2010-03-04 18:09     ` PATCH: 5/6: Add AVX support (i387 changes) H.J. Lu
2010-03-04 18:10       ` PATCH: 6/6: Add AVX support (gdbserver changes) H.J. Lu
2010-03-06 22:23         ` PATCH: 6/6 [2nd try]: " H.J. Lu
2010-03-12 17:25           ` H.J. Lu
2010-03-27 16:07             ` Daniel Jacobowitz
2010-03-28  1:11               ` H.J. Lu
2010-03-28  7:55                 ` Pedro Alves
2010-03-28 14:56                   ` H.J. Lu
2010-03-28 16:17                     ` Pedro Alves
2010-03-28 16:37                       ` H.J. Lu
2010-03-28 16:40                   ` Daniel Jacobowitz
2010-03-28 16:47                     ` Pedro Alves
2010-03-28 20:53                       ` H.J. Lu
2010-03-28 21:27                         ` Pedro Alves
2010-03-28 16:39                 ` Daniel Jacobowitz
2010-03-28 19:31                   ` H.J. Lu
2010-03-29  1:09             ` PATCH: 6/6 [3rd " H.J. Lu
2010-03-29 14:08               ` Eli Zaretskii
2010-03-29 14:42                 ` H.J. Lu
2010-03-29 15:11                   ` Eli Zaretskii
2010-03-29 15:42                     ` H.J. Lu
2010-03-29 15:51                       ` Eli Zaretskii
2010-03-30 16:48               ` H.J. Lu
2010-04-02 17:39                 ` Daniel Jacobowitz
2010-04-07  4:37                   ` H.J. Lu
2010-04-03 21:57                 ` Jan Kratochvil
2010-04-07  4:12                   ` H.J. Lu
2010-04-07 16:59                 ` H.J. Lu
2010-03-05  3:20       ` PATCH: 5/6: Add AVX support (i387 changes) Hui Zhu
2010-03-05  3:54         ` H.J. Lu
2010-03-06 22:22       ` PATCH: 5/6 [2nd try]: " H.J. Lu
2010-03-12 17:24         ` H.J. Lu
2010-04-07 16:57           ` PATCH: 5/6 [3rd " H.J. Lu
2010-03-27 15:08         ` PATCH: 5/6 [2nd " Mark Kettenis
2010-03-27 15:15           ` H.J. Lu
2010-03-06 22:21     ` PATCH: 4/6 [2nd try]: Add AVX support (amd64 changes) H.J. Lu
2010-03-07 21:33       ` H.J. Lu
2010-03-12 17:01         ` H.J. Lu
2010-03-13  1:38           ` H.J. Lu
2010-03-29  1:07           ` PATCH: 4/6 [3rd " H.J. Lu
2010-04-02 14:32             ` H.J. Lu
2010-04-07 16:54               ` H.J. Lu
2010-03-05 10:33   ` PATCH: 2/6: Add AVX support (Update document) Eli Zaretskii
2010-03-05 14:08     ` H.J. Lu
2010-03-06 22:19   ` PATCH: 2/6 [2nd try]: " H.J. Lu
2010-03-12 11:11     ` Eli Zaretskii
2010-03-12 14:17       ` H.J. Lu
2010-03-12 15:28         ` Eli Zaretskii
2010-03-12 15:27     ` Eli Zaretskii
2010-03-12 16:46     ` H.J. Lu
2010-03-12 18:15       ` Eli Zaretskii
2010-03-29  0:18     ` PATCH: 2/6 [3rd " H.J. Lu
2010-03-30 16:41       ` H.J. Lu
2010-03-30 18:27         ` Eli Zaretskii
2010-03-30 18:37           ` H.J. Lu
2010-03-04 19:09 ` PATCH: 1/6: Add AVX support Daniel Jacobowitz
2010-03-04 19:29   ` H.J. Lu
2010-03-04 19:47     ` Daniel Jacobowitz
2010-03-04 21:27       ` H.J. Lu
2010-03-04 21:34         ` Nathan Froyd
2010-03-04 21:41           ` H.J. Lu
2010-03-04 21:59             ` Nathan Froyd
2010-03-04 21:47         ` Daniel Jacobowitz
2010-03-05  2:06           ` H.J. Lu
2010-03-05  7:29             ` Mark Kettenis
2010-03-06 22:16 ` PATCH: 0/6 [2nd try]: " H.J. Lu
2010-03-06 22:18   ` PATCH: 1/6 [2nd try]: Add AVX support (AVX XML files) H.J. Lu
2010-03-07 14:16   ` PATCH: 0/6 [2nd try]: Add AVX support Mark Kettenis
2010-03-07 14:37     ` H.J. Lu
2010-03-07 16:31       ` H.J. Lu
2010-03-07 16:40         ` H.J. Lu
2010-03-07 17:04           ` H.J. Lu
2010-03-07 17:39             ` H.J. Lu
2010-03-07 20:00               ` Mark Kettenis
2010-03-07 19:10           ` Nathan Froyd
2010-03-07 19:49             ` Mark Kettenis
2010-03-07 21:07               ` Nathan Froyd
2010-03-07 21:17                 ` H.J. Lu
2010-03-07 20:29           ` Mark Kettenis
2010-03-07 21:04             ` H.J. Lu
2010-03-27 16:16   ` Daniel Jacobowitz
2010-03-29  0:16   ` PATCH: 0/6 [3nd " H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).