public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* XSAVE handling does not decode registers on recent AMD Ryzen processors
@ 2022-03-15 17:53 John Baldwin
  0 siblings, 0 replies; only message in thread
From: John Baldwin @ 2022-03-15 17:53 UTC (permalink / raw)
  To: GDB Patches

I recently upgraded my desktop to a system using an AMD Ryzen 9 5900X
(much nicer toy than my previous desktop) and one of the things I ran
into is that gdb is not able to handle the NT_X86_XSAVE core dump note:

Reading symbols from thread...
Reading symbols from /usr/home/john/work/johnsvn/test/thread/amd64/thread.debug...
[New LWP 126075]
[New LWP 296878]
[New LWP 296879]
[New LWP 296880]
[New LWP 296881]

warning: Section `.reg-xstate/126075' in core file too small.

The XSAVE instruction on x86 is defined to save various banks of registers
(SSE, AVX, etc.) in a chunk of memory.  The layout of this chunk is described
by a set of CPU leafs, and in particular, the layout is _not_ architectural.
Instead, to parse the layout, one should be using the layout information
from the cpuid leaves which gives the starting offset and size for each bank
of registers.

Right now GDB assumes a fixed layout that matches the layout used on Intel
processors to date, e.g. in gdbsupport/x86-xstate.h:

/* Supported mask and size of the extended state.  */
#define X86_XSTATE_X87_MASK	X86_XSTATE_X87
#define X86_XSTATE_SSE_MASK	(X86_XSTATE_X87 | X86_XSTATE_SSE)
#define X86_XSTATE_AVX_MASK	(X86_XSTATE_SSE_MASK | X86_XSTATE_AVX)
#define X86_XSTATE_MPX_MASK	(X86_XSTATE_SSE_MASK | X86_XSTATE_MPX)
#define X86_XSTATE_AVX_MPX_MASK	(X86_XSTATE_AVX_MASK | X86_XSTATE_MPX)
#define X86_XSTATE_AVX_AVX512_MASK	(X86_XSTATE_AVX_MASK | X86_XSTATE_AVX512)
#define X86_XSTATE_AVX_MPX_AVX512_PKU_MASK 	(X86_XSTATE_AVX_MPX_MASK\
					| X86_XSTATE_AVX512 | X86_XSTATE_PKRU)

#define X86_XSTATE_ALL_MASK		(X86_XSTATE_AVX_MPX_AVX512_PKU_MASK)


#define X86_XSTATE_SSE_SIZE	576
#define X86_XSTATE_AVX_SIZE	832
#define X86_XSTATE_BNDREGS_SIZE	1024
#define X86_XSTATE_BNDCFG_SIZE	1088
#define X86_XSTATE_AVX512_SIZE	2688
#define X86_XSTATE_PKRU_SIZE	2696
#define X86_XSTATE_MAX_SIZE	2696


/* In case one of the MPX XCR0 bits is set we consider we have MPX.  */
#define HAS_MPX(XCR0) (((XCR0) & X86_XSTATE_MPX) != 0)
#define HAS_AVX(XCR0) (((XCR0) & X86_XSTATE_AVX) != 0)
#define HAS_AVX512(XCR0) (((XCR0) & X86_XSTATE_AVX512) != 0)
#define HAS_PKRU(XCR0) (((XCR0) & X86_XSTATE_PKRU) != 0)

/* Get I386 XSAVE extended state size.  */
#define X86_XSTATE_SIZE(XCR0) \
     (HAS_PKRU (XCR0) ? X86_XSTATE_PKRU_SIZE : \
      (HAS_AVX512 (XCR0) ? X86_XSTATE_AVX512_SIZE : \
       (HAS_MPX (XCR0) ? X86_XSTATE_BNDCFG_SIZE : \
        (HAS_AVX (XCR0) ? X86_XSTATE_AVX_SIZE : X86_XSTATE_SSE_SIZE))))

and also the arrays in i387-tdep.c, e.g.:

static int xsave_mpx_offset[] = {
   960 + 0 * 16,			/* bnd0r...bnd3r registers.  */
   960 + 1 * 16,
   960 + 2 * 16,
   960 + 3 * 16,
   1024 + 0 * 8,			/* bndcfg ... bndstatus.  */
   1024 + 1 * 8,
};

(This assumes fixed locations of the bnd registers in the '960' and '1024' for
example)

On my Ryzen CPU I have an XCR0 mask of 0x207 which is the bits X86, SSE, AVX,
and PKRU.  Thus, X86_XSTATE_SIZE thinks the block should be a size of 2696
(X86_XSTATE_PKRU_SIZE).  However, the total size of the XSAVE block reported
by cpuid for this CPU is 2440 bytes.  I haven't yet examined the leafs, but my
guess is that AMD didn't reserve the 256 bytes for MPX in its XSAVE layout
(since MPX is now dead) and places AVX512 directly after AVX.

My question is how do we want to deal with this?  I can see a couple of options:

1) We could allow for different "known layouts" in GDB and use the (xcr0,
    state size) values to select a particular layout.

2) We could teach kernels to export a new core dump note that is the list of
    cpuid leaves describing the offset and size of individual register banks.

My preference for the long term is to do 2) (with a new NT_X86_XSAVE_LAYOUT or
some such that would be a variable length array of what cpuid returns, maybe
with the first entry being the total size and XCR0).  However, we probably
need to permit 1) as a fallback in the short term.

I don't mind hacking on this, but would like to make sure I'm not wandering
off in the wrong direction before I start.  Probably we would need to extend
the x86 gdbarch_tdep classes to include the offsets of the various XSAVE
register banks and change the structures in i387_tdep.c to store relative
offsets instead of absolute offsets (and splitting the MPX table into separate
bndregs and bndcfg tables).  Eventually the new core dump note / register set
could be used to set the offsets in the tdep, but for now we could use the
value of xcr0 and the total save size to select between "known" formats, possibly
by passing these into the relevant "read_description" functions.  The existing
uses of X86_XSTATE_SIZE would instead need to be replaced by reading the size
from a field in the tdep structure.

(Note, I have been testing this on FreeBSD, not on Linux.  I'm not sure if
Linux tries to rewrite the note to use Intel layout on AMD processors, or if
it is subject to the same issue.  I do think that describing the layout via
a new note is a better idea than trying to pick a "standard" format though
as eventually some CPU vendor might add an extension not supported by other
vendors with the result that there can't be "one true format")

-- 
John Baldwin

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-03-15 17:53 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-15 17:53 XSAVE handling does not decode registers on recent AMD Ryzen processors John Baldwin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).