From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lndn.lancelotsix.com (lndn.lancelotsix.com [51.195.220.111]) by sourceware.org (Postfix) with ESMTPS id 0AF16385781F; Mon, 16 Oct 2023 09:17:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0AF16385781F Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=lancelotsix.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=lancelotsix.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0AF16385781F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=51.195.220.111 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697447861; cv=none; b=k4k2wQ4tCn4krEui7tpOxGz+LjBDirXqwU8+YuO6SFu8i93HtFRmoKWvo364n+vv7enNn1L46c00bYcoFQqB0Dj6bJ8feM89N6kHM09VFcrIFR3aSe+Sra7URsChJOW3H3k8gtJ39w48Fw8urPE/i4WZL7W6QgrwsqXWGkDkQVM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697447861; c=relaxed/simple; bh=K94N1D13GMhsgmRfNXIRN9er/i71jZR6RRM65Oy1eOY=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=vQP94XB+7H0rys80lWYYk2xSUYTe7z3Tg5pBlXU1THrLfTiV7EE/wV4qN1qIfbr6dqp47syqKoVuwNA2AqMTSXar+KJmf8Krbht0dY/Nqh7GHv5YVUx93Mh2xMEuA++ddp1WznHPajxPwZTD0ALjn0bCDsvK4Jfb6h8JRENbQK0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from octopus (cust120-dsl54.idnet.net [212.69.54.120]) by lndn.lancelotsix.com (Postfix) with ESMTPSA id A87F384CB8; Mon, 16 Oct 2023 09:17:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=lancelotsix.com; s=2021; t=1697447858; bh=K94N1D13GMhsgmRfNXIRN9er/i71jZR6RRM65Oy1eOY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=tVj7hGGykv73FiAaREtyMUsn9IzXklkFOfElqAGxoDyelTnb4U/Wci6ha++CZZLlK kpT7/Rfg9RI02Cf8ABUZKA00Ourb58VyYKrBeynJqxn47BohLSCvnTpP4V2EXJz+Vt /bc/FZWJkUMWLnhm4DfctaAlP/7K5vyV5jJSMEnoB/qckfs1b+z3usAq8qt7l86tbn tDM6+x7Lninn1RYavX7aackAA3LdB7j8ErJNObuTOqGdYqUrWhX2F6GXY2W2Zgkztn +nrBIOANwH4fhbLsE5SWnRkp4jwhXkevbMyM1pUbbS26LDerLDI+b4gLG24XIpGzSD C44PADkE5FySw== Date: Mon, 16 Oct 2023 10:17:32 +0100 From: Lancelot SIX To: John Baldwin Cc: gdb-patches@sourceware.org, Willgerodt@sourceware.org, Felix , George@sourceware.org, Jini Susan , Simon Marchi Subject: Re: [RFC 02/13] i387-tdep: Add function to read XSAVE layout from NT_X86_CPUID Message-ID: <20231016091732.ybyjym67r2l7e3ol@octopus> References: <20231009183617.24862-1-jhb@FreeBSD.org> <20231009183617.24862-3-jhb@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20231009183617.24862-3-jhb@FreeBSD.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.2 (lndn.lancelotsix.com [0.0.0.0]); Mon, 16 Oct 2023 09:17:38 +0000 (UTC) X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, I am not familiar with XSAVE details, but I have pure c++ style comments below. On Mon, Oct 09, 2023 at 11:36:04AM -0700, John Baldwin wrote: > This can be used by x86 arches to determine the XSAVE layout instead > of guessing based on the XCR0 mask and XSAVE register note size. > --- > gdb/i387-tdep.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++ > gdb/i387-tdep.h | 8 +++ > 2 files changed, 140 insertions(+) > > diff --git a/gdb/i387-tdep.c b/gdb/i387-tdep.c > index 47667da21c7..1eac2b6bd2a 100644 > --- a/gdb/i387-tdep.c > +++ b/gdb/i387-tdep.c > @@ -26,6 +26,8 @@ > #include "target-float.h" > #include "value.h" > > +#include > + > #include "i386-tdep.h" > #include "i387-tdep.h" > #include "gdbsupport/x86-xstate.h" > @@ -987,6 +989,136 @@ i387_guess_xsave_layout (uint64_t xcr0, size_t xsave_size, > return true; > } > > +/* Parse a reg-x86-cpuid pseudo section building a hash table mapping > + cpuid leaves to their results. */ > + > +struct cpuid_key > +{ > + cpuid_key (uint32_t _leaf, uint32_t _subleaf) > + : leaf(_leaf), subleaf(_subleaf) > + {} > + > + uint32_t leaf; > + uint32_t subleaf; > + > + constexpr bool operator== (const cpuid_key &other) const > + { return (leaf == other.leaf && subleaf == other.subleaf); } > +}; > + > +namespace std > +{ > +template<> > +struct hash > +{ > + size_t operator() (const cpuid_key &key) const > + { > + return key.leaf ^ (key.subleaf << 1); > + } > +}; > +} I think there was a discussion not long ago regarding opening std, and it seems that the prefered approach is to use: template<> struct std::hash { ... }; See https://sourceware.org/pipermail/gdb-patches/2023-September/202336.html for the discussion. > + > +struct cpuid_values > +{ > + cpuid_values (uint32_t _eax, uint32_t _ebx, uint32_t _ecx, uint32_t _edx) > + : eax(_eax), ebx(_ebx), ecx(_ecx), edx(_edx) > + {} > + > + uint32_t eax; > + uint32_t ebx; > + uint32_t ecx; > + uint32_t edx; > +}; > + > +typedef std::unordered_map cpuid_map; > + > +static cpuid_map > +i387_parse_cpuid_from_core (bfd *bfd) > +{ > + asection *section = bfd_get_section_by_name (bfd, ".reg-x86-cpuid"); > + if (section == nullptr) > + return {}; > + > + size_t size = bfd_section_size (section); > + if (size == 0 || (size % (6 * 4)) != 0) > + return {}; > + > + char contents[size]; If I remember correctly, VLAs are not a C++ feature (but are supported as a GCC extension https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html). I am unsure if GDB has a policy regarding the use of extensions, so maybe this is fine. Otherwise, you could use a std::vector instead (it comes with a dynamic allocation, but I am not too concerned at this is hardly on a performance critical path) std::vector contents (size); > + if (!bfd_get_section_contents (bfd, section, contents, 0, size)) > + { > + warning (_("Couldn't read `.reg-x86-cpuid' section in core file.")); > + return {}; > + } > + > + cpuid_map map; > + size_t index = 0; > + while (index < size) > + { > + uint32_t leaf = bfd_get_32 (bfd, contents + index); > + uint32_t count = bfd_get_32 (bfd, contents + index + 4); > + uint32_t eax = bfd_get_32 (bfd, contents + index + 8); > + uint32_t ebx = bfd_get_32 (bfd, contents + index + 12); > + uint32_t ecx = bfd_get_32 (bfd, contents + index + 16); > + uint32_t edx = bfd_get_32 (bfd, contents + index + 20); > + > + if (map.count (cpuid_key (leaf, count)) != 0) > + { > + warning (_("Duplicate cpuid leaf %#x,%#x"), leaf, count); > + return {}; > + } > + map.emplace (cpuid_key (leaf, count), > + cpuid_values (eax, ebx, ecx, edx)); As Simon pointed out, there are two lookups here, where you can get away with just one. However, this is C++17 only which is not [yet] available in GDB. Instead, you can use the value returned by emplace to know if an insertation has been done or not: auto emplace_result = map.emplace (cpuid_key (leaf, count), cpuid_values (eax, ebx, ecx, edx)); if (!emplace_result.second) { warning (_("Duplicate cpuid leaf %#x,%#x"), leaf, count); return {}; } > + > + index += 6 * 4; > + } > + > + return map; > +} > + > +/* Fetch the offset of a specific XSAVE extended region. */ > + > +static int I think it is worth returning uint32_t here as int is (in theory) target dependent. > +xsave_feature_offset (cpuid_map &map, uint64_t xcr0, int feature) I think that the MAP parameter could be `const` here. > +{ > + if ((xcr0 & (1ULL << feature)) == 0) > + return 0; > + > + return map.at (cpuid_key (0xd, feature)).ebx; > +} > + > +/* See i387-tdep.h. */ > + > +bool > +i387_read_xsave_layout_from_core (bfd *bfd, uint64_t xcr0, size_t xsave_size, > + x86_xsave_layout &layout) > +{ > + cpuid_map map = i387_parse_cpuid_from_core (bfd); > + if (map.empty ()) > + return false; > + > + try > + { > + layout.sizeof_xsave = xsave_size; > + layout.avx_offset = xsave_feature_offset (map, xcr0, > + X86_XSTATE_AVX_ID); > + layout.bndregs_offset = xsave_feature_offset (map, xcr0, > + X86_XSTATE_BNDREGS_ID); > + layout.bndcfg_offset = xsave_feature_offset (map, xcr0, > + X86_XSTATE_BNDCFG_ID); > + layout.k_offset = xsave_feature_offset (map, xcr0, > + X86_XSTATE_K_ID); > + layout.zmm_h_offset = xsave_feature_offset (map, xcr0, > + X86_XSTATE_ZMM_H_ID); > + layout.zmm_offset = xsave_feature_offset (map, xcr0, X86_XSTATE_ZMM_ID); > + layout.pkru_offset = xsave_feature_offset (map, xcr0, X86_XSTATE_PKRU_ID); > + } > + catch (const std::out_of_range &) > + { > + return false; > + } > + > + return true; > +} > + > /* Extract from XSAVE a bitset of the features that are available on the > target, but which have not yet been enabled. */ > > diff --git a/gdb/i387-tdep.h b/gdb/i387-tdep.h > index e149e30e52e..b16b9a60b67 100644 > --- a/gdb/i387-tdep.h > +++ b/gdb/i387-tdep.h > @@ -147,6 +147,14 @@ extern void i387_supply_fxsave (struct regcache *regcache, int regnum, > extern bool i387_guess_xsave_layout (uint64_t xcr0, size_t xsave_size, > x86_xsave_layout &layout); > > +/* Determine the XSAVE layout from the `reg-x86-cpuid` section in a > + core dump. Returns true on sucess, or false if a layout can not be s/sucess/success/ > + read. */ > + > +extern bool i387_read_xsave_layout_from_core (bfd *bfd, uint64_t xcr0, > + size_t xsave_size, > + x86_xsave_layout &layout); > + > /* Similar to i387_supply_fxsave, but use XSAVE extended state. */ > > extern void i387_supply_xsave (struct regcache *regcache, int regnum, > -- > 2.41.0 > Best, Lancelot.