From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 63933 invoked by alias); 24 Sep 2018 17:19:52 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 63920 invoked by uid 89); 24 Sep 2018 17:19:51 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=Christian, hid, Directors, Managing X-HELO: mga14.intel.com Received: from mga14.intel.com (HELO mga14.intel.com) (192.55.52.115) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 24 Sep 2018 17:19:50 +0000 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Sep 2018 10:19:48 -0700 Received: from irsmsx152.ger.corp.intel.com ([163.33.192.66]) by orsmga002.jf.intel.com with ESMTP; 24 Sep 2018 10:19:47 -0700 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.213]) by IRSMSX152.ger.corp.intel.com ([169.254.6.200]) with mapi id 14.03.0319.002; Mon, 24 Sep 2018 18:19:46 +0100 From: "Metzger, Markus T" To: Jan Beulich , GDB CC: "Wiederhake, Tim" , Simon Marchi Subject: RE: [PATCH] x86-64: fix ZMM register state tracking Date: Mon, 24 Sep 2018 17:19:00 -0000 Message-ID: References: <5B8FD8B302000078001E5940@prv1-mh.provo.novell.com> In-Reply-To: <5B8FD8B302000078001E5940@prv1-mh.provo.novell.com> Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2018-09/txt/msg00812.txt.bz2 Hello Jan, > The three AVX512 state components are entirely independent - one being in= its "init > state" has no implication whatsoever on either of the other two. Fully se= parate > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to prevent upper halves of > the upper 16 ZMM registers to display as if they were zero (when they are= n't) after > e.g. VZEROALL/VZEROUPPER. >=20 > gdb/ > 2018-09-05 Jan Beulich >=20 > * i387-tdep.c (i387_supply_xsave): Split handling of > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > (i387_collect_xsave): Likewise. >=20 > --- a/gdb/i387-tdep.c > +++ b/gdb/i387-tdep.c > @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc > enum bfd_endian byte_order =3D gdbarch_byte_order (gdbarch); > struct gdbarch_tdep *tdep =3D gdbarch_tdep (gdbarch); > const gdb_byte *regs =3D (const gdb_byte *) xsave; > - int i; > + int i, zmm_endlo_regnum =3D I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); It would be nice to comment on this magic 16 and the min operation. It's how XSAVE organizes things but it isn't entirely intuitive. > ULONGEST clear_bv; > static const gdb_byte zero[I386_MAX_REGISTER_SIZE] =3D { 0 }; > enum > @@ -1002,7 +1003,8 @@ i387_supply_xsave (struct regcache *regc > return; >=20 > case avx512_zmm_h: > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + if ((clear_bv & (regnum < zmm_endlo_regnum ? X86_XSTATE_ZMM_H > + : X86_XSTATE_ZMM))) A comment that XSAVE stores the lower 16 registers in a different place than the higher 16 registers and also guards them by different XCR0 bits would be nice. We hid the different places behind those XSAVE_AVX512_ZMM_H_ADDR macros but there's nothing similar for the guard bits. Maybe add macros for the guard check, as well? > regcache->raw_supply (regnum, zero); > else > regcache->raw_supply (regnum, > @@ -1080,21 +1082,17 @@ i387_supply_xsave (struct regcache *regc > } > } >=20 > - /* Handle the upper ZMM registers. */ > - if ((tdep->xcr0 & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + /* Handle the upper halves of the low 8/16 ZMM registers. */ > + if ((tdep->xcr0 & X86_XSTATE_ZMM_H)) > { > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + if ((clear_bv & X86_XSTATE_ZMM_H)) > { > - for (i =3D I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); > - i++) > + for (i =3D I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > regcache->raw_supply (i, zero); > } > else > { > - for (i =3D I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); > - i++) > + for (i =3D I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > regcache->raw_supply (i, > XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); > } > @@ -1119,11 +1117,13 @@ i387_supply_xsave (struct regcache *regc > } > } >=20 > - /* Handle the YMM_AVX512 registers. */ > + /* Handle the upper 16 ZMM/YMM/XMM registers (if any). */ > if ((tdep->xcr0 & X86_XSTATE_ZMM)) > { > if ((clear_bv & X86_XSTATE_ZMM)) > { > + for (i =3D zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + regcache->raw_supply (i, zero); > for (i =3D I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); > i++) > @@ -1135,6 +1135,9 @@ i387_supply_xsave (struct regcache *regc > } > else > { > + for (i =3D zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + regcache->raw_supply (i, > + XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); This covers the upper halves of zmm16 to zmm31. Looking at the function it= looks like the lower halves are covered in separate cases avx512_ymmh_avx512 and avx512_xmmh_avx512. Maybe reflect this in the comment? It currently sugge= sts that it handles the entire upper 16 registers. > for (i =3D I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); > i++) > @@ -1340,7 +1343,8 @@ i387_collect_xsave (const struct regcach > gdb_byte *p, *regs =3D (gdb_byte *) xsave; > gdb_byte raw[I386_MAX_REGISTER_SIZE]; > ULONGEST initial_xstate_bv, clear_bv, xstate_bv =3D 0; > - unsigned int i; > + unsigned int i, zmm_endlo_regnum =3D I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); > enum > { > x87_ctrl_or_mxcsr =3D 0x1, > @@ -1441,9 +1445,8 @@ i387_collect_xsave (const struct regcach > i < I387_MPXEND_REGNUM (tdep); i++) > memset (XSAVE_MPX_ADDR (tdep, regs, i), 0, 8); >=20 > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > - for (i =3D I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); i++) > + if ((clear_bv & X86_XSTATE_ZMM_H)) > + for (i =3D I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); >=20 > if ((clear_bv & X86_XSTATE_K)) > @@ -1453,6 +1456,8 @@ i387_collect_xsave (const struct regcach >=20 > if ((clear_bv & X86_XSTATE_ZMM)) > { > + for (i =3D zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); > for (i =3D I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); i++) > memset (XSAVE_YMM_AVX512_ADDR (tdep, regs, i), 0, 16); Looks OK to me. Regards, Markus. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928