From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x232.google.com (mail-oi1-x232.google.com [IPv6:2607:f8b0:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id 86A363857732 for ; Wed, 26 Jul 2023 20:01:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 86A363857732 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x232.google.com with SMTP id 5614622812f47-3a04e5baffcso189774b6e.3 for ; Wed, 26 Jul 2023 13:01:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1690401712; x=1691006512; h=mime-version:message-id:date:in-reply-to:subject:cc:to:from :user-agent:references:from:to:cc:subject:date:message-id:reply-to; bh=hHxkqHXPaHkxO0wGZjVh0qzGhA4FvNGv7bybEFWVjpg=; b=ftPdRlvQFhS3Goz8ZgWE1dSBfRgBy1ZTCYyWu44tT/Gqfy9AyPyBCxZmWCtor7qut6 Blk6N6aHKHa9568tmrYLaJQhSqKta0s8CyzjbPknD3tBSBjO+JvI7jfd4L8GhXmSzZRa PT9rFKTjvwWNZ2gpYhxuTSm3I0YTYckCie9ce+kS+++2Y2SHSPULT0pLhN/Zv5YGCnT7 w0HTVXs0yK9pOzSNXZQjeYEppMlg0XnWkIv+TTijcNcYGiKJVZlltFrk0TRazbWl7bZP UD6bMrV3hJFkFMkK1q74sY+71WFWD+4YIL/9w9ozgXUO/uoHR3KLKazZbCwIdfuHPJUw XfYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690401712; x=1691006512; h=mime-version:message-id:date:in-reply-to:subject:cc:to:from :user-agent:references:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hHxkqHXPaHkxO0wGZjVh0qzGhA4FvNGv7bybEFWVjpg=; b=VRxjqJwDqVxRzQ5KktgWkRlNjrw9hKlmRlU2+0BTU76zdNEhvtxHcyTZ1A4KVftUTt x2837rWgvcns22+hdREqy8S1hvx8Ccv1pm2dbvkINX5X5NqMvXg6uq9/dQp7J1AAmwQm zSGTQwHomhRCpSmiy6jR1grFUYAuVpz+IYZLX8yXjQkse+nlvcfzqjzU7kIX6aKAp31V dGoq35jC0DIJXAIj91uwCyUM49FMZtKa4998ofjtJc/Dj6CrI1noibF6mCJXsBBrXPOf P8zJT6KIyx3SR8r7b64PwlFx+Cbp3AOwpH759ueBJcXkYABGmZsIslKy5Js3aNxmA6ah 8/fg== X-Gm-Message-State: ABy/qLYh8/VRhqJb6U1GpfaO29ziyToYrg+5ODsmqjDKxsWfKCugTbr7 5dGGS0fmFOS5ECs6i9HTIZWiiQ== X-Google-Smtp-Source: APBJJlHFWEXoqnRX2FSPSxgR4pcRLcN0BZCc6y/pvUW4uXclNco/EE5/NMMXp1WatU17IRFlD+Ca3g== X-Received: by 2002:a05:6808:1293:b0:3a3:f289:8205 with SMTP id a19-20020a056808129300b003a3f2898205mr757438oiw.35.1690401711636; Wed, 26 Jul 2023 13:01:51 -0700 (PDT) Received: from localhost ([2804:14d:7e39:8470:66bb:15b5:6278:c85d]) by smtp.gmail.com with ESMTPSA id bg31-20020a056808179f00b003a5a82ef032sm4715504oib.32.2023.07.26.13.01.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Jul 2023 13:01:50 -0700 (PDT) References: <20230630134616.1238105-1-luis.machado@arm.com> <20230630134616.1238105-6-luis.machado@arm.com> User-agent: mu4e 1.10.5; emacs 28.2 From: Thiago Jung Bauermann To: Luis Machado Cc: gdb-patches@sourceware.org Subject: Re: [PATCH v3 05/16] [gdb/aarch64] sme: Enable SME registers and pseudo-registers In-reply-to: <20230630134616.1238105-6-luis.machado@arm.com> Date: Wed, 26 Jul 2023 17:01:47 -0300 Message-ID: <87pm4e8ms4.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, Just minor nits... Luis Machado via Gdb-patches writes: > The SME (Scalable Matrix Extension) [1] exposes a new matrix register ZA with > variable sizes. It also exposes a new mode called streaming mode. > > Similarly to SVE, the ZA register size is dictated by a vector length, but the > SME vector length is called streaming vetor length. The total size for s/vetor/vector/ > diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c > index a775631e9ec..cb97b205c86 100644 > --- a/gdb/aarch64-linux-nat.c > +++ b/gdb/aarch64-linux-nat.c > @@ -55,6 +55,7 @@ > #include "arch/aarch64-mte-linux.h" > > #include "nat/aarch64-mte-linux-ptrace.h" > +#include "arch/aarch64-scalable-linux.h" > > #include > > @@ -313,8 +314,11 @@ store_fpregs_to_thread (const struct regcache *regcache) > } > } > > -/* Fill GDB's register array with the sve register values > - from the current thread. */ > +/* Fill GDB's REGCACHE with the valid SVE register values from the current > + thread. The function uses the regcache's ptid to identify the thread, so it's not from the current thread that it gets the register values. > + > + This function handles reading data from SVE or SSVE states, depending > + on which state is active at the moment. */ > > static void > fetch_sveregs_from_thread (struct regcache *regcache) > @@ -323,8 +327,11 @@ fetch_sveregs_from_thread (struct regcache *regcache) > aarch64_sve_regs_copy_to_reg_buf (regcache->ptid ().lwp (), regcache); > } > > -/* Store to the current thread the valid sve register > - values in the GDB's register array. */ > +/* Store the valid SVE register values from GDB's REGCACHE to the current > + thread. Same comment about the current thread here. > + > + This function handles writing data to SVE or SSVE states, depending > + on which state is active at the moment. */ > > static void > store_sveregs_to_thread (struct regcache *regcache) > @@ -334,6 +341,41 @@ store_sveregs_to_thread (struct regcache *regcache) > aarch64_sve_regs_copy_from_reg_buf (regcache->ptid ().lwp (), regcache); > } > > +/* Fill GDB's REGCACHE with the ZA register set contents from the > + current thread. If there is no active ZA registe state, make the > + ZA register contents zero. */ Same comment about the current thread here. Also, s/registe/register/ > + > +static void > +fetch_za_from_thread (struct regcache *regcache) > +{ > + aarch64_gdbarch_tdep *tdep > + = gdbarch_tdep (regcache->arch ()); > + > + /* Read ZA state from the thread to the register cache. */ > + aarch64_za_regs_copy_to_reg_buf (regcache->ptid ().lwp (), > + regcache, > + tdep->sme_za_regnum, > + tdep->sme_svg_regnum, > + tdep->sme_svcr_regnum); > +} > + > +/* Store the NT_ARM_ZA register set contents from GDB's REGCACHE to the current > + thread. */ Same comment about the current thread here. > + > +static void > +store_za_to_thread (struct regcache *regcache) > +{ > + aarch64_gdbarch_tdep *tdep > + = gdbarch_tdep (regcache->arch ()); > + > + /* Write ZA state from the register cache to the thread. */ > + aarch64_za_regs_copy_from_reg_buf (regcache->ptid ().lwp (), > + regcache, > + tdep->sme_za_regnum, > + tdep->sme_svg_regnum, > + tdep->sme_svcr_regnum); > +} > + > @@ -890,21 +959,27 @@ aarch64_linux_nat_target::thread_architecture (ptid_t ptid) > if (gdbarch_bfd_arch_info (inf->gdbarch)->bits_per_word == 32) > return inf->gdbarch; > > - /* Only return it if the current vector length matches the one in the tdep. */ > + /* Only return the inferior's gdbarch if both vq and svq match the ones in > + the tdep. */ > aarch64_gdbarch_tdep *tdep > = gdbarch_tdep (inf->gdbarch); > uint64_t vq = aarch64_sve_get_vq (ptid.lwp ()); > - if (vq == tdep->vq) > + uint64_t svq = aarch64_za_get_svq (ptid.lwp ()); > + if (vq == tdep->vq && svq == tdep->sme_svq) > return inf->gdbarch; > > - /* We reach here if the vector length for the thread is different from its > + /* We reach here if any vector length for the thread is different from its > value at process start. Lookup gdbarch via info (potentially creating a > - new one) by using a target description that corresponds to the new vq value > - and the current architecture features. */ > + new one) by using a target description that corresponds to the new vq/svq > + value and the current architecture features. */ > > const struct target_desc *tdesc = gdbarch_target_desc (inf->gdbarch); > aarch64_features features = aarch64_features_from_target_desc (tdesc); > features.vq = vq; > + /* We cast to uint8_t here because the features struct can get large, and it > + is important to store the information in as little storage as > + possible. */ Is it because features is used as a key in tdesc_aarch64_map? Mentioning in the comment why it needs to be small would be useful. > + features.svq = (uint8_t) svq; > > struct gdbarch_info info; > info.bfd_arch_info = bfd_lookup_arch (bfd_arch_aarch64, bfd_mach_aarch64); > @@ -2802,6 +3124,122 @@ aarch64_pseudo_read_value_1 (struct gdbarch *gdbarch, > return result_value; > } > > +/* Helper function for reading/writing ZA pseudo-registers. Given REGNUM, > + a ZA pseudo-register number, return, in OFFSETS, the information on positioning > + of the bytes that must be read from/written to. */ > + > +static void > +aarch64_za_offsets_from_regnum (struct gdbarch *gdbarch, int regnum, > + struct za_offsets &offsets) > +{ > + aarch64_gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); > + > + gdb_assert (tdep->has_sme ()); > + gdb_assert (tdep->sme_svq > 0); > + gdb_assert (tdep->sme_pseudo_base <= regnum); > + gdb_assert (regnum < tdep->sme_pseudo_base + tdep->sme_pseudo_count); > + > + struct za_pseudo_encoding encoding; > + > + /* Decode the ZA pseudo-register number. */ > + aarch64_za_decode_pseudos (gdbarch, regnum, encoding); > + > + /* Fetch the streaming vector length. */ > + size_t svl = sve_vl_from_vq (tdep->sme_svq); > + > + if (is_sme_tile_slice_pseudo_register (gdbarch, regnum)) > + { > + if (encoding.horizontal) > + { > + /* Horizontal tile slices are contiguous ranges of svl bytes. */ > + > + /* The starting offset depends on the tile index (to locate the tile > + in the ZA buffer), the slice index (to locate the slice within the > + tile) and the qualifier. */ > + offsets.starting_offset > + = encoding.tile_index * svl + encoding.slice_index > + * (svl >> encoding.qualifier_index); > + /* Horizontal tile slice data is contiguous and thus doesn't have > + a stride. */ > + offsets.stride_size = 0; > + /* Horizontal tile slice data is contiguous and thus only has 1 > + chunk. */ > + offsets.chunks = 1; > + /* The chunk size is always svl bytes. */ > + offsets.chunk_size = svl; > + } > + else > + { > + /* Vertical tile slices are non-contiguous ranges of > + (1 << qualifier_index) bytes. */ > + > + /* The starting offset depends on the tile number (to locate the > + tile in the ZA buffer), the slice index (to locate the element > + within the tile slice) and the qualifier. */ > + offsets.starting_offset > + = encoding.tile_index * svl + encoding.slice_index > + * (1 << encoding.qualifier_index); > + /* The offset between vertical tile slices depends on the qualifier > + and svl. */ > + offsets.stride_size = svl << (encoding.qualifier_index); The parentheses above are unnecessary. > + /* The number of chunks depends on svl and the qualifier size. */ > + offsets.chunks = svl >> encoding.qualifier_index; > + /* The chunk size depends on the qualifier. */ > + offsets.chunk_size = 1 << encoding.qualifier_index; > + } > + } > + else > + { > + /* ZA tile pseudo-register. */ > + > + /* Starting offset depends on the tile index and qualifier. */ > + offsets.starting_offset = (encoding.tile_index) * svl; The parentheses above are unnecessary. > + /* The offset between tile slices depends on the qualifier and svl. */ > + offsets.stride_size = svl << (encoding.qualifier_index); The parentheses above are unnecessary. > + /* The number of chunks depends on the qualifier and svl. */ > + offsets.chunks = svl >> encoding.qualifier_index; > + /* The chunk size is always svl bytes. */ > + offsets.chunk_size = svl; > + } > +} > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +bool > +write_sve_header (int tid, const struct user_sve_header &header) > +{ > + struct iovec iovec; > + > + iovec.iov_len = sizeof (header); > + iovec.iov_base = (void *) &header; Unnecessary cast. > + > + if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_SVE, &iovec) < 0) > + { > + /* SVE is not supported. */ > + return false; > + } > + return true; > +} > + > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +bool > +read_ssve_header (int tid, struct user_sve_header &header) > +{ > + struct iovec iovec; > + > + iovec.iov_len = sizeof (header); > + iovec.iov_base = &header; > + > + if (ptrace (PTRACE_GETREGSET, tid, NT_ARM_SSVE, &iovec) < 0) > + { > + /* SSVE is not supported. */ > + return false; > + } > + return true; > +} > + > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +bool > +write_ssve_header (int tid, const struct user_sve_header &header) > +{ > + struct iovec iovec; > + > + iovec.iov_len = sizeof (header); > + iovec.iov_base = (void *) &header; Unnecessary cast. > + > + if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_SSVE, &iovec) < 0) > + { > + /* SSVE is not supported. */ > + return false; > + } > + return true; > +} > + > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +bool > +read_za_header (int tid, struct user_za_header &header) > +{ > + struct iovec iovec; > + > + iovec.iov_len = sizeof (header); > + iovec.iov_base = &header; > + > + if (ptrace (PTRACE_GETREGSET, tid, NT_ARM_ZA, &iovec) < 0) > + { > + /* ZA is not supported. */ > + return false; > + } > + return true; > +} > + > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +bool > +write_za_header (int tid, const struct user_za_header &header) > +{ > + struct iovec iovec; > + > + iovec.iov_len = sizeof (header); > + iovec.iov_base = (void *) &header; Unnecessary cast. > + > + if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_ZA, &iovec) < 0) > + { > + /* ZA is not supported. */ > + return false; > + } > + return true; > +} > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +bool > +aarch64_store_za_regset (int tid, const gdb::byte_vector &za_state) > +{ > + struct iovec iovec; > + iovec.iov_len = za_state.size (); > + iovec.iov_base = (void *) za_state.data (); Unnecessary cast. > + > + if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_ZA, &iovec) < 0) > + perror_with_name (_("Failed to write to the NT_ARM_ZA register set.")); > + > + return true; > +} > + > +/* See nat/aarch64-scalable-linux-ptrace.h. */ > + > +void > +aarch64_initialize_za_regset (int tid) > +{ > + /* First fetch the NT_ARM_ZA header so we can fetch the streaming vector > + length. */ > + struct user_za_header header; > + if (!read_za_header (tid, header)) > + error (_("Failed to read NT_ARM_ZA header.")); > + > + /* The vector should be default-initialized to zero, and we should account > + for the payload as well. */ > + std::vector za_new_state (ZA_PT_SIZE (sve_vq_from_vl (header.vl))); > + > + /* Adjust the header size since we are adding the initialized ZA > + payload. */ > + header.size = ZA_PT_SIZE (sve_vq_from_vl (header.vl)); > + > + /* Overlay the modified header onto the new ZA state. */ > + const gdb_byte *base = (gdb_byte *) &header; > + memcpy (za_new_state.data (), base, sizeof (user_za_header)); > + > + /* Set the ptrace request up and update the NT_ARM_ZA register set. */ > + struct iovec iovec; > + iovec.iov_len = za_new_state.size (); > + iovec.iov_base = (void *) za_new_state.data (); Unnecessary cast. > + > + if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_ZA, &iovec) < 0) > + perror_with_name (_("Failed to initialize the NT_ARM_ZA register set.")); > + > + /* The NT_ARM_ZA register set should now contain a zero-initialized ZA > + payload. */ > +} > + -- Thiago