From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id ACE3B381EC81 for ; Wed, 7 Dec 2022 19:25:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ACE3B381EC81 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca Received: from [172.16.0.64] (192-222-180-24.qc.cable.ebox.net [192.222.180.24]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 284A71E0D3; Wed, 7 Dec 2022 14:25:40 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1670441140; bh=nwSd3NLxgtVdN4to8SWyGUWiE7Qw757pjanDlDQasw4=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=chbAUds4gr1c+3zXOoGPO96i/g4XNGEkJ/5y1dIf6Uemj4B9LIO4hudtU0bcbowMN 7DhwIM7dgHjoiFZ6SvBTczRTDj+mrqR4Lub5jHa6/4SXa20XJFWwYD9EpziFE+x8lk whmgrn51avsQW7XYHgLrrgHfmS3O/oQn2eedI5cc= Message-ID: <559069a3-f3ce-2059-bf4a-44add43979f7@simark.ca> Date: Wed, 7 Dec 2022 14:25:39 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH v2 6/6] gdb/aarch64: Detect vector length changes when debugging remotely Content-Language: fr To: Luis Machado , Thiago Jung Bauermann Cc: gdb-patches@sourceware.org References: <20221126020452.1686509-1-thiago.bauermann@linaro.org> <20221126020452.1686509-7-thiago.bauermann@linaro.org> <87edtdiijv.fsf@linaro.org> <0605407e-a9bf-06c1-c513-e6202181a51f@arm.com> From: Simon Marchi In-Reply-To: <0605407e-a9bf-06c1-c513-e6202181a51f@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 12/7/22 12:05, Luis Machado via Gdb-patches wrote: > On 12/5/22 22:37, Thiago Jung Bauermann wrote: >> >> Luis Machado writes: >> >>> On 11/26/22 02:04, Thiago Jung Bauermann wrote: >>>> @@ -8068,6 +8078,21 @@ remote_target::process_stop_reply (struct stop_reply *stop_reply, >>>>          /* Expedited registers.  */ >>>>          if (!stop_reply->regcache.empty ()) >>>>        { >>>> +      /* If GDB already knows about this thread, we can give the >>>> +         architecture-specific code a chance to update the gdbarch based on >>>> +         the expedited registers.  */ >>>> +      if (find_thread_ptid (this, ptid) != nullptr) >>>> +        { >>>> +          stop_reply->arch = gdbarch_update_architecture (stop_reply->arch, >>>> +                                  stop_reply->regcache); >>>> + >>>> +          /* Save stop_reply->arch so that it can be returned by the >>>> +         thread_architecture method.  */ >>>> +          remote_thread_info *remote_thr = get_remote_thread_info (this, >>>> +                                       ptid); >>>> +          remote_thr->expedited_arch = stop_reply->arch; >>>> +        } >>>> + >>>>          struct regcache *regcache >>>>            = get_thread_arch_regcache (this, ptid, stop_reply->arch); >>>>    @@ -14382,6 +14407,23 @@ remote_target::thread_info_to_thread_handle (struct >>>> thread_info *tp) >>>>      return priv->thread_handle; >>>>    } >>>>    +struct gdbarch * >>>> +remote_target::thread_architecture (ptid_t ptid) >>>> +{ >>>> +  thread_info *thr = find_thread_ptid (this, ptid); >>>> +  remote_thread_info *remote_thr = nullptr; >>>> + >>>> +  if (thr != nullptr) >>>> +    remote_thr = get_remote_thread_info (thr); >>>> + >>>> +  if (remote_thr == nullptr || remote_thr->expedited_arch == nullptr) >>>> +    /* The default thread_architecture implementation is the one from >>>> +       process_stratum_target.  */ >>>> +    return process_stratum_target::thread_architecture(ptid); >>>> + >>>> +  return remote_thr->expedited_arch; >>>> +} >>>> + >>>>    bool >>>>    remote_target::can_async_p () >>>>    { >>> >>> Just recalled this. One deficiency of the current SVE implementation >>> is that it doesn't have a proper way to hide the Z register when they >>> don't have any meaningful state (SVE not active, so we have fpsimd >>> state). >> >> I see the SVE_HEADER_FLAG_SVE being checked/set in >> aarch64_linux_supply_sve_regset and aarch64_linux_collect_sve_regset >> (though in the latter it is set unconditionally), and also HAS_SVE_STATE >> being used in aarch64_sve_regs_copy_to_reg_buf and >> aarch64_sve_regs_copy_from_reg_buf. >> >> But I wasn't able to find similar logic regarding hiding the Z >> registers. Do you have any pointers? >> > > There isn't logic to do that at present. This is something I'm pondering for SME, and would possibly apply to > SVE as well. But it may be more complicated than useful. > >>> Given the VL will always be reported as > 0, even if there is no >>> active SVE state, gdb will always attempt to show Z registers. And >>> those are quite verbose. >> >> The VG pseudo-register is defined with an int type, so we could return a >> negative value to indicate that the Z registers are unused (and the >> module of the value would still be the vector granule). Would that >> be ok? > > I think a value of 0 would be fine for this purpose but... > >> >>> In this sense, the implementations of the the thread_architecture >>> methods for both native aarch64 and remote differ. While native >>> aarch64's implementation is capable of detecting the lack of active >>> SVE state, the remote implementation can't. If gdbserver decided to >>> drop the Z registers mid-execution (before the SVE state is not >>> active), there wouldn't be vg for gdb to make a decision. >> >> Looking at aarch64_linux_nat_target::thread_architecture my impression >> is that its result depends only on aarch64_sve_get_vq, so I don't see >> how it's different from the remote implementation. > > ... Right, but for native debugging we actually call ptrace to extract the vector length. That is completely independent from > the XML target description or the available register sets. > > In the case of remote debugging, the expedited registers are returned based on an existing register description. > > Take, for example, the following: > > - SVE state is enabled and we have the vg register. > - Program updates the SVE vector length and, as a result, we no longer have an active SVE state (reverted to FPSIMD state) > - GDB drops the sve feature from its target description. > - Program executes a sve instruction and SVE state is made active. > - gdbserver tries to tell gdb what the value of vg is, but gdb doesn't know anything about vg at this point, because it has dropped the sve feature previously. > > I think this is the downside of using expedited registers to communicate the architecture changes. For this case, returning a full XML would make it easy for gdb to > figure things out. > > But, going back to hiding sve/sme registers when they're not in use, I'm not yet sure that is something we should pursue. > > I'd say if you manage to get things working with the expedited registers (for the problematic cases Simon and I pointed out), that's fine. Reading this, I thought of another potential problem with using expedited registers to communicate the architectures change. When using the non-stop version of the remote protocol, we will have a stop reply for each thread that stops, so that's fine, we can infer each thread's tdesc. But with the all-stop variant of the protocol, you only get a stop reply for one thread, for a given stop. For instance, imagine you have two threads, thread 1 hits a breakpoint. GDB gets a stop reply for thread 1, and thread 2 is just considered stop because we're using the all-stop remote protocol. If we need to read thread 2's registers, we would need to know its tdesc. Its vg may have changed since last time, or it might be the first time we learned about it. In that case, I don't think we have a way out of adding some mean for gdbserver to tell gdb which tdesc applies to that thread. If it's the case, that we need to extend the RSP to allow fetching per-thread tdesc, here's the idea I had in mind for a while, to avoid adding too much overhead. Stop replies and the qXfer:threads:read reply could include a per-thread tdesc identifier. That tdesc identifier would be something short, either an incrementing number, or some kind of hash of the tdesc. It would be something opaque chosen by the stub. A new packet would be introduced to allow GDB to request the XML given that ID. On the GDB side, GDB would request the XML when encountering a tdesc ID it doesn't know about, and then cache the result. The additional overhead would be: - fetching each XML tdesc once, that seems inevitable (we just don't want to fetch the same XML tdesc over and over) - a few characters more in stop replies and qXfer:threads:read replies I believe this could be implemented in a backwards compatible way without even adding a new qSupported feature. XML is extensible by nature. And stop replies are too. The T stop reply doc says: Otherwise, GDB should ignore this ‘n:r’ pair and go on to the next; this allows us to extend the protocol in the future. So in theory we could add a `tdesc-id:1234` field without breaking old GDBs. However, those old GDBs would break when trying to read registers of threads with unexpected SVE lengths. Simon