From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id A22DA3858D28 for ; Fri, 22 Dec 2023 16:25:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A22DA3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A22DA3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=158.69.221.121 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703262320; cv=none; b=Vp8n6077U4tJcHpQHNQ0DInkQ3n3CekeBCV8mvkXLtNRtogSDPwcMQMb3LE4ObwrbUQnuuw2ij3ALKpRW/CJHQAa2XwBeLxiHGZqPlO8QSUvO4jrkOlVeZq3pO9u0iU+rbSyXwLp1XsaJmQIJVQ1QpaysFMD6h3SbNnjqXymhgs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703262320; c=relaxed/simple; bh=kBJfnrllg7EWQy3N6XSuUN/ZOa+u/XIThGaPo8BLQMw=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=LC7rrlrF7RLMeA+hF9/4pxxfa2E3mPx5wFTh/KMut51+Drwjvcd6dHqpTp+TYjHRNwSclWKPKd/mbET8Di90wWSzfvtMP3UgKj+fA1d6NOjUkoeR606CKcP8hlZ69n2Vwt566EWguskOzxCUA4Wg71vrWvQ+qimlyRa1uXog5EA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1703262318; bh=kBJfnrllg7EWQy3N6XSuUN/ZOa+u/XIThGaPo8BLQMw=; h=Date:Subject:To:References:From:In-Reply-To:From; b=pylf4pgDWMGqzC3SMNNgA+IHtRvjDEUUbldBoijINkrrqD8hEOQNL2lnP70iTeMoW JCgF31FYxe3b6+VwBY+QtaT4UjWbiHTFntYC68M3JMs6SR6zeuyLbbPs3JvQH/zON2 NzE0ogE3RRXzwA5i+Y8hGtVEULoAjT8aXzf43Fsg= Received: from [10.0.0.11] (modemcable238.237-201-24.mc.videotron.ca [24.201.237.238]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 337FB1E0AC; Fri, 22 Dec 2023 11:25:18 -0500 (EST) Message-ID: Date: Fri, 22 Dec 2023 11:25:17 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 26/26] gdbserver: allow gradually populating and selectively storing a regcache Content-Language: en-US To: Tankut Baris Aktemur , gdb-patches@sourceware.org References: <65c2083374c880b4802a8453dea62b23e2c0dda6.1677582745.git.tankut.baris.aktemur@intel.com> From: Simon Marchi In-Reply-To: <65c2083374c880b4802a8453dea62b23e2c0dda6.1677582745.git.tankut.baris.aktemur@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2023-02-28 06:28, Tankut Baris Aktemur via Gdb-patches wrote: > Currently, the regcache can be used by fetching all the registers from > the target. For some targets, this could be a costly operation > because there is a large number of threads with a large register file > each. In this patch, we revise the regcache implementation to allow > populating the contents gradually and storing the registers only when > they have updated values. When reading the subject (gradually populating) and the commit message (especially the paragraph above), I expected that the patch would make fetching registers from the target (and filling up the regcache) more lazy. This is the picture I had in mind: the regcache would start with all REG_UNKNOWN statuses. When a caller asks for a given register value, we would fetch that register value from the target, and its status would become REG_VALID. However, I see that regcache::fetch still asks the target to fetch all registers from the start. Is my understanding of what you're trying to achieve wrong? > > To this aim, we introduce a new register status: REG_DIRTY. This > status denotes that a register value has been cached and also updated. > When invalidating the cache, only the dirty registers are stored to > the target. In a typical debug session, it is more likely that only a > small subset of the register file has changed. Therefore, selectively > storing the registers on targets with many threads and registers can > save substantial costs, with respect to storing the whole set. Just curious, can you share some real world experience about those cost savings? I'm guessing we're talking about GPU registers here? > The collect_register function now performs a lazy fetch. If the > requested register value is not cached yet, it is requested from the > target. > > The supply_register function updates the status of the supplied > register as follows: if the register was not available in the cache, > its status becomes REG_VALID, denoting that the value is now cached. > If the register is supplied again, it becomes REG_DIRTY. I don't understand the logic here. If a register is in the REG_UNKNOWN state and I supply a value through regcache->raw_supply, I think it should go to the REG_DIRTY status. We don't know what the value of the register on the target is, it is only safe to assume that it's not the same value as what was supplied. > > The function that supply the whole register set (supply_regblock and > registers_from_string) are also updated to compare the present and new > values of each register, so that we can track the register statuses > (i.e. dirty or not) properly. > > Regression-tested on an X86_64 Linux target using the native-gdbserver > and native-extended-gdbserver board files. > --- > gdbserver/regcache.cc | 96 ++++++++++++++++++++++++++++++------ > gdbserver/regcache.h | 6 +++ > gdbsupport/common-regcache.h | 3 ++ > 3 files changed, 91 insertions(+), 14 deletions(-) > > diff --git a/gdbserver/regcache.cc b/gdbserver/regcache.cc > index cfb68774402..cf020985c31 100644 > --- a/gdbserver/regcache.cc > +++ b/gdbserver/regcache.cc > @@ -63,6 +63,18 @@ regcache::fetch () > gdb_assert (this->thread != nullptr); > switch_to_thread (this->thread); > > + /* If there are individually-fetched dirty registers, first > + store them, then fetch all. We prefer this to doing > + individual fetch for each registers, if needed, because it is > + more likely that very few registers are individually-fetched > + at this moment and that fetching all in one go is more > + efficient than fetching each reg one by one. */ > + for (int i = 0; i < tdesc->reg_defs.size (); ++i) > + { > + if (register_status[i] == REG_DIRTY) > + store_inferior_registers (this, i); > + } > + > /* Invalidate all registers, to prevent stale left-overs. */ > discard (); > fetch_inferior_registers (this, -1); > @@ -100,12 +112,17 @@ regcache_invalidate_thread (struct thread_info *thread) > void > regcache::invalidate () > { > - if (registers_fetched) > + scoped_restore_current_thread restore_thread; > + gdb_assert (this->thread != nullptr); > + switch_to_thread (this->thread); > + > + /* Store dirty registers individually. We prefer this to a > + store-all, because it is more likely that a small number of > + registers have changed. */ > + for (int i = 0; i < tdesc->reg_defs.size (); ++i) > { > - scoped_restore_current_thread restore_thread; > - gdb_assert (this->thread != nullptr); > - switch_to_thread (this->thread); > - store_inferior_registers (this, -1); > + if (register_status[i] == REG_DIRTY) > + store_inferior_registers (this, i); > } I think there's a design problem here: from what I understand, it's possible to get in a situation where all registers are REG_UNKNOWN, except one that is REG_DIRTY. When you "invalidate" the regcache, we'll call store_inferior_registers for the register that is REG_DIRTY. However, linux_process_target::fetch_registers, for instance, will write all registers in a given regset when asked to store one register contained in that regset. So it will end up writing garbage data for all the registers in that regset that were REG_UNKNOWN. > > discard (); > @@ -231,7 +248,8 @@ regcache::registers_to_string (char *buf) > unsigned char *regs = registers; > for (int i = 0; i < tdesc->reg_defs.size (); ++i) > { > - if (register_status[i] == REG_VALID) > + if (register_status[i] == REG_VALID > + || register_status[i] == REG_DIRTY) > { > bin2hex (regs, buf, register_size (tdesc, i)); > buf += register_size (tdesc, i) * 2; > @@ -258,9 +276,12 @@ regcache::registers_from_string (const char *buf) > if (len > tdesc->registers_size * 2) > len = tdesc->registers_size * 2; > } > - hex2bin (buf, registers, len / 2); > - /* All register data have been re-written. Update the statuses. */ > - memset (register_status, REG_VALID, tdesc->reg_defs.size ()); > + > + unsigned char *new_regs = > + (unsigned char *) alloca (tdesc->registers_size); I would prefer to use a gdb::byte_vector here, instead of alloca. Simon