From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=8Ijp=56=simark.ca=simark@sourceware.org>
Received: from simark.ca (simark.ca [158.69.221.121])
	by sourceware.org (Postfix) with ESMTPS id 163EE3858C52
	for <gdb-patches@sourceware.org>; Thu,  2 Feb 2023 20:51:32 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 163EE3858C52
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca
Received: from [10.0.0.11] (unknown [217.28.27.60])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPSA id B6DAF1E110;
	Thu,  2 Feb 2023 15:51:30 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail;
	t=1675371091; bh=8FGeu2myX7qJxiaSNu7hc8nBaQJ43YYivp6A8jhI5bo=;
	h=Date:Subject:To:Cc:References:From:In-Reply-To:From;
	b=O8QeybHJrCWSYeYMQbEidrwXQLXbjUQ50kBGX4sHf919Sb1ZVIdF1H12XTM6lV0cA
	 rxgSKnn5qeVo/H1yeglm4YCXCJZEeLDA99voCYy1kl1vEI3GVjeYm4yA908lIMXRwM
	 razsAbJZzBo0EG6LblGqunJYtdoPigVAPnfq85hc=
Message-ID: <45bd55f5-d33b-58f7-0ac6-13d31c4d03f5@simark.ca>
Date: Thu, 2 Feb 2023 15:51:29 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
Subject: Re: [PATCH v3 5/8] gdbserver: Transmit target description ID in
 thread list and stop reply
To: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Cc: Andrew Burgess <aburgess@redhat.com>,
 Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>,
 Simon Marchi <simon.marchi@efficios.com>
References: <20230130044518.3322695-1-thiago.bauermann@linaro.org>
 <20230130044518.3322695-6-thiago.bauermann@linaro.org>
 <87bkmdtp4n.fsf@redhat.com> <502f0bd0-9b18-9a31-0094-7a9bd4778bd2@simark.ca>
 <87r0v7alq1.fsf@linaro.org>
Content-Language: en-US
From: Simon Marchi <simark@simark.ca>
In-Reply-To: <87r0v7alq1.fsf@linaro.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gdb-patches.sourceware.org>


On 2/2/23 14:52, Thiago Jung Bauermann wrote:
> 
> Simon Marchi <simark@simark.ca> writes:
> 
>>> Having read some of the later patches, I have some additional thoughts
>>> here:
>>>
>>> I think we should make it explicit here that IDs are connection wide,
>>> not per-process.  We should also make it clear that GDB might[1] cache
>>> target descriptions per remote connection, and so a remote target should
>>> not reuse a target description ID except where the target description is
>>> identical.
> 
> Ah, good point. I will make that clarification.
> 
>>> [1] I say "GDB might" here because if we say "GDB will" then this would
>>> imply each target description will only be asked for once.  And I
>>> figure, why be overly restrictive.
>>
>> Thanks for pointing this out, I had the same thought while reading the
>> patch.
>>
>> In my original idea, I imagined that target description IDs could be
>> some hashes computed from the XML content (a bit like git hashes or ELF
>> build IDs), such that a given target description would always have the
>> same ID.  This would give clients the possibility to cache target
>> descriptions locally, a bit like the index cache.  It did sound nice,
>> but perhaps it's not really important.
> 
> Ah, sorry I misunderstood this part of your suggestion. I thought that
> the caching was supposed to be limited to the duration of the connection,
> and thus the m_tdescs map in struct remote state would be enough to
> provide that functionality. Do you mean that the cache should be on
> disk, so that it survives GDB quitting? I can look into that if you want
> and implement it, if not complicated.

To be clear, I'm not asking you to implement an on-disk cache, I'm just
trying to think about what the limitations of the proposed solution are.
Because mistakes or shortcomings introduced in the remote protocol (like
any API / ABI meant to be stable) are difficult to fix afterwards.  If
there is a chance we want to do an on-disk cache (or share a cache
between multiple concurrent connections), we should think about that
now.

On one hand, perhaps we consider that target descriptions are small
enough that there's no point in having a persistent cache like that.
The time to fetch each target description once per connection is
probably insignificant.

On the other hand, perhaps doing the hash approach is easy enough that
we might as well do it, and that just leaves more doors open.  In my
mind, as a start, we could just pass the XML of the tdesc through a
sha256 function, possibly doing something more sophisticated later (this
means the hash of a given tdesc could change over time, but still two
identical hashes mean identical tdescs).  However, I don't think there
are sha256 functions built in on the OSes we want to support, so we'd
have to depend on some external library.  And that complicates things
quite a bit...  We already have a CRC32 function in gdbserver, but I
don't know if that's good enough to be confident there won't be
collisions.

> In this case, I think the tdesc ID should be of the format
> <prefix>:<hex number>, so that <prefix> can be “sha1”, “sha256” etc to
> indicate that <hex number> is a hash with the given algo, or even
> “number” to indicate that it's a simple integer like it is today.
> 
> Perhaps we can do the prefix thing even if not implementing the cache,
> to leave the possibility of adding it in the future?

I think I would choose between hash and number, but not do both, I don't
see the need to have both.

Simon