From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 1750A3858D20 for ; Fri, 3 Feb 2023 16:29:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1750A3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675441791; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i98SYegXIoKRifL80rwPu2SJeREcEiE2bt6e4aZNccs=; b=AQcsgWiCs1snWZAMBqKuEDCy0ShP1pQ8zzcscWGIQbqj1hZ5S7VrGLDeV7/rs22DIT7HlP 4x1ptvnQLTOht0b98GA6gq9VtvDZgF/JzVZa2XmCBCtvjVt/MnNTjr554OPAUCygbVlwat 9T67AASKNx+L86Dzwz8xkIy8ytTNtjI= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-590-Ru8Wm73NNoKA9Y6ttrQAgQ-1; Fri, 03 Feb 2023 11:29:50 -0500 X-MC-Unique: Ru8Wm73NNoKA9Y6ttrQAgQ-1 Received: by mail-qt1-f198.google.com with SMTP id v8-20020a05622a144800b003ba0dc5d798so682760qtx.22 for ; Fri, 03 Feb 2023 08:29:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bRBGNxWJXEO5eRBKzFi7AfaD3vYWoeizp7bDAbRbWoY=; b=YtHpwCW6oLZODpp0nd/SFvjyBufzJlebJ/HLnYdLgJCaVoBWL8rZkQvFfpIyauDUV5 dxjrDpdYwRkfLifhCOCG9TYzmXj7lCSqY3G0wBbfLynJYhSO4Vj3yzUOg+lL4WCiJ0V7 MBJ/EmSjGxPB+76jQOeGsBt/Tm/X9pDeLhKuwoofmNLimz68QiAdWLfGEJAjXRIxQ5E5 i84DHX3waraKkWfJGnufTmtH7j86lM9EiFDiiuadGohJpf+yLnOn5qxvl/XePa3QYtP/ NLH2Bx5YAwe1Jyecbn984FTzNIhtg5zgU6ORXr+yq2+JKSpviWESPWdKXS/WZljO3id8 LbLw== X-Gm-Message-State: AO0yUKXvliZUqdiuGtuY03TDKM9zlF4eXH7zVmlzOAT4VMB/lvoB3I8Y /J1Gr28QgXFB/Gan3vJQOLBzJLkYxEwGmu8I4oAc4Oh97DqdxHvpnQ0mm1clBAHl0nf4Mw/bFzr bjDVxDhwQrZ5YwpRnLQD7+w== X-Received: by 2002:a05:622a:44e:b0:3b6:2e12:4d25 with SMTP id o14-20020a05622a044e00b003b62e124d25mr8258093qtx.31.1675441790086; Fri, 03 Feb 2023 08:29:50 -0800 (PST) X-Google-Smtp-Source: AK7set/ljoAfvF+LfM6Xq/gnJ8VXqExrWm45LOQ2MC8pCiObkcOLYfW/L2BnS5uJFUnHobo949GE5w== X-Received: by 2002:a05:622a:44e:b0:3b6:2e12:4d25 with SMTP id o14-20020a05622a044e00b003b62e124d25mr8258065qtx.31.1675441789746; Fri, 03 Feb 2023 08:29:49 -0800 (PST) Received: from localhost (95.72.115.87.dyn.plus.net. [87.115.72.95]) by smtp.gmail.com with ESMTPSA id q1-20020ac84101000000b003b86d9a3700sm1791258qtl.78.2023.02.03.08.29.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Feb 2023 08:29:49 -0800 (PST) From: Andrew Burgess To: Thiago Jung Bauermann , Simon Marchi Cc: Thiago Jung Bauermann via Gdb-patches , Simon Marchi Subject: Re: [PATCH v3 5/8] gdbserver: Transmit target description ID in thread list and stop reply In-Reply-To: <875ycja2n6.fsf@linaro.org> References: <20230130044518.3322695-1-thiago.bauermann@linaro.org> <20230130044518.3322695-6-thiago.bauermann@linaro.org> <87bkmdtp4n.fsf@redhat.com> <502f0bd0-9b18-9a31-0094-7a9bd4778bd2@simark.ca> <87r0v7alq1.fsf@linaro.org> <45bd55f5-d33b-58f7-0ac6-13d31c4d03f5@simark.ca> <875ycja2n6.fsf@linaro.org> Date: Fri, 03 Feb 2023 16:29:47 +0000 Message-ID: <87v8kir9tw.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Thiago Jung Bauermann writes: > Simon Marchi writes: > >> On 2/2/23 14:52, Thiago Jung Bauermann wrote: >>>=20 >>> Simon Marchi writes: >>>=20 >>>>> Having read some of the later patches, I have some additional thought= s >>>>> here: >>>>> >>>>> I think we should make it explicit here that IDs are connection wide, >>>>> not per-process. We should also make it clear that GDB might[1] cach= e >>>>> target descriptions per remote connection, and so a remote target sho= uld >>>>> not reuse a target description ID except where the target description= is >>>>> identical. >>>=20 >>> Ah, good point. I will make that clarification. >>>=20 >>>>> [1] I say "GDB might" here because if we say "GDB will" then this wou= ld >>>>> imply each target description will only be asked for once. And I >>>>> figure, why be overly restrictive. >>>> >>>> Thanks for pointing this out, I had the same thought while reading the >>>> patch. >>>> >>>> In my original idea, I imagined that target description IDs could be >>>> some hashes computed from the XML content (a bit like git hashes or EL= F >>>> build IDs), such that a given target description would always have the >>>> same ID. This would give clients the possibility to cache target >>>> descriptions locally, a bit like the index cache. It did sound nice, >>>> but perhaps it's not really important. >>>=20 >>> Ah, sorry I misunderstood this part of your suggestion. I thought that >>> the caching was supposed to be limited to the duration of the connectio= n, >>> and thus the m_tdescs map in struct remote state would be enough to >>> provide that functionality. Do you mean that the cache should be on >>> disk, so that it survives GDB quitting? I can look into that if you wan= t >>> and implement it, if not complicated. >> >> To be clear, I'm not asking you to implement an on-disk cache, I'm just >> trying to think about what the limitations of the proposed solution are. >> Because mistakes or shortcomings introduced in the remote protocol (like >> any API / ABI meant to be stable) are difficult to fix afterwards. If >> there is a chance we want to do an on-disk cache (or share a cache >> between multiple concurrent connections), we should think about that >> now. > > Indeed, that's a good idea. > >> On one hand, perhaps we consider that target descriptions are small >> enough that there's no point in having a persistent cache like that. >> The time to fetch each target description once per connection is >> probably insignificant. > > I don't have a good intuition about that. My uneducated guess is that > it's not worth it, but as you say I think it's a good idea to leave the > door open if we can. > >> On the other hand, perhaps doing the hash approach is easy enough that >> we might as well do it, and that just leaves more doors open. In my >> mind, as a start, we could just pass the XML of the tdesc through a >> sha256 function, possibly doing something more sophisticated later (this >> means the hash of a given tdesc could change over time, but still two >> identical hashes mean identical tdescs). However, I don't think there >> are sha256 functions built in on the OSes we want to support, so we'd >> have to depend on some external library. And that complicates things >> quite a bit... We already have a CRC32 function in gdbserver, but I >> don't know if that's good enough to be confident there won't be >> collisions. > > I tried to research this a bit, but I'm not confident enough in my > knowledge of the subject to reach a conclusion. I only know enough about > hashes to be dangerous. I did find a hash testsuite called SMhasher > which seems to be well regarded, and its website=C2=B9 has > reports for CRC32=C2=B2, which shows that it fails many collision and > distribution bias tests. So it doesn't look like it is very reliable for > using as a hash. > > We do support libxxhash as an optional dependency, so we could use that > if gdbserver is linked with it. Its SMhasher report=C2=B3 looks good, IUU= C > (but not the 32 bits version though=E2=81=B4). > > We could also vendor coreutils's lib/sha256.[ch] files. They're 660 > lines in total. > >>> In this case, I think the tdesc ID should be of the format >>> :, so that can be =E2=80=9Csha1=E2=80=9D, = =E2=80=9Csha256=E2=80=9D etc to >>> indicate that is a hash with the given algo, or even >>> =E2=80=9Cnumber=E2=80=9D to indicate that it's a simple integer like it= is today. >>>=20 >>> Perhaps we can do the prefix thing even if not implementing the cache, >>> to leave the possibility of adding it in the future? >> >> I think I would choose between hash and number, but not do both, I don't >> see the need to have both. > > I suggested a prefix to designate an integer ID as a way to implement > only this possibility for now, but which would allow implementing hash > IDs later, as additional prefixes. Or another possibility is that if we > later decide to support hashes, we could add a separate protocol feature > to signal that. So, I'd also be in favour of supporting the simple "just a number" scheme. Not everything that talks to GDB is our GDBServer, and not every target might have the right hashing library. I'd hate to force folk to have to implement some hashing code, just to use this feature of the remote protocol. I have a proposal for how to negotiate different ID types without adding a prefix to every ID sent from gdbserver... We don't need to use a prefix in the actual ID. I believe you're already adding a qSupported feature for per-thread-tdesc. The qSupported mechanism already supports value passing. So, we could pass a value back and forth in the qSupported negotiation where GDB and GDBServer can agree on a numbering scheme. If GDB sends out: per-thread-tdesc=3Dvalue Where `value` is a comma separated list of letters, each letter representing a scheme that GDB understands: N: Each ID is a number, H: Each ID is a hash of the target description contents. Then GDBServer replies with: per-thread-tdesc=3Dvalue Where `value` is a single letter drawn from the set of letters that GDB sent out, this is the scheme that GDBServer will be using. Thus, GDBServer might reply with _one_ of the following: per-thread-tdesc=3DN per-thread-tdesc=3DH If GDB only offers a scheme that GDBServer doesn't understand, e.g. GDB sends: per-thread-tdesc=3DX Then GDBServer just doesn't send this feature back, effectively claiming it doesn't support 'per-thread-tdesc'. The great thing about this is that the hard work here is all in writing the documentation. To begin with you _only_ need to support the 'N' scheme. GDB always sends out just 'per-thread-tdesc=3DN', GDBServer just checks that the value is 'N', and replies 'per-thread-tdesc=3DN'. GDB checks that the return value is also 'N', and then we carry on as before. BUT, we now have a planned route to add more complex schemes in the future. Thanks, Andrew