From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 131003858D32 for ; Fri, 12 Jan 2024 19:12:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 131003858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 131003858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=158.69.221.121 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705086768; cv=none; b=x8dsqr758NxQeyscMUA7x/t8PX6gsG522+f3YSsuA7Q7SLuPdRCXOjMfrzEsvkj33kBFl+kAPkYrSA8qJPH8zBRMGM58inEdhUu6r8w46ce6xYPw+UMnAP2s2UytpaTEWPCvw1hlJ17UZOX0xrkqw8BnZkX8wsobfhxSAxxtsKQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705086768; c=relaxed/simple; bh=4gvY5v8u4ilZGKE6Kn2C21JJmRRbG7GjSnFE9T7ttQA=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=XjSNje3HX1fQP4QAHqh0idehcboWtxiENCuXHRbdro+C7lEDk8aObYIPN63PupElZJNRPJGqL6XjHI7g6a2fXZN0Wy0fEfcRAJN2NZaggJHtmhTc8MBxNxWMzEA9dipZCiIVY269qSm38NNSXL7abRPrpMLWYuTp0QroW3zyHcI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1705086765; bh=4gvY5v8u4ilZGKE6Kn2C21JJmRRbG7GjSnFE9T7ttQA=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=QcHeSmOaij5uhKg4X2ihNtbqgpftA6HxqvEfWOSWmxWwyaqTXQHwcJVdWezr9DGrW Uqcn7NCb7sqD/TbcVr0WnI6wDCGIcu1MT6SjdeD3OH6IUVU/YzMYwNpz7DsFVQrjI+ bekD1STszMBoy3WjOnitvlIu4u4Q6f2lhTBbLhLE= Received: from [172.16.0.192] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 661821E098; Fri, 12 Jan 2024 14:12:45 -0500 (EST) Message-ID: Date: Fri, 12 Jan 2024 14:12:44 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 00/19] Add hash table to gdbsupport To: Tom Tromey Cc: gdb-patches@sourceware.org References: <20230407-t-robin-hood-hash-v1-0-900d93ef1510@tromey.com> <87jzpoft0q.fsf@tromey.com> <87v87zpwuw.fsf@tromey.com> <87r0imo1ie.fsf@tromey.com> Content-Language: fr From: Simon Marchi In-Reply-To: <87r0imo1ie.fsf@tromey.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 1/12/24 13:22, Tom Tromey wrote: > Simon> I spent a bit of time reading the interface of your hash table, and that > Simon> was a point I found unfortunate. The type V must have a default > Simon> constructor and an "empty" state (I guess that's what you mean by > Simon> sentinel), even if it doesn't make sense for the rest of the program. > > It's maybe worth noting that this is not a step backward from htab_t, > which only stores pointers. That's true, in the sense that you can still decide to store pointers in the gdb::traited_hash_table. But if you want to change it so that it stores objects directly (which I guess is an advantage because it's one less indirection), then you have to introduce this new "invalid" or "empty" state for objects for that class. It therefore removes the guarantee you previously had for objects of that class to always be valid, and it becomes possible for objects used elsewhere to be (by mistake or not) in that new "empty" state. It's one more thing to reason about, and some more potential pitfalls. > Simon> How do other implementations (of open addessing hash tables) typically > Simon> deal with this? From what I recally, other implementations I have used > Simon> in the past didn't have this requirement. > > I don't know in general. I looked at one and it uses a separate vector > of flag bytes to indicate which values are valid. I looked at this one: https://github.com/martinus/unordered_dense/tree/main >From what I understand, the underlying storage is a vector, and they somehow make it so all vector elements are always occupied, there are no empty slots. > Simon> And that makes me think that a question I did not see answered is: I > Simon> don't want to undermine your work, but what is the rationale for > Simon> implementing it ourselves? > > In my view, integrating an external C++ package like this is pain both > legally (at least, you have to find one with the appropriate license) > and technically. Even just trying to integrate the GCC hash table was a > pain. The typical problem, IMO, is that it's unusual to write a > standalone class like this, instead one introduces dependencies on a > variety of other things. I did have some experiences (with folly notably) where trying to use one data structure pulled in a bunch of files and it quickly became unmanageable, but that was not the norm. But in my experience, most of the time it's just one header file to copy over, with no dependency. The one I linked above seems to be just that (I gave it a quick try in GDB). And the license being MIT, I don't think that's a problem. My understanding is that it's permitted in this direction, and we use other MIT-licensed stuff. Simon