From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 44378 invoked by alias); 25 Jun 2019 02:58:20 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 44366 invoked by uid 89); 25 Jun 2019 02:58:20 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-6.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=H*f:sk:hC72QcX, H*i:CAFiYyc3, H*i:ZKq_JCtAQVqNb8y, H*i:sk:hC72QcX X-HELO: mail-qt1-f193.google.com Received: from mail-qt1-f193.google.com (HELO mail-qt1-f193.google.com) (209.85.160.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 25 Jun 2019 02:58:18 +0000 Received: by mail-qt1-f193.google.com with SMTP id m29so16888284qtu.1 for ; Mon, 24 Jun 2019 19:58:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=5x5iiyLGpasIlmPdB2ODwSndZdCReIs5mWzo9rHwJws=; b=CA135LbekGcgz8MH40bAFetDiYy1TyG9Hk5t0yt+xLI9fIY9KU4F6EeIv9IczmjiPr Pf30aVeZ9SEHQqHCtQ4xfj4WtE1vXkipeBfDvKAVmu75WorPrJ7UKxc65FFIT6jFQBik tpSdcOe3Xx5vCA9cLF5pRG1taPmwFA+wBPJ39SkUdV4au78JzAHx6YbL1dgL8yB9zHO8 V/zXVgBsubMZIN7Ki40zM2jn1oFEbXx3eyyXKG+Bmfe6xEbVgBHJ6CVdPcwmxv6Rcvbv IJLzPf5Iu5HnnnNy5PsAtjbreW0OFpu1++VeiixOMPa2m5m2wLbUBc3TjzY4ASDr4Vcy zVuA== Return-Path: Received: from [192.168.0.41] (75-166-109-122.hlrn.qwest.net. [75.166.109.122]) by smtp.gmail.com with ESMTPSA id s23sm9141389qtj.56.2019.06.24.19.58.15 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jun 2019 19:58:15 -0700 (PDT) Subject: Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923) To: Richard Biener Cc: Jonathan Wakely , gcc mailing list References: <973790d7-7564-2f5a-100f-47bd930413c7@gmail.com> From: Martin Sebor Message-ID: <34f3eb35-c8d1-85a3-7c2f-fc86241548cd@gmail.com> Date: Tue, 25 Jun 2019 02:58:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2019-06/txt/msg00292.txt.bz2 On 6/24/19 11:42 AM, Richard Biener wrote: > On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: >> >> On 6/24/19 6:11 AM, Richard Biener wrote: >>> On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor wrote: >>>> >>>> On 6/21/19 6:06 AM, Richard Biener wrote: >>>>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor wrote: >>>>>> >>>>>> Bug 90923 shows that even though GCC hash-table based containers >>>>>> like hash_map can be instantiated on types with user-defined ctors >>>>>> and dtors they invoke the dtors of such types without invoking >>>>>> the corresponding ctors. >>>>>> >>>>>> It was thanks to this bug that I spent a day debugging "interesting" >>>>>> miscompilations during GCC bootstrap (in fairness, it was that and >>>>>> bug 90904 about auto_vec copy assignment/construction also being >>>>>> hosed even for POD types). >>>>>> >>>>>> The attached patch corrects the hash_map and hash_set templates >>>>>> to invoke the ctors of the elements they insert and makes them >>>>>> (hopefully) safe to use with non-trivial user-defined types. >>>>> >>>>> Hum. I am worried about the difference of assignment vs. construction >>>>> in ::put() >>>>> >>>>> + bool ins = hash_entry::is_empty (*e); >>>>> + if (ins) >>>>> + { >>>>> + e->m_key = k; >>>>> + new ((void *) &e->m_value) Value (v); >>>>> + } >>>>> + else >>>>> + e->m_value = v; >>>>> >>>>> why not invoke the dtor on the old value and then the ctor again? >>>> >>>> It wouldn't work for self-assignment: >>>> >>>> Value &y = m.get_or_insert (key); >>>> m.put (key, y); >>>> >>>> The usual rule of thumb for writes into containers is to use >>>> construction when creating a new element and assignment when >>>> replacing the value of an existing element. >>>> >>>> Which reminds me that the hash containers, despite being copy- >>>> constructible (at least for POD types, they aren't for user- >>>> defined types), also aren't safe for assignment even for PODs. >>>> I opened bug 90959 for this. Until the assignment is fixed >>>> I made it inaccessibe in the patch (I have fixed the copy >>>> ctor to DTRT for non-PODs). >>>> >>>>> How is an empty hash_entry constructed? >>>> >>>> In hash_table::find_slot_with_hash simply by finding an empty >>>> slot and returning a pointer to it. The memory for the slot >>>> is marked "empty" by calling the Traits::mark_empty() function. >>>> >>>> The full type of hash_map is actually >>>> >>>> hash_map>>> Value, >>>> simple_hashmap_traits, >>>> Value> >>>> >>>> and simple_hashmap_traits delegates it to default_hash_traits >>>> whose mark_empty() just clears the void*, leaving the Value >>>> part uninitialized. That makes sense because we don't want >>>> to call ctors for empty entries. I think the questions one >>>> might ask if one were to extend the design are: a) what class >>>> should invoke the ctor/assignment and b) should it do it >>>> directly or via the traits? >>>> >>>>> ::remove() doesn't >>>>> seem to invoke the dtor either, instead it relies on the >>>>> traits::remove function? >>>> >>>> Yes. There is no Traits::construct or assign or copy. We >>>> could add them but I'm not sure I see to what end (there could >>>> be use cases, I just don't know enough about these classes to >>>> think of any). >>>> >>>> Attached is an updated patch with the additional minor fixes >>>> mentioned above. >>>> >>>> Martin >>>> >>>> PS I think we would get much better results by adopting >>>> the properly designed and tested standard library containers >>>> than by spending time trying to improve the design of these >>>> legacy classes. For simple uses that don't need to integrate >>>> with the GC machinery the standard containers should be fine >>>> (plus, it'd provide us with greater motivation to improve >>>> them and the code GCC emits for their uses). Unfortunately, >>>> to be able to use the hash-based containers we would need to >>>> upgrade to C++ 11. Isn't it time yet? >>> >>> I don't think so. I'm also not sure if C++11 on its own is desirable >>> or if it should be C++14 or later at that point. SLES 12 has GCC 4.8 >>> as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7. >>> SLES 11 already struggles currently (GCC 4.3) but I'd no longer >>> consider that important enough. >>> >>> Note any such change to libstdc++ containers should be complete >>> and come with both code-size and compile-time and memory-usage >>> measurements (both of GCC and other apps of course). >> >> Can I go ahead and commit the patch? > > I think we need to document the requirements on Value classes better. > > @@ -177,7 +185,10 @@ public: > INSERT); > bool ins = Traits::is_empty (*e); > if (ins) > - e->m_key = k; > + { > + e->m_key = k; > + new ((void *)&e->m_value) Value (); > + } > > this now requires a default constructor and I always forget about > differences between the different form of initializations -- for a POD, > does this zero the entry? Yes, the /value initialization/ above (via Value ()) zeroes out PODs, and it doesn't need a ctor. The placement new syntax should not require any changes to existing code. Does this updated comment for hash_map make it clearer? /* KeyId must be a trivial (POD) type. Value may be non-trivial (non-POD). Inserted elements are value-initialized either to zero for POD types or by invoking their default ctor. Removed elements are destroyed by invoking their dtor. On hash_map destruction all elements are removed. */ Martin > Otherwise looks OK to me - I was hoping Jonathan would chime in here. > > Richard. > >> Martin