From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 63535 invoked by alias); 25 Jun 2019 20:30:45 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 63464 invoked by uid 89); 25 Jun 2019 20:30:40 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=aval, H*M:ba9e, 1346 X-HELO: mail-qt1-f194.google.com Received: from mail-qt1-f194.google.com (HELO mail-qt1-f194.google.com) (209.85.160.194) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 25 Jun 2019 20:30:37 +0000 Received: by mail-qt1-f194.google.com with SMTP id p15so19971407qtl.3 for ; Tue, 25 Jun 2019 13:30:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=qJwAUbSt0+/LBuPRJ0ezOKBIu7MBcjkRUd2daPc3fEQ=; b=dV223B82ijs0mNzxkoBgQPdD5UScFsWEJfG6dMc/LLZmjogROsAsJ7MWOCAIU0Nl72 LrmplMkV0aMTT9aYe445V1r7JJNZ74QBvHLmqMjfO33D7Ul+EryvOCQyiPIKKeVym2UD Wvl+lo3+RKELdcsdzk4YGbUw1ekBAIiWzK/kQUxj37nLjvmc9Vv9ioxe1IKThWgeEg0V iUZYaBP7agxFQMW+6WBDKeFi+zPLqXefZM3IcGEGN+atAB9kLRuECvgoKdZPdybcYniS azjXEdszNxrz8enky+5+xedzjQu8INPrmI0PSwe/u+NRLrHJOplanz1dD+CjYs4tNt+m niww== Return-Path: Received: from [192.168.0.41] (75-166-109-122.hlrn.qwest.net. [75.166.109.122]) by smtp.gmail.com with ESMTPSA id e1sm8625605qtb.52.2019.06.25.13.30.33 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Tue, 25 Jun 2019 13:30:34 -0700 (PDT) Subject: Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923) To: Jonathan Wakely , Richard Biener Cc: gcc mailing list References: <973790d7-7564-2f5a-100f-47bd930413c7@gmail.com> <20190625095328.GA30349@redhat.com> From: Martin Sebor Message-ID: <33e05a72-ba9e-ff20-bff5-6e8ccfdbc5b2@gmail.com> Date: Tue, 25 Jun 2019 20:30:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190625095328.GA30349@redhat.com> Content-Type: multipart/mixed; boundary="------------1616A26DBA49C162EF014014" X-IsSubscribed: yes X-SW-Source: 2019-06/txt/msg00311.txt.bz2 This is a multi-part message in MIME format. --------------1616A26DBA49C162EF014014 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-length: 6927 On 6/25/19 3:53 AM, Jonathan Wakely wrote: > On 24/06/19 19:42 +0200, Richard Biener wrote: >> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: >>> >>> On 6/24/19 6:11 AM, Richard Biener wrote: >>> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor wrote: >>> >> >>> >> On 6/21/19 6:06 AM, Richard Biener wrote: >>> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor >>> wrote: >>> >>>> >>> >>>> Bug 90923 shows that even though GCC hash-table based containers >>> >>>> like hash_map can be instantiated on types with user-defined ctors >>> >>>> and dtors they invoke the dtors of such types without invoking >>> >>>> the corresponding ctors. >>> >>>> >>> >>>> It was thanks to this bug that I spent a day debugging >>> "interesting" >>> >>>> miscompilations during GCC bootstrap (in fairness, it was that and >>> >>>> bug 90904 about auto_vec copy assignment/construction also being >>> >>>> hosed even for POD types). >>> >>>> >>> >>>> The attached patch corrects the hash_map and hash_set templates >>> >>>> to invoke the ctors of the elements they insert and makes them >>> >>>> (hopefully) safe to use with non-trivial user-defined types. >>> >>> >>> >>> Hum.  I am worried about the difference of assignment vs. >>> construction >>> >>> in ::put() >>> >>> >>> >>> +      bool ins = hash_entry::is_empty (*e); >>> >>> +      if (ins) >>> >>> +       { >>> >>> +         e->m_key = k; >>> >>> +         new ((void *) &e->m_value) Value (v); >>> >>> +       } >>> >>> +      else >>> >>> +       e->m_value = v; >>> >>> >>> >>> why not invoke the dtor on the old value and then the ctor again? >>> >> >>> >> It wouldn't work for self-assignment: >>> >> >>> >>     Value &y = m.get_or_insert (key); >>> >>     m.put (key, y); >>> >> >>> >> The usual rule of thumb for writes into containers is to use >>> >> construction when creating a new element and assignment when >>> >> replacing the value of an existing element. >>> >> >>> >> Which reminds me that the hash containers, despite being copy- >>> >> constructible (at least for POD types, they aren't for user- >>> >> defined types), also aren't safe for assignment even for PODs. >>> >> I opened bug 90959 for this.  Until the assignment is fixed >>> >> I made it inaccessibe in the patch (I have fixed the copy >>> >> ctor to DTRT for non-PODs). >>> >> >>> >>> How is an empty hash_entry constructed? >>> >> >>> >> In hash_table::find_slot_with_hash simply by finding an empty >>> >> slot and returning a pointer to it.  The memory for the slot >>> >> is marked "empty" by calling the Traits::mark_empty() function. >>> >> >>> >> The full type of hash_map is actually >>> >> >>> >>     hash_map>> >>              Value, >>> >>              simple_hashmap_traits, >>> >>                                    Value> >>> >> >>> >> and simple_hashmap_traits delegates it to default_hash_traits >>> >> whose mark_empty() just clears the void*, leaving the Value >>> >> part uninitialized.  That makes sense because we don't want >>> >> to call ctors for empty entries.  I think the questions one >>> >> might ask if one were to extend the design are: a) what class >>> >> should invoke the ctor/assignment and b) should it do it >>> >> directly or via the traits? >>> >> >>> >>> ::remove() doesn't >>> >>> seem to invoke the dtor either, instead it relies on the >>> >>> traits::remove function? >>> >> >>> >> Yes.  There is no Traits::construct or assign or copy.  We >>> >> could add them but I'm not sure I see to what end (there could >>> >> be use cases, I just don't know enough about these classes to >>> >> think of any). >>> >> >>> >> Attached is an updated patch with the additional minor fixes >>> >> mentioned above. >>> >> >>> >> Martin >>> >> >>> >> PS I think we would get much better results by adopting >>> >> the properly designed and tested standard library containers >>> >> than by spending time trying to improve the design of these >>> >> legacy classes.  For simple uses that don't need to integrate >>> >> with the GC machinery the standard containers should be fine >>> >> (plus, it'd provide us with greater motivation to improve >>> >> them and the code GCC emits for their uses).  Unfortunately, >>> >> to be able to use the hash-based containers we would need to >>> >> upgrade to C++ 11.  Isn't it time yet? >>> > >>> > I don't think so.  I'm also not sure if C++11 on its own is desirable >>> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8 >>> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7. >>> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer >>> > consider that important enough. >>> > >>> > Note any such change to libstdc++ containers should be complete >>> > and come with both code-size and compile-time and memory-usage >>> > measurements (both of GCC and other apps of course). >>> >>> Can I go ahead and commit the patch? >> >> I think we need to document the requirements on Value classes better. >> >> @@ -177,7 +185,10 @@ public: >>                                                   INSERT); >>       bool ins = Traits::is_empty (*e); >>       if (ins) >> -       e->m_key = k; >> +       { >> +         e->m_key = k; >> +         new ((void *)&e->m_value) Value (); >> +       } >> >> this now requires a default constructor and I always forget about >> differences between the different form of initializations -- for a POD, >> does this zero the entry? >> >> Otherwise looks OK to me - I was hoping Jonathan would chime in here. > > The patch looks good to me. I 100% agree with Martin that put() should > not destroy an existing element and recreate a new one. Assignment is > the right way to update the value. > > And Value() is the correct initialization. > > The only change I'd suggest is something to enforce the "KeyId must be > a trivial (POD) type" requirement: > > #if __GNUC__ >= 6 && __cplusplus >= 201103L > static_assert(__is_pod(KeyId), "KeyId must be a trivial (POD) type"); > #endif > > This could actually be added for 4.7 and up (__is_pod is available > earlier, but __cplusplus isn't set correctly before that) but GCC 6 is > when we started to default to C++14. If the check only happens for > people bootstrapping with new versions of GCC it's still going to > catch misuses with non-POD types. I've updated the comments explaining the constraints in more detail and added the static assert to hash_table where it covers both has_map and hash_set. FWIW, I don't imagine anyone instantiating these containers on a non-trivial key types in GCC but having the assert there doesn't hurt anything. Martin --------------1616A26DBA49C162EF014014 Content-Type: text/x-patch; name="gcc-90923.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gcc-90923.diff" Content-length: 11332 Index: gcc/hash-map-tests.c =================================================================== --- gcc/hash-map-tests.c (revision 272657) +++ gcc/hash-map-tests.c (working copy) @@ -103,6 +103,139 @@ test_map_of_strings_to_int () ASSERT_EQ (1, string_map.elements ()); } +typedef struct hash_map_test_val_t +{ + static int ndefault; + static int ncopy; + static int nassign; + static int ndtor; + + hash_map_test_val_t () + : ptr (&ptr) + { + ++ndefault; + } + + hash_map_test_val_t (const hash_map_test_val_t &) + : ptr (&ptr) + { + ++ncopy; + } + + hash_map_test_val_t& operator= (const hash_map_test_val_t &) + { + ++nassign; + return *this; + } + + ~hash_map_test_val_t () + { + gcc_assert (ptr == &ptr); + ++ndtor; + } + + void *ptr; +} val_t; + +int val_t::ndefault; +int val_t::ncopy; +int val_t::nassign; +int val_t::ndtor; + +static void +test_map_of_type_with_ctor_and_dtor () +{ + typedef hash_map Map; + + { + /* Test default ctor. */ + Map m; + (void)&m; + } + + ASSERT_TRUE (val_t::ndefault == 0); + ASSERT_TRUE (val_t::ncopy == 0); + ASSERT_TRUE (val_t::nassign == 0); + ASSERT_TRUE (val_t::ndtor == 0); + + { + /* Test single insertion. */ + Map m; + void *p = &p; + m.get_or_insert (p); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + + { + /* Test copy ctor. */ + Map m1; + void *p = &p; + val_t &rv1 = m1.get_or_insert (p); + + int ncopy = val_t::ncopy; + int nassign = val_t::nassign; + + Map m2 (m1); + val_t *pv2 = m2.get (p); + + ASSERT_TRUE (ncopy + 1 == val_t::ncopy); + ASSERT_TRUE (nassign == val_t::nassign); + + ASSERT_TRUE (&rv1 != pv2); + ASSERT_TRUE (pv2->ptr == &pv2->ptr); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + +#if 0 /* Avoid testing until bug 90959 is fixed. */ + { + /* Test copy assignment into an empty map. */ + Map m1; + void *p = &p; + val_t &rv1 = m1.get_or_insert (p); + + int ncopy = val_t::ncopy; + int nassign = val_t::nassign; + + Map m2; + m2 = m1; + val_t *pv2 = m2.get (p); + + ASSERT_TRUE (ncopy == val_t::ncopy); + ASSERT_TRUE (nassign + 1 == val_t::nassign); + + ASSERT_TRUE (&rv1 != pv2); + ASSERT_TRUE (pv2->ptr == &pv2->ptr); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + +#endif + + { + Map m; + void *p = &p, *q = &q; + val_t &v1 = m.get_or_insert (p); + val_t &v2 = m.get_or_insert (q); + + ASSERT_TRUE (v1.ptr == &v1.ptr && &v2.ptr == v2.ptr); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + + { + Map m; + void *p = &p, *q = &q; + m.get_or_insert (p); + m.remove (p); + m.get_or_insert (q); + m.remove (q); + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + } +} + /* Run all of the selftests within this file. */ void @@ -109,6 +242,7 @@ void hash_map_tests_c_tests () { test_map_of_strings_to_int (); + test_map_of_type_with_ctor_and_dtor (); } } // namespace selftest Index: gcc/hash-map.h =================================================================== --- gcc/hash-map.h (revision 272657) +++ gcc/hash-map.h (working copy) @@ -21,8 +21,14 @@ along with GCC; see the file COPYING3. If not see #ifndef hash_map_h #define hash_map_h -template +/* KeyId must be a trivial (POD) type. Value may be non-trivial + (non-POD). Inserted elements are value-initialized either to + zero for POD types or by invoking their default ctor. Removed + elements are destroyed by invoking their dtor. On hash_map + destruction all elements are removed. Objects of hash_map type + are copy-constructible but not assignable. */ + +template class GTY((user)) hash_map { typedef typename Traits::key_type Key; @@ -151,12 +157,16 @@ class GTY((user)) hash_map { hash_entry *e = m_table.find_slot_with_hash (k, Traits::hash (k), INSERT); - bool existed = !hash_entry::is_empty (*e); - if (!existed) - e->m_key = k; + bool ins = hash_entry::is_empty (*e); + if (ins) + { + e->m_key = k; + new ((void *) &e->m_value) Value (v); + } + else + e->m_value = v; - e->m_value = v; - return existed; + return !ins; } /* if the passed in key is in the map return its value otherwise NULL. */ @@ -168,8 +178,8 @@ class GTY((user)) hash_map } /* Return a reference to the value for the passed in key, creating the entry - if it doesn't already exist. If existed is not NULL then it is set to false - if the key was not previously in the map, and true otherwise. */ + if it doesn't already exist. If existed is not NULL then it is set to + false if the key was not previously in the map, and true otherwise. */ Value &get_or_insert (const Key &k, bool *existed = NULL) { @@ -177,7 +187,10 @@ class GTY((user)) hash_map INSERT); bool ins = Traits::is_empty (*e); if (ins) - e->m_key = k; + { + e->m_key = k; + new ((void *)&e->m_value) Value (); + } if (existed != NULL) *existed = !ins; Index: gcc/hash-set-tests.c =================================================================== --- gcc/hash-set-tests.c (revision 272657) +++ gcc/hash-set-tests.c (working copy) @@ -134,6 +134,159 @@ test_set_of_strings () ASSERT_EQ (2, t.elements ()); } +typedef struct hash_set_test_value_t +{ + static int ndefault; + static int ncopy; + static int nassign; + static int ndtor; + + hash_set_test_value_t (int v = 1): pval (&val), val (v) + { + ++ndefault; + } + + hash_set_test_value_t (const hash_set_test_value_t &rhs) + : pval (&val), val (rhs.val) + { + ++ncopy; + } + + hash_set_test_value_t& operator= (const hash_set_test_value_t &rhs) + { + ++nassign; + val = rhs.val; + return *this; + } + + ~hash_set_test_value_t () + { + /* Verify that the value hasn't been corrupted. */ + gcc_assert (*pval > 0); + gcc_assert (pval == &val); + *pval = -3; + ++ndtor; + } + + int *pval; + int val; +} val_t; + +int val_t::ndefault; +int val_t::ncopy; +int val_t::nassign; +int val_t::ndtor; + +struct value_hash_traits: int_hash +{ + typedef int_hash base_type; + typedef val_t value_type; + typedef value_type compare_type; + + static hashval_t hash (const value_type &v) + { + return base_type::hash (v.val); + } + + static bool equal (const value_type &a, const compare_type &b) + { + return base_type::equal (a.val, b.val); + } + + static void mark_deleted (value_type &v) + { + base_type::mark_deleted (v.val); + } + + static void mark_empty (value_type &v) + { + base_type::mark_empty (v.val); + } + + static bool is_deleted (const value_type &v) + { + return base_type::is_deleted (v.val); + } + + static bool is_empty (const value_type &v) + { + return base_type::is_empty (v.val); + } + + static void remove (value_type &v) + { + v.~value_type (); + } +}; + +static void +test_set_of_type_with_ctor_and_dtor () +{ + typedef hash_set Set; + + { + Set s; + (void)&s; + } + + ASSERT_TRUE (val_t::ndefault == 0); + ASSERT_TRUE (val_t::ncopy == 0); + ASSERT_TRUE (val_t::nassign == 0); + ASSERT_TRUE (val_t::ndtor == 0); + + { + Set s; + ASSERT_EQ (false, s.add (val_t ())); + ASSERT_EQ (true, 1 == s.elements ()); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + + { + Set s; + ASSERT_EQ (false, s.add (val_t ())); + ASSERT_EQ (true, s.add (val_t ())); + ASSERT_EQ (true, 1 == s.elements ()); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); + + { + Set s; + val_t v1 (1), v2 (2), v3 (3); + int ndefault = val_t::ndefault; + int nassign = val_t::nassign; + + ASSERT_EQ (false, s.add (v1)); + ASSERT_EQ (true, s.contains (v1)); + ASSERT_EQ (true, 1 == s.elements ()); + + ASSERT_EQ (false, s.add (v2)); + ASSERT_EQ (true, s.contains (v2)); + ASSERT_EQ (true, 2 == s.elements ()); + + ASSERT_EQ (false, s.add (v3)); + ASSERT_EQ (true, s.contains (v3)); + ASSERT_EQ (true, 3 == s.elements ()); + + ASSERT_EQ (true, s.add (v2)); + ASSERT_EQ (true, s.contains (v2)); + ASSERT_EQ (true, 3 == s.elements ()); + + s.remove (v2); + ASSERT_EQ (true, 2 == s.elements ()); + s.remove (v3); + ASSERT_EQ (true, 1 == s.elements ()); + + /* Verify that no default ctors or assignment operators have + been called. */ + ASSERT_EQ (true, ndefault == val_t::ndefault); + ASSERT_EQ (true, nassign == val_t::nassign); + } + + ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor); +} + /* Run all of the selftests within this file. */ void @@ -140,6 +293,7 @@ void hash_set_tests_c_tests () { test_set_of_strings (); + test_set_of_type_with_ctor_and_dtor (); } } // namespace selftest Index: gcc/hash-set.h =================================================================== --- gcc/hash-set.h (revision 272657) +++ gcc/hash-set.h (working copy) @@ -21,6 +21,13 @@ along with GCC; see the file COPYING3. If not see #ifndef hash_set_h #define hash_set_h +/* KeyId must be a trivial (POD) type. Traits::value_type may be + non-trivial (non-POD). Inserted elements are value-initialized + either to zero for POD types or by invoking their default ctor. + Removed elements are destroyed by invoking their dtor. On + hash_set destruction all elements are removed. Objects of + hash_set type are copy-constructible but not assignable. */ + template > class hash_set @@ -48,7 +55,7 @@ class hash_set Key *e = m_table.find_slot_with_hash (k, Traits::hash (k), INSERT); bool existed = !Traits::is_empty (*e); if (!existed) - *e = k; + new (e) Key (k); return existed; } Index: gcc/hash-table.h =================================================================== --- gcc/hash-table.h (revision 272657) +++ gcc/hash-table.h (working copy) @@ -373,6 +373,11 @@ class hash_table typedef typename Descriptor::value_type value_type; typedef typename Descriptor::compare_type compare_type; +#if __GNUC__ >= 6 && __cplusplus >= 201103L + static_assert (__is_pod (compare_type), + "compare_type must be a trivial (POD) type"); +#endif + public: explicit hash_table (size_t, bool ggc = false, bool sanitize_eq_and_hash = true, @@ -505,6 +510,9 @@ class hash_table } private: + /* FIXME: Make the class assignable. See pr90959. */ + void operator= (hash_table&); + template friend void gt_ggc_mx (hash_table *); template friend void gt_pch_nx (hash_table *); template friend void @@ -657,7 +665,7 @@ hash_table::hash_tabl if (is_deleted (entry)) mark_deleted (nentries[i]); else if (!is_empty (entry)) - nentries[i] = entry; + new ((void*) (nentries + i)) value_type (entry); } m_entries = nentries; } --------------1616A26DBA49C162EF014014--