public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jonathan Wakely <jwakely@redhat.com>,
	Richard Biener <richard.guenther@gmail.com>
Cc: gcc mailing list <gcc@gcc.gnu.org>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
Date: Mon, 01 Jul 2019 14:55:00 -0000	[thread overview]
Message-ID: <2ef16731-4b0b-5277-179f-32b4620a612e@gmail.com> (raw)
In-Reply-To: <33e05a72-ba9e-ff20-bff5-6e8ccfdbc5b2@gmail.com>

[Adding gcc-patches]

Richard, do you have any further comments or is the revised patch
good to commit?

Martin

On 6/25/19 2:30 PM, Martin Sebor wrote:
> On 6/25/19 3:53 AM, Jonathan Wakely wrote:
>> On 24/06/19 19:42 +0200, Richard Biener wrote:
>>> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor <msebor@gmail.com> wrote:
>>>>
>>>> On 6/24/19 6:11 AM, Richard Biener wrote:
>>>> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <msebor@gmail.com> 
>>>> wrote:
>>>> >>
>>>> >> On 6/21/19 6:06 AM, Richard Biener wrote:
>>>> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <msebor@gmail.com> 
>>>> wrote:
>>>> >>>>
>>>> >>>> Bug 90923 shows that even though GCC hash-table based containers
>>>> >>>> like hash_map can be instantiated on types with user-defined ctors
>>>> >>>> and dtors they invoke the dtors of such types without invoking
>>>> >>>> the corresponding ctors.
>>>> >>>>
>>>> >>>> It was thanks to this bug that I spent a day debugging 
>>>> "interesting"
>>>> >>>> miscompilations during GCC bootstrap (in fairness, it was that and
>>>> >>>> bug 90904 about auto_vec copy assignment/construction also being
>>>> >>>> hosed even for POD types).
>>>> >>>>
>>>> >>>> The attached patch corrects the hash_map and hash_set templates
>>>> >>>> to invoke the ctors of the elements they insert and makes them
>>>> >>>> (hopefully) safe to use with non-trivial user-defined types.
>>>> >>>
>>>> >>> Hum.  I am worried about the difference of assignment vs. 
>>>> construction
>>>> >>> in ::put()
>>>> >>>
>>>> >>> +      bool ins = hash_entry::is_empty (*e);
>>>> >>> +      if (ins)
>>>> >>> +       {
>>>> >>> +         e->m_key = k;
>>>> >>> +         new ((void *) &e->m_value) Value (v);
>>>> >>> +       }
>>>> >>> +      else
>>>> >>> +       e->m_value = v;
>>>> >>>
>>>> >>> why not invoke the dtor on the old value and then the ctor again?
>>>> >>
>>>> >> It wouldn't work for self-assignment:
>>>> >>
>>>> >>     Value &y = m.get_or_insert (key);
>>>> >>     m.put (key, y);
>>>> >>
>>>> >> The usual rule of thumb for writes into containers is to use
>>>> >> construction when creating a new element and assignment when
>>>> >> replacing the value of an existing element.
>>>> >>
>>>> >> Which reminds me that the hash containers, despite being copy-
>>>> >> constructible (at least for POD types, they aren't for user-
>>>> >> defined types), also aren't safe for assignment even for PODs.
>>>> >> I opened bug 90959 for this.  Until the assignment is fixed
>>>> >> I made it inaccessibe in the patch (I have fixed the copy
>>>> >> ctor to DTRT for non-PODs).
>>>> >>
>>>> >>> How is an empty hash_entry constructed?
>>>> >>
>>>> >> In hash_table::find_slot_with_hash simply by finding an empty
>>>> >> slot and returning a pointer to it.  The memory for the slot
>>>> >> is marked "empty" by calling the Traits::mark_empty() function.
>>>> >>
>>>> >> The full type of hash_map<void*, Value> is actually
>>>> >>
>>>> >>     hash_map<void*,
>>>> >>              Value,
>>>> >>              simple_hashmap_traits<default_hash_traits<void*>,
>>>> >>                                    Value>
>>>> >>
>>>> >> and simple_hashmap_traits delegates it to default_hash_traits
>>>> >> whose mark_empty() just clears the void*, leaving the Value
>>>> >> part uninitialized.  That makes sense because we don't want
>>>> >> to call ctors for empty entries.  I think the questions one
>>>> >> might ask if one were to extend the design are: a) what class
>>>> >> should invoke the ctor/assignment and b) should it do it
>>>> >> directly or via the traits?
>>>> >>
>>>> >>> ::remove() doesn't
>>>> >>> seem to invoke the dtor either, instead it relies on the
>>>> >>> traits::remove function?
>>>> >>
>>>> >> Yes.  There is no Traits::construct or assign or copy.  We
>>>> >> could add them but I'm not sure I see to what end (there could
>>>> >> be use cases, I just don't know enough about these classes to
>>>> >> think of any).
>>>> >>
>>>> >> Attached is an updated patch with the additional minor fixes
>>>> >> mentioned above.
>>>> >>
>>>> >> Martin
>>>> >>
>>>> >> PS I think we would get much better results by adopting
>>>> >> the properly designed and tested standard library containers
>>>> >> than by spending time trying to improve the design of these
>>>> >> legacy classes.  For simple uses that don't need to integrate
>>>> >> with the GC machinery the standard containers should be fine
>>>> >> (plus, it'd provide us with greater motivation to improve
>>>> >> them and the code GCC emits for their uses).  Unfortunately,
>>>> >> to be able to use the hash-based containers we would need to
>>>> >> upgrade to C++ 11.  Isn't it time yet?
>>>> >
>>>> > I don't think so.  I'm also not sure if C++11 on its own is desirable
>>>> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
>>>> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
>>>> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer
>>>> > consider that important enough.
>>>> >
>>>> > Note any such change to libstdc++ containers should be complete
>>>> > and come with both code-size and compile-time and memory-usage
>>>> > measurements (both of GCC and other apps of course).
>>>>
>>>> Can I go ahead and commit the patch?
>>>
>>> I think we need to document the requirements on Value classes better.
>>>
>>> @@ -177,7 +185,10 @@ public:
>>>                                                   INSERT);
>>>       bool ins = Traits::is_empty (*e);
>>>       if (ins)
>>> -       e->m_key = k;
>>> +       {
>>> +         e->m_key = k;
>>> +         new ((void *)&e->m_value) Value ();
>>> +       }
>>>
>>> this now requires a default constructor and I always forget about
>>> differences between the different form of initializations -- for a POD,
>>> does this zero the entry?
>>>
>>> Otherwise looks OK to me - I was hoping Jonathan would chime in here.
>>
>> The patch looks good to me. I 100% agree with Martin that put() should
>> not destroy an existing element and recreate a new one. Assignment is
>> the right way to update the value.
>>
>> And Value() is the correct initialization.
>>
>> The only change I'd suggest is something to enforce the "KeyId must be
>> a trivial (POD) type" requirement:
>>
>> #if __GNUC__ >= 6 && __cplusplus >= 201103L
>> static_assert(__is_pod(KeyId), "KeyId must be a trivial (POD) type");
>> #endif
>>
>> This could actually be added for 4.7 and up (__is_pod is available
>> earlier, but __cplusplus isn't set correctly before that) but GCC 6 is
>> when we started to default to C++14. If the check only happens for
>> people bootstrapping with new versions of GCC it's still going to
>> catch misuses with non-POD types.
> 
> I've updated the comments explaining the constraints in more detail
> and added the static assert to hash_table where it covers both
> has_map and hash_set.  FWIW, I don't imagine anyone instantiating
> these containers on a non-trivial key types in GCC but having
> the assert there doesn't hurt anything.
> 
> Martin
> 

  reply	other threads:[~2019-07-01 14:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-19  3:15 Martin Sebor
2019-06-21 12:06 ` Richard Biener
2019-06-21 17:17   ` Martin Sebor
2019-06-24 12:11     ` Richard Biener
2019-06-24 14:35       ` Martin Sebor
2019-06-24 17:42         ` Richard Biener
2019-06-25  2:58           ` Martin Sebor
2019-06-25  9:53           ` Jonathan Wakely
2019-06-25 20:30             ` Martin Sebor
2019-07-01 14:55               ` Martin Sebor [this message]
2019-07-01 16:34                 ` Richard Biener
2019-07-01 18:34                   ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ef16731-4b0b-5277-179f-32b4620a612e@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=gcc@gcc.gnu.org \
    --cc=jwakely@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).