public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Jonathan Wakely <jwakely.gcc@gmail.com>, <gcc@gcc.gnu.org>
Subject: Re: 'hash_map<tree, hash_map<tree, tree>>'
Date: Sat, 7 Aug 2021 10:08:03 +0200	[thread overview]
Message-ID: <87lf5dr82k.fsf@euler.schwinge.homeip.net> (raw)
In-Reply-To: <CAH6eHdT5yZ99jCmXAVxgwywoyamdgFrp7fjWrEM7DNj0tPzjRg@mail.gmail.com>

Hi!

On 2021-08-06T19:37:58+0100, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> On Fri, 6 Aug 2021, 17:58 Thomas Schwinge, <thomas@codesourcery.com> wrote:
>> So I'm trying to do some C++...  ;-)
>>
>> Given:
>>
>>     /* A map from SSA names or var decls to record fields.  */
>>     typedef hash_map<tree, tree> field_map_t;
>>
>>     /* For each propagation record type, this is a map from SSA names or
>> var decls
>>        to propagate, to the field in the record type that should be used
>> for
>>        transmission and reception.  */
>>     typedef hash_map<tree, field_map_t> record_field_map_t;
>>
>> Thus, that's a 'hash_map<tree, hash_map<tree, tree>>'.  (I may do that,
>> right?)  Looking through GCC implementation files, very most of all uses
>> of 'hash_map' boil down to pointer key ('tree', for example) and
>> pointer/integer value.
>>
>> Then:
>>
>>     record_field_map_t field_map ([...]); // see below
>>     for ([...])
>>       {
>>         tree record_type = [...];
>>         [...]
>>         bool existed;
>>         field_map_t &fields
>>           = field_map.get_or_insert (record_type, &existed);
>>         gcc_checking_assert (!existed);
>>         [...]
>>         for ([...])
>>           fields.put ([...], [...]);
>>         [...]
>>       }
>>     [stuff that looks up elements from 'field_map']
>>     field_map.empty ();
>>
>> This generally works.
>>
>> If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy.
>> If however I instantiate 'record_field_map_t field_map (13);' (where '13'
>> would be the default for 'hash_map'), Valgrind complains:
>>
>>     2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876
>>        at 0x483DD99: calloc (vg_replace_malloc.c:762)
>>        by 0x175F010: xcalloc (xmalloc.c:162)
>>        by 0xAF4A2C: hash_table<hash_map<tree_node*, tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, bool, mem_alloc_origin) (hash-table.h:275)
>>        by 0x15E0120: hash_map<tree_node*, tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
>>        by 0x15DEE87: hash_map<tree_node*, hash_map<tree_node*, tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> >, simple_hashmap_traits<default_hash_traits<tree_node*>, hash_map<tree_node*, tree_node*, simple_hashmap_traits<default_hash_traits<tree_node*>, tree_node*> > > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205)
>>        by 0x15DD52C: execute_omp_oacc_neuter_broadcast() (omp-oacc-neuter-broadcast.cc:1371)
>>        [...]
>>
>> (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
>> file.)
>>
>> My suspicion was that it is due to the 'field_map' getting resized as it
>> incrementally grows (and '40' being big enough for that to never happen),
>> and somehow the non-POD (?) value objects not being properly handled
>> during that.  Working my way a bit through 'gcc/hash-map.*' and
>> 'gcc/hash-table.*' (but not claiming that I understand all that, off
>> hand), it seems as if my theory is right: I'm able to plug this memory
>> leak as follows:
>>
>>     --- gcc/hash-table.h
>>     +++ gcc/hash-table.h
>>     @@ -820,6 +820,8 @@ hash_table<Descriptor, Lazy, Allocator>::expand ()
>>              {
>>                value_type *q = find_empty_slot_for_expand (Descriptor::hash (x));
>>           new ((void*) q) value_type (std::move (x));
>>     +     //BAD Descriptor::remove (x); // (doesn't make sense and) a ton of "Invalid read [...] inside a block of size [...] free'd"
>>     +     x.~value_type (); //GOOD This seems to work!  -- but does it make sense?
>>              }
>>
>>            p++;
>>
>> However, that doesn't exactly look like a correct fix, does it?  I'd
>> expect such a manual destructor call in combination with placement new
>> (that is being used here, obviously) -- but this is after 'std::move'?
>> However, this also survives a smoke-test-like run of parts of the GCC
>> testsuite, bootstrap and complete run now ongoing.

That testing came back without any issues.

> Does GCC's hash_map assume you only use it to store POD (plain old data)
> types

Don't you disappoint me, C++!

> which don't need to be destroyed, because they don't have any
> dynamically allocated memory or other resources?
>
> A hash_map is not a POD, because it does have dynamically allocated memory.

ACK, that's what I tried to say above in my "layman's terms".  ;-)

> If my guess is right, then hash_map should really use a static_assert to
> enforce that requirement, instead of letting you use it in a way that will
> leak.

Eh, yes, at the very least!

Or, of course, make it work?  I mean GCC surely isn't the first software
project to desire implementing a 'hash_map' storing non-POD objects?
Don't you disappoint me, C++!

Alternative to that manual destructor call (per my patch/hack above) --
is maybe something wrong in the 'value_type' constructor implementation
or any other bits related to the 'std::move'?  (Is that where the non-POD
source data ought to be destructed; via "move" instead of "copy"
semantics?)

"Learning C++ by actual need."  ;-D


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

  reply	other threads:[~2021-08-07  8:08 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-06 16:57 Thomas Schwinge
2021-08-06 18:37 ` Jonathan Wakely
2021-08-07  8:08   ` Thomas Schwinge [this message]
2021-08-07  8:54     ` Jonathan Wakely
2021-08-16 12:43       ` Thomas Schwinge
2021-08-09 10:02 ` Richard Biener
2021-08-16 12:44   ` Thomas Schwinge
2021-08-16 13:33     ` Richard Biener
2021-08-12 23:15 ` Martin Sebor
2021-08-16 12:44   ` Thomas Schwinge
2021-08-16 20:10     ` Martin Sebor
2021-08-17  6:40       ` Expensive selftests (was: 'hash_map<tree, hash_map<tree, tree>>') Thomas Schwinge
2021-08-17  8:57         ` Richard Biener
2021-08-18 11:34           ` Add more self-tests for 'hash_map' with Value type with non-trivial constructor/destructor (was: Expensive selftests) Thomas Schwinge
2021-08-18 13:35           ` Expensive selftests (was: 'hash_map<tree, hash_map<tree, tree>>') David Edelsohn
2021-08-18 15:34             ` Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand' work on 32-bit architectures [PR101959] Thomas Schwinge
2021-08-18 16:12               ` Richard Biener
2021-08-17 15:01         ` Expensive selftests Martin Sebor
2021-08-30 10:46       ` Fix 'hash_table::expand' to destruct stale Value objects (was: 'hash_map<tree, hash_map<tree, tree>>') Thomas Schwinge
2021-09-02  1:31         ` Fix 'hash_table::expand' to destruct stale Value objects Martin Sebor
2021-09-10  8:00           ` [PING] " Thomas Schwinge
2021-09-17 11:17             ` [PING^2] " Thomas Schwinge
2021-09-17 12:08               ` Richard Biener
2021-09-17 12:39                 ` Jonathan Wakely
2021-09-17 13:03                   ` Richard Biener
2021-09-17 15:52                     ` Thomas Schwinge
2021-09-17 19:08                       ` Jonathan Wakely
2021-09-20  9:11                       ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lf5dr82k.fsf@euler.schwinge.homeip.net \
    --to=thomas@codesourcery.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jwakely.gcc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).