Hi! On 2021-09-01T19:31:19-0600, Martin Sebor via Gcc-patches wrote: > On 8/30/21 4:46 AM, Thomas Schwinge wrote: >> Ping -- we still need to plug the memory leak; see patch attached, and/or >> long discussion here: > > Thanks for answering my questions. I have no concerns with going > forward with the patch as is. Thanks, Martin. Ping for formal approval (and review for using proper C++ terminology in the 'gcc/hash-table.h:hash_table::expand' source code comment that I'm adding). Patch again attached, for easy reference. > Just a suggestion/request: unless > this patch fixes all the outstanding problems you know of or suspect > in this area (leaks/missing dtor calls) and unless you plan to work > on those in the near future, please open a bug for them with a brain > dump of what you learned. That should save us time when the day > comes to tackle those. ACK. I'm not aware of any additional known problems. (In our email discussion, we did have some "vague ideas" for opportunities of clarification/clean-up, but these aren't worth filing PRs for; needs someone to gain understanding, taking a look.) Grüße Thomas >> On 2021-08-16T14:10:00-0600, Martin Sebor wrote: >>> On 8/16/21 6:44 AM, Thomas Schwinge wrote: >>>> On 2021-08-12T17:15:44-0600, Martin Sebor via Gcc wrote: >>>>> On 8/6/21 10:57 AM, Thomas Schwinge wrote: >>>>>> So I'm trying to do some C++... ;-) >>>>>> >>>>>> Given: >>>>>> >>>>>> /* A map from SSA names or var decls to record fields. */ >>>>>> typedef hash_map field_map_t; >>>>>> >>>>>> /* For each propagation record type, this is a map from SSA names or var decls >>>>>> to propagate, to the field in the record type that should be used for >>>>>> transmission and reception. */ >>>>>> typedef hash_map record_field_map_t; >>>>>> >>>>>> Thus, that's a 'hash_map>'. (I may do that, >>>>>> right?) Looking through GCC implementation files, very most of all uses >>>>>> of 'hash_map' boil down to pointer key ('tree', for example) and >>>>>> pointer/integer value. >>>>> >>>>> Right. Because most GCC containers rely exclusively on GCC's own >>>>> uses for testing, if your use case is novel in some way, chances >>>>> are it might not work as intended in all circumstances. >>>>> >>>>> I've wrestled with hash_map a number of times. A use case that's >>>>> close to yours (i.e., a non-trivial value type) is in cp/parser.c: >>>>> see class_to_loc_map_t. >>>> >>>> Indeed, at the time you sent this email, I already had started looking >>>> into that one! (The Fortran test cases that I originally analyzed, which >>>> triggered other cases of non-POD/non-trivial destructor, all didn't >>>> result in a memory leak, because the non-trivial constructor doesn't >>>> actually allocate any resources dynamically -- that's indeed different in >>>> this case here.) ..., and indeed: >>>> >>>>> (I don't remember if I tested it for leaks >>>>> though. It's used to implement -Wmismatched-tags so compiling >>>>> a few tests under Valgrind should show if it does leak.) >>>> >>>> ... it does leak memory at present. :-| (See attached commit log for >>>> details for one example.) >> >> (Attached "Fix 'hash_table::expand' to destruct stale Value objects" >> again.) >> >>>> To that effect, to document the current behavior, I propose to >>>> "Add more self-tests for 'hash_map' with Value type with non-trivial >>>> constructor/destructor" >> >> (We've done that in commit e4f16e9f357a38ec702fb69a0ffab9d292a6af9b >> "Add more self-tests for 'hash_map' with Value type with non-trivial >> constructor/destructor", quickly followed by bug fix >> commit bb04a03c6f9bacc890118b9e12b657503093c2f8 >> "Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand' >> work on 32-bit architectures [PR101959]". >> >>>> (Also cherry-pick into release branches, eventually?) >> >>>>>> Then: >>>>>> >>>>>> record_field_map_t field_map ([...]); // see below >>>>>> for ([...]) >>>>>> { >>>>>> tree record_type = [...]; >>>>>> [...] >>>>>> bool existed; >>>>>> field_map_t &fields >>>>>> = field_map.get_or_insert (record_type, &existed); >>>>>> gcc_checking_assert (!existed); >>>>>> [...] >>>>>> for ([...]) >>>>>> fields.put ([...], [...]); >>>>>> [...] >>>>>> } >>>>>> [stuff that looks up elements from 'field_map'] >>>>>> field_map.empty (); >>>>>> >>>>>> This generally works. >>>>>> >>>>>> If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy. >>>>>> If however I instantiate 'record_field_map_t field_map (13);' (where '13' >>>>>> would be the default for 'hash_map'), Valgrind complains: >>>>>> >>>>>> 2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876 >>>>>> at 0x483DD99: calloc (vg_replace_malloc.c:762) >>>>>> by 0x175F010: xcalloc (xmalloc.c:162) >>>>>> by 0xAF4A2C: hash_table, tree_node*> >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, bool, mem_alloc_origin) (hash-table.h:275) >>>>>> by 0x15E0120: hash_map, tree_node*> >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143) >>>>>> by 0x15DEE87: hash_map, tree_node*> >, simple_hashmap_traits, hash_map, tree_node*> > > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205) >>>>>> by 0x15DD52C: execute_omp_oacc_neuter_broadcast() (omp-oacc-neuter-broadcast.cc:1371) >>>>>> [...] >>>>>> >>>>>> (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc' >>>>>> file.) >>>>>> >>>>>> My suspicion was that it is due to the 'field_map' getting resized as it >>>>>> incrementally grows (and '40' being big enough for that to never happen), >>>>>> and somehow the non-POD (?) value objects not being properly handled >>>>>> during that. Working my way a bit through 'gcc/hash-map.*' and >>>>>> 'gcc/hash-table.*' (but not claiming that I understand all that, off >>>>>> hand), it seems as if my theory is right: I'm able to plug this memory >>>>>> leak as follows: >>>>>> >>>>>> --- gcc/hash-table.h >>>>>> +++ gcc/hash-table.h >>>>>> @@ -820,6 +820,8 @@ hash_table::expand () >>>>>> { >>>>>> value_type *q = find_empty_slot_for_expand (Descriptor::hash (x)); >>>>>> new ((void*) q) value_type (std::move (x)); >>>>>> + //BAD Descriptor::remove (x); // (doesn't make sense and) a ton of "Invalid read [...] inside a block of size [...] free'd" >>>>>> + x.~value_type (); //GOOD This seems to work! -- but does it make sense? >>>>>> } >>>>>> >>>>>> p++; >>>>>> >>>>>> However, that doesn't exactly look like a correct fix, does it? I'd >>>>>> expect such a manual destructor call in combination with placement new >>>>>> (that is being used here, obviously) -- but this is after 'std::move'? >>>>>> However, this also survives a smoke-test-like run of parts of the GCC >>>>>> testsuite, bootstrap and complete run now ongoing. >>>>> >>>>> If explicitly calling the dtor on the moved object is the right thing >>>>> to do I'd expect to see such invocations elsewhere in hash_table but >>>>> I don't. It does seem like removed elements ought to be destroyed, >>>>> but it also seems like the destruction should go through some traits >>>>> class (e.g., call Descriptor::remove and mark_deleted or do some >>>>> similar dance), and be called from other member functions that move >>>>> elements. >>>> >>>> So, we shall assume that any "real C++" use case (that is, C++ class) is >>>> using the appropriate C++ method, that is, 'typed_delete_remove', which >>>> does 'delete', which does destroy the C++ object, if applicable, else >>>> 'typed_noop_remove'. >>>> >>>> (Shall we look into the few places that use 'typed_free_remove' via >>>> 'free_ptr_hash', and potentially replace them with 'typed_delete_remove'? >>>> Or is there a reason for the two schemes to co-exist, other than for >>>> legacy reasons? Anyway, that's for another day.) >>> >>> I find all these these traits pretty much impenetrable. >> >> (Indeed. ... which triggered this reflex with me, to try simplifying >> this by getting rid of 'typed_free_remove' etc...) >> >>> I guess >>> intuitively, I'd expect Descriptor::remove() to destroy an element, >>> but I have no idea if that would be right given the design. >> >> So 'typed_delete_remove' does destruct via 'delete'. 'typed_free_remove' >> doesn't -- but is only used via 'free_ptr_hash', where this isn't >> relevant? 'typed_noop_remove' I haven't considered yet regarding that >> issue. Anyway, that's all for another day. >> >>>> What is different in the 'hash_table::expand' case is that all the Value >>>> objects do get 'std::move'd into a new blob of memory via placement new >>>> (so a subsequent 'delete' via 'typed_delete_remove' is not appropriate), >>>> but then the stale Value objects never get destructed. And indeed an >>>> explicit destructor call (which, as I understand is a no-op for primitive >>>> types; "pseudo destructor") is the right thing to do; see >>>> >>>> and others, for example. (Therefore, I don't think this needs to be >>>> routed through a "traits" function, but can rather be done in-line here, >>>> after each placement new, before deallocation of the original blob of >>>> memory. Also, I argue it's the right thing to do also for 'm_ggc', >>>> because even if in that case we're not going to leak memory (GC will >>>> reclaim), but we still may leak other resources dynamically allocated in >>>> a non-trivial constructor.) >>> >>> Yes, of course, all elements need to be eventually be destroyed. >>> My only hesitation was whether it should be done via one of these >>> traits classes (like it's done in C++ containers via allocators) >>> rather than directly >> >> Given that there is (apparently) no issue to do a placement new in >> 'hash_table::expand', I'd asumme a -- corresponding -- explicit >> destructor call would be likewise appropriate? (But I'll of course route >> this through a (new) "traits" function if someone explains why this is >> the right thing to do.) >> >>> and whether there might be other places >>> where the destruction night need to happen. >> >> (Possibly, yes, per discussion above -- but that's a separate issue?) >> >>>> The attached "Fix 'hash_table::expand' to destruct stale Value objects" >>>> does prevent my original problem, does address the current 'class2loc' >>>> memory leak in 'cp/parser.c' (see commit log for one example), and >>>> adjusts the new >>>> 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand' test >>>> case such that number of constructor calls matches the number of >>>> destructor calls. After some careful review regarding C++ details >>>> (please!), OK to push to master branch? (Also cherry-pick into release >>>> branches, eventually?) Is the source code comment that I'm adding >>>> sufficiently explanatory and correct in C++ terms? >> >> Ping. >> >>> Shouldn't the hash_table dtor (and perhaps also empty()) also >>> destroy the elements? (Otherwise, what destroys the elements >>> newly constructed here, or anywhere else for that matter, such >>> as in the hash_table ctor?) >> >> Per my understanding, per discussion above, they (eventually) do get >> destroyed via 'delete' in 'typed_delete_remove', for example, via >> 'Descriptor::remove' calls for all/relevant entries in >> 'hash_table::~hash_table', 'hash_table::empty_slow', >> 'hash_table::clear_slot', 'hash_table::remove_elt_with_hash'. >> >> (This means that if there has been an intermediate 'expand', this may >> (eventually) destroy objects at a different memory location from where >> they originally have been created -- but that's fine.) >> >> The 'expand' case itself is different: any (live) entries are *moved* >> into a new storage memory object via placement new. (And then the >> hollow entries in the old storage memory object linger.) >> >>> Also, shouldn't the destroyed slot be marked either deleted or >>> empty?) >> >> Per my understanding, irrelevant in 'expand': working through 'oentries', >> the *move* is only done 'if (!is_empty (x) && !is_deleted (x))' (so >> combined with the item above, there is no memory leak for any entries >> that have already been 'remove'd -- they have already been destructed), >> and the whole (old) storage memory object will be deallocated right after >> the 'oentries' loop. >> >> >>> (Sorry, I realize I have more questions than answers.) >> >> No worries, happens to me most of the times! Thanks for looking into >> this. >> >> >> Grüße >> Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955