From: Jason Merrill <jason@redhat.com>
To: Patrick Palka <ppalka@redhat.com>
Cc: gcc-patches@gcc.gnu.org, nathan@acm.org
Subject: Re: [PATCH] c++/modules: local class merging [PR99426]
Date: Fri, 12 Apr 2024 14:07:51 -0400 [thread overview]
Message-ID: <1a24b2c4-5d38-41b2-a290-5dea96ffa789@redhat.com> (raw)
In-Reply-To: <fa685680-9e69-c957-edf4-69c3a011e12a@idea>
On 4/12/24 13:48, Patrick Palka wrote:
> On Fri, 12 Apr 2024, Jason Merrill wrote:
>
>> On 4/12/24 10:35, Patrick Palka wrote:
>>> On Wed, 10 Apr 2024, Jason Merrill wrote:
>>>
>>>> On 4/10/24 14:48, Patrick Palka wrote:
>>>>> On Tue, 9 Apr 2024, Jason Merrill wrote:
>>>>>
>>>>>> On 3/5/24 10:31, Patrick Palka wrote:
>>>>>>> On Tue, 27 Feb 2024, Patrick Palka wrote:
>>>>>>>
>>>>>>> Subject: [PATCH] c++/modules: local type merging [PR99426]
>>>>>>>
>>>>>>> One known missing piece in the modules implementation is merging of
>>>>>>> a
>>>>>>> streamed-in local type (class or enum) with the corresponding in-TU
>>>>>>> version of the local type. This missing piece turns out to cause a
>>>>>>> hard-to-reduce use-after-free GC issue due to the entity_ary not
>>>>>>> being
>>>>>>> marked as a GC root (deliberately), and manifests as a serialization
>>>>>>> error on stream-in as in PR99426 (see comment #6 for a reduction).
>>>>>>> It's
>>>>>>> also reproducible on trunk when running the xtreme-header tests
>>>>>>> without
>>>>>>> -fno-module-lazy.
>>>>>>>
>>>>>>> This patch makes us merge such local types according to their
>>>>>>> position
>>>>>>> within the containing function's definition, analogous to how we
>>>>>>> merge
>>>>>>> FIELD_DECLs of a class according to their index in the TYPE_FIELDS
>>>>>>> list.
>>>>>>>
>>>>>>> PR c++/99426
>>>>>>>
>>>>>>> gcc/cp/ChangeLog:
>>>>>>>
>>>>>>> * module.cc (merge_kind::MK_local_type): New enumerator.
>>>>>>> (merge_kind_name): Update.
>>>>>>> (trees_out::chained_decls): Move BLOCK-specific handling
>>>>>>> of DECL_LOCAL_DECL_P decls to ...
>>>>>>> (trees_out::core_vals) <case BLOCK>: ... here. Stream
>>>>>>> BLOCK_VARS manually.
>>>>>>> (trees_in::core_vals) <case BLOCK>: Stream BLOCK_VARS
>>>>>>> manually. Handle deduplicated local types..
>>>>>>> (trees_out::key_local_type): Define.
>>>>>>> (trees_in::key_local_type): Define.
>>>>>>> (trees_out::get_merge_kind) <case FUNCTION_DECL>: Return
>>>>>>> MK_local_type for a local type.
>>>>>>> (trees_out::key_mergeable) <case FUNCTION_DECL>: Use
>>>>>>> key_local_type.
>>>>>>> (trees_in::key_mergeable) <case FUNCTION_DECL>: Likewise.
>>>>>>> (trees_in::is_matching_decl): Be flexible with type mismatches
>>>>>>> for local entities.
>>>>>>>
>>>>>>> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
>>>>>>> index 80b63a70a62..d9e34e9a4b9 100644
>>>>>>> --- a/gcc/cp/module.cc
>>>>>>> +++ b/gcc/cp/module.cc
>>>>>>> @@ -6714,7 +6720,37 @@ trees_in::core_vals (tree t)
>>>>>>> case BLOCK:
>>>>>>> t->block.locus = state->read_location (*this);
>>>>>>> t->block.end_locus = state->read_location (*this);
>>>>>>> - t->block.vars = chained_decls ();
>>>>>>> +
>>>>>>> + for (tree *chain = &t->block.vars;;)
>>>>>>> + if (tree decl = tree_node ())
>>>>>>> + {
>>>>>>> + /* For a deduplicated local type or enumerator, chain the
>>>>>>> + duplicate decl instead of the canonical in-TU decl.
>>>>>>> Seeing
>>>>>>> + a duplicate here means the containing function whose
>>>>>>> body
>>>>>>> + we're streaming in is a duplicate too, so we'll end up
>>>>>>> + discarding this BLOCK (and the rest of the duplicate
>>>>>>> function
>>>>>>> + body) anyway. */
>>>>>>> + if (is_duplicate (decl))
>>>>>>> + decl = maybe_duplicate (decl);
>>>>>>> + else if (DECL_IMPLICIT_TYPEDEF_P (decl)
>>>>>>> + && TYPE_TEMPLATE_INFO (TREE_TYPE (decl)))
>>>>>>> + {
>>>>>>> + tree tmpl = TYPE_TI_TEMPLATE (TREE_TYPE (decl));
>>>>>>> + if (DECL_TEMPLATE_RESULT (tmpl) == decl &&
>>>>>>> is_duplicate
>>>>>>> (tmpl))
>>>>>>> + decl = DECL_TEMPLATE_RESULT (maybe_duplicate
>>>>>>> (tmpl));
>>>>>>> + }
>>>>>>
>>>>>> This seems like a lot of generally-applicable code for finding the
>>>>>> duplicate,
>>>>>> which other calls to maybe_duplicate/odr_duplicate don't use. If the
>>>>>> template
>>>>>> is a duplicate, why isn't its result? If there's a good reason for
>>>>>> that,
>>>>>> should this template handling go into maybe_duplicate?
>>>>>
>>>>> Ah yeah, that makes sense.
>>>>>
>>>>> Some context: IIUC modules treats the TEMPLATE_DECL instead of the
>>>>> DECL_TEMPLATE_RESULT as the canonical decl, which in turn means we'll
>>>>> register_duplicate only the TEMPLATE_DECL. But BLOCK_VARS never
>>>>> contains
>>>>> a TEMPLATE_DECL, always the DECL_TEMPLATE_RESULT (i.e. a TYPE_DECL),
>>>>> hence the extra handling.
>>>>>
>>>>> Given that it's relatively more difficult to get at the TEMPLATE_DECL
>>>>> from the DECL_TEMPLATE_RESULT rather than vice versa, maybe we should
>>>>> just register both as duplicates from register_duplicate? That way
>>>>> callers can just simply pass the DECL_TEMPLATE_RESULT to maybe_duplicate
>>>>> and it'll do the right thing.
>>>>
>>>> Sounds good.
>>>>
>>>>>>> @@ -10337,6 +10373,83 @@ trees_in::fn_parms_fini (int tag, tree fn,
>>>>>>> tree
>>>>>>> existing, bool is_defn)
>>>>>>> }
>>>>>>> }
>>>>>>> +/* Encode into KEY the position of the local type (class or
>>>>>>> enum)
>>>>>>> + declaration DECL within FN. The position is encoded as the
>>>>>>> + index of the innermost BLOCK (numbered in BFS order) along with
>>>>>>> + the index within its BLOCK_VARS list. */
>>>>>>
>>>>>> Since we already set DECL_DISCRIMINATOR for mangling, could we use
>>>>>> it+name
>>>>>> for
>>>>>> the key as well?
>>>>>
>>>>> We could (and IIUc that'd be more robust to ODR violations), but
>>>>> wouldn't it mean we'd have to do a linear walk over all BLOCK_VARs of
>>>>> all BLOCKS in order to find the one with the matching
>>>>> name+discriminator? That'd be slower than the current approach which
>>>>> lets us skip to the correct BLOCK and walk only its BLOCK_VARS.
>>>>
>>>> Ah, good point. How about block number + name instead of the index?
>>>
>>> It seems DECL_DISCRIMINATOR is only set at instantiation time and so for
>>> local types from a function template pattern the field is empty, which
>>> means we can't use it as the key in general :/
>>
>> I meant just block number and name, without DECL_DISCRIMINATOR. Just using
>> the name instead of an index in BLOCK_VARS.
>
> Ah, I think that'd be enough for named local types, but what about
> anonymous local types? IIUC without DECL_DISCRIMINATOR we wouldn't be
> able to reliably distinguisth between multiple anonymous local types
> defined in the same block, since their identifiers aren't stable given
> that they're based off of a global counter (and so sensitive to #include
> order) :(
Good point. But I'd still think to merge based on name if we have one;
as you said above, to be more robust to ODR violations.
If the imported fn has a local class that the included header didn't,
would we get the same problem?
Jason
next prev parent reply other threads:[~2024-04-12 18:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-27 2:37 Patrick Palka
2024-02-27 18:10 ` Patrick Palka
2024-03-05 15:31 ` Patrick Palka
2024-03-26 14:24 ` Patrick Palka
2024-04-09 20:27 ` Patrick Palka
2024-04-09 21:57 ` Jason Merrill
2024-04-10 18:48 ` Patrick Palka
2024-04-10 22:55 ` Jason Merrill
2024-04-12 14:35 ` Patrick Palka
2024-04-12 17:18 ` Jason Merrill
2024-04-12 17:48 ` Patrick Palka
2024-04-12 18:07 ` Jason Merrill [this message]
2024-04-12 18:39 ` Patrick Palka
2024-04-12 19:08 ` Jason Merrill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1a24b2c4-5d38-41b2-a290-5dea96ffa789@redhat.com \
--to=jason@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=nathan@acm.org \
--cc=ppalka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).