Hi, Martin Jambor writes: > Hi, > > On Sat, Mar 18 2023, Arsen Arsenović wrote: >> Martin Jambor writes: > > [...] > >>>> >>>> For the test case in the PR, in ipa.cc:remove_unreachable_nodes, GCC >>>> seems to try to remove an unreachable function that was already inlined >>>> into a different unreachable function. >>> >>> No, it fails to remove it. It is still there but should not have been, >>> that is the problem. >> >> Ah - I see. >> >>>> >>>> This decision later trips up the gcc_assert in: >>>> >>>> /* Inline clones might be kept around so their materializing allows further >>>> cloning. If the function the clone is inlined into is removed, we need >>>> to turn it into normal cone. */ >>>> FOR_EACH_FUNCTION (node) >>>> { >>>> if (node->inlined_to >>>> && !node->callers) >>>> { >>>> gcc_assert (node->clones); >>>> node->inlined_to = NULL; >>>> update_inlined_to_pointer (node, node); >>>> } >>>> node->aux = NULL; >>>> } >>>> >>>> .. because it is expecting that an inlined function without callers >>>> (which is necessarily true here as this function is unreachable and so >>>> was ->remove ()'d earlier) has clones. >>> >>> The assert makes sure that if we encounter an inlined-to node without >>> any caller, that it merely holds as the holder of the function body for >>> its other specialized (think IPA-CP) or inline clones. If node->clones >>> is false, there are no such clones and it was a bug to mark the node as >>> required during the removal of unreachable nodes. >> >> I see. That makes sense. So, this assert is placed here by convenience >> rather than being this invariant being absolutely required for the >> purpose of the loop? (it is related, so this placement makes sense, I >> just want to confirm whether it's a "mandatory" invariant) > > If the assert fails, the algorithm does not work as intended. OTOH, It > could be a gcc_checking_assert only since user compiled code would still > work, just would be unnecessarily bigger. Yes, that is what I was trying to ask ;-) Apologies for failing to articulate so. >> >>> It is correct. An inlined function without a caller is even a logical >>> oxymoron and can only exist if it has the purpose described above (and >>> even then probably only in a fairly special circumstances). >> >> Right. I wasn't quite sure whether setting inlined_to would remove that >> caller, but if I understood right, it should not. >> >> What is interesting, though, is that there is an attempt to remove this >> node during ipa_inline: > > IPA-inline calls remove_unreachable_nodes to get rid of call graph nodes > which are known not to be necessary after inlining (inlining can lead to > redirection of some call graph edges to __builtin_unreachable) and > unreachable removal... well.. removes nodes. > >> >> (gdb) bt >> #0 cgraph_edge::remove_callee ( >> this=> "__ct_base "/18> -> )>) >> at ../../gcc/gcc/cgraph.h:3299 >> #1 0x0000000000d03c37 in cgraph_node::remove_callees (this=> * const 0x7ffff6dedaa0 "__ct_base "/18>) at >> ../../gcc/gcc/cgraph.cc:1743 >> #2 0x0000000000d04387 in cgraph_node::remove (this=> 0x7ffff6dedaa0 "__ct_base "/18>) at ../../gcc/gcc/cgraph.cc:1884 >> #3 0x00000000010da74f in symbol_table::remove_unreachable_nodes >> (this=0x7ffff6ddb000, file=0x7ffff7a814c0 <_IO_2_1_stderr_>) at >> ../../gcc/gcc/ipa.cc:518 >> #4 0x0000000002b51e53 in ipa_inline () at >> ../../gcc/gcc/ipa-inline.cc:2761 >> #5 0x0000000002b52cf7 in (anonymous >> namespace)::pass_ipa_inline::execute (this=0x3c8d5b0) at >> ../../gcc/gcc/ipa-inline.cc:3153 >> (etc) >> >> ... I presume that my assumption that cgraph_node::remove () should >> remove nodes from the cgraph_node::next chain is wrong? > > Ummm.... the function does that through the call to > symtab_node::unregister. But how is that related? Oh - my bad. I seem to have broken on the wrong condition here. "value"/28 is *not* removed. That makes more sense, it'd be confusing if remove() still lead to FOR_EACH_FUNCTION touching a node. >> >>>> >>>> And as a side question, do all clone nodes have a ->clones pointing to >>>> the same list of all clones, or are they in a tree-ish arrangement, >>>> where clones of clones end up forming a tree, with the clone_of pointer >>>> being a pointer to the parent? >>> >>> The latter, they form a tree. >> >> I see, that makes sense. Thanks. >> >>>> (in this instance, the node that trips >>>> the assert has a nullptr clone_of and clones value, which would AIUI >>>> imply that it is the original) >>> >>> Yes. >>> >>>> This train of thinking doesn't end up involving any devirtualization >>>> code, which the PR was originally reproduced with, but my current theory >>>> is that devirtualizing here just exposed an edge case that is decently >>>> well hidden, rather than it playing a crucial role. >>> >>> The inlined function is - I believe erroneously - marked as reachable by >>> walk_polymorphic_call_targets() within the unreachable analysis - so >>> devirtualizing is somewhat crucial. >>> >>> I believe the real question - to which I don't have an answer yet - is >>> why does possible_polymorphic_call_targets return a method that is not >>> virtual? >>> >>> (gdb) p n->decl->decl_common.virtual_flag >>> $19 = 0 >>> >>> Or even >>> >>> (gdb) p referenced_from_vtable_p(n) >>> $24 = false >>> >>> Time to dig into ipa-devirt.cc, I guess.... >>> >>> Martin >> >> I saw your response on the PR itself, thanks for handling that in my >> absence. With the information you've shared I believe I understand that >> the fix you provided: an inlined call can no longer be a polymorphic >> call target, and so reaching an inlined function inside >> walk_polymorphic_call_targets should not be possible? > > The thing is that ICF discovers that a virtual function, which was > originally a possible target, is semantically identically to a > non-virtual function and merges them, making the virtual function an > alias of the non-virtual one. Devirtualization can then suddenly spit > out non-virtual potential targets which is kind of unexpected. > > It seems to me that the most correct fix is to add to > walk_polymorphic_call_targets a check that the obtained possible target > is still referenced_from_vtable_p() - because the alias that was > originally a virtual function is referenced from a vtable that at this > point is also known to be gone. But the check looks like it is possibly > expensive, so I wanted to discuss this with Honza first (hopefully next > week). I see. That does make sense. I'm not sure I understand Honzas suggestion, though. can_refer_decl_in_current_unit_p is true for everything involved here, which, AIUI, makes sense since everything is in one partition (to make sure, I set -flto-partition=one for a test run too). I made can_refer_decl_in_current_unit_p non-static and checked the value for both the target_node and the alias_target in the block he mentioned, and the result was 1 in all cases. Thinking about this, I was curious if preventing the alias from being followed if it would become nonvirtual would help, and it stopped the ICE in the PR, but as with the first attempted fix of mine, I'm not certain of its correctness: diff --git a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc index 14cf132c767..36b9b87fd10 100644 --- a/gcc/ipa-devirt.cc +++ b/gcc/ipa-devirt.cc @@ -2420,7 +2420,9 @@ maybe_record_node (vec &nodes, alias_target = target_node->ultimate_alias_target (&avail); if (target_node != alias_target && avail >= AVAIL_AVAILABLE - && target_node->get_availability ()) + && target_node->get_availability () + && (!DECL_VIRTUAL_P (target_node->decl) + || DECL_VIRTUAL_P (alias_target->decl))) target_node = alias_target; } (also untested with the actual testsuite, it's quite late..) Thanks again, have a lovely night. >> >> I had already figured that an error could've likely been in >> reach-ability analysis, but my time ran low, and I had not confirmed >> anything, or as little as formalized a theory, so I just wrote the >> original email instead of following this trail of thought fully. >> >> Thank you for your guidance! Have a lovely night :) > > It is good thing that you asked, I also learned something new (that > virtual and non-virtual functions can be ICFed together). > > Martin -- Arsen Arsenović