* [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] @ 2024-04-17 7:42 Jakub Jelinek 2024-04-17 9:04 ` Jan Hubicka 2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek 0 siblings, 2 replies; 25+ messages in thread From: Jakub Jelinek @ 2024-04-17 7:42 UTC (permalink / raw) To: Jason Merrill, Jan Hubicka, Richard Biener, Patrick Palka; +Cc: gcc-patches Hi! When expand_or_defer_fn is called at_eof time, it calls import_export_decl and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a couple of places to try to optimize cdtors which are known to have the same body by making the complete cdtor an alias to base cdtor (and in that case also uses *[CD]5* as comdat group name instead of the normal comdat group names specific to each mangled name). Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN, maybe_clone_body and can_alias_cdtor use: if (DECL_ONE_ONLY (fn)) cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone)); ... bool can_alias = can_alias_cdtor (fn); ... /* Tell cgraph if both ctors or both dtors are known to have the same body. */ if (can_alias && fns[0] && idx == 1 && cgraph_node::get_create (fns[0])->create_same_body_alias (clone, fns[0])) { alias = true; if (DECL_ONE_ONLY (fns[0])) { /* For comdat base and complete cdtors put them into the same, *[CD]5* comdat group instead of *[CD][12]*. */ comdat_group = cdtor_comdat_group (fns[1], fns[0]); cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group); if (symtab_node::get (clone)->same_comdat_group) symtab_node::get (clone)->remove_from_same_comdat_group (); symtab_node::get (clone)->add_to_same_comdat_group (symtab_node::get (fns[0])); } } and /* Don't use aliases for weak/linkonce definitions unless we can put both symbols in the same COMDAT group. */ return (DECL_INTERFACE_KNOWN (fn) && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) && (!DECL_ONE_ONLY (fn) || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); The following testcase regressed with Marek's r14-5979 change, when pr113208_0.C is compiled where the ctor is marked constexpr, we no longer perform this optimization, where _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. Now, the linker seems to somehow cope with that, eventhough it probably keeps both copies of the ctor, but seems LTO can't cope with that and Honza doesn't know what it should do in that case (linker decides that the prevailing symbol is _ZN6vectorI12QualityValueEC2ERKS1_ (from the _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU, from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)). Note, the case where some constructor is marked constexpr in one TU and not in another one happens pretty often in libstdc++ when one mixes -std= flags used to compile different compilation units. The reason the optimization doesn't trigger when the constructor is constexpr is that expand_or_defer_fn is called in that case much earlier than when it is not constexpr; in the former case it is called when we try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN is false in that case and comdat_linkage hasn't been called either (I think it is desirable, because comdat group is stored in the cgraph node and am not sure it is a good idea to create cgraph nodes for something that might not be needed later on at all), so maybe_clone_body clones the bodies, but doesn't make them as aliases. The following patch is an attempt to redo this optimization when import_export_decl is called at_eof time on the base/complete cdtor (or deleting dtor). It will not do anything if maybe_clone_body hasn't been called uyet (the TREE_ASM_WRITTEN check on the DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete cdtors have been lowered already, or when maybe_clone_body called maybe_thunk_body and it was successful. Otherwise retries the can_alias_cdtor check and makes the complete cdtor alias to the base cdtor with adjustments to the comdat group. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-04-17 Jakub Jelinek <jakub@redhat.com> PR lto/113208 * cp-tree.h (maybe_optimize_cdtor): Declare. * decl2.cc (import_export_decl): Call it for cloned cdtors. * optimize.cc (maybe_optimize_cdtor): New function. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/cp-tree.h.jj 2024-04-16 17:18:37.286279533 +0200 +++ gcc/cp/cp-tree.h 2024-04-16 17:45:01.685635709 +0200 @@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign /* In optimize.cc */ extern tree clone_attrs (tree); extern bool maybe_clone_body (tree); +extern void maybe_optimize_cdtor (tree); /* In parser.cc */ extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool, --- gcc/cp/decl2.cc.jj 2024-04-16 17:18:37.287279519 +0200 +++ gcc/cp/decl2.cc 2024-04-16 17:45:01.686635695 +0200 @@ -3568,6 +3568,9 @@ import_export_decl (tree decl) } DECL_INTERFACE_KNOWN (decl) = 1; + + if (DECL_CLONED_FUNCTION_P (decl)) + maybe_optimize_cdtor (decl); } /* Return an expression that performs the destruction of DECL, which --- gcc/cp/optimize.cc.jj 2024-04-16 17:18:37.374278327 +0200 +++ gcc/cp/optimize.cc 2024-04-16 21:35:53.597188774 +0200 @@ -723,3 +723,58 @@ maybe_clone_body (tree fn) /* We don't need to process the original function any further. */ return 1; } + +/* If maybe_clone_body is called while the cdtor is still tentative, + DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn). + In that case we wouldn't try to optimize using an alias and instead + would emit separate base and complete cdtor. The following function + attempts to still optimize that case when we import_export_decl + is called first time on one of the clones. */ + +void +maybe_optimize_cdtor (tree orig_decl) +{ + tree fns[3]; + tree fn = DECL_CLONED_FUNCTION (orig_decl); + gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); + if (DECL_INTERFACE_KNOWN (fn) + || !TREE_ASM_WRITTEN (fn) + || !DECL_ONE_ONLY (orig_decl) + || symtab->global_info_ready) + return; + + populate_clone_array (fn, fns); + + if (!fns[0] || !fns[1]) + return; + for (int i = 2 - !fns[2]; i >= 0; --i) + if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i])) + return; + DECL_INTERFACE_KNOWN (fn) = 1; + comdat_linkage (fn); + if (!can_alias_cdtor (fn)) + return; + /* For comdat base and complete cdtors put them into the same, + *[CD]5* comdat group instead of *[CD][12]*. */ + auto n0 = cgraph_node::get_create (fns[0]); + auto n1 = cgraph_node::get_create (fns[1]); + auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL; + if (n0->lowered || n1->lowered || (n2 && n2->lowered)) + return; + import_export_decl (fns[0]); + n1->definition = false; + if (!n0->create_same_body_alias (fns[1], fns[0])) + return; + + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); + n1 = cgraph_node::get (fns[1]); + n0->set_comdat_group (comdat_group); + if (n1->same_comdat_group) + n1->remove_from_same_comdat_group (); + n1->add_to_same_comdat_group (n0); + if (fns[2]) + n2->add_to_same_comdat_group (n0); + import_export_decl (fns[1]); + /* Remove the body now that it is an alias. */ + DECL_SAVED_TREE (fns[1]) = NULL_TREE; +} --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 7:42 [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Jakub Jelinek @ 2024-04-17 9:04 ` Jan Hubicka 2024-04-17 12:32 ` Jakub Jelinek 2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek 1 sibling, 1 reply; 25+ messages in thread From: Jan Hubicka @ 2024-04-17 9:04 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches > Hi! Hello, > The reason the optimization doesn't trigger when the constructor is > constexpr is that expand_or_defer_fn is called in that case much earlier > than when it is not constexpr; in the former case it is called when we > try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN > is false in that case and comdat_linkage hasn't been called either > (I think it is desirable, because comdat group is stored in the cgraph > node and am not sure it is a good idea to create cgraph nodes for > something that might not be needed later on at all), so maybe_clone_body > clones the bodies, but doesn't make them as aliases. Thanks for working this out! Creating cgraph node with no body is harmless as it will be removed later. > > The following patch is an attempt to redo this optimization when > import_export_decl is called at_eof time on the base/complete cdtor > (or deleting dtor). It will not do anything if maybe_clone_body > hasn't been called uyet (the TREE_ASM_WRITTEN check on the > DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete > cdtors have been lowered already, or when maybe_clone_body called > maybe_thunk_body and it was successful. Otherwise retries the > can_alias_cdtor check and makes the complete cdtor alias to the > base cdtor with adjustments to the comdat group. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2024-04-17 Jakub Jelinek <jakub@redhat.com> > > PR lto/113208 > * cp-tree.h (maybe_optimize_cdtor): Declare. > * decl2.cc (import_export_decl): Call it for cloned cdtors. > * optimize.cc (maybe_optimize_cdtor): New function. > > * g++.dg/lto/pr113208_0.C: New test. > * g++.dg/lto/pr113208_1.C: New file. > * g++.dg/lto/pr113208.h: New file. > > --- gcc/cp/cp-tree.h.jj 2024-04-16 17:18:37.286279533 +0200 > +++ gcc/cp/cp-tree.h 2024-04-16 17:45:01.685635709 +0200 > @@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign > /* In optimize.cc */ > extern tree clone_attrs (tree); > extern bool maybe_clone_body (tree); > +extern void maybe_optimize_cdtor (tree); > > /* In parser.cc */ > extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool, > --- gcc/cp/decl2.cc.jj 2024-04-16 17:18:37.287279519 +0200 > +++ gcc/cp/decl2.cc 2024-04-16 17:45:01.686635695 +0200 > @@ -3568,6 +3568,9 @@ import_export_decl (tree decl) > } > > DECL_INTERFACE_KNOWN (decl) = 1; > + > + if (DECL_CLONED_FUNCTION_P (decl)) > + maybe_optimize_cdtor (decl); > } > > /* Return an expression that performs the destruction of DECL, which > --- gcc/cp/optimize.cc.jj 2024-04-16 17:18:37.374278327 +0200 > +++ gcc/cp/optimize.cc 2024-04-16 21:35:53.597188774 +0200 > @@ -723,3 +723,58 @@ maybe_clone_body (tree fn) > /* We don't need to process the original function any further. */ > return 1; > } > + > +/* If maybe_clone_body is called while the cdtor is still tentative, > + DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn). > + In that case we wouldn't try to optimize using an alias and instead > + would emit separate base and complete cdtor. The following function > + attempts to still optimize that case when we import_export_decl > + is called first time on one of the clones. */ > + > +void > +maybe_optimize_cdtor (tree orig_decl) > +{ > + tree fns[3]; > + tree fn = DECL_CLONED_FUNCTION (orig_decl); > + gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); > + if (DECL_INTERFACE_KNOWN (fn) > + || !TREE_ASM_WRITTEN (fn) > + || !DECL_ONE_ONLY (orig_decl) > + || symtab->global_info_ready) > + return; > + > + populate_clone_array (fn, fns); > + > + if (!fns[0] || !fns[1]) > + return; > + for (int i = 2 - !fns[2]; i >= 0; --i) > + if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i])) > + return; > + DECL_INTERFACE_KNOWN (fn) = 1; > + comdat_linkage (fn); > + if (!can_alias_cdtor (fn)) > + return; > + /* For comdat base and complete cdtors put them into the same, > + *[CD]5* comdat group instead of *[CD][12]*. */ > + auto n0 = cgraph_node::get_create (fns[0]); > + auto n1 = cgraph_node::get_create (fns[1]); > + auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL; > + if (n0->lowered || n1->lowered || (n2 && n2->lowered)) > + return; > + import_export_decl (fns[0]); > + n1->definition = false; > + if (!n0->create_same_body_alias (fns[1], fns[0])) > + return; > + > + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); > + n1 = cgraph_node::get (fns[1]); > + n0->set_comdat_group (comdat_group); > + if (n1->same_comdat_group) > + n1->remove_from_same_comdat_group (); > + n1->add_to_same_comdat_group (n0); > + if (fns[2]) > + n2->add_to_same_comdat_group (n0); > + import_export_decl (fns[1]); So this is handling the case where symbol was already inserted into one comdat group and later we need to move it into the C5 group? As long as we move everythingf rom old comdat group, this should be fine. > + /* Remove the body now that it is an alias. */ > + DECL_SAVED_TREE (fns[1]) = NULL_TREE; Maybe using release_function_body since it also knows how to remove DECL_STRUCT_FUNCTION that exists at this stage? I was thinking how to solve the problem on cgrpah side. We definitely have long lasting bug where aliases are handled incorrectly for which I made WIP patch (but probably would like to hold it after release branch is created). When foo has alias bar and foo is praviled then the alias target is prevailed too. This is what causes the ICE about cross comdat section alias. However fixing this is not enough as I do not think we can handle incremental linking correctly (as discussed briefly on IRC technically we should keep both sections but that would require two symbols of same name in single .o file). With the aliasing fixed we turn the other symbol into static but keep alias, so we end up with one comdat group having the non-aliased constructor and second comdat group (C5) exporting only the alias, which is not quite reight either. Honza ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 9:04 ` Jan Hubicka @ 2024-04-17 12:32 ` Jakub Jelinek 2024-04-17 13:26 ` Jan Hubicka 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-17 12:32 UTC (permalink / raw) To: Jan Hubicka; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches On Wed, Apr 17, 2024 at 11:04:26AM +0200, Jan Hubicka wrote: > > The reason the optimization doesn't trigger when the constructor is > > constexpr is that expand_or_defer_fn is called in that case much earlier > > than when it is not constexpr; in the former case it is called when we > > try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN > > is false in that case and comdat_linkage hasn't been called either > > (I think it is desirable, because comdat group is stored in the cgraph > > node and am not sure it is a good idea to create cgraph nodes for > > something that might not be needed later on at all), so maybe_clone_body > > clones the bodies, but doesn't make them as aliases. > > Thanks for working this out! Creating cgraph node with no body is > harmless as it will be removed later. That is actually for functions with bodies, maybe_clone_body creates the bodies for those, but still when it happens early, the cdtors have tentative_decl_linkage linkage, which in many cases means DECL_EXTERNAL, DECL_NOT_REALLY_EXTERN (C++ FE special thing), DECL_DEFER_OUTPUT etc. > > + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); > > + n1 = cgraph_node::get (fns[1]); > > + n0->set_comdat_group (comdat_group); > > + if (n1->same_comdat_group) > > + n1->remove_from_same_comdat_group (); > > + n1->add_to_same_comdat_group (n0); > > + if (fns[2]) > > + n2->add_to_same_comdat_group (n0); > > + import_export_decl (fns[1]); > > So this is handling the case where symbol was already inserted into one > comdat group and later we need to move it into the C5 group? As long as > we move everythingf rom old comdat group, this should be fine. The above is pretty much an adjusted copy of what maybe_clone_body does, except it doesn't call cgraph_node::get{,_create} all the time and uses import_export_decl rather than expand_or_defer_fn{,_1}. > > + /* Remove the body now that it is an alias. */ > > + DECL_SAVED_TREE (fns[1]) = NULL_TREE; > Maybe using release_function_body since it also knows how to remove > DECL_STRUCT_FUNCTION that exists at this stage? Guess I could try that, clearing of DECL_SAVED_TREE was what was done in maybe_clone_body too. > I was thinking how to solve the problem on cgrpah side. We definitely > have long lasting bug where aliases are handled incorrectly for which I > made WIP patch (but probably would like to hold it after release branch is > created). When foo has alias bar and foo is praviled then the alias > target is prevailed too. This is what causes the ICE about cross comdat > section alias. However fixing this is not enough as I do not think we > can handle incremental linking correctly (as discussed briefly on IRC > technically we should keep both sections but that would require two > symbols of same name in single .o file). > > With the aliasing fixed we turn the other symbol into static but keep > alias, so we end up with one comdat group having the non-aliased > constructor and second comdat group (C5) exporting only the alias, which > is not quite reight either. I've tried to see what actually happens during linking without LTO, so compiled pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk (so it has those 2 separate comdats, one for C2 and one for C1), though I've changed the void m(k); line to __attribute__((noipa)) void m(k) {} in the testcase, then compiled pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer so that one can clearly differentiate from where the implementation was picked and finally added template <typename _Tp> struct _Vector_base { int g() const; _Vector_base(int, int); }; struct QualityValue; template <> _Vector_base<QualityValue>::_Vector_base(int, int) {} template <> int _Vector_base<QualityValue>::g() const { return 0; } int main () {} If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed in both cases. It is unclear why that isn't the case for LTO. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 12:32 ` Jakub Jelinek @ 2024-04-17 13:26 ` Jan Hubicka 2024-04-17 14:07 ` Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jan Hubicka @ 2024-04-17 13:26 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches > > I've tried to see what actually happens during linking without LTO, so compiled > pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk > (so it has those 2 separate comdats, one for C2 and one for C1), though I've > changed the > void m(k); > line to > __attribute__((noipa)) void m(k) {} > in the testcase, then compiled > pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer > so that one can clearly differentiate from where the implementation was > picked and finally added > template <typename _Tp> struct _Vector_base { > int g() const; > _Vector_base(int, int); > }; > > struct QualityValue; > template <> > _Vector_base<QualityValue>::_Vector_base(int, int) {} > template <> > int _Vector_base<QualityValue>::g() const { return 0; } > int main () {} > If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and > _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the > omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed > in both cases. It is unclear why that isn't the case for LTO. I think it is because of -fkeep-inline-functions which makes the first object file to define both symbols, while with LTO we optimize out one of them. So to reproduce same behaviour with non-LTO we would probably need use -O1 and arrange the contructor to be unilinable instead of using -fkeep-inline-functions. Honza ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 13:26 ` Jan Hubicka @ 2024-04-17 14:07 ` Jakub Jelinek 2024-04-17 14:34 ` Jan Hubicka 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-17 14:07 UTC (permalink / raw) To: Jan Hubicka; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches On Wed, Apr 17, 2024 at 03:26:53PM +0200, Jan Hubicka wrote: > > > > I've tried to see what actually happens during linking without LTO, so compiled > > pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk > > (so it has those 2 separate comdats, one for C2 and one for C1), though I've > > changed the > > void m(k); > > line to > > __attribute__((noipa)) void m(k) {} > > in the testcase, then compiled > > pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer > > so that one can clearly differentiate from where the implementation was > > picked and finally added > > template <typename _Tp> struct _Vector_base { > > int g() const; > > _Vector_base(int, int); > > }; > > > > struct QualityValue; > > template <> > > _Vector_base<QualityValue>::_Vector_base(int, int) {} > > template <> > > int _Vector_base<QualityValue>::g() const { return 0; } > > int main () {} > > If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and > > _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the > > omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed > > in both cases. It is unclear why that isn't the case for LTO. > > I think it is because of -fkeep-inline-functions which makes the first > object file to define both symbols, while with LTO we optimize out one > of them. > > So to reproduce same behaviour with non-LTO we would probably need use > -O1 and arrange the contructor to be unilinable instead of using > -fkeep-inline-functions. Ah, you're right. If I compile (the one line modified) pr113208_0.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ comdat and no _ZN6vectorI12QualityValueEC1ERKS1_ and pr113208_1.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 -fno-omit-frame-pointer and link that together with the above mentioned third *.C file, I see 000000000040112a <_ZN6vectorI12QualityValueEC2ERKS1_>: 40112a: 53 push %rbx 40112b: 48 89 fb mov %rdi,%rbx 40112e: 48 89 f7 mov %rsi,%rdi 401131: e8 9c 00 00 00 call 4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv> 401136: 89 c2 mov %eax,%edx 401138: be 01 00 00 00 mov $0x1,%esi 40113d: 48 89 df mov %rbx,%rdi 401140: e8 7b 00 00 00 call 4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii> 401145: 5b pop %rbx 401146: c3 ret i.e. the C2 prevailing from pr113208_0.s where it is the only symbol, and 0000000000401196 <_ZN6vectorI12QualityValueEC1ERKS1_>: 401196: 55 push %rbp 401197: 48 89 e5 mov %rsp,%rbp 40119a: 53 push %rbx 40119b: 48 83 ec 08 sub $0x8,%rsp 40119f: 48 89 fb mov %rdi,%rbx 4011a2: 48 89 f7 mov %rsi,%rdi 4011a5: e8 28 00 00 00 call 4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv> 4011aa: 89 c2 mov %eax,%edx 4011ac: be 01 00 00 00 mov $0x1,%esi 4011b1: 48 89 df mov %rbx,%rdi 4011b4: e8 07 00 00 00 call 4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii> 4011b9: 48 8b 5d f8 mov -0x8(%rbp),%rbx 4011bd: c9 leave 4011be: c3 ret which is the C1 alias originally aliased to C2 in C5 comdat. So, that would match linker behavior where it sees C1 -> C2 alias prevails, but a different version of C2 prevails, so let's either make C1 a non-alias or alias to a non-exported symbol or something like that. Though, I admit I have no idea what we do with comdat's during LTO, perhaps doing what I said above could break stuff if linker after seeing the LTO resulting objects decides on prevailing symbols differently. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 14:07 ` Jakub Jelinek @ 2024-04-17 14:34 ` Jan Hubicka 2024-04-17 14:39 ` Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jan Hubicka @ 2024-04-17 14:34 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches > > Ah, you're right. > If I compile (the one line modified) pr113208_0.C with > -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 > it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ > comdat and no _ZN6vectorI12QualityValueEC1ERKS1_ > and pr113208_1.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 -fno-omit-frame-pointer > and link that together with the above mentioned third *.C file, I see > 000000000040112a <_ZN6vectorI12QualityValueEC2ERKS1_>: > 40112a: 53 push %rbx > 40112b: 48 89 fb mov %rdi,%rbx > 40112e: 48 89 f7 mov %rsi,%rdi > 401131: e8 9c 00 00 00 call 4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv> > 401136: 89 c2 mov %eax,%edx > 401138: be 01 00 00 00 mov $0x1,%esi > 40113d: 48 89 df mov %rbx,%rdi > 401140: e8 7b 00 00 00 call 4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii> > 401145: 5b pop %rbx > 401146: c3 ret > i.e. the C2 prevailing from pr113208_0.s where it is the only symbol, and > 0000000000401196 <_ZN6vectorI12QualityValueEC1ERKS1_>: > 401196: 55 push %rbp > 401197: 48 89 e5 mov %rsp,%rbp > 40119a: 53 push %rbx > 40119b: 48 83 ec 08 sub $0x8,%rsp > 40119f: 48 89 fb mov %rdi,%rbx > 4011a2: 48 89 f7 mov %rsi,%rdi > 4011a5: e8 28 00 00 00 call 4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv> > 4011aa: 89 c2 mov %eax,%edx > 4011ac: be 01 00 00 00 mov $0x1,%esi > 4011b1: 48 89 df mov %rbx,%rdi > 4011b4: e8 07 00 00 00 call 4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii> > 4011b9: 48 8b 5d f8 mov -0x8(%rbp),%rbx > 4011bd: c9 leave > 4011be: c3 ret > which is the C1 alias originally aliased to C2 in C5 comdat. > So, that would match linker behavior where it sees C1 -> C2 alias prevails, > but a different version of C2 prevails, so let's either make C1 a non-alias > or alias to a non-exported symbol or something like that. Yep, I think the linker simply handles the C2 symbol as weak symbol and after it decides to keep both comdat section (C2 and C5) the C5's weak symbol is prevailed by C2. I wrote patch doing the same with LTO - we need to handle alias references specially and do not update them according to the resolution info and then look for prevailed symbols which have aliases and turn them to static symbols. I think for most scenarios this is OK, but I am not sure about incremental linking (both LTO and non-LTO). What seems wrong is that we produce C5 comdat that is not exporting what it should and thus breaking the invariant that in valid code all comdats of same name are semantically equivalent. Perhaps it makes no difference since this scenario is pretty special and we know that the functions are semantically equivalent and their addresses are never compared for equality (at least I failed to produce some useful testcase). Honza ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 14:34 ` Jan Hubicka @ 2024-04-17 14:39 ` Jakub Jelinek 0 siblings, 0 replies; 25+ messages in thread From: Jakub Jelinek @ 2024-04-17 14:39 UTC (permalink / raw) To: Jan Hubicka; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches On Wed, Apr 17, 2024 at 04:34:07PM +0200, Jan Hubicka wrote: > I think for most scenarios this is OK, but I am not sure about > incremental linking (both LTO and non-LTO). What seems wrong is that we > produce C5 comdat that is not exporting what it should and thus breaking > the invariant that in valid code all comdats of same name are > semantically equivalent. Yeah, exactly. That is what I'm worried about too. > Perhaps it makes no difference since this > scenario is pretty special and we know that the functions are > semantically equivalent and their addresses are never compared for > equality (at least I failed to produce some useful testcase). Yes, I think one can't take address of a constructor/destructor and compare that for equality; I guess the destructor address can be stored in vtables, but code manually reading stuff from vtables and assuming pointer equality is almost certainly not valid. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-17 7:42 [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Jakub Jelinek 2024-04-17 9:04 ` Jan Hubicka @ 2024-04-22 15:42 ` Jakub Jelinek 2024-04-23 3:14 ` Jason Merrill 1 sibling, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-22 15:42 UTC (permalink / raw) To: Jason Merrill; +Cc: gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka On Wed, Apr 17, 2024 at 09:42:47AM +0200, Jakub Jelinek wrote: > When expand_or_defer_fn is called at_eof time, it calls import_export_decl > and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a > couple of places to try to optimize cdtors which are known to have the > same body by making the complete cdtor an alias to base cdtor (and in > that case also uses *[CD]5* as comdat group name instead of the normal > comdat group names specific to each mangled name). > Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN, > maybe_clone_body and can_alias_cdtor use: > if (DECL_ONE_ONLY (fn)) > cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone)); > ... > bool can_alias = can_alias_cdtor (fn); > ... > /* Tell cgraph if both ctors or both dtors are known to have > the same body. */ > if (can_alias > && fns[0] > && idx == 1 > && cgraph_node::get_create (fns[0])->create_same_body_alias > (clone, fns[0])) > { > alias = true; > if (DECL_ONE_ONLY (fns[0])) > { > /* For comdat base and complete cdtors put them > into the same, *[CD]5* comdat group instead of > *[CD][12]*. */ > comdat_group = cdtor_comdat_group (fns[1], fns[0]); > cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group); > if (symtab_node::get (clone)->same_comdat_group) > symtab_node::get (clone)->remove_from_same_comdat_group (); > symtab_node::get (clone)->add_to_same_comdat_group > (symtab_node::get (fns[0])); > } > } > and > /* Don't use aliases for weak/linkonce definitions unless we can put both > symbols in the same COMDAT group. */ > return (DECL_INTERFACE_KNOWN (fn) > && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) > && (!DECL_ONE_ONLY (fn) > || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); > The following testcase regressed with Marek's r14-5979 change, > when pr113208_0.C is compiled where the ctor is marked constexpr, > we no longer perform this optimization, where > _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and > _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, > instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same > content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in > _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. > Now, the linker seems to somehow cope with that, eventhough it > probably keeps both copies of the ctor, but seems LTO can't cope > with that and Honza doesn't know what it should do in that case > (linker decides that the prevailing symbol is > _ZN6vectorI12QualityValueEC2ERKS1_ (from the > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and > _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU, > from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)). > > Note, the case where some constructor is marked constexpr in one > TU and not in another one happens pretty often in libstdc++ when > one mixes -std= flags used to compile different compilation units. > > The reason the optimization doesn't trigger when the constructor is > constexpr is that expand_or_defer_fn is called in that case much earlier > than when it is not constexpr; in the former case it is called when we > try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN > is false in that case and comdat_linkage hasn't been called either > (I think it is desirable, because comdat group is stored in the cgraph > node and am not sure it is a good idea to create cgraph nodes for > something that might not be needed later on at all), so maybe_clone_body > clones the bodies, but doesn't make them as aliases. > > The following patch is an attempt to redo this optimization when > import_export_decl is called at_eof time on the base/complete cdtor > (or deleting dtor). It will not do anything if maybe_clone_body > hasn't been called uyet (the TREE_ASM_WRITTEN check on the > DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete > cdtors have been lowered already, or when maybe_clone_body called > maybe_thunk_body and it was successful. Otherwise retries the > can_alias_cdtor check and makes the complete cdtor alias to the > base cdtor with adjustments to the comdat group. Here is an updated version of the patch which changes - DECL_SAVED_TREE (fns[1]) = NULL_TREE; + release_function_body (fns[1]); as suggested by Honza. Bootstrapped/regtested on x86_64-linux and i686-linux again, ok for trunk? 2024-04-22 Jakub Jelinek <jakub@redhat.com> PR lto/113208 * cp-tree.h (maybe_optimize_cdtor): Declare. * decl2.cc (import_export_decl): Call it for cloned cdtors. * optimize.cc (maybe_optimize_cdtor): New function. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/cp-tree.h.jj 2024-04-16 17:18:37.286279533 +0200 +++ gcc/cp/cp-tree.h 2024-04-16 17:45:01.685635709 +0200 @@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign /* In optimize.cc */ extern tree clone_attrs (tree); extern bool maybe_clone_body (tree); +extern void maybe_optimize_cdtor (tree); /* In parser.cc */ extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool, --- gcc/cp/decl2.cc.jj 2024-04-16 17:18:37.287279519 +0200 +++ gcc/cp/decl2.cc 2024-04-16 17:45:01.686635695 +0200 @@ -3568,6 +3568,9 @@ import_export_decl (tree decl) } DECL_INTERFACE_KNOWN (decl) = 1; + + if (DECL_CLONED_FUNCTION_P (decl)) + maybe_optimize_cdtor (decl); } /* Return an expression that performs the destruction of DECL, which --- gcc/cp/optimize.cc.jj 2024-04-16 17:18:37.374278327 +0200 +++ gcc/cp/optimize.cc 2024-04-22 14:52:59.310630914 +0200 @@ -723,3 +723,58 @@ maybe_clone_body (tree fn) /* We don't need to process the original function any further. */ return 1; } + +/* If maybe_clone_body is called while the cdtor is still tentative, + DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn). + In that case we wouldn't try to optimize using an alias and instead + would emit separate base and complete cdtor. The following function + attempts to still optimize that case when we import_export_decl + is called first time on one of the clones. */ + +void +maybe_optimize_cdtor (tree orig_decl) +{ + tree fns[3]; + tree fn = DECL_CLONED_FUNCTION (orig_decl); + gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); + if (DECL_INTERFACE_KNOWN (fn) + || !TREE_ASM_WRITTEN (fn) + || !DECL_ONE_ONLY (orig_decl) + || symtab->global_info_ready) + return; + + populate_clone_array (fn, fns); + + if (!fns[0] || !fns[1]) + return; + for (int i = 2 - !fns[2]; i >= 0; --i) + if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i])) + return; + DECL_INTERFACE_KNOWN (fn) = 1; + comdat_linkage (fn); + if (!can_alias_cdtor (fn)) + return; + /* For comdat base and complete cdtors put them into the same, + *[CD]5* comdat group instead of *[CD][12]*. */ + auto n0 = cgraph_node::get_create (fns[0]); + auto n1 = cgraph_node::get_create (fns[1]); + auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL; + if (n0->lowered || n1->lowered || (n2 && n2->lowered)) + return; + import_export_decl (fns[0]); + n1->definition = false; + if (!n0->create_same_body_alias (fns[1], fns[0])) + return; + + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); + n1 = cgraph_node::get (fns[1]); + n0->set_comdat_group (comdat_group); + if (n1->same_comdat_group) + n1->remove_from_same_comdat_group (); + n1->add_to_same_comdat_group (n0); + if (fns[2]) + n2->add_to_same_comdat_group (n0); + import_export_decl (fns[1]); + /* Remove the body now that it is an alias. */ + release_function_body (fns[1]); +} --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek @ 2024-04-23 3:14 ` Jason Merrill 2024-04-23 16:04 ` Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jason Merrill @ 2024-04-23 3:14 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka On 4/22/24 08:42, Jakub Jelinek wrote: > On Wed, Apr 17, 2024 at 09:42:47AM +0200, Jakub Jelinek wrote: >> When expand_or_defer_fn is called at_eof time, it calls import_export_decl >> and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a >> couple of places to try to optimize cdtors which are known to have the >> same body by making the complete cdtor an alias to base cdtor (and in >> that case also uses *[CD]5* as comdat group name instead of the normal >> comdat group names specific to each mangled name). >> Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN, >> maybe_clone_body and can_alias_cdtor use: >> if (DECL_ONE_ONLY (fn)) >> cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone)); >> ... >> bool can_alias = can_alias_cdtor (fn); >> ... >> /* Tell cgraph if both ctors or both dtors are known to have >> the same body. */ >> if (can_alias >> && fns[0] >> && idx == 1 >> && cgraph_node::get_create (fns[0])->create_same_body_alias >> (clone, fns[0])) >> { >> alias = true; >> if (DECL_ONE_ONLY (fns[0])) >> { >> /* For comdat base and complete cdtors put them >> into the same, *[CD]5* comdat group instead of >> *[CD][12]*. */ >> comdat_group = cdtor_comdat_group (fns[1], fns[0]); >> cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group); >> if (symtab_node::get (clone)->same_comdat_group) >> symtab_node::get (clone)->remove_from_same_comdat_group (); >> symtab_node::get (clone)->add_to_same_comdat_group >> (symtab_node::get (fns[0])); >> } >> } >> and >> /* Don't use aliases for weak/linkonce definitions unless we can put both >> symbols in the same COMDAT group. */ >> return (DECL_INTERFACE_KNOWN (fn) >> && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) >> && (!DECL_ONE_ONLY (fn) >> || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); >> The following testcase regressed with Marek's r14-5979 change, >> when pr113208_0.C is compiled where the ctor is marked constexpr, >> we no longer perform this optimization, where >> _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the >> _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and >> _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, >> instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in >> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same >> content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in >> _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. This seems like an ABI bug that could use a non-LTO testcase. >> Now, the linker seems to somehow cope with that, eventhough it >> probably keeps both copies of the ctor, but seems LTO can't cope >> with that and Honza doesn't know what it should do in that case >> (linker decides that the prevailing symbol is >> _ZN6vectorI12QualityValueEC2ERKS1_ (from the >> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and >> _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU, >> from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)). >> >> Note, the case where some constructor is marked constexpr in one >> TU and not in another one happens pretty often in libstdc++ when >> one mixes -std= flags used to compile different compilation units. >> >> The reason the optimization doesn't trigger when the constructor is >> constexpr is that expand_or_defer_fn is called in that case much earlier >> than when it is not constexpr; in the former case it is called when we >> try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN >> is false in that case and comdat_linkage hasn't been called either >> (I think it is desirable, because comdat group is stored in the cgraph >> node and am not sure it is a good idea to create cgraph nodes for >> something that might not be needed later on at all), so maybe_clone_body >> clones the bodies, but doesn't make them as aliases. Hmm, cloning the bodies and then discarding them later seems like more extra work than creating the cgraph nodes. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-23 3:14 ` Jason Merrill @ 2024-04-23 16:04 ` Jakub Jelinek 2024-04-24 9:16 ` Jonathan Wakely 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-23 16:04 UTC (permalink / raw) To: Jason Merrill Cc: gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka, Jonathan Wakely On Mon, Apr 22, 2024 at 11:14:35PM -0400, Jason Merrill wrote: > > > The following testcase regressed with Marek's r14-5979 change, > > > when pr113208_0.C is compiled where the ctor is marked constexpr, > > > we no longer perform this optimization, where > > > _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the > > > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and > > > _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, > > > instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in > > > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same > > > content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in > > > _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. > > This seems like an ABI bug that could use a non-LTO testcase. Well, except for the issues it causes to LTO I think it is compatible, worst case we get the body of the ctor duplicated in the executable and the linker picks some of the weak symbols as the symbol definitions. Anyway, I've added a non-LTO testcase for that in the patch below. > Hmm, cloning the bodies and then discarding them later seems like more extra > work than creating the cgraph nodes. So, I've tried to handle that in tentative_decl_linkage, like that function already handles functions declared inline except for implicit template instantiations. If we expect that import_export_decl will do comdat_linkage for the ctor later on do it right away. That fixes the testcases too, but seems to regress +FAIL: libstdc++-abi/abi_check on both x86_64-linux and i686-linux, in each case 8 symbols disappeared from libstdc++.so.6: _ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev _ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev _ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev _ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev _ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev _ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev Will need to study why that happened, it might be that it was ok because I think the filesystem stuff is unlike the rest compiled with no exported templates, but would need at least some hacks in libstdc++ to preserve previously exported symbols. Still, feels like a risky change this late if it wouldn't break ABI of other libraries. 2024-04-23 Jakub Jelinek <jakub@redhat.com> PR lto/113208 * decl2.cc (tentative_decl_linkage): Use comdat_linkage also for implicit instantiations of maybe in charge ctors/dtors if -fimplicit-templates or -fimplicit-inline-templates and -fweak and target supports aliases. * g++.dg/abi/comdat2.C: New test. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/decl2.cc.jj 2024-04-22 15:16:55.328548807 +0200 +++ gcc/cp/decl2.cc 2024-04-23 09:52:18.993250442 +0200 @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl) to mark the functions at this point. */ if (DECL_DECLARED_INLINE_P (decl) && (!DECL_IMPLICIT_INSTANTIATION (decl) - || DECL_DEFAULTED_FN (decl))) + || DECL_DEFAULTED_FN (decl) + /* For implicit instantiations of cdtors, + if import_export_decl would use comdat linkage, + make sure to use it right away, so that maybe_clone_body + can use aliases. See PR113208. */ + || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl) + && (flag_implicit_templates + || flag_implicit_inline_templates) + && flag_weak + && TARGET_SUPPORTS_ALIASES))) { /* This function must have external linkage, as otherwise DECL_INTERFACE_KNOWN would have been --- gcc/testsuite/g++.dg/abi/comdat2.C.jj 2024-04-23 10:04:28.485964610 +0200 +++ gcc/testsuite/g++.dg/abi/comdat2.C 2024-04-23 10:05:24.757171194 +0200 @@ -0,0 +1,26 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2 -fkeep-inline-functions" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-23 09:44:07.523179110 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-23 09:44:07.523179110 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-23 09:44:07.523179110 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-23 09:44:07.523179110 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-23 09:44:07.523179110 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-23 09:44:07.523179110 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-23 16:04 ` Jakub Jelinek @ 2024-04-24 9:16 ` Jonathan Wakely 2024-04-24 16:16 ` [PATCH] c++, v3: " Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jonathan Wakely @ 2024-04-24 9:16 UTC (permalink / raw) To: Jakub Jelinek Cc: Jason Merrill, gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka On Tue, 23 Apr 2024 at 17:05, Jakub Jelinek <jakub@redhat.com> wrote: > > On Mon, Apr 22, 2024 at 11:14:35PM -0400, Jason Merrill wrote: > > > > The following testcase regressed with Marek's r14-5979 change, > > > > when pr113208_0.C is compiled where the ctor is marked constexpr, > > > > we no longer perform this optimization, where > > > > _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the > > > > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and > > > > _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, > > > > instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in > > > > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same > > > > content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in > > > > _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. > > > > This seems like an ABI bug that could use a non-LTO testcase. > > Well, except for the issues it causes to LTO I think it is compatible, > worst case we get the body of the ctor duplicated in the executable > and the linker picks some of the weak symbols as the symbol definitions. > Anyway, I've added a non-LTO testcase for that in the patch below. > > > Hmm, cloning the bodies and then discarding them later seems like more extra > > work than creating the cgraph nodes. > > So, I've tried to handle that in tentative_decl_linkage, like that function > already handles functions declared inline except for implicit template > instantiations. If we expect that import_export_decl will do comdat_linkage > for the ctor later on do it right away. > > That fixes the testcases too, but seems to regress > +FAIL: libstdc++-abi/abi_check > on both x86_64-linux and i686-linux, in each case 8 symbols disappeared from > libstdc++.so.6: > _ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev > _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev > _ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev > _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev > _ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev > _ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev > _ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev > _ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev > > Will need to study why that happened, it might be that it was ok because > I think the filesystem stuff is unlike the rest compiled with no exported > templates, but would need at least some hacks in libstdc++ to preserve > previously exported symbols. There are explicit instantiation definitions that should instantiate those types: src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::_Dir>; src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>; src/c++17/fs_path.cc:template class std::__shared_ptr<const fs::filesystem_error::_Impl>; So the missing symbols should be present in cow-fs_dir.o and cow-fs_path.o > Still, feels like a risky change this late if it wouldn't break ABI of other > libraries. > > 2024-04-23 Jakub Jelinek <jakub@redhat.com> > > PR lto/113208 > * decl2.cc (tentative_decl_linkage): Use comdat_linkage also > for implicit instantiations of maybe in charge ctors/dtors > if -fimplicit-templates or -fimplicit-inline-templates and > -fweak and target supports aliases. > > * g++.dg/abi/comdat2.C: New test. > * g++.dg/lto/pr113208_0.C: New test. > * g++.dg/lto/pr113208_1.C: New file. > * g++.dg/lto/pr113208.h: New file. > > --- gcc/cp/decl2.cc.jj 2024-04-22 15:16:55.328548807 +0200 > +++ gcc/cp/decl2.cc 2024-04-23 09:52:18.993250442 +0200 > @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl) > to mark the functions at this point. */ > if (DECL_DECLARED_INLINE_P (decl) > && (!DECL_IMPLICIT_INSTANTIATION (decl) > - || DECL_DEFAULTED_FN (decl))) > + || DECL_DEFAULTED_FN (decl) > + /* For implicit instantiations of cdtors, > + if import_export_decl would use comdat linkage, > + make sure to use it right away, so that maybe_clone_body > + can use aliases. See PR113208. */ > + || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl) > + && (flag_implicit_templates > + || flag_implicit_inline_templates) > + && flag_weak > + && TARGET_SUPPORTS_ALIASES))) > { > /* This function must have external linkage, as > otherwise DECL_INTERFACE_KNOWN would have been > --- gcc/testsuite/g++.dg/abi/comdat2.C.jj 2024-04-23 10:04:28.485964610 +0200 > +++ gcc/testsuite/g++.dg/abi/comdat2.C 2024-04-23 10:05:24.757171194 +0200 > @@ -0,0 +1,26 @@ > +// PR lto/113208 > +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } > +// { dg-additional-options "-O2 -fkeep-inline-functions" } > +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } > +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } > +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } > + > +template <typename T> > +struct A { > + int foo () const; > + A (int, int); > +}; > +template <typename T> > +struct B : A<T> { > + constexpr B (const B &x) : A<T> (1, x.foo ()) {} > + B () : A<T> (1, 2) {} > +}; > +struct C; > +struct D : B<C> {}; > +void bar (D); > + > +void > +baz (D x) > +{ > + bar (x); > +} > --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-23 09:44:07.523179110 +0200 > +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-23 09:44:07.523179110 +0200 > @@ -0,0 +1,10 @@ > +template <typename _Tp> struct _Vector_base { > + int g() const; > + _Vector_base(int, int); > +}; > +template <typename _Tp> > +struct vector : _Vector_base<_Tp> { > + CONSTEXPR vector(const vector &__x) > + : _Vector_base<_Tp>(1, __x.g()) {} > + vector() : _Vector_base<_Tp>(1, 2) {} > +}; > --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-23 09:44:07.523179110 +0200 > +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-23 09:44:07.523179110 +0200 > @@ -0,0 +1,6 @@ > +#define CONSTEXPR > +#include "pr113208.h" > + > +struct QualityValue; > +vector<QualityValue> values1; > +vector<QualityValue> values{values1}; > --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-23 09:44:07.523179110 +0200 > +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-23 09:44:07.523179110 +0200 > @@ -0,0 +1,13 @@ > +// { dg-lto-do link } > +// { dg-lto-options { {-O1 -std=c++20 -flto}} } > +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } > +// { dg-require-linker-plugin "" } > + > +#define CONSTEXPR constexpr > +#include "pr113208.h" > + > +struct QualityValue; > +struct k : vector<QualityValue> {}; > + > +void m(k); > +void n(k i) { m(i); } > > > Jakub > ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-24 9:16 ` Jonathan Wakely @ 2024-04-24 16:16 ` Jakub Jelinek 2024-04-24 22:39 ` Jason Merrill 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-24 16:16 UTC (permalink / raw) To: Jonathan Wakely, Jason Merrill, Jan Hubicka Cc: gcc-patches, Richard Biener, Patrick Palka On Wed, Apr 24, 2024 at 10:16:05AM +0100, Jonathan Wakely wrote: > > That fixes the testcases too, but seems to regress > > +FAIL: libstdc++-abi/abi_check > There are explicit instantiation definitions that should instantiate > those types: > > src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::_Dir>; > src/c++17/fs_dir.cc:template class > std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>; > src/c++17/fs_path.cc:template class std::__shared_ptr<const > fs::filesystem_error::_Impl>; > > So the missing symbols should be present in cow-fs_dir.o and cow-fs_path.o So this boils down to (cvise reduced): namespace __gnu_cxx { enum _Lock_policy { _S_single, _S_mutex, _S_atomic } const __default_lock_policy = _S_atomic; } namespace std { using __gnu_cxx::__default_lock_policy; using __gnu_cxx::_Lock_policy; template <typename, _Lock_policy = __default_lock_policy> struct __shared_ptr { constexpr __shared_ptr() {} }; namespace filesystem { struct _Dir; struct directory_iterator { __shared_ptr<_Dir> _M_dir; }; void end() { directory_iterator(); } } extern template class __shared_ptr<filesystem::_Dir>; } namespace fs = std::filesystem; template class std::__shared_ptr<fs::_Dir>; at -O2, previously it used to emit _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE1EEC2Ev but now no longer does with the yesterday posted PR113208 patch. The following updated patch fixes that by calling note_vague_linkage_fn for the cdtor clones from maybe_clone_body if the flags suggest that the maybe-in-charge cdtor had tentative_decl_linkage -> note_vague_linkage_fn called too. And then I've noticed that in some cases the updated comdat group set by maybe_clone_body (*C5* or *D5*) was then overwritten again by maybe_make_one_only. So the patch tweaks cxx_comdat_group, so that when some comdat group has been chosen already it doesn't try to use some different one. Bootstrapped/regtested on x86_64-linux and i686-linux, this one doesn't regress anything unlike the earlier patch. 2024-04-24 Jakub Jelinek <jakub@redhat.com> PR lto/113208 * decl2.cc (tentative_decl_linkage): Use comdat_linkage also for implicit instantiations of maybe in charge ctors/dtors if -fimplicit-templates or -fimplicit-inline-templates and -fweak and target supports aliases. * optimize.cc (maybe_clone_body): Call note_vague_linkage_fn on clone if fn has DECL_INTERFACE_KNOWN, DECL_NOT_REALLY_EXTERN and DECL_DEFER_OUTPUT flags set. * decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already set. * g++.dg/abi/comdat2.C: New test. * g++.dg/abi/comdat3.C: New test. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/decl2.cc.jj 2024-04-23 14:49:41.933186265 +0200 +++ gcc/cp/decl2.cc 2024-04-24 15:17:09.043625729 +0200 @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl) to mark the functions at this point. */ if (DECL_DECLARED_INLINE_P (decl) && (!DECL_IMPLICIT_INSTANTIATION (decl) - || DECL_DEFAULTED_FN (decl))) + || DECL_DEFAULTED_FN (decl) + /* For implicit instantiations of cdtors, + if import_export_decl would use comdat linkage, + make sure to use it right away, so that maybe_clone_body + can use aliases. See PR113208. */ + || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl) + && (flag_implicit_templates + || flag_implicit_inline_templates) + && flag_weak + && TARGET_SUPPORTS_ALIASES))) { /* This function must have external linkage, as otherwise DECL_INTERFACE_KNOWN would have been --- gcc/cp/optimize.cc.jj 2024-04-24 13:44:26.456634100 +0200 +++ gcc/cp/optimize.cc 2024-04-24 14:46:13.050371557 +0200 @@ -717,6 +717,10 @@ maybe_clone_body (tree fn) else expand_or_defer_fn (clone); first = false; + if (DECL_INTERFACE_KNOWN (fn) + && DECL_NOT_REALLY_EXTERN (fn) + && DECL_DEFER_OUTPUT (fn)) + note_vague_linkage_fn (clone); } pop_from_top_level (); --- gcc/cp/decl.cc.jj 2024-04-23 08:31:05.515161033 +0200 +++ gcc/cp/decl.cc 2024-04-24 15:15:34.401951491 +0200 @@ -19254,6 +19254,14 @@ cxx_comdat_group (tree decl) else break; } + /* If a ctor/dtor has already set the comdat group by + maybe_clone_body, don't override it. */ + if (SUPPORTS_ONE_ONLY + && TREE_CODE (decl) == FUNCTION_DECL + && DECL_CLONED_FUNCTION_P (decl) + && SUPPORTS_ONE_ONLY) + if (tree comdat = DECL_COMDAT_GROUP (decl)) + return comdat; } return decl; --- gcc/testsuite/g++.dg/abi/comdat2.C.jj 2024-04-24 13:44:11.054849035 +0200 +++ gcc/testsuite/g++.dg/abi/comdat2.C 2024-04-24 13:44:11.054849035 +0200 @@ -0,0 +1,26 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2 -fkeep-inline-functions" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} --- gcc/testsuite/g++.dg/abi/comdat3.C.jj 2024-04-24 15:23:58.829885362 +0200 +++ gcc/testsuite/g++.dg/abi/comdat3.C 2024-04-24 15:24:46.088223353 +0200 @@ -0,0 +1,22 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } } + +namespace N { enum L { L1, L2, L3 } const O = L3; } +namespace M { + using N::O; + using N::L; + template <typename, L = O> + struct S { constexpr S () {} }; + namespace P { + struct T; + struct U { S<T> u; }; + void foo () { U (); } + } + extern template class S<P::T>; +} +namespace p = M::P; +template class M::S<p::T>; --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-24 13:43:17.887591040 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-24 13:44:11.055849021 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-24 13:43:17.887591040 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-24 13:44:11.055849021 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-24 13:43:17.887591040 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-24 13:44:11.054849035 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-24 16:16 ` [PATCH] c++, v3: " Jakub Jelinek @ 2024-04-24 22:39 ` Jason Merrill 2024-04-24 22:47 ` Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jason Merrill @ 2024-04-24 22:39 UTC (permalink / raw) To: Jakub Jelinek, Jonathan Wakely, Jan Hubicka Cc: gcc-patches, Richard Biener, Patrick Palka On 4/24/24 09:16, Jakub Jelinek wrote: > On Wed, Apr 24, 2024 at 10:16:05AM +0100, Jonathan Wakely wrote: >>> That fixes the testcases too, but seems to regress >>> +FAIL: libstdc++-abi/abi_check > >> There are explicit instantiation definitions that should instantiate >> those types: >> >> src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::_Dir>; >> src/c++17/fs_dir.cc:template class >> std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>; >> src/c++17/fs_path.cc:template class std::__shared_ptr<const >> fs::filesystem_error::_Impl>; >> >> So the missing symbols should be present in cow-fs_dir.o and cow-fs_path.o > > So this boils down to (cvise reduced): > namespace __gnu_cxx { enum _Lock_policy { _S_single, _S_mutex, _S_atomic } const __default_lock_policy = _S_atomic; } > namespace std { > using __gnu_cxx::__default_lock_policy; > using __gnu_cxx::_Lock_policy; > template <typename, _Lock_policy = __default_lock_policy> struct __shared_ptr { constexpr __shared_ptr() {} }; > namespace filesystem { > struct _Dir; > struct directory_iterator { __shared_ptr<_Dir> _M_dir; }; > void end() { directory_iterator(); } } > extern template class __shared_ptr<filesystem::_Dir>; > } > namespace fs = std::filesystem; > template class std::__shared_ptr<fs::_Dir>; > at -O2, previously it used to emit > _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev > _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE1EEC2Ev > but now no longer does with the yesterday posted PR113208 patch. > > The following updated patch fixes that by calling note_vague_linkage_fn for > the cdtor clones from maybe_clone_body if the flags suggest that the > maybe-in-charge cdtor had tentative_decl_linkage -> note_vague_linkage_fn > called too. And then I've noticed that in some cases the updated comdat > group set by maybe_clone_body (*C5* or *D5*) was then overwritten again by > maybe_make_one_only. So the patch tweaks cxx_comdat_group, so that when > some comdat group has been chosen already it doesn't try to use some > different one. > > Bootstrapped/regtested on x86_64-linux and i686-linux, this one doesn't > regress anything unlike the earlier patch. > > 2024-04-24 Jakub Jelinek <jakub@redhat.com> > > PR lto/113208 > * decl2.cc (tentative_decl_linkage): Use comdat_linkage also > for implicit instantiations of maybe in charge ctors/dtors > if -fimplicit-templates or -fimplicit-inline-templates and > -fweak and target supports aliases. > * optimize.cc (maybe_clone_body): Call note_vague_linkage_fn > on clone if fn has DECL_INTERFACE_KNOWN, DECL_NOT_REALLY_EXTERN > and DECL_DEFER_OUTPUT flags set. > * decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P > functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already > set. > > * g++.dg/abi/comdat2.C: New test. > * g++.dg/abi/comdat3.C: New test. > * g++.dg/lto/pr113208_0.C: New test. > * g++.dg/lto/pr113208_1.C: New file. > * g++.dg/lto/pr113208.h: New file. > > --- gcc/cp/decl2.cc.jj 2024-04-23 14:49:41.933186265 +0200 > +++ gcc/cp/decl2.cc 2024-04-24 15:17:09.043625729 +0200 > @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl) > to mark the functions at this point. */ > if (DECL_DECLARED_INLINE_P (decl) > && (!DECL_IMPLICIT_INSTANTIATION (decl) > - || DECL_DEFAULTED_FN (decl))) > + || DECL_DEFAULTED_FN (decl) > + /* For implicit instantiations of cdtors, > + if import_export_decl would use comdat linkage, > + make sure to use it right away, so that maybe_clone_body > + can use aliases. See PR113208. */ > + || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl) > + && (flag_implicit_templates > + || flag_implicit_inline_templates) > + && flag_weak > + && TARGET_SUPPORTS_ALIASES))) It seems wrong to set DECL_INTERFACE_KNOWN for cdtors that might get an explicit instantiation later, and likewise for comdat_linkage when !flag_weak; instead of adding this condition to the if, how about adding an else like > else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)) > /* For implicit instantiations of cdtors, > if import_export_decl would use comdat linkage, > make sure to use it right away, so that maybe_clone_body > can use aliases. See PR113208. */ > maybe_make_one_only (decl); ? Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-24 22:39 ` Jason Merrill @ 2024-04-24 22:47 ` Jakub Jelinek 2024-04-25 0:43 ` Jason Merrill 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-24 22:47 UTC (permalink / raw) To: Jason Merrill Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Wed, Apr 24, 2024 at 06:39:33PM -0400, Jason Merrill wrote: > > --- gcc/cp/decl2.cc.jj 2024-04-23 14:49:41.933186265 +0200 > > +++ gcc/cp/decl2.cc 2024-04-24 15:17:09.043625729 +0200 > > @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl) > > to mark the functions at this point. */ > > if (DECL_DECLARED_INLINE_P (decl) > > && (!DECL_IMPLICIT_INSTANTIATION (decl) > > - || DECL_DEFAULTED_FN (decl))) > > + || DECL_DEFAULTED_FN (decl) > > + /* For implicit instantiations of cdtors, > > + if import_export_decl would use comdat linkage, > > + make sure to use it right away, so that maybe_clone_body > > + can use aliases. See PR113208. */ > > + || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl) > > + && (flag_implicit_templates > > + || flag_implicit_inline_templates) > > + && flag_weak > > + && TARGET_SUPPORTS_ALIASES))) > > It seems wrong to set DECL_INTERFACE_KNOWN for cdtors that might get an > explicit instantiation later, and likewise for comdat_linkage when > !flag_weak; instead of adding this condition to the if, how about adding an > else like > > > else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)) > > /* For implicit instantiations of cdtors, > > if import_export_decl would use comdat linkage, > > make sure to use it right away, so that maybe_clone_body > > can use aliases. See PR113208. */ > > maybe_make_one_only (decl); > > ? Then can_alias_cdtor would return false, because it ends with: /* Don't use aliases for weak/linkonce definitions unless we can put both symbols in the same COMDAT group. */ return (DECL_INTERFACE_KNOWN (fn) && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) && (!DECL_ONE_ONLY (fn) || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); Should we change that DECL_INTERFACE_KNOWN (fn) in there to (DECL_INTERFACE_KNOWN (fn) || something) then and what that something should be? HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)? Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-24 22:47 ` Jakub Jelinek @ 2024-04-25 0:43 ` Jason Merrill 2024-04-25 12:02 ` [PATCH] c++, v4: " Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jason Merrill @ 2024-04-25 0:43 UTC (permalink / raw) To: Jakub Jelinek Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On 4/24/24 15:47, Jakub Jelinek wrote: > On Wed, Apr 24, 2024 at 06:39:33PM -0400, Jason Merrill wrote: >>> --- gcc/cp/decl2.cc.jj 2024-04-23 14:49:41.933186265 +0200 >>> +++ gcc/cp/decl2.cc 2024-04-24 15:17:09.043625729 +0200 >>> @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl) >>> to mark the functions at this point. */ >>> if (DECL_DECLARED_INLINE_P (decl) >>> && (!DECL_IMPLICIT_INSTANTIATION (decl) >>> - || DECL_DEFAULTED_FN (decl))) >>> + || DECL_DEFAULTED_FN (decl) >>> + /* For implicit instantiations of cdtors, >>> + if import_export_decl would use comdat linkage, >>> + make sure to use it right away, so that maybe_clone_body >>> + can use aliases. See PR113208. */ >>> + || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl) >>> + && (flag_implicit_templates >>> + || flag_implicit_inline_templates) >>> + && flag_weak >>> + && TARGET_SUPPORTS_ALIASES))) >> >> It seems wrong to set DECL_INTERFACE_KNOWN for cdtors that might get an >> explicit instantiation later, and likewise for comdat_linkage when >> !flag_weak; instead of adding this condition to the if, how about adding an >> else like >> >>> else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)) >>> /* For implicit instantiations of cdtors, >>> if import_export_decl would use comdat linkage, >>> make sure to use it right away, so that maybe_clone_body >>> can use aliases. See PR113208. */ >>> maybe_make_one_only (decl); >> >> ? > > Then can_alias_cdtor would return false, because it ends with: > /* Don't use aliases for weak/linkonce definitions unless we can put both > symbols in the same COMDAT group. */ > return (DECL_INTERFACE_KNOWN (fn) > && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) > && (!DECL_ONE_ONLY (fn) > || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); > Should we change that DECL_INTERFACE_KNOWN (fn) in there to > (DECL_INTERFACE_KNOWN (fn) || something) then and what that > something should be? HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)? Yes, I think reorganize to ((DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn) && !DECL_ONE_ONLY (fn)) || (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-25 0:43 ` Jason Merrill @ 2024-04-25 12:02 ` Jakub Jelinek 2024-04-25 14:22 ` Jakub Jelinek 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-25 12:02 UTC (permalink / raw) To: Jason Merrill Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Wed, Apr 24, 2024 at 08:43:46PM -0400, Jason Merrill wrote: > > Then can_alias_cdtor would return false, because it ends with: > > /* Don't use aliases for weak/linkonce definitions unless we can put both > > symbols in the same COMDAT group. */ > > return (DECL_INTERFACE_KNOWN (fn) > > && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) > > && (!DECL_ONE_ONLY (fn) > > || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); > > Should we change that DECL_INTERFACE_KNOWN (fn) in there to > > (DECL_INTERFACE_KNOWN (fn) || something) then and what that > > something should be? HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)? > > Yes, I think reorganize to > > ((DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn) && !DECL_ONE_ONLY (fn)) > || (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) I've tried the following patch, but unfortunately that lead to large number of regressions: +FAIL: g++.dg/coroutines/torture/co-yield-04-complex-local-state.C (test for excess errors) +FAIL: g++.dg/coroutines/torture/func-params-08.C (test for excess errors) +FAIL: g++.dg/coroutines/torture/func-params-09-awaitable-parms.C (test for excess errors) +FAIL: g++.dg/cpp0x/constexpr-initlist.C -std=c++11 (test for excess errors) +FAIL: g++.dg/cpp0x/constexpr-initlist.C -std=c++14 (test for excess errors) +FAIL: g++.dg/cpp0x/constexpr-initlist.C -std=c++17 (test for excess errors) +FAIL: g++.dg/cpp0x/constexpr-initlist.C -std=c++20 (test for excess errors) +FAIL: g++.dg/cpp0x/constexpr-initlist.C -std=c++23 (test for excess errors) +FAIL: g++.dg/cpp0x/constexpr-initlist.C -std=c++26 (test for excess errors) +FAIL: g++.dg/cpp0x/initlist25.C -std=c++11 (test for excess errors) +FAIL: g++.dg/cpp0x/initlist25.C -std=c++14 (test for excess errors) +FAIL: g++.dg/cpp0x/initlist25.C -std=c++17 (test for excess errors) +FAIL: g++.dg/cpp0x/initlist25.C -std=c++20 (test for excess errors) +FAIL: g++.dg/cpp0x/initlist25.C -std=c++23 (test for excess errors) +FAIL: g++.dg/cpp0x/initlist25.C -std=c++26 (test for excess errors) +FAIL: g++.dg/cpp1y/pr95226.C -std=c++20 (test for excess errors) +FAIL: g++.dg/cpp1y/pr95226.C -std=c++23 (test for excess errors) +FAIL: g++.dg/cpp1y/pr95226.C -std=c++26 (test for excess errors) +FAIL: g++.dg/cpp1z/decomp12.C -std=c++23 (test for excess errors) +FAIL: g++.dg/cpp1z/decomp12.C -std=c++26 (test for excess errors) +FAIL: g++.dg/cpp1z/eval-order2.C -std=c++20 (test for excess errors) +FAIL: g++.dg/cpp1z/eval-order2.C -std=c++23 (test for excess errors) +FAIL: g++.dg/cpp1z/eval-order2.C -std=c++26 (test for excess errors) +FAIL: g++.dg/cpp2a/srcloc17.C -std=c++20 (test for excess errors) +FAIL: g++.dg/cpp2a/srcloc17.C -std=c++23 (test for excess errors) +FAIL: g++.dg/cpp2a/srcloc17.C -std=c++26 (test for excess errors) +FAIL: g++.old-deja/g++.jason/template31.C -std=c++20 (test for excess errors) +FAIL: g++.old-deja/g++.jason/template31.C -std=c++23 (test for excess errors) +FAIL: g++.old-deja/g++.jason/template31.C -std=c++26 (test for excess errors) +FAIL: 20_util/unique_ptr/creation/for_overwrite.cc -std=gnu++26 (test for excess errors) +FAIL: 23_containers/span/cons_1_assert_neg.cc -std=gnu++20 (test for excess errors) +FAIL: 23_containers/span/cons_1_assert_neg.cc -std=gnu++26 (test for excess errors) +FAIL: 23_containers/span/cons_2_assert_neg.cc -std=gnu++20 (test for excess errors) +FAIL: 23_containers/span/cons_2_assert_neg.cc -std=gnu++26 (test for excess errors) +FAIL: std/ranges/repeat/1.cc -std=gnu++23 (test for excess errors) +FAIL: std/ranges/repeat/1.cc -std=gnu++26 (test for excess errors) Errors are like: func-params-08.C:(.text._ZNSt12_Vector_baseIiSaIiEEC2Ev[_ZNSt12_Vector_baseIiSaIiEEC5Ev]+0x14): undefined reference to `_ZNSt12_Vector_baseIiSaIiEE12_Vector_implC1EvQ26is_default_constructible_vIN9__gnu_cxx14__alloc_traitsIT0_NS5_10value_typeEE6rebindIT_E5otherEE' Though, libstdc++.so.6 abilist is the same. Trying to debug it now. 2024-04-24 Jakub Jelinek <jakub@redhat.com> Jason Merrill <jason@redhat.com> PR lto/113208 * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only for implicit instantiations of maybe in charge ctors/dtors declared inline. * optimize.cc (can_alias_cdtor): Adjust condition, for HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even if not DECL_INTERFACE_KNOWN. * decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already set. * g++.dg/abi/comdat2.C: New test. * g++.dg/abi/comdat3.C: New test. * g++.dg/abi/comdat4.C: New test. * g++.dg/abi/comdat5.C: New test. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/decl2.cc.jj 2024-04-24 18:28:22.299513620 +0200 +++ gcc/cp/decl2.cc 2024-04-25 10:04:18.049476567 +0200 @@ -3312,16 +3312,23 @@ tentative_decl_linkage (tree decl) linkage of all functions, and as that causes writes to the data mapped in from the PCH file, it's advantageous to mark the functions at this point. */ - if (DECL_DECLARED_INLINE_P (decl) - && (!DECL_IMPLICIT_INSTANTIATION (decl) - || DECL_DEFAULTED_FN (decl))) + if (DECL_DECLARED_INLINE_P (decl)) { - /* This function must have external linkage, as - otherwise DECL_INTERFACE_KNOWN would have been - set. */ - gcc_assert (TREE_PUBLIC (decl)); - comdat_linkage (decl); - DECL_INTERFACE_KNOWN (decl) = 1; + if (!DECL_IMPLICIT_INSTANTIATION (decl) + || DECL_DEFAULTED_FN (decl)) + { + /* This function must have external linkage, as + otherwise DECL_INTERFACE_KNOWN would have been + set. */ + gcc_assert (TREE_PUBLIC (decl)); + comdat_linkage (decl); + DECL_INTERFACE_KNOWN (decl) = 1; + } + else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)) + /* For implicit instantiations of cdtors try to make + it comdat, so that maybe_clone_body can use aliases. + See PR113208. */ + maybe_make_one_only (decl); } } else if (VAR_P (decl)) --- gcc/cp/optimize.cc.jj 2024-04-25 09:47:24.936736833 +0200 +++ gcc/cp/optimize.cc 2024-04-25 10:19:25.338332564 +0200 @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn) gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); /* Don't use aliases for weak/linkonce definitions unless we can put both symbols in the same COMDAT group. */ - return (DECL_INTERFACE_KNOWN (fn) - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) - && (!DECL_ONE_ONLY (fn) - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); + return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) + : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn))); } /* FN is a [cd]tor, fns is a pointer to an array of length 3. Fill fns --- gcc/cp/decl.cc.jj 2024-04-24 18:28:22.231514532 +0200 +++ gcc/cp/decl.cc 2024-04-25 09:47:06.855991224 +0200 @@ -19254,6 +19254,14 @@ cxx_comdat_group (tree decl) else break; } + /* If a ctor/dtor has already set the comdat group by + maybe_clone_body, don't override it. */ + if (SUPPORTS_ONE_ONLY + && TREE_CODE (decl) == FUNCTION_DECL + && DECL_CLONED_FUNCTION_P (decl) + && SUPPORTS_ONE_ONLY) + if (tree comdat = DECL_COMDAT_GROUP (decl)) + return comdat; } return decl; --- gcc/testsuite/g++.dg/abi/comdat2.C.jj 2024-04-25 09:47:06.855991224 +0200 +++ gcc/testsuite/g++.dg/abi/comdat2.C 2024-04-25 09:47:06.855991224 +0200 @@ -0,0 +1,26 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2 -fkeep-inline-functions" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} --- gcc/testsuite/g++.dg/abi/comdat3.C.jj 2024-04-25 09:47:06.855991224 +0200 +++ gcc/testsuite/g++.dg/abi/comdat3.C 2024-04-25 09:47:06.855991224 +0200 @@ -0,0 +1,22 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } } + +namespace N { enum L { L1, L2, L3 } const O = L3; } +namespace M { + using N::O; + using N::L; + template <typename, L = O> + struct S { constexpr S () {} }; + namespace P { + struct T; + struct U { S<T> u; }; + void foo () { U (); } + } + extern template class S<P::T>; +} +namespace p = M::P; +template class M::S<p::T>; --- gcc/testsuite/g++.dg/abi/comdat4.C.jj 2024-04-25 10:27:54.141172257 +0200 +++ gcc/testsuite/g++.dg/abi/comdat4.C 2024-04-25 10:28:22.493773334 +0200 @@ -0,0 +1,28 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} + +template struct B<C>; --- gcc/testsuite/g++.dg/abi/comdat5.C.jj 2024-04-25 10:28:35.842585513 +0200 +++ gcc/testsuite/g++.dg/abi/comdat5.C 2024-04-25 10:28:52.413352362 +0200 @@ -0,0 +1,28 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler-not "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} + +extern template struct B<C>; --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-25 09:47:06.856991210 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-25 09:47:06.855991224 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-25 09:47:06.856991210 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-25 09:47:06.856991210 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-25 09:47:06.856991210 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-25 09:47:06.856991210 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-25 12:02 ` [PATCH] c++, v4: " Jakub Jelinek @ 2024-04-25 14:22 ` Jakub Jelinek 2024-04-25 15:30 ` Jason Merrill 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-04-25 14:22 UTC (permalink / raw) To: Jason Merrill, Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Thu, Apr 25, 2024 at 02:02:32PM +0200, Jakub Jelinek wrote: > I've tried the following patch, but unfortunately that lead to large > number of regressions: > +FAIL: g++.dg/cpp0x/initlist25.C -std=c++17 (test for excess errors) So the reduced testcase for this is template <typename T, typename U> struct A { T a1; U a2; template <typename V, typename W, bool = true> constexpr A(V &&x, W &&y) : a1(x), a2(y) {} }; template <typename> struct B; namespace std { template <class> struct initializer_list { int *_M_array; decltype (sizeof 0) _M_len; }; } template <typename T, typename U> struct C { void foo (std::initializer_list<A<const T, U>>); }; template <class> struct D; template <typename T, typename = D<T>, typename = B<T>> struct E { E (const char *); ~E (); }; int main () { C<E<char>, E<char>> m; m.foo ({{"t", "t"}, {"y", "y"}}); } Without the patch I've just posted or even with the earlier version of the patch the _ZN1AIK1EIc1DIcE1BIcEES5_EC[12]IRA2_KcSB_Lb1EEEOT_OT0_ ctors were emitted, but with this patch they are unresolved externals. The reason is that the code actually uses (calls) the _ZN1AIK1EIc1DIcE1BIcEES5_EC1IRA2_KcSB_Lb1EEEOT_OT0_ __ct_comp constructor, that one has TREE_USED, while the _ZN1AIK1EIc1DIcE1BIcEES5_EC2IRA2_KcSB_Lb1EEEOT_OT0_ __ct_base constructor is not TREE_USED. But the c_parse_final_cleanups loop over FOR_EACH_VEC_SAFE_ELT (deferred_fns, i, decl) will ignore the TREE_USED __ct_comp because it is an alias and so has !DECL_SAVED_TREE: 5273 if (!DECL_SAVED_TREE (decl)) 5274 continue; With the following incremental patch the tests in make check-g++ (haven't tried the coroutine one) which failed with the earlier patch now pass. --- gcc/cp/decl2.cc.jj 2024-04-25 10:52:21.057535959 +0200 +++ gcc/cp/decl2.cc 2024-04-25 16:19:17.385547357 +0200 @@ -5271,7 +5271,19 @@ c_parse_final_cleanups (void) generate_tls_wrapper (decl); if (!DECL_SAVED_TREE (decl)) - continue; + { + cgraph_node *node; + tree tgt; + /* Even when maybe_clone_body created same body alias + has no DECL_SAVED_TREE, if its alias target does, + don't skip it. */ + if (!DECL_CLONED_FUNCTION (decl) + || !(node = cgraph_node::get (decl)) + || !node->cpp_implicit_alias + || !(tgt = node->get_alias_target_tree ()) + || !DECL_SAVED_TREE (tgt)) + continue; + } cgraph_node *node = cgraph_node::get_create (decl); @@ -5299,7 +5311,7 @@ c_parse_final_cleanups (void) node = node->get_alias_target (); node->call_for_symbol_thunks_and_aliases (clear_decl_external, - NULL, true); + NULL, true); /* If we mark !DECL_EXTERNAL one of the symbols in some comdat group, we need to mark all symbols in the same comdat group that way. */ @@ -5309,7 +5321,7 @@ c_parse_final_cleanups (void) next != node; next = dyn_cast<cgraph_node *> (next->same_comdat_group)) next->call_for_symbol_thunks_and_aliases (clear_decl_external, - NULL, true); + NULL, true); } /* If we're going to need to write this function out, and Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-25 14:22 ` Jakub Jelinek @ 2024-04-25 15:30 ` Jason Merrill 2024-04-25 18:42 ` [PATCH] c++, v5: " Jakub Jelinek 2024-05-09 18:20 ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek 0 siblings, 2 replies; 25+ messages in thread From: Jason Merrill @ 2024-04-25 15:30 UTC (permalink / raw) To: Jakub Jelinek, Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On 4/25/24 07:22, Jakub Jelinek wrote: > On Thu, Apr 25, 2024 at 02:02:32PM +0200, Jakub Jelinek wrote: >> I've tried the following patch, but unfortunately that lead to large >> number of regressions: >> +FAIL: g++.dg/cpp0x/initlist25.C -std=c++17 (test for excess errors) > > So the reduced testcase for this is > template <typename T, typename U> struct A { > T a1; > U a2; > template <typename V, typename W, bool = true> > constexpr A(V &&x, W &&y) : a1(x), a2(y) {} > }; > template <typename> struct B; > namespace std { > template <class> struct initializer_list { > int *_M_array; > decltype (sizeof 0) _M_len; > }; > } > template <typename T, typename U> struct C { > void foo (std::initializer_list<A<const T, U>>); > }; > template <class> struct D; > template <typename T, typename = D<T>, typename = B<T>> > struct E { E (const char *); ~E (); }; > int > main () > { > C<E<char>, E<char>> m; > m.foo ({{"t", "t"}, {"y", "y"}}); > } > Without the patch I've just posted or even with the earlier version > of the patch the > _ZN1AIK1EIc1DIcE1BIcEES5_EC[12]IRA2_KcSB_Lb1EEEOT_OT0_ > ctors were emitted, but with this patch they are unresolved externals. > > The reason is that the code actually uses (calls) the > _ZN1AIK1EIc1DIcE1BIcEES5_EC1IRA2_KcSB_Lb1EEEOT_OT0_ > __ct_comp constructor, that one has TREE_USED, while the > _ZN1AIK1EIc1DIcE1BIcEES5_EC2IRA2_KcSB_Lb1EEEOT_OT0_ > __ct_base constructor is not TREE_USED. > > But the c_parse_final_cleanups loop over > FOR_EACH_VEC_SAFE_ELT (deferred_fns, i, decl) > will ignore the TREE_USED __ct_comp because it is an alias > and so has !DECL_SAVED_TREE: > 5273 if (!DECL_SAVED_TREE (decl)) > 5274 continue; Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but rather set it to some stub like void_node? Though with all these changes, it's probably better to go with your first patch for GCC 14 and delay this approach to 15. Your v1 patch is OK for 14. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] c++, v5: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] 2024-04-25 15:30 ` Jason Merrill @ 2024-04-25 18:42 ` Jakub Jelinek 2024-05-09 18:20 ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek 1 sibling, 0 replies; 25+ messages in thread From: Jakub Jelinek @ 2024-04-25 18:42 UTC (permalink / raw) To: Jason Merrill Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote: > Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but > rather set it to some stub like void_node? I'll try that in stage1. > Though with all these changes, it's probably better to go with your first > patch for GCC 14 and delay this approach to 15. Your v1 patch is OK for 14. Just to record, following patch passed bootstrap/regtest on x86_64-linux and i686-linux. But I've committed the v1 version instead with the addition of comdat2.C and comdat5.C testcases from this patch now and in stage1 will post an incremental diff. Thanks. 2024-04-25 Jakub Jelinek <jakub@redhat.com> Jason Merrill <jason@redhat.com> PR lto/113208 * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only for implicit instantiations of maybe in charge ctors/dtors declared inline. (c_parse_final_cleanups): Don't skip used same body aliases which have non-NULL DECL_SAVED_TREE on the alias target. Formatting fixes. * optimize.cc (can_alias_cdtor): Adjust condition, for HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even if not DECL_INTERFACE_KNOWN. * decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already set. * g++.dg/abi/comdat2.C: New test. * g++.dg/abi/comdat3.C: New test. * g++.dg/abi/comdat4.C: New test. * g++.dg/abi/comdat5.C: New test. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/decl2.cc.jj 2024-04-24 18:28:22.299513620 +0200 +++ gcc/cp/decl2.cc 2024-04-25 16:19:17.385547357 +0200 @@ -3312,16 +3312,23 @@ tentative_decl_linkage (tree decl) linkage of all functions, and as that causes writes to the data mapped in from the PCH file, it's advantageous to mark the functions at this point. */ - if (DECL_DECLARED_INLINE_P (decl) - && (!DECL_IMPLICIT_INSTANTIATION (decl) - || DECL_DEFAULTED_FN (decl))) + if (DECL_DECLARED_INLINE_P (decl)) { - /* This function must have external linkage, as - otherwise DECL_INTERFACE_KNOWN would have been - set. */ - gcc_assert (TREE_PUBLIC (decl)); - comdat_linkage (decl); - DECL_INTERFACE_KNOWN (decl) = 1; + if (!DECL_IMPLICIT_INSTANTIATION (decl) + || DECL_DEFAULTED_FN (decl)) + { + /* This function must have external linkage, as + otherwise DECL_INTERFACE_KNOWN would have been + set. */ + gcc_assert (TREE_PUBLIC (decl)); + comdat_linkage (decl); + DECL_INTERFACE_KNOWN (decl) = 1; + } + else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)) + /* For implicit instantiations of cdtors try to make + it comdat, so that maybe_clone_body can use aliases. + See PR113208. */ + maybe_make_one_only (decl); } } else if (VAR_P (decl)) @@ -5264,7 +5271,19 @@ c_parse_final_cleanups (void) generate_tls_wrapper (decl); if (!DECL_SAVED_TREE (decl)) - continue; + { + cgraph_node *node; + tree tgt; + /* Even when maybe_clone_body created same body alias + has no DECL_SAVED_TREE, if its alias target does, + don't skip it. */ + if (!DECL_CLONED_FUNCTION (decl) + || !(node = cgraph_node::get (decl)) + || !node->cpp_implicit_alias + || !(tgt = node->get_alias_target_tree ()) + || !DECL_SAVED_TREE (tgt)) + continue; + } cgraph_node *node = cgraph_node::get_create (decl); @@ -5292,7 +5311,7 @@ c_parse_final_cleanups (void) node = node->get_alias_target (); node->call_for_symbol_thunks_and_aliases (clear_decl_external, - NULL, true); + NULL, true); /* If we mark !DECL_EXTERNAL one of the symbols in some comdat group, we need to mark all symbols in the same comdat group that way. */ @@ -5302,7 +5321,7 @@ c_parse_final_cleanups (void) next != node; next = dyn_cast<cgraph_node *> (next->same_comdat_group)) next->call_for_symbol_thunks_and_aliases (clear_decl_external, - NULL, true); + NULL, true); } /* If we're going to need to write this function out, and --- gcc/cp/optimize.cc.jj 2024-04-25 09:47:24.936736833 +0200 +++ gcc/cp/optimize.cc 2024-04-25 10:19:25.338332564 +0200 @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn) gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); /* Don't use aliases for weak/linkonce definitions unless we can put both symbols in the same COMDAT group. */ - return (DECL_INTERFACE_KNOWN (fn) - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) - && (!DECL_ONE_ONLY (fn) - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); + return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) + : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn))); } /* FN is a [cd]tor, fns is a pointer to an array of length 3. Fill fns --- gcc/cp/decl.cc.jj 2024-04-24 18:28:22.231514532 +0200 +++ gcc/cp/decl.cc 2024-04-25 09:47:06.855991224 +0200 @@ -19254,6 +19254,14 @@ cxx_comdat_group (tree decl) else break; } + /* If a ctor/dtor has already set the comdat group by + maybe_clone_body, don't override it. */ + if (SUPPORTS_ONE_ONLY + && TREE_CODE (decl) == FUNCTION_DECL + && DECL_CLONED_FUNCTION_P (decl) + && SUPPORTS_ONE_ONLY) + if (tree comdat = DECL_COMDAT_GROUP (decl)) + return comdat; } return decl; --- gcc/testsuite/g++.dg/abi/comdat2.C.jj 2024-04-25 09:47:06.855991224 +0200 +++ gcc/testsuite/g++.dg/abi/comdat2.C 2024-04-25 09:47:06.855991224 +0200 @@ -0,0 +1,26 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2 -fkeep-inline-functions" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} --- gcc/testsuite/g++.dg/abi/comdat3.C.jj 2024-04-25 09:47:06.855991224 +0200 +++ gcc/testsuite/g++.dg/abi/comdat3.C 2024-04-25 09:47:06.855991224 +0200 @@ -0,0 +1,22 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } } + +namespace N { enum L { L1, L2, L3 } const O = L3; } +namespace M { + using N::O; + using N::L; + template <typename, L = O> + struct S { constexpr S () {} }; + namespace P { + struct T; + struct U { S<T> u; }; + void foo () { U (); } + } + extern template class S<P::T>; +} +namespace p = M::P; +template class M::S<p::T>; --- gcc/testsuite/g++.dg/abi/comdat4.C.jj 2024-04-25 10:27:54.141172257 +0200 +++ gcc/testsuite/g++.dg/abi/comdat4.C 2024-04-25 10:28:22.493773334 +0200 @@ -0,0 +1,28 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} + +template struct B<C>; --- gcc/testsuite/g++.dg/abi/comdat5.C.jj 2024-04-25 10:28:35.842585513 +0200 +++ gcc/testsuite/g++.dg/abi/comdat5.C 2024-04-25 10:28:52.413352362 +0200 @@ -0,0 +1,28 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler-not "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} + +extern template struct B<C>; --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-25 09:47:06.856991210 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-25 09:47:06.855991224 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-25 09:47:06.856991210 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-25 09:47:06.856991210 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-25 09:47:06.856991210 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-25 09:47:06.856991210 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] 2024-04-25 15:30 ` Jason Merrill 2024-04-25 18:42 ` [PATCH] c++, v5: " Jakub Jelinek @ 2024-05-09 18:20 ` Jakub Jelinek 2024-05-09 18:58 ` Marek Polacek 2024-05-10 19:59 ` Jason Merrill 1 sibling, 2 replies; 25+ messages in thread From: Jakub Jelinek @ 2024-05-09 18:20 UTC (permalink / raw) To: Jason Merrill Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote: > Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but > rather set it to some stub like void_node? > > Though with all these changes, it's probably better to go with your first > patch for GCC 14 and delay this approach to 15. Your v1 patch is OK for 14. Ok, here is an updated patch, which sets DECL_SAVED_TREE to void_node for the aliases together with reversion of the earlier committed patch. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-05-09 Jakub Jelinek <jakub@redhat.com> Jason Merrill <jason@redhat.com> PR lto/113208 * cp-tree.h (maybe_optimize_cdtor): Remove. * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only for implicit instantiations of maybe in charge ctors/dtors declared inline. (import_export_decl): Don't call maybe_optimize_cdtor. (c_parse_final_cleanups): Formatting fixes. * optimize.cc (can_alias_cdtor): Adjust condition, for HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even if not DECL_INTERFACE_KNOWN. (maybe_clone_body): Don't clear DECL_SAVED_TREE, instead set it to void_node. (maybe_clone_body): Remove. * decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already set. * g++.dg/abi/comdat3.C: New test. * g++.dg/abi/comdat4.C: New test. --- gcc/cp/cp-tree.h.jj 2024-05-09 10:30:54.775505524 +0200 +++ gcc/cp/cp-tree.h 2024-05-09 17:07:01.246206288 +0200 @@ -7451,7 +7451,6 @@ extern bool handle_module_option (unsign /* In optimize.cc */ extern tree clone_attrs (tree); extern bool maybe_clone_body (tree); -extern void maybe_optimize_cdtor (tree); /* In parser.cc */ extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool, --- gcc/cp/decl2.cc.jj 2024-05-02 09:31:17.753298180 +0200 +++ gcc/cp/decl2.cc 2024-05-09 17:11:11.676836268 +0200 @@ -3325,16 +3325,23 @@ tentative_decl_linkage (tree decl) linkage of all functions, and as that causes writes to the data mapped in from the PCH file, it's advantageous to mark the functions at this point. */ - if (DECL_DECLARED_INLINE_P (decl) - && (!DECL_IMPLICIT_INSTANTIATION (decl) - || DECL_DEFAULTED_FN (decl))) + if (DECL_DECLARED_INLINE_P (decl)) { - /* This function must have external linkage, as - otherwise DECL_INTERFACE_KNOWN would have been - set. */ - gcc_assert (TREE_PUBLIC (decl)); - comdat_linkage (decl); - DECL_INTERFACE_KNOWN (decl) = 1; + if (!DECL_IMPLICIT_INSTANTIATION (decl) + || DECL_DEFAULTED_FN (decl)) + { + /* This function must have external linkage, as + otherwise DECL_INTERFACE_KNOWN would have been + set. */ + gcc_assert (TREE_PUBLIC (decl)); + comdat_linkage (decl); + DECL_INTERFACE_KNOWN (decl) = 1; + } + else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)) + /* For implicit instantiations of cdtors try to make + it comdat, so that maybe_clone_body can use aliases. + See PR113208. */ + maybe_make_one_only (decl); } } else if (VAR_P (decl)) @@ -3604,9 +3611,6 @@ import_export_decl (tree decl) } DECL_INTERFACE_KNOWN (decl) = 1; - - if (DECL_CLONED_FUNCTION_P (decl)) - maybe_optimize_cdtor (decl); } /* Return an expression that performs the destruction of DECL, which @@ -5331,7 +5335,7 @@ c_parse_final_cleanups (void) node = node->get_alias_target (); node->call_for_symbol_thunks_and_aliases (clear_decl_external, - NULL, true); + NULL, true); /* If we mark !DECL_EXTERNAL one of the symbols in some comdat group, we need to mark all symbols in the same comdat group that way. */ @@ -5341,7 +5345,7 @@ c_parse_final_cleanups (void) next != node; next = dyn_cast<cgraph_node *> (next->same_comdat_group)) next->call_for_symbol_thunks_and_aliases (clear_decl_external, - NULL, true); + NULL, true); } /* If we're going to need to write this function out, and --- gcc/cp/optimize.cc.jj 2024-04-25 20:33:30.771858912 +0200 +++ gcc/cp/optimize.cc 2024-05-09 17:10:23.920478922 +0200 @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn) gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); /* Don't use aliases for weak/linkonce definitions unless we can put both symbols in the same COMDAT group. */ - return (DECL_INTERFACE_KNOWN (fn) - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) - && (!DECL_ONE_ONLY (fn) - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); + return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) + : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn))); } /* FN is a [cd]tor, fns is a pointer to an array of length 3. Fill fns @@ -712,7 +710,7 @@ maybe_clone_body (tree fn) if (expand_or_defer_fn_1 (clone)) emit_associated_thunks (clone); /* We didn't generate a body, so remove the empty one. */ - DECL_SAVED_TREE (clone) = NULL_TREE; + DECL_SAVED_TREE (clone) = void_node; } else expand_or_defer_fn (clone); @@ -723,58 +721,3 @@ maybe_clone_body (tree fn) /* We don't need to process the original function any further. */ return 1; } - -/* If maybe_clone_body is called while the cdtor is still tentative, - DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn). - In that case we wouldn't try to optimize using an alias and instead - would emit separate base and complete cdtor. The following function - attempts to still optimize that case when we import_export_decl - is called first time on one of the clones. */ - -void -maybe_optimize_cdtor (tree orig_decl) -{ - tree fns[3]; - tree fn = DECL_CLONED_FUNCTION (orig_decl); - gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); - if (DECL_INTERFACE_KNOWN (fn) - || !TREE_ASM_WRITTEN (fn) - || !DECL_ONE_ONLY (orig_decl) - || symtab->global_info_ready) - return; - - populate_clone_array (fn, fns); - - if (!fns[0] || !fns[1]) - return; - for (int i = 2 - !fns[2]; i >= 0; --i) - if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i])) - return; - DECL_INTERFACE_KNOWN (fn) = 1; - comdat_linkage (fn); - if (!can_alias_cdtor (fn)) - return; - /* For comdat base and complete cdtors put them into the same, - *[CD]5* comdat group instead of *[CD][12]*. */ - auto n0 = cgraph_node::get_create (fns[0]); - auto n1 = cgraph_node::get_create (fns[1]); - auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL; - if (n0->lowered || n1->lowered || (n2 && n2->lowered)) - return; - import_export_decl (fns[0]); - n1->definition = false; - if (!n0->create_same_body_alias (fns[1], fns[0])) - return; - - tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); - n1 = cgraph_node::get (fns[1]); - n0->set_comdat_group (comdat_group); - if (n1->same_comdat_group) - n1->remove_from_same_comdat_group (); - n1->add_to_same_comdat_group (n0); - if (fns[2]) - n2->add_to_same_comdat_group (n0); - import_export_decl (fns[1]); - /* Remove the body now that it is an alias. */ - release_function_body (fns[1]); -} --- gcc/cp/decl.cc.jj 2024-05-09 10:30:54.804505130 +0200 +++ gcc/cp/decl.cc 2024-05-09 17:07:08.400110018 +0200 @@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl) else break; } + /* If a ctor/dtor has already set the comdat group by + maybe_clone_body, don't override it. */ + if (SUPPORTS_ONE_ONLY + && TREE_CODE (decl) == FUNCTION_DECL + && DECL_CLONED_FUNCTION_P (decl) + && SUPPORTS_ONE_ONLY) + if (tree comdat = DECL_COMDAT_GROUP (decl)) + return comdat; } return decl; --- gcc/testsuite/g++.dg/abi/comdat3.C.jj 2024-05-09 17:07:27.345855068 +0200 +++ gcc/testsuite/g++.dg/abi/comdat3.C 2024-05-09 17:07:08.400110018 +0200 @@ -0,0 +1,22 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } } +// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } } + +namespace N { enum L { L1, L2, L3 } const O = L3; } +namespace M { + using N::O; + using N::L; + template <typename, L = O> + struct S { constexpr S () {} }; + namespace P { + struct T; + struct U { S<T> u; }; + void foo () { U (); } + } + extern template class S<P::T>; +} +namespace p = M::P; +template class M::S<p::T>; --- gcc/testsuite/g++.dg/abi/comdat4.C.jj 2024-05-09 17:07:31.961792950 +0200 +++ gcc/testsuite/g++.dg/abi/comdat4.C 2024-05-09 17:07:08.400110018 +0200 @@ -0,0 +1,28 @@ +// PR lto/113208 +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } } +// { dg-additional-options "-O2" } +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } } +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } } + +template <typename T> +struct A { + int foo () const; + A (int, int); +}; +template <typename T> +struct B : A<T> { + constexpr B (const B &x) : A<T> (1, x.foo ()) {} + B () : A<T> (1, 2) {} +}; +struct C; +struct D : B<C> {}; +void bar (D); + +void +baz (D x) +{ + bar (x); +} + +template struct B<C>; Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] 2024-05-09 18:20 ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek @ 2024-05-09 18:58 ` Marek Polacek 2024-05-09 19:05 ` Jakub Jelinek 2024-05-10 19:59 ` Jason Merrill 1 sibling, 1 reply; 25+ messages in thread From: Marek Polacek @ 2024-05-09 18:58 UTC (permalink / raw) To: Jakub Jelinek Cc: Jason Merrill, Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Thu, May 09, 2024 at 08:20:00PM +0200, Jakub Jelinek wrote: > --- gcc/cp/decl.cc.jj 2024-05-09 10:30:54.804505130 +0200 > +++ gcc/cp/decl.cc 2024-05-09 17:07:08.400110018 +0200 > @@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl) > else > break; > } > + /* If a ctor/dtor has already set the comdat group by > + maybe_clone_body, don't override it. */ > + if (SUPPORTS_ONE_ONLY > + && TREE_CODE (decl) == FUNCTION_DECL > + && DECL_CLONED_FUNCTION_P (decl) > + && SUPPORTS_ONE_ONLY) > + if (tree comdat = DECL_COMDAT_GROUP (decl)) > + return comdat; This checks SUPPORTS_ONE_ONLY twice. Marek ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] 2024-05-09 18:58 ` Marek Polacek @ 2024-05-09 19:05 ` Jakub Jelinek 0 siblings, 0 replies; 25+ messages in thread From: Jakub Jelinek @ 2024-05-09 19:05 UTC (permalink / raw) To: Marek Polacek Cc: Jason Merrill, Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Thu, May 09, 2024 at 02:58:52PM -0400, Marek Polacek wrote: > On Thu, May 09, 2024 at 08:20:00PM +0200, Jakub Jelinek wrote: > > --- gcc/cp/decl.cc.jj 2024-05-09 10:30:54.804505130 +0200 > > +++ gcc/cp/decl.cc 2024-05-09 17:07:08.400110018 +0200 > > @@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl) > > else > > break; > > } > > + /* If a ctor/dtor has already set the comdat group by > > + maybe_clone_body, don't override it. */ > > + if (SUPPORTS_ONE_ONLY > > + && TREE_CODE (decl) == FUNCTION_DECL > > + && DECL_CLONED_FUNCTION_P (decl) > > + && SUPPORTS_ONE_ONLY) > > + if (tree comdat = DECL_COMDAT_GROUP (decl)) > > + return comdat; > > This checks SUPPORTS_ONE_ONLY twice. Oops, you're right, fixed in my copy. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] 2024-05-09 18:20 ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek 2024-05-09 18:58 ` Marek Polacek @ 2024-05-10 19:59 ` Jason Merrill 2024-05-13 10:19 ` Jakub Jelinek 1 sibling, 1 reply; 25+ messages in thread From: Jason Merrill @ 2024-05-10 19:59 UTC (permalink / raw) To: Jakub Jelinek Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On 5/9/24 14:20, Jakub Jelinek wrote: > On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote: >> Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but >> rather set it to some stub like void_node? >> >> Though with all these changes, it's probably better to go with your first >> patch for GCC 14 and delay this approach to 15. Your v1 patch is OK for 14. > > Ok, here is an updated patch, which sets DECL_SAVED_TREE to void_node for > the aliases together with reversion of the earlier committed patch. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2024-05-09 Jakub Jelinek <jakub@redhat.com> > Jason Merrill <jason@redhat.com> > > PR lto/113208 > * cp-tree.h (maybe_optimize_cdtor): Remove. > * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only > for implicit instantiations of maybe in charge ctors/dtors > declared inline. > (import_export_decl): Don't call maybe_optimize_cdtor. > (c_parse_final_cleanups): Formatting fixes. > * optimize.cc (can_alias_cdtor): Adjust condition, for > HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even > if not DECL_INTERFACE_KNOWN. > --- gcc/cp/optimize.cc.jj 2024-04-25 20:33:30.771858912 +0200 > +++ gcc/cp/optimize.cc 2024-05-09 17:10:23.920478922 +0200 > @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn) > gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); > /* Don't use aliases for weak/linkonce definitions unless we can put both > symbols in the same COMDAT group. */ > - return (DECL_INTERFACE_KNOWN (fn) > - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) > - && (!DECL_ONE_ONLY (fn) > - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); > + return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) > + : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn))); Hmm, would (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn))) make sense instead? I don't think DECL_WEAK is necessary for COMDAT. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] 2024-05-10 19:59 ` Jason Merrill @ 2024-05-13 10:19 ` Jakub Jelinek 2024-05-14 22:20 ` Jason Merrill 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2024-05-13 10:19 UTC (permalink / raw) To: Jason Merrill Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote: > > 2024-05-09 Jakub Jelinek <jakub@redhat.com> > > Jason Merrill <jason@redhat.com> > > > > PR lto/113208 > > * cp-tree.h (maybe_optimize_cdtor): Remove. > > * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only > > for implicit instantiations of maybe in charge ctors/dtors > > declared inline. > > (import_export_decl): Don't call maybe_optimize_cdtor. > > (c_parse_final_cleanups): Formatting fixes. > > * optimize.cc (can_alias_cdtor): Adjust condition, for > > HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even > > if not DECL_INTERFACE_KNOWN. > > > --- gcc/cp/optimize.cc.jj 2024-04-25 20:33:30.771858912 +0200 > > +++ gcc/cp/optimize.cc 2024-05-09 17:10:23.920478922 +0200 > > @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn) > > gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); > > /* Don't use aliases for weak/linkonce definitions unless we can put both > > symbols in the same COMDAT group. */ > > - return (DECL_INTERFACE_KNOWN (fn) > > - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) > > - && (!DECL_ONE_ONLY (fn) > > - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); > > + return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) > > + : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn))); > > Hmm, would > > (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP > : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn))) > > make sense instead? I don't think DECL_WEAK is necessary for COMDAT. I think it isn't indeed necessary for COMDAT, although e.g. comdat_linkage will not call make_decl_one_only if !flag_weak. But I think it is absolutely required for the alias cdtor optimization in question, because otherwise it would be an ABI change. Consider older version of GCC or some other compiler emitting _ZN6vectorI12QualityValueEC1ERKS1_ and _ZN6vectorI12QualityValueEC2ERKS1_ symbols not as aliases, each in their own comdat groups, so .text._ZN6vectorI12QualityValueEC1ERKS1_ in _ZN6vectorI12QualityValueEC1ERKS1_ comdat group and .text._ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ comdat group. And then comes GCC with the above patch without the DECL_WEAK check in there, and decides to use alias, so _ZN6vectorI12QualityValueEC1ERKS1_ is an alias to _ZN6vectorI12QualityValueEC2ERKS1_ and both live in .text._ZN6vectorI12QualityValueEC2ERKS1_ section in _ZN6vectorI12QualityValueEC5ERKS1_ comdat group. If you mix TUs with this, the linker can keep one of the section sets from the _ZN6vectorI12QualityValueEC1ERKS1_ and _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC5ERKS1_ comdat groups. If there is no .weak for the symbols, this will fail to link, one can emit it either the old way or the new way but never both, it is part of an ABI. While with .weak, mixing it is possible, worst case one gets some unused code in the linked binary or shared library. Of course the desirable case is that there is no mixing and there is no unused code, but if it happens, no big deal. Without .weak it is a big deal. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] 2024-05-13 10:19 ` Jakub Jelinek @ 2024-05-14 22:20 ` Jason Merrill 0 siblings, 0 replies; 25+ messages in thread From: Jason Merrill @ 2024-05-14 22:20 UTC (permalink / raw) To: Jakub Jelinek Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka On 5/13/24 06:19, Jakub Jelinek wrote: > On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote: >>> 2024-05-09 Jakub Jelinek <jakub@redhat.com> >>> Jason Merrill <jason@redhat.com> >>> >>> PR lto/113208 >>> * cp-tree.h (maybe_optimize_cdtor): Remove. >>> * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only >>> for implicit instantiations of maybe in charge ctors/dtors >>> declared inline. >>> (import_export_decl): Don't call maybe_optimize_cdtor. >>> (c_parse_final_cleanups): Formatting fixes. >>> * optimize.cc (can_alias_cdtor): Adjust condition, for >>> HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even >>> if not DECL_INTERFACE_KNOWN. >> >>> --- gcc/cp/optimize.cc.jj 2024-04-25 20:33:30.771858912 +0200 >>> +++ gcc/cp/optimize.cc 2024-05-09 17:10:23.920478922 +0200 >>> @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn) >>> gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); >>> /* Don't use aliases for weak/linkonce definitions unless we can put both >>> symbols in the same COMDAT group. */ >>> - return (DECL_INTERFACE_KNOWN (fn) >>> - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) >>> - && (!DECL_ONE_ONLY (fn) >>> - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); >>> + return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)) >>> + : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn))); >> >> Hmm, would >> >> (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP >> : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn))) >> >> make sense instead? I don't think DECL_WEAK is necessary for COMDAT. > > I think it isn't indeed necessary for COMDAT, although e.g. comdat_linkage > will not call make_decl_one_only if !flag_weak. > > But I think it is absolutely required for the alias cdtor optimization > in question, because otherwise it would be an ABI change. > Consider older version of GCC or some other compiler emitting > _ZN6vectorI12QualityValueEC1ERKS1_ > and > _ZN6vectorI12QualityValueEC2ERKS1_ > symbols not as aliases, each in their own comdat groups, so > .text._ZN6vectorI12QualityValueEC1ERKS1_ in _ZN6vectorI12QualityValueEC1ERKS1_ > comdat group and > .text._ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ > comdat group. And then comes GCC with the above patch without the DECL_WEAK > check in there, and decides to use alias, so > _ZN6vectorI12QualityValueEC1ERKS1_ is an alias to > _ZN6vectorI12QualityValueEC2ERKS1_ and both live in > .text._ZN6vectorI12QualityValueEC2ERKS1_ section in > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group. If you mix TUs with this, > the linker can keep one of the section sets from the _ZN6vectorI12QualityValueEC1ERKS1_ > and _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC5ERKS1_ > comdat groups. If there is no .weak for the symbols, this will fail to > link, one can emit it either the old way or the new way but never both, it > is part of an ABI. > While with .weak, mixing it is possible, worst case one gets some unused > code in the linked binary or shared library. Of course the desirable case > is that there is no mixing and there is no unused code, but if it happens, > no big deal. Without .weak it is a big deal. Makes sense, the patch is OK. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-05-14 22:20 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-04-17 7:42 [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Jakub Jelinek 2024-04-17 9:04 ` Jan Hubicka 2024-04-17 12:32 ` Jakub Jelinek 2024-04-17 13:26 ` Jan Hubicka 2024-04-17 14:07 ` Jakub Jelinek 2024-04-17 14:34 ` Jan Hubicka 2024-04-17 14:39 ` Jakub Jelinek 2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek 2024-04-23 3:14 ` Jason Merrill 2024-04-23 16:04 ` Jakub Jelinek 2024-04-24 9:16 ` Jonathan Wakely 2024-04-24 16:16 ` [PATCH] c++, v3: " Jakub Jelinek 2024-04-24 22:39 ` Jason Merrill 2024-04-24 22:47 ` Jakub Jelinek 2024-04-25 0:43 ` Jason Merrill 2024-04-25 12:02 ` [PATCH] c++, v4: " Jakub Jelinek 2024-04-25 14:22 ` Jakub Jelinek 2024-04-25 15:30 ` Jason Merrill 2024-04-25 18:42 ` [PATCH] c++, v5: " Jakub Jelinek 2024-05-09 18:20 ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek 2024-05-09 18:58 ` Marek Polacek 2024-05-09 19:05 ` Jakub Jelinek 2024-05-10 19:59 ` Jason Merrill 2024-05-13 10:19 ` Jakub Jelinek 2024-05-14 22:20 ` Jason Merrill
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).