public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
@ 2024-04-17  7:42 Jakub Jelinek
  2024-04-17  9:04 ` Jan Hubicka
  2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek
  0 siblings, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-17  7:42 UTC (permalink / raw)
  To: Jason Merrill, Jan Hubicka, Richard Biener, Patrick Palka; +Cc: gcc-patches

Hi!

When expand_or_defer_fn is called at_eof time, it calls import_export_decl
and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a
couple of places to try to optimize cdtors which are known to have the
same body by making the complete cdtor an alias to base cdtor (and in
that case also uses *[CD]5* as comdat group name instead of the normal
comdat group names specific to each mangled name).
Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN,
maybe_clone_body and can_alias_cdtor use:
      if (DECL_ONE_ONLY (fn))
        cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone));
...
  bool can_alias = can_alias_cdtor (fn);
...
      /* Tell cgraph if both ctors or both dtors are known to have
         the same body.  */
      if (can_alias
          && fns[0]
          && idx == 1
          && cgraph_node::get_create (fns[0])->create_same_body_alias
               (clone, fns[0]))
        {
          alias = true;
          if (DECL_ONE_ONLY (fns[0]))
            {
              /* For comdat base and complete cdtors put them
                 into the same, *[CD]5* comdat group instead of
                 *[CD][12]*.  */
              comdat_group = cdtor_comdat_group (fns[1], fns[0]);
              cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group);
              if (symtab_node::get (clone)->same_comdat_group)
                symtab_node::get (clone)->remove_from_same_comdat_group ();
              symtab_node::get (clone)->add_to_same_comdat_group
                (symtab_node::get (fns[0]));
            }
        }
and
  /* Don't use aliases for weak/linkonce definitions unless we can put both
     symbols in the same COMDAT group.  */
  return (DECL_INTERFACE_KNOWN (fn)
          && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
          && (!DECL_ONE_ONLY (fn)
              || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
The following testcase regressed with Marek's r14-5979 change,
when pr113208_0.C is compiled where the ctor is marked constexpr,
we no longer perform this optimization, where
_ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
_ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
_ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
_ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
_ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
Now, the linker seems to somehow cope with that, eventhough it
probably keeps both copies of the ctor, but seems LTO can't cope
with that and Honza doesn't know what it should do in that case
(linker decides that the prevailing symbol is
_ZN6vectorI12QualityValueEC2ERKS1_ (from the
_ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and
_ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU,
from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)).

Note, the case where some constructor is marked constexpr in one
TU and not in another one happens pretty often in libstdc++ when
one mixes -std= flags used to compile different compilation units.

The reason the optimization doesn't trigger when the constructor is
constexpr is that expand_or_defer_fn is called in that case much earlier
than when it is not constexpr; in the former case it is called when we
try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
is false in that case and comdat_linkage hasn't been called either
(I think it is desirable, because comdat group is stored in the cgraph
node and am not sure it is a good idea to create cgraph nodes for
something that might not be needed later on at all), so maybe_clone_body
clones the bodies, but doesn't make them as aliases.

The following patch is an attempt to redo this optimization when
import_export_decl is called at_eof time on the base/complete cdtor
(or deleting dtor).  It will not do anything if maybe_clone_body
hasn't been called uyet (the TREE_ASM_WRITTEN check on the
DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete
cdtors have been lowered already, or when maybe_clone_body called
maybe_thunk_body and it was successful.  Otherwise retries the
can_alias_cdtor check and makes the complete cdtor alias to the
base cdtor with adjustments to the comdat group.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-17  Jakub Jelinek  <jakub@redhat.com>

	PR lto/113208
	* cp-tree.h (maybe_optimize_cdtor): Declare.
	* decl2.cc (import_export_decl): Call it for cloned cdtors.
	* optimize.cc (maybe_optimize_cdtor): New function.

	* g++.dg/lto/pr113208_0.C: New test.
	* g++.dg/lto/pr113208_1.C: New file.
	* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/cp-tree.h.jj	2024-04-16 17:18:37.286279533 +0200
+++ gcc/cp/cp-tree.h	2024-04-16 17:45:01.685635709 +0200
@@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign
 /* In optimize.cc */
 extern tree clone_attrs				(tree);
 extern bool maybe_clone_body			(tree);
+extern void maybe_optimize_cdtor		(tree);
 
 /* In parser.cc */
 extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
--- gcc/cp/decl2.cc.jj	2024-04-16 17:18:37.287279519 +0200
+++ gcc/cp/decl2.cc	2024-04-16 17:45:01.686635695 +0200
@@ -3568,6 +3568,9 @@ import_export_decl (tree decl)
     }
 
   DECL_INTERFACE_KNOWN (decl) = 1;
+
+  if (DECL_CLONED_FUNCTION_P (decl))
+    maybe_optimize_cdtor (decl);
 }
 
 /* Return an expression that performs the destruction of DECL, which
--- gcc/cp/optimize.cc.jj	2024-04-16 17:18:37.374278327 +0200
+++ gcc/cp/optimize.cc	2024-04-16 21:35:53.597188774 +0200
@@ -723,3 +723,58 @@ maybe_clone_body (tree fn)
   /* We don't need to process the original function any further.  */
   return 1;
 }
+
+/* If maybe_clone_body is called while the cdtor is still tentative,
+   DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn).
+   In that case we wouldn't try to optimize using an alias and instead
+   would emit separate base and complete cdtor.  The following function
+   attempts to still optimize that case when we import_export_decl
+   is called first time on one of the clones.  */
+
+void
+maybe_optimize_cdtor (tree orig_decl)
+{
+  tree fns[3];
+  tree fn = DECL_CLONED_FUNCTION (orig_decl);
+  gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
+  if (DECL_INTERFACE_KNOWN (fn)
+      || !TREE_ASM_WRITTEN (fn)
+      || !DECL_ONE_ONLY (orig_decl)
+      || symtab->global_info_ready)
+    return;
+
+  populate_clone_array (fn, fns);
+
+  if (!fns[0] || !fns[1])
+    return;
+  for (int i = 2 - !fns[2]; i >= 0; --i)
+    if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i]))
+      return;
+  DECL_INTERFACE_KNOWN (fn) = 1;
+  comdat_linkage (fn);
+  if (!can_alias_cdtor (fn))
+    return;
+  /* For comdat base and complete cdtors put them into the same,
+     *[CD]5* comdat group instead of *[CD][12]*.  */
+  auto n0 = cgraph_node::get_create (fns[0]);
+  auto n1 = cgraph_node::get_create (fns[1]);
+  auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL;
+  if (n0->lowered || n1->lowered || (n2 && n2->lowered))
+    return;
+  import_export_decl (fns[0]);
+  n1->definition = false;
+  if (!n0->create_same_body_alias (fns[1], fns[0]))
+    return;
+
+  tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
+  n1 = cgraph_node::get (fns[1]);
+  n0->set_comdat_group (comdat_group);
+  if (n1->same_comdat_group)
+    n1->remove_from_same_comdat_group ();
+  n1->add_to_same_comdat_group (n0);
+  if (fns[2])
+    n2->add_to_same_comdat_group (n0);
+  import_export_decl (fns[1]);
+  /* Remove the body now that it is an alias.  */
+  DECL_SAVED_TREE (fns[1]) = NULL_TREE;
+}
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj	2024-04-16 17:45:01.687635682 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C	2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj	2024-04-16 17:45:01.687635682 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C	2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj	2024-04-16 17:45:01.687635682 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h	2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17  7:42 [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Jakub Jelinek
@ 2024-04-17  9:04 ` Jan Hubicka
  2024-04-17 12:32   ` Jakub Jelinek
  2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek
  1 sibling, 1 reply; 25+ messages in thread
From: Jan Hubicka @ 2024-04-17  9:04 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches

> Hi!
Hello,
> The reason the optimization doesn't trigger when the constructor is
> constexpr is that expand_or_defer_fn is called in that case much earlier
> than when it is not constexpr; in the former case it is called when we
> try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
> is false in that case and comdat_linkage hasn't been called either
> (I think it is desirable, because comdat group is stored in the cgraph
> node and am not sure it is a good idea to create cgraph nodes for
> something that might not be needed later on at all), so maybe_clone_body
> clones the bodies, but doesn't make them as aliases.

Thanks for working this out! Creating cgraph node with no body is
harmless as it will be removed later.  
> 
> The following patch is an attempt to redo this optimization when
> import_export_decl is called at_eof time on the base/complete cdtor
> (or deleting dtor).  It will not do anything if maybe_clone_body
> hasn't been called uyet (the TREE_ASM_WRITTEN check on the
> DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete
> cdtors have been lowered already, or when maybe_clone_body called
> maybe_thunk_body and it was successful.  Otherwise retries the
> can_alias_cdtor check and makes the complete cdtor alias to the
> base cdtor with adjustments to the comdat group.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2024-04-17  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR lto/113208
> 	* cp-tree.h (maybe_optimize_cdtor): Declare.
> 	* decl2.cc (import_export_decl): Call it for cloned cdtors.
> 	* optimize.cc (maybe_optimize_cdtor): New function.
> 
> 	* g++.dg/lto/pr113208_0.C: New test.
> 	* g++.dg/lto/pr113208_1.C: New file.
> 	* g++.dg/lto/pr113208.h: New file.
> 
> --- gcc/cp/cp-tree.h.jj	2024-04-16 17:18:37.286279533 +0200
> +++ gcc/cp/cp-tree.h	2024-04-16 17:45:01.685635709 +0200
> @@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign
>  /* In optimize.cc */
>  extern tree clone_attrs				(tree);
>  extern bool maybe_clone_body			(tree);
> +extern void maybe_optimize_cdtor		(tree);
>  
>  /* In parser.cc */
>  extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
> --- gcc/cp/decl2.cc.jj	2024-04-16 17:18:37.287279519 +0200
> +++ gcc/cp/decl2.cc	2024-04-16 17:45:01.686635695 +0200
> @@ -3568,6 +3568,9 @@ import_export_decl (tree decl)
>      }
>  
>    DECL_INTERFACE_KNOWN (decl) = 1;
> +
> +  if (DECL_CLONED_FUNCTION_P (decl))
> +    maybe_optimize_cdtor (decl);
>  }
>  
>  /* Return an expression that performs the destruction of DECL, which
> --- gcc/cp/optimize.cc.jj	2024-04-16 17:18:37.374278327 +0200
> +++ gcc/cp/optimize.cc	2024-04-16 21:35:53.597188774 +0200
> @@ -723,3 +723,58 @@ maybe_clone_body (tree fn)
>    /* We don't need to process the original function any further.  */
>    return 1;
>  }
> +
> +/* If maybe_clone_body is called while the cdtor is still tentative,
> +   DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn).
> +   In that case we wouldn't try to optimize using an alias and instead
> +   would emit separate base and complete cdtor.  The following function
> +   attempts to still optimize that case when we import_export_decl
> +   is called first time on one of the clones.  */
> +
> +void
> +maybe_optimize_cdtor (tree orig_decl)
> +{
> +  tree fns[3];
> +  tree fn = DECL_CLONED_FUNCTION (orig_decl);
> +  gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
> +  if (DECL_INTERFACE_KNOWN (fn)
> +      || !TREE_ASM_WRITTEN (fn)
> +      || !DECL_ONE_ONLY (orig_decl)
> +      || symtab->global_info_ready)
> +    return;
> +
> +  populate_clone_array (fn, fns);
> +
> +  if (!fns[0] || !fns[1])
> +    return;
> +  for (int i = 2 - !fns[2]; i >= 0; --i)
> +    if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i]))
> +      return;
> +  DECL_INTERFACE_KNOWN (fn) = 1;
> +  comdat_linkage (fn);
> +  if (!can_alias_cdtor (fn))
> +    return;
> +  /* For comdat base and complete cdtors put them into the same,
> +     *[CD]5* comdat group instead of *[CD][12]*.  */
> +  auto n0 = cgraph_node::get_create (fns[0]);
> +  auto n1 = cgraph_node::get_create (fns[1]);
> +  auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL;
> +  if (n0->lowered || n1->lowered || (n2 && n2->lowered))
> +    return;
> +  import_export_decl (fns[0]);
> +  n1->definition = false;
> +  if (!n0->create_same_body_alias (fns[1], fns[0]))
> +    return;
> +
> +  tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
> +  n1 = cgraph_node::get (fns[1]);
> +  n0->set_comdat_group (comdat_group);
> +  if (n1->same_comdat_group)
> +    n1->remove_from_same_comdat_group ();
> +  n1->add_to_same_comdat_group (n0);
> +  if (fns[2])
> +    n2->add_to_same_comdat_group (n0);
> +  import_export_decl (fns[1]);

So this is handling the case where symbol was already inserted into one
comdat group and later we need to move it into the C5 group?  As long as
we move everythingf rom old comdat group, this should be fine.
> +  /* Remove the body now that it is an alias.  */
> +  DECL_SAVED_TREE (fns[1]) = NULL_TREE;
Maybe using release_function_body since it also knows how to remove
DECL_STRUCT_FUNCTION that exists at this stage?

I was thinking how to solve the problem on cgrpah side.  We definitely
have long lasting bug where aliases are handled incorrectly for which I
made WIP patch (but probably would like to hold it after release branch is
created).  When foo has alias bar and foo is praviled then the alias
target is prevailed too.  This is what causes the ICE about cross comdat
section alias.  However fixing this is not enough as I do not think we
can handle incremental linking correctly (as discussed briefly on IRC
technically we should keep both sections but that would require two
symbols of same name in single .o file).

With the aliasing fixed we turn the other symbol into static but keep
alias, so we end up with one comdat group having the non-aliased
constructor and second comdat group (C5) exporting only the alias, which
is not quite reight either.

Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17  9:04 ` Jan Hubicka
@ 2024-04-17 12:32   ` Jakub Jelinek
  2024-04-17 13:26     ` Jan Hubicka
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-17 12:32 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches

On Wed, Apr 17, 2024 at 11:04:26AM +0200, Jan Hubicka wrote:
> > The reason the optimization doesn't trigger when the constructor is
> > constexpr is that expand_or_defer_fn is called in that case much earlier
> > than when it is not constexpr; in the former case it is called when we
> > try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
> > is false in that case and comdat_linkage hasn't been called either
> > (I think it is desirable, because comdat group is stored in the cgraph
> > node and am not sure it is a good idea to create cgraph nodes for
> > something that might not be needed later on at all), so maybe_clone_body
> > clones the bodies, but doesn't make them as aliases.
> 
> Thanks for working this out! Creating cgraph node with no body is
> harmless as it will be removed later.  

That is actually for functions with bodies, maybe_clone_body creates the
bodies for those, but still when it happens early, the cdtors have
tentative_decl_linkage linkage, which in many cases means DECL_EXTERNAL,
DECL_NOT_REALLY_EXTERN (C++ FE special thing), DECL_DEFER_OUTPUT etc.

> > +  tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
> > +  n1 = cgraph_node::get (fns[1]);
> > +  n0->set_comdat_group (comdat_group);
> > +  if (n1->same_comdat_group)
> > +    n1->remove_from_same_comdat_group ();
> > +  n1->add_to_same_comdat_group (n0);
> > +  if (fns[2])
> > +    n2->add_to_same_comdat_group (n0);
> > +  import_export_decl (fns[1]);
> 
> So this is handling the case where symbol was already inserted into one
> comdat group and later we need to move it into the C5 group?  As long as
> we move everythingf rom old comdat group, this should be fine.

The above is pretty much an adjusted copy of what maybe_clone_body does,
except it doesn't call cgraph_node::get{,_create} all the time and uses
import_export_decl rather than expand_or_defer_fn{,_1}.

> > +  /* Remove the body now that it is an alias.  */
> > +  DECL_SAVED_TREE (fns[1]) = NULL_TREE;
> Maybe using release_function_body since it also knows how to remove
> DECL_STRUCT_FUNCTION that exists at this stage?

Guess I could try that, clearing of DECL_SAVED_TREE was what was done
in maybe_clone_body too.

> I was thinking how to solve the problem on cgrpah side.  We definitely
> have long lasting bug where aliases are handled incorrectly for which I
> made WIP patch (but probably would like to hold it after release branch is
> created).  When foo has alias bar and foo is praviled then the alias
> target is prevailed too.  This is what causes the ICE about cross comdat
> section alias.  However fixing this is not enough as I do not think we
> can handle incremental linking correctly (as discussed briefly on IRC
> technically we should keep both sections but that would require two
> symbols of same name in single .o file).
> 
> With the aliasing fixed we turn the other symbol into static but keep
> alias, so we end up with one comdat group having the non-aliased
> constructor and second comdat group (C5) exporting only the alias, which
> is not quite reight either.

I've tried to see what actually happens during linking without LTO, so compiled
pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk
(so it has those 2 separate comdats, one for C2 and one for C1), though I've
changed the
void m(k);
line to
__attribute__((noipa)) void m(k) {}
in the testcase, then compiled
pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer
so that one can clearly differentiate from where the implementation was
picked and finally added
template <typename _Tp> struct _Vector_base {
  int g() const;
  _Vector_base(int, int);
};

struct QualityValue;
template <>
_Vector_base<QualityValue>::_Vector_base(int, int) {}
template <>
int _Vector_base<QualityValue>::g() const { return 0; }
int main () {}
If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and
_ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the
omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed
in both cases.  It is unclear why that isn't the case for LTO.

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17 12:32   ` Jakub Jelinek
@ 2024-04-17 13:26     ` Jan Hubicka
  2024-04-17 14:07       ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Hubicka @ 2024-04-17 13:26 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches

> 
> I've tried to see what actually happens during linking without LTO, so compiled
> pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk
> (so it has those 2 separate comdats, one for C2 and one for C1), though I've
> changed the
> void m(k);
> line to
> __attribute__((noipa)) void m(k) {}
> in the testcase, then compiled
> pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer
> so that one can clearly differentiate from where the implementation was
> picked and finally added
> template <typename _Tp> struct _Vector_base {
>   int g() const;
>   _Vector_base(int, int);
> };
> 
> struct QualityValue;
> template <>
> _Vector_base<QualityValue>::_Vector_base(int, int) {}
> template <>
> int _Vector_base<QualityValue>::g() const { return 0; }
> int main () {}
> If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and
> _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the
> omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed
> in both cases.  It is unclear why that isn't the case for LTO.

I think it is because of -fkeep-inline-functions which makes the first
object file to define both symbols, while with LTO we optimize out one
of them.  

So to reproduce same behaviour with non-LTO we would probably need use
-O1 and arrange the contructor to be unilinable instead of using
-fkeep-inline-functions.

Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17 13:26     ` Jan Hubicka
@ 2024-04-17 14:07       ` Jakub Jelinek
  2024-04-17 14:34         ` Jan Hubicka
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-17 14:07 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches

On Wed, Apr 17, 2024 at 03:26:53PM +0200, Jan Hubicka wrote:
> > 
> > I've tried to see what actually happens during linking without LTO, so compiled
> > pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk
> > (so it has those 2 separate comdats, one for C2 and one for C1), though I've
> > changed the
> > void m(k);
> > line to
> > __attribute__((noipa)) void m(k) {}
> > in the testcase, then compiled
> > pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer
> > so that one can clearly differentiate from where the implementation was
> > picked and finally added
> > template <typename _Tp> struct _Vector_base {
> >   int g() const;
> >   _Vector_base(int, int);
> > };
> > 
> > struct QualityValue;
> > template <>
> > _Vector_base<QualityValue>::_Vector_base(int, int) {}
> > template <>
> > int _Vector_base<QualityValue>::g() const { return 0; }
> > int main () {}
> > If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and
> > _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the
> > omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed
> > in both cases.  It is unclear why that isn't the case for LTO.
> 
> I think it is because of -fkeep-inline-functions which makes the first
> object file to define both symbols, while with LTO we optimize out one
> of them.  
> 
> So to reproduce same behaviour with non-LTO we would probably need use
> -O1 and arrange the contructor to be unilinable instead of using
> -fkeep-inline-functions.

Ah, you're right.
If I compile (the one line modified) pr113208_0.C with
-O -fno-early-inlining -fdisable-ipa-inline -std=c++20
it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_
comdat and no _ZN6vectorI12QualityValueEC1ERKS1_
and pr113208_1.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 -fno-omit-frame-pointer
and link that together with the above mentioned third *.C file, I see
000000000040112a <_ZN6vectorI12QualityValueEC2ERKS1_>:
  40112a:       53                      push   %rbx
  40112b:       48 89 fb                mov    %rdi,%rbx
  40112e:       48 89 f7                mov    %rsi,%rdi
  401131:       e8 9c 00 00 00          call   4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv>
  401136:       89 c2                   mov    %eax,%edx
  401138:       be 01 00 00 00          mov    $0x1,%esi
  40113d:       48 89 df                mov    %rbx,%rdi
  401140:       e8 7b 00 00 00          call   4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii>
  401145:       5b                      pop    %rbx
  401146:       c3                      ret    
i.e. the C2 prevailing from pr113208_0.s where it is the only symbol, and
0000000000401196 <_ZN6vectorI12QualityValueEC1ERKS1_>:
  401196:       55                      push   %rbp
  401197:       48 89 e5                mov    %rsp,%rbp
  40119a:       53                      push   %rbx
  40119b:       48 83 ec 08             sub    $0x8,%rsp
  40119f:       48 89 fb                mov    %rdi,%rbx
  4011a2:       48 89 f7                mov    %rsi,%rdi
  4011a5:       e8 28 00 00 00          call   4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv>
  4011aa:       89 c2                   mov    %eax,%edx
  4011ac:       be 01 00 00 00          mov    $0x1,%esi
  4011b1:       48 89 df                mov    %rbx,%rdi
  4011b4:       e8 07 00 00 00          call   4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii>
  4011b9:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
  4011bd:       c9                      leave  
  4011be:       c3                      ret    
which is the C1 alias originally aliased to C2 in C5 comdat.
So, that would match linker behavior where it sees C1 -> C2 alias prevails,
but a different version of C2 prevails, so let's either make C1 a non-alias
or alias to a non-exported symbol or something like that.
Though, I admit I have no idea what we do with comdat's during LTO, perhaps
doing what I said above could break stuff if linker after seeing the LTO
resulting objects decides on prevailing symbols differently.

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17 14:07       ` Jakub Jelinek
@ 2024-04-17 14:34         ` Jan Hubicka
  2024-04-17 14:39           ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Hubicka @ 2024-04-17 14:34 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches

> 
> Ah, you're right.
> If I compile (the one line modified) pr113208_0.C with
> -O -fno-early-inlining -fdisable-ipa-inline -std=c++20
> it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_
> comdat and no _ZN6vectorI12QualityValueEC1ERKS1_
> and pr113208_1.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 -fno-omit-frame-pointer
> and link that together with the above mentioned third *.C file, I see
> 000000000040112a <_ZN6vectorI12QualityValueEC2ERKS1_>:
>   40112a:       53                      push   %rbx
>   40112b:       48 89 fb                mov    %rdi,%rbx
>   40112e:       48 89 f7                mov    %rsi,%rdi
>   401131:       e8 9c 00 00 00          call   4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv>
>   401136:       89 c2                   mov    %eax,%edx
>   401138:       be 01 00 00 00          mov    $0x1,%esi
>   40113d:       48 89 df                mov    %rbx,%rdi
>   401140:       e8 7b 00 00 00          call   4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii>
>   401145:       5b                      pop    %rbx
>   401146:       c3                      ret    
> i.e. the C2 prevailing from pr113208_0.s where it is the only symbol, and
> 0000000000401196 <_ZN6vectorI12QualityValueEC1ERKS1_>:
>   401196:       55                      push   %rbp
>   401197:       48 89 e5                mov    %rsp,%rbp
>   40119a:       53                      push   %rbx
>   40119b:       48 83 ec 08             sub    $0x8,%rsp
>   40119f:       48 89 fb                mov    %rdi,%rbx
>   4011a2:       48 89 f7                mov    %rsi,%rdi
>   4011a5:       e8 28 00 00 00          call   4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv>
>   4011aa:       89 c2                   mov    %eax,%edx
>   4011ac:       be 01 00 00 00          mov    $0x1,%esi
>   4011b1:       48 89 df                mov    %rbx,%rdi
>   4011b4:       e8 07 00 00 00          call   4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii>
>   4011b9:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
>   4011bd:       c9                      leave  
>   4011be:       c3                      ret    
> which is the C1 alias originally aliased to C2 in C5 comdat.
> So, that would match linker behavior where it sees C1 -> C2 alias prevails,
> but a different version of C2 prevails, so let's either make C1 a non-alias
> or alias to a non-exported symbol or something like that.

Yep, I think the linker simply handles the C2 symbol as weak symbol and
after it decides to keep both comdat section (C2 and C5) the C5's weak
symbol is prevailed by C2.

I wrote patch doing the same with LTO - we need to handle alias
references specially and do not update them according to the resolution
info and then look for prevailed symbols which have aliases and turn
them to static symbols.  

I think for most scenarios this is OK, but I am not sure about
incremental linking (both LTO and non-LTO). What seems wrong is that we
produce C5 comdat that is not exporting what it should and thus breaking
the invariant that in valid code all comdats of same name are
semantically equivalent.  Perhaps it makes no difference since this
scenario is pretty special and we know that the functions are
semantically equivalent and their addresses are never compared for
equality (at least I failed to produce some useful testcase).

Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17 14:34         ` Jan Hubicka
@ 2024-04-17 14:39           ` Jakub Jelinek
  0 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-17 14:39 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jason Merrill, Richard Biener, Patrick Palka, gcc-patches

On Wed, Apr 17, 2024 at 04:34:07PM +0200, Jan Hubicka wrote:
> I think for most scenarios this is OK, but I am not sure about
> incremental linking (both LTO and non-LTO). What seems wrong is that we
> produce C5 comdat that is not exporting what it should and thus breaking
> the invariant that in valid code all comdats of same name are
> semantically equivalent.

Yeah, exactly.  That is what I'm worried about too.

>  Perhaps it makes no difference since this
> scenario is pretty special and we know that the functions are
> semantically equivalent and their addresses are never compared for
> equality (at least I failed to produce some useful testcase).

Yes, I think one can't take address of a constructor/destructor and compare
that for equality; I guess the destructor address can be stored in vtables,
but code manually reading stuff from vtables and assuming pointer equality
is almost certainly not valid.

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-17  7:42 [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Jakub Jelinek
  2024-04-17  9:04 ` Jan Hubicka
@ 2024-04-22 15:42 ` Jakub Jelinek
  2024-04-23  3:14   ` Jason Merrill
  1 sibling, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-22 15:42 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka

On Wed, Apr 17, 2024 at 09:42:47AM +0200, Jakub Jelinek wrote:
> When expand_or_defer_fn is called at_eof time, it calls import_export_decl
> and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a
> couple of places to try to optimize cdtors which are known to have the
> same body by making the complete cdtor an alias to base cdtor (and in
> that case also uses *[CD]5* as comdat group name instead of the normal
> comdat group names specific to each mangled name).
> Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN,
> maybe_clone_body and can_alias_cdtor use:
>       if (DECL_ONE_ONLY (fn))
>         cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone));
> ...
>   bool can_alias = can_alias_cdtor (fn);
> ...
>       /* Tell cgraph if both ctors or both dtors are known to have
>          the same body.  */
>       if (can_alias
>           && fns[0]
>           && idx == 1
>           && cgraph_node::get_create (fns[0])->create_same_body_alias
>                (clone, fns[0]))
>         {
>           alias = true;
>           if (DECL_ONE_ONLY (fns[0]))
>             {
>               /* For comdat base and complete cdtors put them
>                  into the same, *[CD]5* comdat group instead of
>                  *[CD][12]*.  */
>               comdat_group = cdtor_comdat_group (fns[1], fns[0]);
>               cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group);
>               if (symtab_node::get (clone)->same_comdat_group)
>                 symtab_node::get (clone)->remove_from_same_comdat_group ();
>               symtab_node::get (clone)->add_to_same_comdat_group
>                 (symtab_node::get (fns[0]));
>             }
>         }
> and
>   /* Don't use aliases for weak/linkonce definitions unless we can put both
>      symbols in the same COMDAT group.  */
>   return (DECL_INTERFACE_KNOWN (fn)
>           && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
>           && (!DECL_ONE_ONLY (fn)
>               || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
> The following testcase regressed with Marek's r14-5979 change,
> when pr113208_0.C is compiled where the ctor is marked constexpr,
> we no longer perform this optimization, where
> _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
> _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
> _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
> instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
> content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
> _ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
> Now, the linker seems to somehow cope with that, eventhough it
> probably keeps both copies of the ctor, but seems LTO can't cope
> with that and Honza doesn't know what it should do in that case
> (linker decides that the prevailing symbol is
> _ZN6vectorI12QualityValueEC2ERKS1_ (from the
> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and
> _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU,
> from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)).
> 
> Note, the case where some constructor is marked constexpr in one
> TU and not in another one happens pretty often in libstdc++ when
> one mixes -std= flags used to compile different compilation units.
> 
> The reason the optimization doesn't trigger when the constructor is
> constexpr is that expand_or_defer_fn is called in that case much earlier
> than when it is not constexpr; in the former case it is called when we
> try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
> is false in that case and comdat_linkage hasn't been called either
> (I think it is desirable, because comdat group is stored in the cgraph
> node and am not sure it is a good idea to create cgraph nodes for
> something that might not be needed later on at all), so maybe_clone_body
> clones the bodies, but doesn't make them as aliases.
> 
> The following patch is an attempt to redo this optimization when
> import_export_decl is called at_eof time on the base/complete cdtor
> (or deleting dtor).  It will not do anything if maybe_clone_body
> hasn't been called uyet (the TREE_ASM_WRITTEN check on the
> DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete
> cdtors have been lowered already, or when maybe_clone_body called
> maybe_thunk_body and it was successful.  Otherwise retries the
> can_alias_cdtor check and makes the complete cdtor alias to the
> base cdtor with adjustments to the comdat group.

Here is an updated version of the patch which changes
-  DECL_SAVED_TREE (fns[1]) = NULL_TREE;
+  release_function_body (fns[1]);
as suggested by Honza.

Bootstrapped/regtested on x86_64-linux and i686-linux again, ok for trunk?

2024-04-22  Jakub Jelinek  <jakub@redhat.com>

	PR lto/113208
	* cp-tree.h (maybe_optimize_cdtor): Declare.
	* decl2.cc (import_export_decl): Call it for cloned cdtors.
	* optimize.cc (maybe_optimize_cdtor): New function.

	* g++.dg/lto/pr113208_0.C: New test.
	* g++.dg/lto/pr113208_1.C: New file.
	* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/cp-tree.h.jj	2024-04-16 17:18:37.286279533 +0200
+++ gcc/cp/cp-tree.h	2024-04-16 17:45:01.685635709 +0200
@@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign
 /* In optimize.cc */
 extern tree clone_attrs				(tree);
 extern bool maybe_clone_body			(tree);
+extern void maybe_optimize_cdtor		(tree);
 
 /* In parser.cc */
 extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
--- gcc/cp/decl2.cc.jj	2024-04-16 17:18:37.287279519 +0200
+++ gcc/cp/decl2.cc	2024-04-16 17:45:01.686635695 +0200
@@ -3568,6 +3568,9 @@ import_export_decl (tree decl)
     }
 
   DECL_INTERFACE_KNOWN (decl) = 1;
+
+  if (DECL_CLONED_FUNCTION_P (decl))
+    maybe_optimize_cdtor (decl);
 }
 
 /* Return an expression that performs the destruction of DECL, which
--- gcc/cp/optimize.cc.jj	2024-04-16 17:18:37.374278327 +0200
+++ gcc/cp/optimize.cc	2024-04-22 14:52:59.310630914 +0200
@@ -723,3 +723,58 @@ maybe_clone_body (tree fn)
   /* We don't need to process the original function any further.  */
   return 1;
 }
+
+/* If maybe_clone_body is called while the cdtor is still tentative,
+   DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn).
+   In that case we wouldn't try to optimize using an alias and instead
+   would emit separate base and complete cdtor.  The following function
+   attempts to still optimize that case when we import_export_decl
+   is called first time on one of the clones.  */
+
+void
+maybe_optimize_cdtor (tree orig_decl)
+{
+  tree fns[3];
+  tree fn = DECL_CLONED_FUNCTION (orig_decl);
+  gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
+  if (DECL_INTERFACE_KNOWN (fn)
+      || !TREE_ASM_WRITTEN (fn)
+      || !DECL_ONE_ONLY (orig_decl)
+      || symtab->global_info_ready)
+    return;
+
+  populate_clone_array (fn, fns);
+
+  if (!fns[0] || !fns[1])
+    return;
+  for (int i = 2 - !fns[2]; i >= 0; --i)
+    if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i]))
+      return;
+  DECL_INTERFACE_KNOWN (fn) = 1;
+  comdat_linkage (fn);
+  if (!can_alias_cdtor (fn))
+    return;
+  /* For comdat base and complete cdtors put them into the same,
+     *[CD]5* comdat group instead of *[CD][12]*.  */
+  auto n0 = cgraph_node::get_create (fns[0]);
+  auto n1 = cgraph_node::get_create (fns[1]);
+  auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL;
+  if (n0->lowered || n1->lowered || (n2 && n2->lowered))
+    return;
+  import_export_decl (fns[0]);
+  n1->definition = false;
+  if (!n0->create_same_body_alias (fns[1], fns[0]))
+    return;
+
+  tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
+  n1 = cgraph_node::get (fns[1]);
+  n0->set_comdat_group (comdat_group);
+  if (n1->same_comdat_group)
+    n1->remove_from_same_comdat_group ();
+  n1->add_to_same_comdat_group (n0);
+  if (fns[2])
+    n2->add_to_same_comdat_group (n0);
+  import_export_decl (fns[1]);
+  /* Remove the body now that it is an alias.  */
+  release_function_body (fns[1]);
+}
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj	2024-04-16 17:45:01.687635682 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C	2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj	2024-04-16 17:45:01.687635682 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C	2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj	2024-04-16 17:45:01.687635682 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h	2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek
@ 2024-04-23  3:14   ` Jason Merrill
  2024-04-23 16:04     ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Merrill @ 2024-04-23  3:14 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka

On 4/22/24 08:42, Jakub Jelinek wrote:
> On Wed, Apr 17, 2024 at 09:42:47AM +0200, Jakub Jelinek wrote:
>> When expand_or_defer_fn is called at_eof time, it calls import_export_decl
>> and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a
>> couple of places to try to optimize cdtors which are known to have the
>> same body by making the complete cdtor an alias to base cdtor (and in
>> that case also uses *[CD]5* as comdat group name instead of the normal
>> comdat group names specific to each mangled name).
>> Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN,
>> maybe_clone_body and can_alias_cdtor use:
>>        if (DECL_ONE_ONLY (fn))
>>          cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone));
>> ...
>>    bool can_alias = can_alias_cdtor (fn);
>> ...
>>        /* Tell cgraph if both ctors or both dtors are known to have
>>           the same body.  */
>>        if (can_alias
>>            && fns[0]
>>            && idx == 1
>>            && cgraph_node::get_create (fns[0])->create_same_body_alias
>>                 (clone, fns[0]))
>>          {
>>            alias = true;
>>            if (DECL_ONE_ONLY (fns[0]))
>>              {
>>                /* For comdat base and complete cdtors put them
>>                   into the same, *[CD]5* comdat group instead of
>>                   *[CD][12]*.  */
>>                comdat_group = cdtor_comdat_group (fns[1], fns[0]);
>>                cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group);
>>                if (symtab_node::get (clone)->same_comdat_group)
>>                  symtab_node::get (clone)->remove_from_same_comdat_group ();
>>                symtab_node::get (clone)->add_to_same_comdat_group
>>                  (symtab_node::get (fns[0]));
>>              }
>>          }
>> and
>>    /* Don't use aliases for weak/linkonce definitions unless we can put both
>>       symbols in the same COMDAT group.  */
>>    return (DECL_INTERFACE_KNOWN (fn)
>>            && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
>>            && (!DECL_ONE_ONLY (fn)
>>                || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
>> The following testcase regressed with Marek's r14-5979 change,
>> when pr113208_0.C is compiled where the ctor is marked constexpr,
>> we no longer perform this optimization, where
>> _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
>> _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
>> _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
>> instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
>> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
>> content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
>> _ZN6vectorI12QualityValueEC1ERKS1_ comdat group.

This seems like an ABI bug that could use a non-LTO testcase.

>> Now, the linker seems to somehow cope with that, eventhough it
>> probably keeps both copies of the ctor, but seems LTO can't cope
>> with that and Honza doesn't know what it should do in that case
>> (linker decides that the prevailing symbol is
>> _ZN6vectorI12QualityValueEC2ERKS1_ (from the
>> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and
>> _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU,
>> from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)).
>>
>> Note, the case where some constructor is marked constexpr in one
>> TU and not in another one happens pretty often in libstdc++ when
>> one mixes -std= flags used to compile different compilation units.
>>
>> The reason the optimization doesn't trigger when the constructor is
>> constexpr is that expand_or_defer_fn is called in that case much earlier
>> than when it is not constexpr; in the former case it is called when we
>> try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
>> is false in that case and comdat_linkage hasn't been called either
>> (I think it is desirable, because comdat group is stored in the cgraph
>> node and am not sure it is a good idea to create cgraph nodes for
>> something that might not be needed later on at all), so maybe_clone_body
>> clones the bodies, but doesn't make them as aliases.

Hmm, cloning the bodies and then discarding them later seems like more 
extra work than creating the cgraph nodes.

Jason


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-23  3:14   ` Jason Merrill
@ 2024-04-23 16:04     ` Jakub Jelinek
  2024-04-24  9:16       ` Jonathan Wakely
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-23 16:04 UTC (permalink / raw)
  To: Jason Merrill
  Cc: gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka, Jonathan Wakely

On Mon, Apr 22, 2024 at 11:14:35PM -0400, Jason Merrill wrote:
> > > The following testcase regressed with Marek's r14-5979 change,
> > > when pr113208_0.C is compiled where the ctor is marked constexpr,
> > > we no longer perform this optimization, where
> > > _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
> > > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
> > > _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
> > > instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
> > > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
> > > content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
> > > _ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
> 
> This seems like an ABI bug that could use a non-LTO testcase.

Well, except for the issues it causes to LTO I think it is compatible,
worst case we get the body of the ctor duplicated in the executable
and the linker picks some of the weak symbols as the symbol definitions.
Anyway, I've added a non-LTO testcase for that in the patch below.

> Hmm, cloning the bodies and then discarding them later seems like more extra
> work than creating the cgraph nodes.

So, I've tried to handle that in tentative_decl_linkage, like that function
already handles functions declared inline except for implicit template
instantiations.  If we expect that import_export_decl will do comdat_linkage
for the ctor later on do it right away.

That fixes the testcases too, but seems to regress
+FAIL: libstdc++-abi/abi_check
on both x86_64-linux and i686-linux, in each case 8 symbols disappeared from
libstdc++.so.6:
_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev
_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev
_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev
_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev
_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev
_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev
_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev
_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev

Will need to study why that happened, it might be that it was ok because
I think the filesystem stuff is unlike the rest compiled with no exported
templates, but would need at least some hacks in libstdc++ to preserve
previously exported symbols.
Still, feels like a risky change this late if it wouldn't break ABI of other
libraries.

2024-04-23  Jakub Jelinek  <jakub@redhat.com>

	PR lto/113208
	* decl2.cc (tentative_decl_linkage): Use comdat_linkage also
	for implicit instantiations of maybe in charge ctors/dtors
	if -fimplicit-templates or -fimplicit-inline-templates and
	-fweak and target supports aliases.

	* g++.dg/abi/comdat2.C: New test.
	* g++.dg/lto/pr113208_0.C: New test.
	* g++.dg/lto/pr113208_1.C: New file.
	* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/decl2.cc.jj	2024-04-22 15:16:55.328548807 +0200
+++ gcc/cp/decl2.cc	2024-04-23 09:52:18.993250442 +0200
@@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl)
 	     to mark the functions at this point.  */
 	  if (DECL_DECLARED_INLINE_P (decl)
 	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
-		  || DECL_DEFAULTED_FN (decl)))
+		  || DECL_DEFAULTED_FN (decl)
+		  /* For implicit instantiations of cdtors,
+		     if import_export_decl would use comdat linkage,
+		     make sure to use it right away, so that maybe_clone_body
+		     can use aliases.  See PR113208.  */
+		  || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)
+		      && (flag_implicit_templates
+			  || flag_implicit_inline_templates)
+		      && flag_weak
+		      && TARGET_SUPPORTS_ALIASES)))
 	    {
 	      /* This function must have external linkage, as
 		 otherwise DECL_INTERFACE_KNOWN would have been
--- gcc/testsuite/g++.dg/abi/comdat2.C.jj	2024-04-23 10:04:28.485964610 +0200
+++ gcc/testsuite/g++.dg/abi/comdat2.C	2024-04-23 10:05:24.757171194 +0200
@@ -0,0 +1,26 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2 -fkeep-inline-functions" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj	2024-04-23 09:44:07.523179110 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h	2024-04-23 09:44:07.523179110 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj	2024-04-23 09:44:07.523179110 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C	2024-04-23 09:44:07.523179110 +0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj	2024-04-23 09:44:07.523179110 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C	2024-04-23 09:44:07.523179110 +0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v2: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-23 16:04     ` Jakub Jelinek
@ 2024-04-24  9:16       ` Jonathan Wakely
  2024-04-24 16:16         ` [PATCH] c++, v3: " Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jonathan Wakely @ 2024-04-24  9:16 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, gcc-patches, Jan Hubicka, Richard Biener, Patrick Palka

On Tue, 23 Apr 2024 at 17:05, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Apr 22, 2024 at 11:14:35PM -0400, Jason Merrill wrote:
> > > > The following testcase regressed with Marek's r14-5979 change,
> > > > when pr113208_0.C is compiled where the ctor is marked constexpr,
> > > > we no longer perform this optimization, where
> > > > _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
> > > > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
> > > > _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
> > > > instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
> > > > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
> > > > content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
> > > > _ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
> >
> > This seems like an ABI bug that could use a non-LTO testcase.
>
> Well, except for the issues it causes to LTO I think it is compatible,
> worst case we get the body of the ctor duplicated in the executable
> and the linker picks some of the weak symbols as the symbol definitions.
> Anyway, I've added a non-LTO testcase for that in the patch below.
>
> > Hmm, cloning the bodies and then discarding them later seems like more extra
> > work than creating the cgraph nodes.
>
> So, I've tried to handle that in tentative_decl_linkage, like that function
> already handles functions declared inline except for implicit template
> instantiations.  If we expect that import_export_decl will do comdat_linkage
> for the ctor later on do it right away.
>
> That fixes the testcases too, but seems to regress
> +FAIL: libstdc++-abi/abi_check
> on both x86_64-linux and i686-linux, in each case 8 symbols disappeared from
> libstdc++.so.6:
> _ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev
> _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev
> _ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev
> _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev
> _ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev
> _ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev
> _ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev
> _ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev
>
> Will need to study why that happened, it might be that it was ok because
> I think the filesystem stuff is unlike the rest compiled with no exported
> templates, but would need at least some hacks in libstdc++ to preserve
> previously exported symbols.

There are explicit instantiation definitions that should instantiate
those types:

src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::_Dir>;
src/c++17/fs_dir.cc:template class
std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>;
src/c++17/fs_path.cc:template class std::__shared_ptr<const
fs::filesystem_error::_Impl>;

So the missing symbols should be present in cow-fs_dir.o and cow-fs_path.o



> Still, feels like a risky change this late if it wouldn't break ABI of other
> libraries.
>
> 2024-04-23  Jakub Jelinek  <jakub@redhat.com>
>
>         PR lto/113208
>         * decl2.cc (tentative_decl_linkage): Use comdat_linkage also
>         for implicit instantiations of maybe in charge ctors/dtors
>         if -fimplicit-templates or -fimplicit-inline-templates and
>         -fweak and target supports aliases.
>
>         * g++.dg/abi/comdat2.C: New test.
>         * g++.dg/lto/pr113208_0.C: New test.
>         * g++.dg/lto/pr113208_1.C: New file.
>         * g++.dg/lto/pr113208.h: New file.
>
> --- gcc/cp/decl2.cc.jj  2024-04-22 15:16:55.328548807 +0200
> +++ gcc/cp/decl2.cc     2024-04-23 09:52:18.993250442 +0200
> @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl)
>              to mark the functions at this point.  */
>           if (DECL_DECLARED_INLINE_P (decl)
>               && (!DECL_IMPLICIT_INSTANTIATION (decl)
> -                 || DECL_DEFAULTED_FN (decl)))
> +                 || DECL_DEFAULTED_FN (decl)
> +                 /* For implicit instantiations of cdtors,
> +                    if import_export_decl would use comdat linkage,
> +                    make sure to use it right away, so that maybe_clone_body
> +                    can use aliases.  See PR113208.  */
> +                 || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)
> +                     && (flag_implicit_templates
> +                         || flag_implicit_inline_templates)
> +                     && flag_weak
> +                     && TARGET_SUPPORTS_ALIASES)))
>             {
>               /* This function must have external linkage, as
>                  otherwise DECL_INTERFACE_KNOWN would have been
> --- gcc/testsuite/g++.dg/abi/comdat2.C.jj       2024-04-23 10:04:28.485964610 +0200
> +++ gcc/testsuite/g++.dg/abi/comdat2.C  2024-04-23 10:05:24.757171194 +0200
> @@ -0,0 +1,26 @@
> +// PR lto/113208
> +// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
> +// { dg-additional-options "-O2 -fkeep-inline-functions" }
> +// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
> +// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
> +// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
> +
> +template <typename T>
> +struct A {
> +  int foo () const;
> +  A (int, int);
> +};
> +template <typename T>
> +struct B : A<T> {
> +  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
> +  B () : A<T> (1, 2) {}
> +};
> +struct C;
> +struct D : B<C> {};
> +void bar (D);
> +
> +void
> +baz (D x)
> +{
> +  bar (x);
> +}
> --- gcc/testsuite/g++.dg/lto/pr113208.h.jj      2024-04-23 09:44:07.523179110 +0200
> +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-23 09:44:07.523179110 +0200
> @@ -0,0 +1,10 @@
> +template <typename _Tp> struct _Vector_base {
> +  int g() const;
> +  _Vector_base(int, int);
> +};
> +template <typename _Tp>
> +struct vector : _Vector_base<_Tp> {
> +  CONSTEXPR vector(const vector &__x)
> +      : _Vector_base<_Tp>(1, __x.g()) {}
> + vector() : _Vector_base<_Tp>(1, 2) {}
> +};
> --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj    2024-04-23 09:44:07.523179110 +0200
> +++ gcc/testsuite/g++.dg/lto/pr113208_1.C       2024-04-23 09:44:07.523179110 +0200
> @@ -0,0 +1,6 @@
> +#define CONSTEXPR
> +#include "pr113208.h"
> +
> +struct QualityValue;
> +vector<QualityValue> values1;
> +vector<QualityValue> values{values1};
> --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj    2024-04-23 09:44:07.523179110 +0200
> +++ gcc/testsuite/g++.dg/lto/pr113208_0.C       2024-04-23 09:44:07.523179110 +0200
> @@ -0,0 +1,13 @@
> +// { dg-lto-do link }
> +// { dg-lto-options { {-O1 -std=c++20 -flto}} }
> +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
> +// { dg-require-linker-plugin "" }
> +
> +#define CONSTEXPR constexpr
> +#include "pr113208.h"
> +
> +struct QualityValue;
> +struct k : vector<QualityValue> {};
> +
> +void m(k);
> +void n(k i) { m(i); }
>
>
>         Jakub
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-24  9:16       ` Jonathan Wakely
@ 2024-04-24 16:16         ` Jakub Jelinek
  2024-04-24 22:39           ` Jason Merrill
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-24 16:16 UTC (permalink / raw)
  To: Jonathan Wakely, Jason Merrill, Jan Hubicka
  Cc: gcc-patches, Richard Biener, Patrick Palka

On Wed, Apr 24, 2024 at 10:16:05AM +0100, Jonathan Wakely wrote:
> > That fixes the testcases too, but seems to regress
> > +FAIL: libstdc++-abi/abi_check

> There are explicit instantiation definitions that should instantiate
> those types:
> 
> src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::_Dir>;
> src/c++17/fs_dir.cc:template class
> std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>;
> src/c++17/fs_path.cc:template class std::__shared_ptr<const
> fs::filesystem_error::_Impl>;
> 
> So the missing symbols should be present in cow-fs_dir.o and cow-fs_path.o

So this boils down to (cvise reduced):
namespace __gnu_cxx { enum _Lock_policy { _S_single, _S_mutex, _S_atomic } const __default_lock_policy = _S_atomic; }
namespace std {
using __gnu_cxx::__default_lock_policy;
using __gnu_cxx::_Lock_policy;
template <typename, _Lock_policy = __default_lock_policy> struct __shared_ptr { constexpr __shared_ptr() {} };
namespace filesystem {
struct _Dir;
struct directory_iterator { __shared_ptr<_Dir> _M_dir; };
void end() { directory_iterator(); } }
extern template class __shared_ptr<filesystem::_Dir>;
}
namespace fs = std::filesystem;
template class std::__shared_ptr<fs::_Dir>;
at -O2, previously it used to emit
_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev
_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE1EEC2Ev
but now no longer does with the yesterday posted PR113208 patch.

The following updated patch fixes that by calling note_vague_linkage_fn for
the cdtor clones from maybe_clone_body if the flags suggest that the
maybe-in-charge cdtor had tentative_decl_linkage -> note_vague_linkage_fn
called too.  And then I've noticed that in some cases the updated comdat
group set by maybe_clone_body (*C5* or *D5*) was then overwritten again by
maybe_make_one_only.  So the patch tweaks cxx_comdat_group, so that when
some comdat group has been chosen already it doesn't try to use some
different one.

Bootstrapped/regtested on x86_64-linux and i686-linux, this one doesn't
regress anything unlike the earlier patch.

2024-04-24  Jakub Jelinek  <jakub@redhat.com>

	PR lto/113208
	* decl2.cc (tentative_decl_linkage): Use comdat_linkage also
	for implicit instantiations of maybe in charge ctors/dtors
	if -fimplicit-templates or -fimplicit-inline-templates and
	-fweak and target supports aliases.
	* optimize.cc (maybe_clone_body): Call note_vague_linkage_fn
	on clone if fn has DECL_INTERFACE_KNOWN, DECL_NOT_REALLY_EXTERN
	and DECL_DEFER_OUTPUT flags set.
	* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
	functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
	set.

	* g++.dg/abi/comdat2.C: New test.
	* g++.dg/abi/comdat3.C: New test.
	* g++.dg/lto/pr113208_0.C: New test.
	* g++.dg/lto/pr113208_1.C: New file.
	* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/decl2.cc.jj	2024-04-23 14:49:41.933186265 +0200
+++ gcc/cp/decl2.cc	2024-04-24 15:17:09.043625729 +0200
@@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl)
 	     to mark the functions at this point.  */
 	  if (DECL_DECLARED_INLINE_P (decl)
 	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
-		  || DECL_DEFAULTED_FN (decl)))
+		  || DECL_DEFAULTED_FN (decl)
+		  /* For implicit instantiations of cdtors,
+		     if import_export_decl would use comdat linkage,
+		     make sure to use it right away, so that maybe_clone_body
+		     can use aliases.  See PR113208.  */
+		  || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)
+		      && (flag_implicit_templates
+			  || flag_implicit_inline_templates)
+		      && flag_weak
+		      && TARGET_SUPPORTS_ALIASES)))
 	    {
 	      /* This function must have external linkage, as
 		 otherwise DECL_INTERFACE_KNOWN would have been
--- gcc/cp/optimize.cc.jj	2024-04-24 13:44:26.456634100 +0200
+++ gcc/cp/optimize.cc	2024-04-24 14:46:13.050371557 +0200
@@ -717,6 +717,10 @@ maybe_clone_body (tree fn)
       else
 	expand_or_defer_fn (clone);
       first = false;
+      if (DECL_INTERFACE_KNOWN (fn)
+	  && DECL_NOT_REALLY_EXTERN (fn)
+	  && DECL_DEFER_OUTPUT (fn))
+	note_vague_linkage_fn (clone);
     }
   pop_from_top_level ();
 
--- gcc/cp/decl.cc.jj	2024-04-23 08:31:05.515161033 +0200
+++ gcc/cp/decl.cc	2024-04-24 15:15:34.401951491 +0200
@@ -19254,6 +19254,14 @@ cxx_comdat_group (tree decl)
 	  else
 	    break;
 	}
+      /* If a ctor/dtor has already set the comdat group by
+	 maybe_clone_body, don't override it.  */
+      if (SUPPORTS_ONE_ONLY
+	  && TREE_CODE (decl) == FUNCTION_DECL
+	  && DECL_CLONED_FUNCTION_P (decl)
+	  && SUPPORTS_ONE_ONLY)
+	if (tree comdat = DECL_COMDAT_GROUP (decl))
+	  return comdat;
     }
 
   return decl;
--- gcc/testsuite/g++.dg/abi/comdat2.C.jj	2024-04-24 13:44:11.054849035 +0200
+++ gcc/testsuite/g++.dg/abi/comdat2.C	2024-04-24 13:44:11.054849035 +0200
@@ -0,0 +1,26 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2 -fkeep-inline-functions" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
--- gcc/testsuite/g++.dg/abi/comdat3.C.jj	2024-04-24 15:23:58.829885362 +0200
+++ gcc/testsuite/g++.dg/abi/comdat3.C	2024-04-24 15:24:46.088223353 +0200
@@ -0,0 +1,22 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } }
+
+namespace N { enum L { L1, L2, L3 } const O = L3; }
+namespace M {
+  using N::O;
+  using N::L;
+  template <typename, L = O>
+  struct S { constexpr S () {} };
+  namespace P {
+    struct T;
+    struct U { S<T> u; };
+    void foo () { U (); }
+  }
+  extern template class S<P::T>;
+}
+namespace p = M::P;
+template class M::S<p::T>;
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj	2024-04-24 13:43:17.887591040 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C	2024-04-24 13:44:11.055849021 +0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj	2024-04-24 13:43:17.887591040 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C	2024-04-24 13:44:11.055849021 +0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj	2024-04-24 13:43:17.887591040 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h	2024-04-24 13:44:11.054849035 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-24 16:16         ` [PATCH] c++, v3: " Jakub Jelinek
@ 2024-04-24 22:39           ` Jason Merrill
  2024-04-24 22:47             ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Merrill @ 2024-04-24 22:39 UTC (permalink / raw)
  To: Jakub Jelinek, Jonathan Wakely, Jan Hubicka
  Cc: gcc-patches, Richard Biener, Patrick Palka

On 4/24/24 09:16, Jakub Jelinek wrote:
> On Wed, Apr 24, 2024 at 10:16:05AM +0100, Jonathan Wakely wrote:
>>> That fixes the testcases too, but seems to regress
>>> +FAIL: libstdc++-abi/abi_check
> 
>> There are explicit instantiation definitions that should instantiate
>> those types:
>>
>> src/c++17/fs_dir.cc:template class std::__shared_ptr<fs::_Dir>;
>> src/c++17/fs_dir.cc:template class
>> std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>;
>> src/c++17/fs_path.cc:template class std::__shared_ptr<const
>> fs::filesystem_error::_Impl>;
>>
>> So the missing symbols should be present in cow-fs_dir.o and cow-fs_path.o
> 
> So this boils down to (cvise reduced):
> namespace __gnu_cxx { enum _Lock_policy { _S_single, _S_mutex, _S_atomic } const __default_lock_policy = _S_atomic; }
> namespace std {
> using __gnu_cxx::__default_lock_policy;
> using __gnu_cxx::_Lock_policy;
> template <typename, _Lock_policy = __default_lock_policy> struct __shared_ptr { constexpr __shared_ptr() {} };
> namespace filesystem {
> struct _Dir;
> struct directory_iterator { __shared_ptr<_Dir> _M_dir; };
> void end() { directory_iterator(); } }
> extern template class __shared_ptr<filesystem::_Dir>;
> }
> namespace fs = std::filesystem;
> template class std::__shared_ptr<fs::_Dir>;
> at -O2, previously it used to emit
> _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev
> _ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE1EEC2Ev
> but now no longer does with the yesterday posted PR113208 patch.
> 
> The following updated patch fixes that by calling note_vague_linkage_fn for
> the cdtor clones from maybe_clone_body if the flags suggest that the
> maybe-in-charge cdtor had tentative_decl_linkage -> note_vague_linkage_fn
> called too.  And then I've noticed that in some cases the updated comdat
> group set by maybe_clone_body (*C5* or *D5*) was then overwritten again by
> maybe_make_one_only.  So the patch tweaks cxx_comdat_group, so that when
> some comdat group has been chosen already it doesn't try to use some
> different one.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, this one doesn't
> regress anything unlike the earlier patch.
> 
> 2024-04-24  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR lto/113208
> 	* decl2.cc (tentative_decl_linkage): Use comdat_linkage also
> 	for implicit instantiations of maybe in charge ctors/dtors
> 	if -fimplicit-templates or -fimplicit-inline-templates and
> 	-fweak and target supports aliases.
> 	* optimize.cc (maybe_clone_body): Call note_vague_linkage_fn
> 	on clone if fn has DECL_INTERFACE_KNOWN, DECL_NOT_REALLY_EXTERN
> 	and DECL_DEFER_OUTPUT flags set.
> 	* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
> 	functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
> 	set.
> 
> 	* g++.dg/abi/comdat2.C: New test.
> 	* g++.dg/abi/comdat3.C: New test.
> 	* g++.dg/lto/pr113208_0.C: New test.
> 	* g++.dg/lto/pr113208_1.C: New file.
> 	* g++.dg/lto/pr113208.h: New file.
> 
> --- gcc/cp/decl2.cc.jj	2024-04-23 14:49:41.933186265 +0200
> +++ gcc/cp/decl2.cc	2024-04-24 15:17:09.043625729 +0200
> @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl)
>   	     to mark the functions at this point.  */
>   	  if (DECL_DECLARED_INLINE_P (decl)
>   	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
> -		  || DECL_DEFAULTED_FN (decl)))
> +		  || DECL_DEFAULTED_FN (decl)
> +		  /* For implicit instantiations of cdtors,
> +		     if import_export_decl would use comdat linkage,
> +		     make sure to use it right away, so that maybe_clone_body
> +		     can use aliases.  See PR113208.  */
> +		  || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)
> +		      && (flag_implicit_templates
> +			  || flag_implicit_inline_templates)
> +		      && flag_weak
> +		      && TARGET_SUPPORTS_ALIASES)))

It seems wrong to set DECL_INTERFACE_KNOWN for cdtors that might get an 
explicit instantiation later, and likewise for comdat_linkage when 
!flag_weak; instead of adding this condition to the if, how about adding 
an else like

>           else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
>             /* For implicit instantiations of cdtors,                                                                    
>                if import_export_decl would use comdat linkage,                                                           
>                make sure to use it right away, so that maybe_clone_body                                                  
>                can use aliases.  See PR113208.  */
>             maybe_make_one_only (decl);

?

Jason


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-24 22:39           ` Jason Merrill
@ 2024-04-24 22:47             ` Jakub Jelinek
  2024-04-25  0:43               ` Jason Merrill
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-24 22:47 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On Wed, Apr 24, 2024 at 06:39:33PM -0400, Jason Merrill wrote:
> > --- gcc/cp/decl2.cc.jj	2024-04-23 14:49:41.933186265 +0200
> > +++ gcc/cp/decl2.cc	2024-04-24 15:17:09.043625729 +0200
> > @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl)
> >   	     to mark the functions at this point.  */
> >   	  if (DECL_DECLARED_INLINE_P (decl)
> >   	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
> > -		  || DECL_DEFAULTED_FN (decl)))
> > +		  || DECL_DEFAULTED_FN (decl)
> > +		  /* For implicit instantiations of cdtors,
> > +		     if import_export_decl would use comdat linkage,
> > +		     make sure to use it right away, so that maybe_clone_body
> > +		     can use aliases.  See PR113208.  */
> > +		  || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)
> > +		      && (flag_implicit_templates
> > +			  || flag_implicit_inline_templates)
> > +		      && flag_weak
> > +		      && TARGET_SUPPORTS_ALIASES)))
> 
> It seems wrong to set DECL_INTERFACE_KNOWN for cdtors that might get an
> explicit instantiation later, and likewise for comdat_linkage when
> !flag_weak; instead of adding this condition to the if, how about adding an
> else like
> 
> >           else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
> >             /* For implicit instantiations of cdtors,
> > if import_export_decl would use comdat linkage,
> > make sure to use it right away, so that maybe_clone_body
> > can use aliases.  See PR113208.  */
> >             maybe_make_one_only (decl);
> 
> ?

Then can_alias_cdtor would return false, because it ends with:
  /* Don't use aliases for weak/linkonce definitions unless we can put both
     symbols in the same COMDAT group.  */
  return (DECL_INTERFACE_KNOWN (fn)
          && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
          && (!DECL_ONE_ONLY (fn)
              || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
Should we change that DECL_INTERFACE_KNOWN (fn) in there to
(DECL_INTERFACE_KNOWN (fn) || something) then and what that
something should be?  HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)?

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v3: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-24 22:47             ` Jakub Jelinek
@ 2024-04-25  0:43               ` Jason Merrill
  2024-04-25 12:02                 ` [PATCH] c++, v4: " Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Merrill @ 2024-04-25  0:43 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On 4/24/24 15:47, Jakub Jelinek wrote:
> On Wed, Apr 24, 2024 at 06:39:33PM -0400, Jason Merrill wrote:
>>> --- gcc/cp/decl2.cc.jj	2024-04-23 14:49:41.933186265 +0200
>>> +++ gcc/cp/decl2.cc	2024-04-24 15:17:09.043625729 +0200
>>> @@ -3314,7 +3314,16 @@ tentative_decl_linkage (tree decl)
>>>    	     to mark the functions at this point.  */
>>>    	  if (DECL_DECLARED_INLINE_P (decl)
>>>    	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
>>> -		  || DECL_DEFAULTED_FN (decl)))
>>> +		  || DECL_DEFAULTED_FN (decl)
>>> +		  /* For implicit instantiations of cdtors,
>>> +		     if import_export_decl would use comdat linkage,
>>> +		     make sure to use it right away, so that maybe_clone_body
>>> +		     can use aliases.  See PR113208.  */
>>> +		  || (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl)
>>> +		      && (flag_implicit_templates
>>> +			  || flag_implicit_inline_templates)
>>> +		      && flag_weak
>>> +		      && TARGET_SUPPORTS_ALIASES)))
>>
>> It seems wrong to set DECL_INTERFACE_KNOWN for cdtors that might get an
>> explicit instantiation later, and likewise for comdat_linkage when
>> !flag_weak; instead of adding this condition to the if, how about adding an
>> else like
>>
>>>            else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
>>>              /* For implicit instantiations of cdtors,
>>> if import_export_decl would use comdat linkage,
>>> make sure to use it right away, so that maybe_clone_body
>>> can use aliases.  See PR113208.  */
>>>              maybe_make_one_only (decl);
>>
>> ?
> 
> Then can_alias_cdtor would return false, because it ends with:
>    /* Don't use aliases for weak/linkonce definitions unless we can put both
>       symbols in the same COMDAT group.  */
>    return (DECL_INTERFACE_KNOWN (fn)
>            && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
>            && (!DECL_ONE_ONLY (fn)
>                || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
> Should we change that DECL_INTERFACE_KNOWN (fn) in there to
> (DECL_INTERFACE_KNOWN (fn) || something) then and what that
> something should be?  HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)?

Yes, I think reorganize to

((DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn) && !DECL_ONE_ONLY (fn))
  || (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))

Jason


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-25  0:43               ` Jason Merrill
@ 2024-04-25 12:02                 ` Jakub Jelinek
  2024-04-25 14:22                   ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-25 12:02 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On Wed, Apr 24, 2024 at 08:43:46PM -0400, Jason Merrill wrote:
> > Then can_alias_cdtor would return false, because it ends with:
> >    /* Don't use aliases for weak/linkonce definitions unless we can put both
> >       symbols in the same COMDAT group.  */
> >    return (DECL_INTERFACE_KNOWN (fn)
> >            && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
> >            && (!DECL_ONE_ONLY (fn)
> >                || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
> > Should we change that DECL_INTERFACE_KNOWN (fn) in there to
> > (DECL_INTERFACE_KNOWN (fn) || something) then and what that
> > something should be?  HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)?
> 
> Yes, I think reorganize to
> 
> ((DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn) && !DECL_ONE_ONLY (fn))
>  || (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))

I've tried the following patch, but unfortunately that lead to large
number of regressions:
+FAIL: g++.dg/coroutines/torture/co-yield-04-complex-local-state.C (test for excess errors)
+FAIL: g++.dg/coroutines/torture/func-params-08.C (test for excess errors)
+FAIL: g++.dg/coroutines/torture/func-params-09-awaitable-parms.C (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++11 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++14 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++17 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++11 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++14 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++17 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp1y/pr95226.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp1y/pr95226.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp1y/pr95226.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp1z/decomp12.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp1z/decomp12.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp1z/eval-order2.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp1z/eval-order2.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp1z/eval-order2.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp2a/srcloc17.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp2a/srcloc17.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp2a/srcloc17.C  -std=c++26 (test for excess errors)
+FAIL: g++.old-deja/g++.jason/template31.C  -std=c++20 (test for excess errors)
+FAIL: g++.old-deja/g++.jason/template31.C  -std=c++23 (test for excess errors)
+FAIL: g++.old-deja/g++.jason/template31.C  -std=c++26 (test for excess errors)
+FAIL: 20_util/unique_ptr/creation/for_overwrite.cc  -std=gnu++26 (test for excess errors)
+FAIL: 23_containers/span/cons_1_assert_neg.cc  -std=gnu++20 (test for excess errors)
+FAIL: 23_containers/span/cons_1_assert_neg.cc  -std=gnu++26 (test for excess errors)
+FAIL: 23_containers/span/cons_2_assert_neg.cc  -std=gnu++20 (test for excess errors)
+FAIL: 23_containers/span/cons_2_assert_neg.cc  -std=gnu++26 (test for excess errors)
+FAIL: std/ranges/repeat/1.cc  -std=gnu++23 (test for excess errors)
+FAIL: std/ranges/repeat/1.cc  -std=gnu++26 (test for excess errors)

Errors are like:
func-params-08.C:(.text._ZNSt12_Vector_baseIiSaIiEEC2Ev[_ZNSt12_Vector_baseIiSaIiEEC5Ev]+0x14): undefined reference to `_ZNSt12_Vector_baseIiSaIiEE12_Vector_implC1EvQ26is_default_constructible_vIN9__gnu_cxx14__alloc_traitsIT0_NS5_10value_typeEE6rebindIT_E5otherEE'
Though, libstdc++.so.6 abilist is the same.
Trying to debug it now.

2024-04-24  Jakub Jelinek  <jakub@redhat.com>
	    Jason Merrill  <jason@redhat.com>

	PR lto/113208
	* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
	for implicit instantiations of maybe in charge ctors/dtors
	declared inline.
	* optimize.cc (can_alias_cdtor): Adjust condition, for
	HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
	if not DECL_INTERFACE_KNOWN.
	* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
	functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
	set.

	* g++.dg/abi/comdat2.C: New test.
	* g++.dg/abi/comdat3.C: New test.
	* g++.dg/abi/comdat4.C: New test.
	* g++.dg/abi/comdat5.C: New test.
	* g++.dg/lto/pr113208_0.C: New test.
	* g++.dg/lto/pr113208_1.C: New file.
	* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/decl2.cc.jj	2024-04-24 18:28:22.299513620 +0200
+++ gcc/cp/decl2.cc	2024-04-25 10:04:18.049476567 +0200
@@ -3312,16 +3312,23 @@ tentative_decl_linkage (tree decl)
 	     linkage of all functions, and as that causes writes to
 	     the data mapped in from the PCH file, it's advantageous
 	     to mark the functions at this point.  */
-	  if (DECL_DECLARED_INLINE_P (decl)
-	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
-		  || DECL_DEFAULTED_FN (decl)))
+	  if (DECL_DECLARED_INLINE_P (decl))
 	    {
-	      /* This function must have external linkage, as
-		 otherwise DECL_INTERFACE_KNOWN would have been
-		 set.  */
-	      gcc_assert (TREE_PUBLIC (decl));
-	      comdat_linkage (decl);
-	      DECL_INTERFACE_KNOWN (decl) = 1;
+	      if (!DECL_IMPLICIT_INSTANTIATION (decl)
+		  || DECL_DEFAULTED_FN (decl))
+		{
+		  /* This function must have external linkage, as
+		     otherwise DECL_INTERFACE_KNOWN would have been
+		     set.  */
+		  gcc_assert (TREE_PUBLIC (decl));
+		  comdat_linkage (decl);
+		  DECL_INTERFACE_KNOWN (decl) = 1;
+		}
+	      else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
+		/* For implicit instantiations of cdtors try to make
+		   it comdat, so that maybe_clone_body can use aliases.
+		   See PR113208.  */
+		maybe_make_one_only (decl);
 	    }
 	}
       else if (VAR_P (decl))
--- gcc/cp/optimize.cc.jj	2024-04-25 09:47:24.936736833 +0200
+++ gcc/cp/optimize.cc	2024-04-25 10:19:25.338332564 +0200
@@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
   gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
   /* Don't use aliases for weak/linkonce definitions unless we can put both
      symbols in the same COMDAT group.  */
-  return (DECL_INTERFACE_KNOWN (fn)
-	  && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
-	  && (!DECL_ONE_ONLY (fn)
-	      || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
+  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
+			 : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
 }
 
 /* FN is a [cd]tor, fns is a pointer to an array of length 3.  Fill fns
--- gcc/cp/decl.cc.jj	2024-04-24 18:28:22.231514532 +0200
+++ gcc/cp/decl.cc	2024-04-25 09:47:06.855991224 +0200
@@ -19254,6 +19254,14 @@ cxx_comdat_group (tree decl)
 	  else
 	    break;
 	}
+      /* If a ctor/dtor has already set the comdat group by
+	 maybe_clone_body, don't override it.  */
+      if (SUPPORTS_ONE_ONLY
+	  && TREE_CODE (decl) == FUNCTION_DECL
+	  && DECL_CLONED_FUNCTION_P (decl)
+	  && SUPPORTS_ONE_ONLY)
+	if (tree comdat = DECL_COMDAT_GROUP (decl))
+	  return comdat;
     }
 
   return decl;
--- gcc/testsuite/g++.dg/abi/comdat2.C.jj	2024-04-25 09:47:06.855991224 +0200
+++ gcc/testsuite/g++.dg/abi/comdat2.C	2024-04-25 09:47:06.855991224 +0200
@@ -0,0 +1,26 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2 -fkeep-inline-functions" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
--- gcc/testsuite/g++.dg/abi/comdat3.C.jj	2024-04-25 09:47:06.855991224 +0200
+++ gcc/testsuite/g++.dg/abi/comdat3.C	2024-04-25 09:47:06.855991224 +0200
@@ -0,0 +1,22 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } }
+
+namespace N { enum L { L1, L2, L3 } const O = L3; }
+namespace M {
+  using N::O;
+  using N::L;
+  template <typename, L = O>
+  struct S { constexpr S () {} };
+  namespace P {
+    struct T;
+    struct U { S<T> u; };
+    void foo () { U (); }
+  }
+  extern template class S<P::T>;
+}
+namespace p = M::P;
+template class M::S<p::T>;
--- gcc/testsuite/g++.dg/abi/comdat4.C.jj	2024-04-25 10:27:54.141172257 +0200
+++ gcc/testsuite/g++.dg/abi/comdat4.C	2024-04-25 10:28:22.493773334 +0200
@@ -0,0 +1,28 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
+
+template struct B<C>;
--- gcc/testsuite/g++.dg/abi/comdat5.C.jj	2024-04-25 10:28:35.842585513 +0200
+++ gcc/testsuite/g++.dg/abi/comdat5.C	2024-04-25 10:28:52.413352362 +0200
@@ -0,0 +1,28 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
+
+extern template struct B<C>;
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj	2024-04-25 09:47:06.856991210 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C	2024-04-25 09:47:06.855991224 +0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj	2024-04-25 09:47:06.856991210 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C	2024-04-25 09:47:06.856991210 +0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj	2024-04-25 09:47:06.856991210 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h	2024-04-25 09:47:06.856991210 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-25 12:02                 ` [PATCH] c++, v4: " Jakub Jelinek
@ 2024-04-25 14:22                   ` Jakub Jelinek
  2024-04-25 15:30                     ` Jason Merrill
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-25 14:22 UTC (permalink / raw)
  To: Jason Merrill, Jonathan Wakely, Jan Hubicka, gcc-patches,
	Richard Biener, Patrick Palka

On Thu, Apr 25, 2024 at 02:02:32PM +0200, Jakub Jelinek wrote:
> I've tried the following patch, but unfortunately that lead to large
> number of regressions:
> +FAIL: g++.dg/cpp0x/initlist25.C  -std=c++17 (test for excess errors)

So the reduced testcase for this is
template <typename T, typename U> struct A {
  T a1;
  U a2;
  template <typename V, typename W, bool = true>
  constexpr A(V &&x, W &&y) : a1(x), a2(y) {}
};
template <typename> struct B;
namespace std {
template <class> struct initializer_list {
  int *_M_array;
  decltype (sizeof 0) _M_len;
};
}
template <typename T, typename U> struct C {
  void foo (std::initializer_list<A<const T, U>>);
};
template <class> struct D;
template <typename T, typename = D<T>, typename = B<T>>
struct E { E (const char *); ~E (); };
int
main ()
{
  C<E<char>, E<char>> m;
  m.foo ({{"t", "t"}, {"y", "y"}});
}
Without the patch I've just posted or even with the earlier version
of the patch the
_ZN1AIK1EIc1DIcE1BIcEES5_EC[12]IRA2_KcSB_Lb1EEEOT_OT0_
ctors were emitted, but with this patch they are unresolved externals.

The reason is that the code actually uses (calls) the
_ZN1AIK1EIc1DIcE1BIcEES5_EC1IRA2_KcSB_Lb1EEEOT_OT0_
__ct_comp constructor, that one has TREE_USED, while the
_ZN1AIK1EIc1DIcE1BIcEES5_EC2IRA2_KcSB_Lb1EEEOT_OT0_
__ct_base constructor is not TREE_USED.

But the c_parse_final_cleanups loop over
FOR_EACH_VEC_SAFE_ELT (deferred_fns, i, decl)
will ignore the TREE_USED __ct_comp because it is an alias
and so has !DECL_SAVED_TREE:
5273		  if (!DECL_SAVED_TREE (decl))
5274		    continue;

With the following incremental patch the tests in make check-g++
(haven't tried the coroutine one) which failed with the earlier patch
now pass.

--- gcc/cp/decl2.cc.jj	2024-04-25 10:52:21.057535959 +0200
+++ gcc/cp/decl2.cc	2024-04-25 16:19:17.385547357 +0200
@@ -5271,7 +5271,19 @@ c_parse_final_cleanups (void)
 	    generate_tls_wrapper (decl);
 
 	  if (!DECL_SAVED_TREE (decl))
-	    continue;
+	    {
+	      cgraph_node *node;
+	      tree tgt;
+	      /* Even when maybe_clone_body created same body alias
+		 has no DECL_SAVED_TREE, if its alias target does,
+		 don't skip it.  */
+	      if (!DECL_CLONED_FUNCTION (decl)
+		  || !(node = cgraph_node::get (decl))
+		  || !node->cpp_implicit_alias
+		  || !(tgt = node->get_alias_target_tree ())
+		  || !DECL_SAVED_TREE (tgt))
+		continue;
+	    }
 
 	  cgraph_node *node = cgraph_node::get_create (decl);
 
@@ -5299,7 +5311,7 @@ c_parse_final_cleanups (void)
 		node = node->get_alias_target ();
 
 	      node->call_for_symbol_thunks_and_aliases (clear_decl_external,
-						      NULL, true);
+							NULL, true);
 	      /* If we mark !DECL_EXTERNAL one of the symbols in some comdat
 		 group, we need to mark all symbols in the same comdat group
 		 that way.  */
@@ -5309,7 +5321,7 @@ c_parse_final_cleanups (void)
 		     next != node;
 		     next = dyn_cast<cgraph_node *> (next->same_comdat_group))
 		  next->call_for_symbol_thunks_and_aliases (clear_decl_external,
-							  NULL, true);
+							    NULL, true);
 	    }
 
 	  /* If we're going to need to write this function out, and


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-25 14:22                   ` Jakub Jelinek
@ 2024-04-25 15:30                     ` Jason Merrill
  2024-04-25 18:42                       ` [PATCH] c++, v5: " Jakub Jelinek
  2024-05-09 18:20                       ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek
  0 siblings, 2 replies; 25+ messages in thread
From: Jason Merrill @ 2024-04-25 15:30 UTC (permalink / raw)
  To: Jakub Jelinek, Jonathan Wakely, Jan Hubicka, gcc-patches,
	Richard Biener, Patrick Palka

On 4/25/24 07:22, Jakub Jelinek wrote:
> On Thu, Apr 25, 2024 at 02:02:32PM +0200, Jakub Jelinek wrote:
>> I've tried the following patch, but unfortunately that lead to large
>> number of regressions:
>> +FAIL: g++.dg/cpp0x/initlist25.C  -std=c++17 (test for excess errors)
> 
> So the reduced testcase for this is
> template <typename T, typename U> struct A {
>    T a1;
>    U a2;
>    template <typename V, typename W, bool = true>
>    constexpr A(V &&x, W &&y) : a1(x), a2(y) {}
> };
> template <typename> struct B;
> namespace std {
> template <class> struct initializer_list {
>    int *_M_array;
>    decltype (sizeof 0) _M_len;
> };
> }
> template <typename T, typename U> struct C {
>    void foo (std::initializer_list<A<const T, U>>);
> };
> template <class> struct D;
> template <typename T, typename = D<T>, typename = B<T>>
> struct E { E (const char *); ~E (); };
> int
> main ()
> {
>    C<E<char>, E<char>> m;
>    m.foo ({{"t", "t"}, {"y", "y"}});
> }
> Without the patch I've just posted or even with the earlier version
> of the patch the
> _ZN1AIK1EIc1DIcE1BIcEES5_EC[12]IRA2_KcSB_Lb1EEEOT_OT0_
> ctors were emitted, but with this patch they are unresolved externals.
> 
> The reason is that the code actually uses (calls) the
> _ZN1AIK1EIc1DIcE1BIcEES5_EC1IRA2_KcSB_Lb1EEEOT_OT0_
> __ct_comp constructor, that one has TREE_USED, while the
> _ZN1AIK1EIc1DIcE1BIcEES5_EC2IRA2_KcSB_Lb1EEEOT_OT0_
> __ct_base constructor is not TREE_USED.
> 
> But the c_parse_final_cleanups loop over
> FOR_EACH_VEC_SAFE_ELT (deferred_fns, i, decl)
> will ignore the TREE_USED __ct_comp because it is an alias
> and so has !DECL_SAVED_TREE:
> 5273		  if (!DECL_SAVED_TREE (decl))
> 5274		    continue;

Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, 
but rather set it to some stub like void_node?

Though with all these changes, it's probably better to go with your 
first patch for GCC 14 and delay this approach to 15.  Your v1 patch is 
OK for 14.

Jason


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] c++, v5: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
  2024-04-25 15:30                     ` Jason Merrill
@ 2024-04-25 18:42                       ` Jakub Jelinek
  2024-05-09 18:20                       ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek
  1 sibling, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2024-04-25 18:42 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote:
> Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but
> rather set it to some stub like void_node?

I'll try that in stage1.

> Though with all these changes, it's probably better to go with your first
> patch for GCC 14 and delay this approach to 15.  Your v1 patch is OK for 14.

Just to record, following patch passed bootstrap/regtest on x86_64-linux and
i686-linux.  But I've committed the v1 version instead with the addition of
comdat2.C and comdat5.C testcases from this patch now and in stage1 will
post an incremental diff.

Thanks.

2024-04-25  Jakub Jelinek  <jakub@redhat.com>
	    Jason Merrill  <jason@redhat.com>

	PR lto/113208
	* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
	for implicit instantiations of maybe in charge ctors/dtors
	declared inline.
	(c_parse_final_cleanups): Don't skip used same body aliases which
	have non-NULL DECL_SAVED_TREE on the alias target.  Formatting fixes.
	* optimize.cc (can_alias_cdtor): Adjust condition, for
	HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
	if not DECL_INTERFACE_KNOWN.
	* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
	functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
	set.

	* g++.dg/abi/comdat2.C: New test.
	* g++.dg/abi/comdat3.C: New test.
	* g++.dg/abi/comdat4.C: New test.
	* g++.dg/abi/comdat5.C: New test.
	* g++.dg/lto/pr113208_0.C: New test.
	* g++.dg/lto/pr113208_1.C: New file.
	* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/decl2.cc.jj	2024-04-24 18:28:22.299513620 +0200
+++ gcc/cp/decl2.cc	2024-04-25 16:19:17.385547357 +0200
@@ -3312,16 +3312,23 @@ tentative_decl_linkage (tree decl)
 	     linkage of all functions, and as that causes writes to
 	     the data mapped in from the PCH file, it's advantageous
 	     to mark the functions at this point.  */
-	  if (DECL_DECLARED_INLINE_P (decl)
-	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
-		  || DECL_DEFAULTED_FN (decl)))
+	  if (DECL_DECLARED_INLINE_P (decl))
 	    {
-	      /* This function must have external linkage, as
-		 otherwise DECL_INTERFACE_KNOWN would have been
-		 set.  */
-	      gcc_assert (TREE_PUBLIC (decl));
-	      comdat_linkage (decl);
-	      DECL_INTERFACE_KNOWN (decl) = 1;
+	      if (!DECL_IMPLICIT_INSTANTIATION (decl)
+		  || DECL_DEFAULTED_FN (decl))
+		{
+		  /* This function must have external linkage, as
+		     otherwise DECL_INTERFACE_KNOWN would have been
+		     set.  */
+		  gcc_assert (TREE_PUBLIC (decl));
+		  comdat_linkage (decl);
+		  DECL_INTERFACE_KNOWN (decl) = 1;
+		}
+	      else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
+		/* For implicit instantiations of cdtors try to make
+		   it comdat, so that maybe_clone_body can use aliases.
+		   See PR113208.  */
+		maybe_make_one_only (decl);
 	    }
 	}
       else if (VAR_P (decl))
@@ -5264,7 +5271,19 @@ c_parse_final_cleanups (void)
 	    generate_tls_wrapper (decl);
 
 	  if (!DECL_SAVED_TREE (decl))
-	    continue;
+	    {
+	      cgraph_node *node;
+	      tree tgt;
+	      /* Even when maybe_clone_body created same body alias
+		 has no DECL_SAVED_TREE, if its alias target does,
+		 don't skip it.  */
+	      if (!DECL_CLONED_FUNCTION (decl)
+		  || !(node = cgraph_node::get (decl))
+		  || !node->cpp_implicit_alias
+		  || !(tgt = node->get_alias_target_tree ())
+		  || !DECL_SAVED_TREE (tgt))
+		continue;
+	    }
 
 	  cgraph_node *node = cgraph_node::get_create (decl);
 
@@ -5292,7 +5311,7 @@ c_parse_final_cleanups (void)
 		node = node->get_alias_target ();
 
 	      node->call_for_symbol_thunks_and_aliases (clear_decl_external,
-						      NULL, true);
+							NULL, true);
 	      /* If we mark !DECL_EXTERNAL one of the symbols in some comdat
 		 group, we need to mark all symbols in the same comdat group
 		 that way.  */
@@ -5302,7 +5321,7 @@ c_parse_final_cleanups (void)
 		     next != node;
 		     next = dyn_cast<cgraph_node *> (next->same_comdat_group))
 		  next->call_for_symbol_thunks_and_aliases (clear_decl_external,
-							  NULL, true);
+							    NULL, true);
 	    }
 
 	  /* If we're going to need to write this function out, and
--- gcc/cp/optimize.cc.jj	2024-04-25 09:47:24.936736833 +0200
+++ gcc/cp/optimize.cc	2024-04-25 10:19:25.338332564 +0200
@@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
   gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
   /* Don't use aliases for weak/linkonce definitions unless we can put both
      symbols in the same COMDAT group.  */
-  return (DECL_INTERFACE_KNOWN (fn)
-	  && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
-	  && (!DECL_ONE_ONLY (fn)
-	      || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
+  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
+			 : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
 }
 
 /* FN is a [cd]tor, fns is a pointer to an array of length 3.  Fill fns
--- gcc/cp/decl.cc.jj	2024-04-24 18:28:22.231514532 +0200
+++ gcc/cp/decl.cc	2024-04-25 09:47:06.855991224 +0200
@@ -19254,6 +19254,14 @@ cxx_comdat_group (tree decl)
 	  else
 	    break;
 	}
+      /* If a ctor/dtor has already set the comdat group by
+	 maybe_clone_body, don't override it.  */
+      if (SUPPORTS_ONE_ONLY
+	  && TREE_CODE (decl) == FUNCTION_DECL
+	  && DECL_CLONED_FUNCTION_P (decl)
+	  && SUPPORTS_ONE_ONLY)
+	if (tree comdat = DECL_COMDAT_GROUP (decl))
+	  return comdat;
     }
 
   return decl;
--- gcc/testsuite/g++.dg/abi/comdat2.C.jj	2024-04-25 09:47:06.855991224 +0200
+++ gcc/testsuite/g++.dg/abi/comdat2.C	2024-04-25 09:47:06.855991224 +0200
@@ -0,0 +1,26 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2 -fkeep-inline-functions" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
--- gcc/testsuite/g++.dg/abi/comdat3.C.jj	2024-04-25 09:47:06.855991224 +0200
+++ gcc/testsuite/g++.dg/abi/comdat3.C	2024-04-25 09:47:06.855991224 +0200
@@ -0,0 +1,22 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } }
+
+namespace N { enum L { L1, L2, L3 } const O = L3; }
+namespace M {
+  using N::O;
+  using N::L;
+  template <typename, L = O>
+  struct S { constexpr S () {} };
+  namespace P {
+    struct T;
+    struct U { S<T> u; };
+    void foo () { U (); }
+  }
+  extern template class S<P::T>;
+}
+namespace p = M::P;
+template class M::S<p::T>;
--- gcc/testsuite/g++.dg/abi/comdat4.C.jj	2024-04-25 10:27:54.141172257 +0200
+++ gcc/testsuite/g++.dg/abi/comdat4.C	2024-04-25 10:28:22.493773334 +0200
@@ -0,0 +1,28 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
+
+template struct B<C>;
--- gcc/testsuite/g++.dg/abi/comdat5.C.jj	2024-04-25 10:28:35.842585513 +0200
+++ gcc/testsuite/g++.dg/abi/comdat5.C	2024-04-25 10:28:52.413352362 +0200
@@ -0,0 +1,28 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
+
+extern template struct B<C>;
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj	2024-04-25 09:47:06.856991210 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C	2024-04-25 09:47:06.855991224 +0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj	2024-04-25 09:47:06.856991210 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C	2024-04-25 09:47:06.856991210 +0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj	2024-04-25 09:47:06.856991210 +0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h	2024-04-25 09:47:06.856991210 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]
  2024-04-25 15:30                     ` Jason Merrill
  2024-04-25 18:42                       ` [PATCH] c++, v5: " Jakub Jelinek
@ 2024-05-09 18:20                       ` Jakub Jelinek
  2024-05-09 18:58                         ` Marek Polacek
  2024-05-10 19:59                         ` Jason Merrill
  1 sibling, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2024-05-09 18:20 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote:
> Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but
> rather set it to some stub like void_node?
> 
> Though with all these changes, it's probably better to go with your first
> patch for GCC 14 and delay this approach to 15.  Your v1 patch is OK for 14.

Ok, here is an updated patch, which sets DECL_SAVED_TREE to void_node for
the aliases together with reversion of the earlier committed patch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-09  Jakub Jelinek  <jakub@redhat.com>
	    Jason Merrill  <jason@redhat.com>

	PR lto/113208
	* cp-tree.h (maybe_optimize_cdtor): Remove.
	* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
	for implicit instantiations of maybe in charge ctors/dtors
	declared inline.
	(import_export_decl): Don't call maybe_optimize_cdtor.
	(c_parse_final_cleanups): Formatting fixes.
	* optimize.cc (can_alias_cdtor): Adjust condition, for
	HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
	if not DECL_INTERFACE_KNOWN.
	(maybe_clone_body): Don't clear DECL_SAVED_TREE, instead set it
	to void_node.
	(maybe_clone_body): Remove.
	* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
	functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
	set.

	* g++.dg/abi/comdat3.C: New test.
	* g++.dg/abi/comdat4.C: New test.

--- gcc/cp/cp-tree.h.jj	2024-05-09 10:30:54.775505524 +0200
+++ gcc/cp/cp-tree.h	2024-05-09 17:07:01.246206288 +0200
@@ -7451,7 +7451,6 @@ extern bool handle_module_option (unsign
 /* In optimize.cc */
 extern tree clone_attrs				(tree);
 extern bool maybe_clone_body			(tree);
-extern void maybe_optimize_cdtor		(tree);
 
 /* In parser.cc */
 extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
--- gcc/cp/decl2.cc.jj	2024-05-02 09:31:17.753298180 +0200
+++ gcc/cp/decl2.cc	2024-05-09 17:11:11.676836268 +0200
@@ -3325,16 +3325,23 @@ tentative_decl_linkage (tree decl)
 	     linkage of all functions, and as that causes writes to
 	     the data mapped in from the PCH file, it's advantageous
 	     to mark the functions at this point.  */
-	  if (DECL_DECLARED_INLINE_P (decl)
-	      && (!DECL_IMPLICIT_INSTANTIATION (decl)
-		  || DECL_DEFAULTED_FN (decl)))
+	  if (DECL_DECLARED_INLINE_P (decl))
 	    {
-	      /* This function must have external linkage, as
-		 otherwise DECL_INTERFACE_KNOWN would have been
-		 set.  */
-	      gcc_assert (TREE_PUBLIC (decl));
-	      comdat_linkage (decl);
-	      DECL_INTERFACE_KNOWN (decl) = 1;
+	      if (!DECL_IMPLICIT_INSTANTIATION (decl)
+		  || DECL_DEFAULTED_FN (decl))
+		{
+		  /* This function must have external linkage, as
+		     otherwise DECL_INTERFACE_KNOWN would have been
+		     set.  */
+		  gcc_assert (TREE_PUBLIC (decl));
+		  comdat_linkage (decl);
+		  DECL_INTERFACE_KNOWN (decl) = 1;
+		}
+	      else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
+		/* For implicit instantiations of cdtors try to make
+		   it comdat, so that maybe_clone_body can use aliases.
+		   See PR113208.  */
+		maybe_make_one_only (decl);
 	    }
 	}
       else if (VAR_P (decl))
@@ -3604,9 +3611,6 @@ import_export_decl (tree decl)
     }
 
   DECL_INTERFACE_KNOWN (decl) = 1;
-
-  if (DECL_CLONED_FUNCTION_P (decl))
-    maybe_optimize_cdtor (decl);
 }
 
 /* Return an expression that performs the destruction of DECL, which
@@ -5331,7 +5335,7 @@ c_parse_final_cleanups (void)
 		node = node->get_alias_target ();
 
 	      node->call_for_symbol_thunks_and_aliases (clear_decl_external,
-						      NULL, true);
+							NULL, true);
 	      /* If we mark !DECL_EXTERNAL one of the symbols in some comdat
 		 group, we need to mark all symbols in the same comdat group
 		 that way.  */
@@ -5341,7 +5345,7 @@ c_parse_final_cleanups (void)
 		     next != node;
 		     next = dyn_cast<cgraph_node *> (next->same_comdat_group))
 		  next->call_for_symbol_thunks_and_aliases (clear_decl_external,
-							  NULL, true);
+							    NULL, true);
 	    }
 
 	  /* If we're going to need to write this function out, and
--- gcc/cp/optimize.cc.jj	2024-04-25 20:33:30.771858912 +0200
+++ gcc/cp/optimize.cc	2024-05-09 17:10:23.920478922 +0200
@@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
   gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
   /* Don't use aliases for weak/linkonce definitions unless we can put both
      symbols in the same COMDAT group.  */
-  return (DECL_INTERFACE_KNOWN (fn)
-	  && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
-	  && (!DECL_ONE_ONLY (fn)
-	      || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
+  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
+			 : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
 }
 
 /* FN is a [cd]tor, fns is a pointer to an array of length 3.  Fill fns
@@ -712,7 +710,7 @@ maybe_clone_body (tree fn)
 	  if (expand_or_defer_fn_1 (clone))
 	    emit_associated_thunks (clone);
 	  /* We didn't generate a body, so remove the empty one.  */
-	  DECL_SAVED_TREE (clone) = NULL_TREE;
+	  DECL_SAVED_TREE (clone) = void_node;
 	}
       else
 	expand_or_defer_fn (clone);
@@ -723,58 +721,3 @@ maybe_clone_body (tree fn)
   /* We don't need to process the original function any further.  */
   return 1;
 }
-
-/* If maybe_clone_body is called while the cdtor is still tentative,
-   DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn).
-   In that case we wouldn't try to optimize using an alias and instead
-   would emit separate base and complete cdtor.  The following function
-   attempts to still optimize that case when we import_export_decl
-   is called first time on one of the clones.  */
-
-void
-maybe_optimize_cdtor (tree orig_decl)
-{
-  tree fns[3];
-  tree fn = DECL_CLONED_FUNCTION (orig_decl);
-  gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
-  if (DECL_INTERFACE_KNOWN (fn)
-      || !TREE_ASM_WRITTEN (fn)
-      || !DECL_ONE_ONLY (orig_decl)
-      || symtab->global_info_ready)
-    return;
-
-  populate_clone_array (fn, fns);
-
-  if (!fns[0] || !fns[1])
-    return;
-  for (int i = 2 - !fns[2]; i >= 0; --i)
-    if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i]))
-      return;
-  DECL_INTERFACE_KNOWN (fn) = 1;
-  comdat_linkage (fn);
-  if (!can_alias_cdtor (fn))
-    return;
-  /* For comdat base and complete cdtors put them into the same,
-     *[CD]5* comdat group instead of *[CD][12]*.  */
-  auto n0 = cgraph_node::get_create (fns[0]);
-  auto n1 = cgraph_node::get_create (fns[1]);
-  auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL;
-  if (n0->lowered || n1->lowered || (n2 && n2->lowered))
-    return;
-  import_export_decl (fns[0]);
-  n1->definition = false;
-  if (!n0->create_same_body_alias (fns[1], fns[0]))
-    return;
-
-  tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
-  n1 = cgraph_node::get (fns[1]);
-  n0->set_comdat_group (comdat_group);
-  if (n1->same_comdat_group)
-    n1->remove_from_same_comdat_group ();
-  n1->add_to_same_comdat_group (n0);
-  if (fns[2])
-    n2->add_to_same_comdat_group (n0);
-  import_export_decl (fns[1]);
-  /* Remove the body now that it is an alias.  */
-  release_function_body (fns[1]);
-}
--- gcc/cp/decl.cc.jj	2024-05-09 10:30:54.804505130 +0200
+++ gcc/cp/decl.cc	2024-05-09 17:07:08.400110018 +0200
@@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl)
 	  else
 	    break;
 	}
+      /* If a ctor/dtor has already set the comdat group by
+	 maybe_clone_body, don't override it.  */
+      if (SUPPORTS_ONE_ONLY
+	  && TREE_CODE (decl) == FUNCTION_DECL
+	  && DECL_CLONED_FUNCTION_P (decl)
+	  && SUPPORTS_ONE_ONLY)
+	if (tree comdat = DECL_COMDAT_GROUP (decl))
+	  return comdat;
     }
 
   return decl;
--- gcc/testsuite/g++.dg/abi/comdat3.C.jj	2024-05-09 17:07:27.345855068 +0200
+++ gcc/testsuite/g++.dg/abi/comdat3.C	2024-05-09 17:07:08.400110018 +0200
@@ -0,0 +1,22 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1M1SINS_1P1TELN1N1LE2EEC5Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC1Ev,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1M1SINS_1P1TELN1N1LE2EEC2Ev,comdat" } }
+
+namespace N { enum L { L1, L2, L3 } const O = L3; }
+namespace M {
+  using N::O;
+  using N::L;
+  template <typename, L = O>
+  struct S { constexpr S () {} };
+  namespace P {
+    struct T;
+    struct U { S<T> u; };
+    void foo () { U (); }
+  }
+  extern template class S<P::T>;
+}
+namespace p = M::P;
+template class M::S<p::T>;
--- gcc/testsuite/g++.dg/abi/comdat4.C.jj	2024-05-09 17:07:31.961792950 +0200
+++ gcc/testsuite/g++.dg/abi/comdat4.C	2024-05-09 17:07:08.400110018 +0200
@@ -0,0 +1,28 @@
+// PR lto/113208
+// { dg-do compile { target { c++11 && { *-*-*gnu* } } } }
+// { dg-additional-options "-O2" }
+// { dg-final { scan-assembler "_ZN1BI1CEC5ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC1ERKS1_,comdat" } }
+// { dg-final { scan-assembler-not "_ZN1BI1CEC2ERKS1_,comdat" } }
+
+template <typename T>
+struct A {
+  int foo () const;
+  A (int, int);
+};
+template <typename T>
+struct B : A<T> {
+  constexpr B (const B &x) : A<T> (1, x.foo ()) {}
+  B () : A<T> (1, 2) {}
+};
+struct C;
+struct D : B<C> {};
+void bar (D);
+
+void
+baz (D x)
+{
+  bar (x);
+}
+
+template struct B<C>;


	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]
  2024-05-09 18:20                       ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek
@ 2024-05-09 18:58                         ` Marek Polacek
  2024-05-09 19:05                           ` Jakub Jelinek
  2024-05-10 19:59                         ` Jason Merrill
  1 sibling, 1 reply; 25+ messages in thread
From: Marek Polacek @ 2024-05-09 18:58 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, Jonathan Wakely, Jan Hubicka, gcc-patches,
	Richard Biener, Patrick Palka

On Thu, May 09, 2024 at 08:20:00PM +0200, Jakub Jelinek wrote:
> --- gcc/cp/decl.cc.jj	2024-05-09 10:30:54.804505130 +0200
> +++ gcc/cp/decl.cc	2024-05-09 17:07:08.400110018 +0200
> @@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl)
>  	  else
>  	    break;
>  	}
> +      /* If a ctor/dtor has already set the comdat group by
> +	 maybe_clone_body, don't override it.  */
> +      if (SUPPORTS_ONE_ONLY
> +	  && TREE_CODE (decl) == FUNCTION_DECL
> +	  && DECL_CLONED_FUNCTION_P (decl)
> +	  && SUPPORTS_ONE_ONLY)
> +	if (tree comdat = DECL_COMDAT_GROUP (decl))
> +	  return comdat;

This checks SUPPORTS_ONE_ONLY twice.

Marek


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]
  2024-05-09 18:58                         ` Marek Polacek
@ 2024-05-09 19:05                           ` Jakub Jelinek
  0 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2024-05-09 19:05 UTC (permalink / raw)
  To: Marek Polacek
  Cc: Jason Merrill, Jonathan Wakely, Jan Hubicka, gcc-patches,
	Richard Biener, Patrick Palka

On Thu, May 09, 2024 at 02:58:52PM -0400, Marek Polacek wrote:
> On Thu, May 09, 2024 at 08:20:00PM +0200, Jakub Jelinek wrote:
> > --- gcc/cp/decl.cc.jj	2024-05-09 10:30:54.804505130 +0200
> > +++ gcc/cp/decl.cc	2024-05-09 17:07:08.400110018 +0200
> > @@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl)
> >  	  else
> >  	    break;
> >  	}
> > +      /* If a ctor/dtor has already set the comdat group by
> > +	 maybe_clone_body, don't override it.  */
> > +      if (SUPPORTS_ONE_ONLY
> > +	  && TREE_CODE (decl) == FUNCTION_DECL
> > +	  && DECL_CLONED_FUNCTION_P (decl)
> > +	  && SUPPORTS_ONE_ONLY)
> > +	if (tree comdat = DECL_COMDAT_GROUP (decl))
> > +	  return comdat;
> 
> This checks SUPPORTS_ONE_ONLY twice.

Oops, you're right, fixed in my copy.

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]
  2024-05-09 18:20                       ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek
  2024-05-09 18:58                         ` Marek Polacek
@ 2024-05-10 19:59                         ` Jason Merrill
  2024-05-13 10:19                           ` Jakub Jelinek
  1 sibling, 1 reply; 25+ messages in thread
From: Jason Merrill @ 2024-05-10 19:59 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On 5/9/24 14:20, Jakub Jelinek wrote:
> On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote:
>> Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but
>> rather set it to some stub like void_node?
>>
>> Though with all these changes, it's probably better to go with your first
>> patch for GCC 14 and delay this approach to 15.  Your v1 patch is OK for 14.
> 
> Ok, here is an updated patch, which sets DECL_SAVED_TREE to void_node for
> the aliases together with reversion of the earlier committed patch.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2024-05-09  Jakub Jelinek  <jakub@redhat.com>
> 	    Jason Merrill  <jason@redhat.com>
> 
> 	PR lto/113208
> 	* cp-tree.h (maybe_optimize_cdtor): Remove.
> 	* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
> 	for implicit instantiations of maybe in charge ctors/dtors
> 	declared inline.
> 	(import_export_decl): Don't call maybe_optimize_cdtor.
> 	(c_parse_final_cleanups): Formatting fixes.
> 	* optimize.cc (can_alias_cdtor): Adjust condition, for
> 	HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
> 	if not DECL_INTERFACE_KNOWN.

> --- gcc/cp/optimize.cc.jj	2024-04-25 20:33:30.771858912 +0200
> +++ gcc/cp/optimize.cc	2024-05-09 17:10:23.920478922 +0200
> @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
>     gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
>     /* Don't use aliases for weak/linkonce definitions unless we can put both
>        symbols in the same COMDAT group.  */
> -  return (DECL_INTERFACE_KNOWN (fn)
> -	  && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
> -	  && (!DECL_ONE_ONLY (fn)
> -	      || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
> +  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
> +			 : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));

Hmm, would

(DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP
  : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn)))

make sense instead?  I don't think DECL_WEAK is necessary for COMDAT.

Jason


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]
  2024-05-10 19:59                         ` Jason Merrill
@ 2024-05-13 10:19                           ` Jakub Jelinek
  2024-05-14 22:20                             ` Jason Merrill
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2024-05-13 10:19 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote:
> > 2024-05-09  Jakub Jelinek  <jakub@redhat.com>
> > 	    Jason Merrill  <jason@redhat.com>
> > 
> > 	PR lto/113208
> > 	* cp-tree.h (maybe_optimize_cdtor): Remove.
> > 	* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
> > 	for implicit instantiations of maybe in charge ctors/dtors
> > 	declared inline.
> > 	(import_export_decl): Don't call maybe_optimize_cdtor.
> > 	(c_parse_final_cleanups): Formatting fixes.
> > 	* optimize.cc (can_alias_cdtor): Adjust condition, for
> > 	HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
> > 	if not DECL_INTERFACE_KNOWN.
> 
> > --- gcc/cp/optimize.cc.jj	2024-04-25 20:33:30.771858912 +0200
> > +++ gcc/cp/optimize.cc	2024-05-09 17:10:23.920478922 +0200
> > @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
> >     gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
> >     /* Don't use aliases for weak/linkonce definitions unless we can put both
> >        symbols in the same COMDAT group.  */
> > -  return (DECL_INTERFACE_KNOWN (fn)
> > -	  && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
> > -	  && (!DECL_ONE_ONLY (fn)
> > -	      || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
> > +  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
> > +			 : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
> 
> Hmm, would
> 
> (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP
>  : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn)))
> 
> make sense instead?  I don't think DECL_WEAK is necessary for COMDAT.

I think it isn't indeed necessary for COMDAT, although e.g. comdat_linkage
will not call make_decl_one_only if !flag_weak.

But I think it is absolutely required for the alias cdtor optimization
in question, because otherwise it would be an ABI change.
Consider older version of GCC or some other compiler emitting
_ZN6vectorI12QualityValueEC1ERKS1_
and
_ZN6vectorI12QualityValueEC2ERKS1_
symbols not as aliases, each in their own comdat groups, so
.text._ZN6vectorI12QualityValueEC1ERKS1_ in _ZN6vectorI12QualityValueEC1ERKS1_
comdat group and
.text._ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_
comdat group.  And then comes GCC with the above patch without the DECL_WEAK
check in there, and decides to use alias, so
_ZN6vectorI12QualityValueEC1ERKS1_ is an alias to
_ZN6vectorI12QualityValueEC2ERKS1_ and both live in
.text._ZN6vectorI12QualityValueEC2ERKS1_ section in
_ZN6vectorI12QualityValueEC5ERKS1_ comdat group.  If you mix TUs with this,
the linker can keep one of the section sets from the _ZN6vectorI12QualityValueEC1ERKS1_
and _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC5ERKS1_
comdat groups.  If there is no .weak for the symbols, this will fail to
link, one can emit it either the old way or the new way but never both, it
is part of an ABI.
While with .weak, mixing it is possible, worst case one gets some unused
code in the linked binary or shared library.  Of course the desirable case
is that there is no mixing and there is no unused code, but if it happens,
no big deal.  Without .weak it is a big deal.

	Jakub


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]
  2024-05-13 10:19                           ` Jakub Jelinek
@ 2024-05-14 22:20                             ` Jason Merrill
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Merrill @ 2024-05-14 22:20 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jonathan Wakely, Jan Hubicka, gcc-patches, Richard Biener, Patrick Palka

On 5/13/24 06:19, Jakub Jelinek wrote:
> On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote:
>>> 2024-05-09  Jakub Jelinek  <jakub@redhat.com>
>>> 	    Jason Merrill  <jason@redhat.com>
>>>
>>> 	PR lto/113208
>>> 	* cp-tree.h (maybe_optimize_cdtor): Remove.
>>> 	* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
>>> 	for implicit instantiations of maybe in charge ctors/dtors
>>> 	declared inline.
>>> 	(import_export_decl): Don't call maybe_optimize_cdtor.
>>> 	(c_parse_final_cleanups): Formatting fixes.
>>> 	* optimize.cc (can_alias_cdtor): Adjust condition, for
>>> 	HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
>>> 	if not DECL_INTERFACE_KNOWN.
>>
>>> --- gcc/cp/optimize.cc.jj	2024-04-25 20:33:30.771858912 +0200
>>> +++ gcc/cp/optimize.cc	2024-05-09 17:10:23.920478922 +0200
>>> @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
>>>      gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
>>>      /* Don't use aliases for weak/linkonce definitions unless we can put both
>>>         symbols in the same COMDAT group.  */
>>> -  return (DECL_INTERFACE_KNOWN (fn)
>>> -	  && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
>>> -	  && (!DECL_ONE_ONLY (fn)
>>> -	      || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
>>> +  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
>>> +			 : (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
>>
>> Hmm, would
>>
>> (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP
>>   : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn)))
>>
>> make sense instead?  I don't think DECL_WEAK is necessary for COMDAT.
> 
> I think it isn't indeed necessary for COMDAT, although e.g. comdat_linkage
> will not call make_decl_one_only if !flag_weak.
> 
> But I think it is absolutely required for the alias cdtor optimization
> in question, because otherwise it would be an ABI change.
> Consider older version of GCC or some other compiler emitting
> _ZN6vectorI12QualityValueEC1ERKS1_
> and
> _ZN6vectorI12QualityValueEC2ERKS1_
> symbols not as aliases, each in their own comdat groups, so
> .text._ZN6vectorI12QualityValueEC1ERKS1_ in _ZN6vectorI12QualityValueEC1ERKS1_
> comdat group and
> .text._ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_
> comdat group.  And then comes GCC with the above patch without the DECL_WEAK
> check in there, and decides to use alias, so
> _ZN6vectorI12QualityValueEC1ERKS1_ is an alias to
> _ZN6vectorI12QualityValueEC2ERKS1_ and both live in
> .text._ZN6vectorI12QualityValueEC2ERKS1_ section in
> _ZN6vectorI12QualityValueEC5ERKS1_ comdat group.  If you mix TUs with this,
> the linker can keep one of the section sets from the _ZN6vectorI12QualityValueEC1ERKS1_
> and _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC5ERKS1_
> comdat groups.  If there is no .weak for the symbols, this will fail to
> link, one can emit it either the old way or the new way but never both, it
> is part of an ABI.
> While with .weak, mixing it is possible, worst case one gets some unused
> code in the linked binary or shared library.  Of course the desirable case
> is that there is no mixing and there is no unused code, but if it happens,
> no big deal.  Without .weak it is a big deal.

Makes sense, the patch is OK.

Jason


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-05-14 22:20 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17  7:42 [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Jakub Jelinek
2024-04-17  9:04 ` Jan Hubicka
2024-04-17 12:32   ` Jakub Jelinek
2024-04-17 13:26     ` Jan Hubicka
2024-04-17 14:07       ` Jakub Jelinek
2024-04-17 14:34         ` Jan Hubicka
2024-04-17 14:39           ` Jakub Jelinek
2024-04-22 15:42 ` [PATCH] c++, v2: " Jakub Jelinek
2024-04-23  3:14   ` Jason Merrill
2024-04-23 16:04     ` Jakub Jelinek
2024-04-24  9:16       ` Jonathan Wakely
2024-04-24 16:16         ` [PATCH] c++, v3: " Jakub Jelinek
2024-04-24 22:39           ` Jason Merrill
2024-04-24 22:47             ` Jakub Jelinek
2024-04-25  0:43               ` Jason Merrill
2024-04-25 12:02                 ` [PATCH] c++, v4: " Jakub Jelinek
2024-04-25 14:22                   ` Jakub Jelinek
2024-04-25 15:30                     ` Jason Merrill
2024-04-25 18:42                       ` [PATCH] c++, v5: " Jakub Jelinek
2024-05-09 18:20                       ` [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208] Jakub Jelinek
2024-05-09 18:58                         ` Marek Polacek
2024-05-09 19:05                           ` Jakub Jelinek
2024-05-10 19:59                         ` Jason Merrill
2024-05-13 10:19                           ` Jakub Jelinek
2024-05-14 22:20                             ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).