TYPE_BINFO and canonical types at LTO

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* TYPE_BINFO and canonical types at LTO
@ 2014-02-14  0:48 Jan Hubicka
  2014-02-14  8:06 ` Jason Merrill
  2014-02-14  8:42 ` Richard Biener
  0 siblings, 2 replies; 18+ messages in thread
From: Jan Hubicka @ 2014-02-14  0:48 UTC (permalink / raw)
  To: gcc, rguenther, jason

Hi,
I have noticed that record_component_aliases is called during LTO time and it
examines contents of BINFO:
0x5cd7a5 record_component_aliases(tree_node*)
        ../../gcc/alias.c:1005
0x5cd4a9 get_alias_set(tree_node*)
        ../../gcc/alias.c:895
0x5cc67a component_uses_parent_alias_set_from(tree_node const*)
        ../../gcc/alias.c:548
0x5ccc42 reference_alias_ptr_type_1
        ../../gcc/alias.c:660
0x5ccf93 get_alias_set(tree_node*)
        ../../gcc/alias.c:740
0xb823d8 indirect_refs_may_alias_p
        ../../gcc/tree-ssa-alias.c:1125
0xb82d8d refs_may_alias_p_1(ao_ref*, ao_ref*, bool)
        ../../gcc/tree-ssa-alias.c:1279
0xb848df stmt_may_clobber_ref_p_1(gimple_statement_base*, ao_ref*)
        ../../gcc/tree-ssa-alias.c:2013
0xb85d27 walk_non_aliased_vuses(ao_ref*, tree_node*, void* (*)(ao_ref*, tree_node*, unsigned int, void*), void* (*)(ao_ref*, tree_node*, void*), void*)
        ../../gcc/tree-ssa-alias.c:2411
0xc509f3 vn_reference_lookup(tree_node*, tree_node*, vn_lookup_kind, vn_reference_s**)
        ../../gcc/tree-ssa-sccvn.c:2063
0xc52ea4 visit_reference_op_store
        ../../gcc/tree-ssa-sccvn.c:2970
0xc55404 extract_and_process_scc_for_name
        ../../gcc/tree-ssa-sccvn.c:3825

This smells bad, since it is given a canonical type that is after the
structural equivalency merging that ignores BINFOs, so it may be completely
different class with completely different bases than the original.  Bases are
structuraly merged, too and may be exchanged for normal fields because
DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
the canonical type definition in LTO.

I wonder if that code is needed after all:
    case QUAL_UNION_TYPE:
      /* Recursively record aliases for the base classes, if there are any.  */
      if (TYPE_BINFO (type))
        { 
          int i;
          tree binfo, base_binfo;

          for (binfo = TYPE_BINFO (type), i = 0;
               BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
            record_alias_subset (superset,
                                 get_alias_set (BINFO_TYPE (base_binfo)));
        }
      for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
        if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
          record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
      break;
all bases are also fields of within the type, so the second loop should notice
all the types seen by first loop if I am correct?
So perhaps the loop can be dropped at first place. 

Honza

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-14  0:48 TYPE_BINFO and canonical types at LTO Jan Hubicka
@ 2014-02-14  8:06 ` Jason Merrill
  2014-02-14  8:42 ` Richard Biener
  1 sibling, 0 replies; 18+ messages in thread
From: Jason Merrill @ 2014-02-14  8:06 UTC (permalink / raw)
  To: Jan Hubicka, gcc, rguenther

On 02/13/2014 04:48 PM, Jan Hubicka wrote:
> all bases are also fields of within the type, so the second loop should notice
> all the types seen by first loop if I am correct?

Yes, except that empty bases don't get fields because they have no data. 
  But since they have no data they aren't interesting to aliasing 
either, so you should be OK just looking at field types.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-14  0:48 TYPE_BINFO and canonical types at LTO Jan Hubicka
  2014-02-14  8:06 ` Jason Merrill
@ 2014-02-14  8:42 ` Richard Biener
  2014-02-14 16:52   ` Jan Hubicka
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Biener @ 2014-02-14  8:42 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Fri, 14 Feb 2014, Jan Hubicka wrote:

> Hi,
> I have noticed that record_component_aliases is called during LTO time and it
> examines contents of BINFO:
> 0x5cd7a5 record_component_aliases(tree_node*)
>         ../../gcc/alias.c:1005
> 0x5cd4a9 get_alias_set(tree_node*)
>         ../../gcc/alias.c:895
> 0x5cc67a component_uses_parent_alias_set_from(tree_node const*)
>         ../../gcc/alias.c:548
> 0x5ccc42 reference_alias_ptr_type_1
>         ../../gcc/alias.c:660
> 0x5ccf93 get_alias_set(tree_node*)
>         ../../gcc/alias.c:740
> 0xb823d8 indirect_refs_may_alias_p
>         ../../gcc/tree-ssa-alias.c:1125
> 0xb82d8d refs_may_alias_p_1(ao_ref*, ao_ref*, bool)
>         ../../gcc/tree-ssa-alias.c:1279
> 0xb848df stmt_may_clobber_ref_p_1(gimple_statement_base*, ao_ref*)
>         ../../gcc/tree-ssa-alias.c:2013
> 0xb85d27 walk_non_aliased_vuses(ao_ref*, tree_node*, void* (*)(ao_ref*, tree_node*, unsigned int, void*), void* (*)(ao_ref*, tree_node*, void*), void*)
>         ../../gcc/tree-ssa-alias.c:2411
> 0xc509f3 vn_reference_lookup(tree_node*, tree_node*, vn_lookup_kind, vn_reference_s**)
>         ../../gcc/tree-ssa-sccvn.c:2063
> 0xc52ea4 visit_reference_op_store
>         ../../gcc/tree-ssa-sccvn.c:2970
> 0xc55404 extract_and_process_scc_for_name
>         ../../gcc/tree-ssa-sccvn.c:3825
> 
> This smells bad, since it is given a canonical type that is after the
> structural equivalency merging that ignores BINFOs, so it may be completely
> different class with completely different bases than the original.  Bases are
> structuraly merged, too and may be exchanged for normal fields because
> DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> the canonical type definition in LTO.

Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
by considering all fields during that merging?

Note that the BINFO walk below only adds extra aliasing - it should
be harmless correctness-wise, no?

> I wonder if that code is needed after all:
>     case QUAL_UNION_TYPE:
>       /* Recursively record aliases for the base classes, if there are any.  */
>       if (TYPE_BINFO (type))
>         { 
>           int i;
>           tree binfo, base_binfo;
> 
>           for (binfo = TYPE_BINFO (type), i = 0;
>                BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
>             record_alias_subset (superset,
>                                  get_alias_set (BINFO_TYPE (base_binfo)));
>         }
>       for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
>         if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
>           record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
>       break;
> all bases are also fields of within the type, so the second loop should notice
> all the types seen by first loop if I am correct?
> So perhaps the loop can be dropped at first place. 

Yeah, I remember seeing that code and thinking the same multiple times.
Though I also vaguely remember that removing that loop regressed
something.  How is virtual inheritance represented in the fields?

But I'd be happy if this BINFO user would go away ;)

(similar in general for the get_alias_set langhook - with LTO we
only preserve extra alias-set zero answers from that)

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-14  8:42 ` Richard Biener
@ 2014-02-14 16:52   ` Jan Hubicka
  2014-02-14 23:22     ` Jan Hubicka
  2014-02-15 10:03     ` Richard Biener
  0 siblings, 2 replies; 18+ messages in thread
From: Jan Hubicka @ 2014-02-14 16:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc, jason

> > This smells bad, since it is given a canonical type that is after the
> > structural equivalency merging that ignores BINFOs, so it may be completely
> > different class with completely different bases than the original.  Bases are
> > structuraly merged, too and may be exchanged for normal fields because
> > DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> > the canonical type definition in LTO.
> 
> Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
> by considering all fields during that merging?

To make the code work with LTO, one can not merge 
struct B {struct A a}
struct B: A {}

these IMO differ only by DECL_ARTIFICIAL flag on the fields.
> 
> Note that the BINFO walk below only adds extra aliasing - it should
> be harmless correctness-wise, no?

If it is needed for the second case, then it we will produce wrong code with LTO
when we merge first and second case toghetr.  If it is not needed, then we are safe
but we don't really need the loop then.

Having the testcase where the BINFO walk is needed for corectness, I can turn
it into wrong code in LTO by interposing the class by structure.

I am rebuilding firefox with sanity checking that the second loop never adds anything
useful. Lets see.
> 
> > I wonder if that code is needed after all:
> >     case QUAL_UNION_TYPE:
> >       /* Recursively record aliases for the base classes, if there are any.  */
> >       if (TYPE_BINFO (type))
> >         { 
> >           int i;
> >           tree binfo, base_binfo;
> > 
> >           for (binfo = TYPE_BINFO (type), i = 0;
> >                BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
> >             record_alias_subset (superset,
> >                                  get_alias_set (BINFO_TYPE (base_binfo)));
> >         }
> >       for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
> >         if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
> >           record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
> >       break;
> > all bases are also fields of within the type, so the second loop should notice
> > all the types seen by first loop if I am correct?
> > So perhaps the loop can be dropped at first place. 
> 
> Yeah, I remember seeing that code and thinking the same multiple times.
> Though I also vaguely remember that removing that loop regressed
> something.  How is virtual inheritance represented in the fields?

struct A {int a;};
struct B: virtual A {};
struct C: virtual A {};
struct D: B, C {};
struct D *d;
struct B *b;
int t(void)
{
  d->a=1;
  return b->a;
}

I srepresented as:

 <record_type 0x7ffff6c58000 D addressable needs-constructing type_5 BLK
    size <integer_cst 0x7ffff6ae98c0 type <integer_type 0x7ffff6ae7150 bitsizetype> constant 192>
    unit size <integer_cst 0x7ffff6ae9880 type <integer_type 0x7ffff6ae70a8 sizetype> constant 24>
    align 64 symtab 0 alias set 6 canonical type 0x7ffff6c58000
    fields <field_decl 0x7ffff6c4ae40 D.2249
        type <record_type 0x7ffff6c452a0 B addressable needs-constructing type_5 BLK
            size <integer_cst 0x7ffff6ae9140 constant 128>
            unit size <integer_cst 0x7ffff6ae9160 constant 16>
            align 64 symtab 0 alias set 3 canonical type 0x7ffff6c452a0 fields <field_decl 0x7ffff6c4a2f8 _vptr.B> context <translation_unit_decl 0x7ffff6af1170 D.1>
            full-name "struct B"
            needs-constructor X() X(constX&) this=(X&) n_parents=1 use_template=0 interface-unknown
            pointer_to_this <pointer_type 0x7ffff6c58b28> chain <type_decl 0x7ffff6c3ea10 B>>
        ignored decl_6 BLK file t.C line 4 col 8
        size <integer_cst 0x7ffff6ae90c0 constant 64>
        unit size <integer_cst 0x7ffff6ae90e0 constant 8>
        align 64 offset_align 128
        offset <integer_cst 0x7ffff6ae9100 constant 0>
        bit offset <integer_cst 0x7ffff6ae9180 constant 0> context <record_type 0x7ffff6c58000 D>
        chain <field_decl 0x7ffff6c4aed8 D.2250 type <record_type 0x7ffff6c45b28 C>
            ignored decl_6 BLK file t.C line 4 col 8 size <integer_cst 0x7ffff6ae90c0 64> unit size <integer_cst 0x7ffff6ae90e0 8>
            align 64 offset_align 128 offset <integer_cst 0x7ffff6ae9100 0> bit offset <integer_cst 0x7ffff6ae90c0 64> context <record_type 0x7ffff6c58000 D> chain <type_decl 0x7ffff6c590b8 D>>> context <translation_unit_decl 0x7ffff6af1170 D.1>
    full-name "struct D"
    needs-constructor X() X(constX&) this=(X&) n_parents=2 use_template=0 interface-unknown
    pointer_to_this <pointer_type 0x7ffff6c58a80> chain <type_decl 0x7ffff6c59000 D>>


You my end up with A being a field, too, but that only bypas the recursion.

> 
> But I'd be happy if this BINFO user would go away ;)

Me too, indeed.
> 
> (similar in general for the get_alias_set langhook - with LTO we
> only preserve extra alias-set zero answers from that)

I see, I was surprised we construct alias sets at LTO time, I believed we
always stream them in&out.

Honza
> 
> Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-14 16:52   ` Jan Hubicka
@ 2014-02-14 23:22     ` Jan Hubicka
  2014-02-15 10:10       ` Richard Biener
  2014-02-15 10:03     ` Richard Biener
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Hubicka @ 2014-02-14 23:22 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, gcc, jason

> > > This smells bad, since it is given a canonical type that is after the
> > > structural equivalency merging that ignores BINFOs, so it may be completely
> > > different class with completely different bases than the original.  Bases are
> > > structuraly merged, too and may be exchanged for normal fields because
> > > DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> > > the canonical type definition in LTO.
> > 
> > Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
> > by considering all fields during that merging?
> 
> To make the code work with LTO, one can not merge 
> struct B {struct A a}
> struct B: A {}
> 
> these IMO differ only by DECL_ARTIFICIAL flag on the fields.
> > 
> > Note that the BINFO walk below only adds extra aliasing - it should
> > be harmless correctness-wise, no?
> 
> If it is needed for the second case, then it we will produce wrong code with LTO
> when we merge first and second case toghetr.  If it is not needed, then we are safe
> but we don't really need the loop then.
> 
> Having the testcase where the BINFO walk is needed for corectness, I can turn
> it into wrong code in LTO by interposing the class by structure.
> 
> I am rebuilding firefox with sanity checking that the second loop never adds anything
> useful. Lets see.

The code really seems to be adding only cases of zero sized classes.  I use the
following hack in my tree. I do not know how to discover zero sized class, so I
test of unit size 1, but I think if there was other cases, we would notice
anyway.

The reason why I am looking into is that because I am trying to evaulate how
expensive BINFOs are. I am trying to keep only those that matters for debug info and devirt,
and avoid streaming others.

Honza

Index: alias.c
===================================================================
--- alias.c	(revision 207777)
+++ alias.c	(working copy)
@@ -995,20 +996,40 @@ record_component_aliases (tree type)
     case RECORD_TYPE:
     case UNION_TYPE:
     case QUAL_UNION_TYPE:
-      /* Recursively record aliases for the base classes, if there are any.  */
-      if (TYPE_BINFO (type))
-	{
-	  int i;
-	  tree binfo, base_binfo;
-
-	  for (binfo = TYPE_BINFO (type), i = 0;
-	       BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
-	    record_alias_subset (superset,
-				 get_alias_set (BINFO_TYPE (base_binfo)));
-	}
-      for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
-	if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
-	  record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
+      {
+#ifdef ENABLE_CHECKING
+        bitmap_obstack my_obstack;
+        bitmap_obstack_initialize (&my_obstack); 
+	bitmap added = BITMAP_ALLOC (&my_obstack);
+#endif
+	alias_set_type t;
+	for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
+	  if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
+	    {
+	      t = get_alias_set (TREE_TYPE (field));
+#ifdef ENABLE_CHECKING
+	      bitmap_set_bit (added, t);
+#endif
+	      record_alias_subset (superset, t);
+	    }
+#ifdef ENABLE_CHECKING
+	/* Recursively record aliases for the base classes, if there are any.  */
+	if (!in_lto_p && TYPE_BINFO (type))
+	  {
+	    int i;
+	    tree binfo, base_binfo;
+
+	    for (binfo = TYPE_BINFO (type), i = 0;
+		 BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+		if (!bitmap_bit_p (added, get_alias_set (BINFO_TYPE (base_binfo))))
+		{
+		  gcc_assert (integer_onep (TYPE_SIZE_UNIT (BINFO_TYPE (base_binfo))));
+		}
+	  }
+	BITMAP_FREE (added);
+  bitmap_obstack_release (&my_obstack);
+#endif
+      }
       break;
 
     case COMPLEX_TYPE:

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-14 16:52   ` Jan Hubicka
  2014-02-14 23:22     ` Jan Hubicka
@ 2014-02-15 10:03     ` Richard Biener
  2014-02-16 23:55       ` Jan Hubicka
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Biener @ 2014-02-15 10:03 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Fri, 14 Feb 2014, Jan Hubicka wrote:

> > > This smells bad, since it is given a canonical type that is after the
> > > structural equivalency merging that ignores BINFOs, so it may be completely
> > > different class with completely different bases than the original.  Bases are
> > > structuraly merged, too and may be exchanged for normal fields because
> > > DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> > > the canonical type definition in LTO.
> > 
> > Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
> > by considering all fields during that merging?
> 
> To make the code work with LTO, one can not merge 
> struct B {struct A a}
> struct B: A {}
> 
> these IMO differ only by DECL_ARTIFICIAL flag on the fields.

"The code" == that BINFO walk?  Is that because we walk a completely
unrelated BINFO chain?  I'd say we should have merged its types
so that difference shouldn't matter.

Hopefully ;)

> > 
> > Note that the BINFO walk below only adds extra aliasing - it should
> > be harmless correctness-wise, no?
> 
> If it is needed for the second case, then it we will produce wrong code 
> with LTO when we merge first and second case toghetr.  If it is not 
> needed, then we are safe but we don't really need the loop then.
> 
> Having the testcase where the BINFO walk is needed for corectness, I can turn
> it into wrong code in LTO by interposing the class by structure.
> 
> I am rebuilding firefox with sanity checking that the second loop never adds anything
> useful. Lets see.
> > 
> > > I wonder if that code is needed after all:
> > >     case QUAL_UNION_TYPE:
> > >       /* Recursively record aliases for the base classes, if there are any.  */
> > >       if (TYPE_BINFO (type))
> > >         { 
> > >           int i;
> > >           tree binfo, base_binfo;
> > > 
> > >           for (binfo = TYPE_BINFO (type), i = 0;
> > >                BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
> > >             record_alias_subset (superset,
> > >                                  get_alias_set (BINFO_TYPE (base_binfo)));
> > >         }
> > >       for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
> > >         if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
> > >           record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
> > >       break;
> > > all bases are also fields of within the type, so the second loop should notice
> > > all the types seen by first loop if I am correct?
> > > So perhaps the loop can be dropped at first place. 
> > 
> > Yeah, I remember seeing that code and thinking the same multiple times.
> > Though I also vaguely remember that removing that loop regressed
> > something.  How is virtual inheritance represented in the fields?
> 
> struct A {int a;};
> struct B: virtual A {};
> struct C: virtual A {};
> struct D: B, C {};
> struct D *d;
> struct B *b;
> int t(void)
> {
>   d->a=1;
>   return b->a;
> }
> 
> I srepresented as:
> 
>  <record_type 0x7ffff6c58000 D addressable needs-constructing type_5 BLK
>     size <integer_cst 0x7ffff6ae98c0 type <integer_type 0x7ffff6ae7150 bitsizetype> constant 192>
>     unit size <integer_cst 0x7ffff6ae9880 type <integer_type 0x7ffff6ae70a8 sizetype> constant 24>
>     align 64 symtab 0 alias set 6 canonical type 0x7ffff6c58000
>     fields <field_decl 0x7ffff6c4ae40 D.2249
>         type <record_type 0x7ffff6c452a0 B addressable needs-constructing type_5 BLK
>             size <integer_cst 0x7ffff6ae9140 constant 128>
>             unit size <integer_cst 0x7ffff6ae9160 constant 16>
>             align 64 symtab 0 alias set 3 canonical type 0x7ffff6c452a0 fields <field_decl 0x7ffff6c4a2f8 _vptr.B> context <translation_unit_decl 0x7ffff6af1170 D.1>
>             full-name "struct B"
>             needs-constructor X() X(constX&) this=(X&) n_parents=1 use_template=0 interface-unknown
>             pointer_to_this <pointer_type 0x7ffff6c58b28> chain <type_decl 0x7ffff6c3ea10 B>>
>         ignored decl_6 BLK file t.C line 4 col 8
>         size <integer_cst 0x7ffff6ae90c0 constant 64>
>         unit size <integer_cst 0x7ffff6ae90e0 constant 8>
>         align 64 offset_align 128
>         offset <integer_cst 0x7ffff6ae9100 constant 0>
>         bit offset <integer_cst 0x7ffff6ae9180 constant 0> context <record_type 0x7ffff6c58000 D>
>         chain <field_decl 0x7ffff6c4aed8 D.2250 type <record_type 0x7ffff6c45b28 C>
>             ignored decl_6 BLK file t.C line 4 col 8 size <integer_cst 0x7ffff6ae90c0 64> unit size <integer_cst 0x7ffff6ae90e0 8>
>             align 64 offset_align 128 offset <integer_cst 0x7ffff6ae9100 0> bit offset <integer_cst 0x7ffff6ae90c0 64> context <record_type 0x7ffff6c58000 D> chain <type_decl 0x7ffff6c590b8 D>>> context <translation_unit_decl 0x7ffff6af1170 D.1>
>     full-name "struct D"
>     needs-constructor X() X(constX&) this=(X&) n_parents=2 use_template=0 interface-unknown
>     pointer_to_this <pointer_type 0x7ffff6c58a80> chain <type_decl 0x7ffff6c59000 D>>
> 
> 
> You my end up with A being a field, too, but that only bypas the recursion.
> 
> > 
> > But I'd be happy if this BINFO user would go away ;)
> 
> Me too, indeed.
> > 
> > (similar in general for the get_alias_set langhook - with LTO we
> > only preserve extra alias-set zero answers from that)
> 
> I see, I was surprised we construct alias sets at LTO time, I believed we
> always stream them in&out.

No, we stream out a flag basically, get_alias_set () == 0.  We don't have
a way to merge the alias set forests.

Richard.

> Honza

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-14 23:22     ` Jan Hubicka
@ 2014-02-15 10:10       ` Richard Biener
  2014-02-20 20:49         ` Jason Merrill
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2014-02-15 10:10 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Sat, 15 Feb 2014, Jan Hubicka wrote:

> > > > This smells bad, since it is given a canonical type that is after the
> > > > structural equivalency merging that ignores BINFOs, so it may be completely
> > > > different class with completely different bases than the original.  Bases are
> > > > structuraly merged, too and may be exchanged for normal fields because
> > > > DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> > > > the canonical type definition in LTO.
> > > 
> > > Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
> > > by considering all fields during that merging?
> > 
> > To make the code work with LTO, one can not merge 
> > struct B {struct A a}
> > struct B: A {}
> > 
> > these IMO differ only by DECL_ARTIFICIAL flag on the fields.
> > > 
> > > Note that the BINFO walk below only adds extra aliasing - it should
> > > be harmless correctness-wise, no?
> > 
> > If it is needed for the second case, then it we will produce wrong code with LTO
> > when we merge first and second case toghetr.  If it is not needed, then we are safe
> > but we don't really need the loop then.
> > 
> > Having the testcase where the BINFO walk is needed for corectness, I can turn
> > it into wrong code in LTO by interposing the class by structure.
> > 
> > I am rebuilding firefox with sanity checking that the second loop never adds anything
> > useful. Lets see.
> 
> The code really seems to be adding only cases of zero sized classes.

zero-size as in with no fields?  Then we can't do anything with such
classes and thus we don't need to record component aliases?

I'd say we try dropping this loop when stage1 opens again.

Richard.

>  I use the following hack in my tree. I do not know how to discover zero 
> sized class, so I test of unit size 1, but I think if there was other 
> cases, we would notice anyway.
> 
> The reason why I am looking into is that because I am trying to evaulate 
> how expensive BINFOs are. I am trying to keep only those that matters 
> for debug info and devirt, and avoid streaming others.
> 
> Honza
> 
> Index: alias.c
> ===================================================================
> --- alias.c	(revision 207777)
> +++ alias.c	(working copy)
> @@ -995,20 +996,40 @@ record_component_aliases (tree type)
>      case RECORD_TYPE:
>      case UNION_TYPE:
>      case QUAL_UNION_TYPE:
> -      /* Recursively record aliases for the base classes, if there are any.  */
> -      if (TYPE_BINFO (type))
> -	{
> -	  int i;
> -	  tree binfo, base_binfo;
> -
> -	  for (binfo = TYPE_BINFO (type), i = 0;
> -	       BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
> -	    record_alias_subset (superset,
> -				 get_alias_set (BINFO_TYPE (base_binfo)));
> -	}
> -      for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
> -	if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
> -	  record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
> +      {
> +#ifdef ENABLE_CHECKING
> +        bitmap_obstack my_obstack;
> +        bitmap_obstack_initialize (&my_obstack); 
> +	bitmap added = BITMAP_ALLOC (&my_obstack);
> +#endif
> +	alias_set_type t;
> +	for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field))
> +	  if (TREE_CODE (field) == FIELD_DECL && !DECL_NONADDRESSABLE_P (field))
> +	    {
> +	      t = get_alias_set (TREE_TYPE (field));
> +#ifdef ENABLE_CHECKING
> +	      bitmap_set_bit (added, t);
> +#endif
> +	      record_alias_subset (superset, t);
> +	    }
> +#ifdef ENABLE_CHECKING
> +	/* Recursively record aliases for the base classes, if there are any.  */
> +	if (!in_lto_p && TYPE_BINFO (type))
> +	  {
> +	    int i;
> +	    tree binfo, base_binfo;
> +
> +	    for (binfo = TYPE_BINFO (type), i = 0;
> +		 BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
> +		if (!bitmap_bit_p (added, get_alias_set (BINFO_TYPE (base_binfo))))
> +		{
> +		  gcc_assert (integer_onep (TYPE_SIZE_UNIT (BINFO_TYPE (base_binfo))));
> +		}
> +	  }
> +	BITMAP_FREE (added);
> +  bitmap_obstack_release (&my_obstack);
> +#endif
> +      }
>        break;
>  
>      case COMPLEX_TYPE:

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-15 10:03     ` Richard Biener
@ 2014-02-16 23:55       ` Jan Hubicka
  2014-02-17 11:36         ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Hubicka @ 2014-02-16 23:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc, jason

> On Fri, 14 Feb 2014, Jan Hubicka wrote:
> 
> > > > This smells bad, since it is given a canonical type that is after the
> > > > structural equivalency merging that ignores BINFOs, so it may be completely
> > > > different class with completely different bases than the original.  Bases are
> > > > structuraly merged, too and may be exchanged for normal fields because
> > > > DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> > > > the canonical type definition in LTO.
> > > 
> > > Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
> > > by considering all fields during that merging?
> > 
> > To make the code work with LTO, one can not merge 
> > struct B {struct A a}
> > struct B: A {}
> > 
> > these IMO differ only by DECL_ARTIFICIAL flag on the fields.
> 
> "The code" == that BINFO walk?  Is that because we walk a completely

Yes.

> unrelated BINFO chain?  I'd say we should have merged its types
> so that difference shouldn't matter.
> 
> Hopefully ;)

I am trying to make point that will matter.  Here is completed testcase above:

struct A {int a;};
struct C:A {};
struct B {struct A a;};
struct C *p2;
struct B *p1;
int
t()
{
  p1->a.a = 2;
  return p2->a;
}

With patch I get:

Index: lto/lto.c
===================================================================
--- lto/lto.c   (revision 207777)
+++ lto/lto.c   (working copy)
@@ -49,6 +49,8 @@ along with GCC; see the file COPYING3.
 #include "data-streamer.h"
 #include "context.h"
 #include "pass_manager.h"
+#include "print-tree.h"
 
 
 /* Number of parallel tasks to run, -1 if we want to use GNU Make jobserver.  */
@@ -619,6 +621,15 @@ gimple_canonical_type_eq (const void *p1
 {
   const_tree t1 = (const_tree) p1;
   const_tree t2 = (const_tree) p2;
+  if (gimple_canonical_types_compatible_p (CONST_CAST_TREE (t1),
+                                             CONST_CAST_TREE (t2))
+      && TREE_CODE (CONST_CAST_TREE (t1)) == RECORD_TYPE)
+     {
+       debug_tree (CONST_CAST_TREE (t1));
+       fprintf (stderr, "bases:%i\n", BINFO_BASE_BINFOS (TYPE_BINFO (t1))->length());
+       debug_tree (CONST_CAST_TREE (t2));
+       fprintf (stderr, "bases:%i\n", BINFO_BASE_BINFOS (TYPE_BINFO (t2))->length());
+     }
   return gimple_canonical_types_compatible_p (CONST_CAST_TREE (t1),
                                              CONST_CAST_TREE (t2));
 }

 <record_type 0x7ffff6c52888 B SI
    size <integer_cst 0x7ffff6ae83a0 type <integer_type 0x7ffff6ae5150 bitsizetype> constant 32>
    unit size <integer_cst 0x7ffff6ae83c0 type <integer_type 0x7ffff6ae50a8 sizetype> constant 4>
    align 32 symtab 0 alias set -1 canonical type 0x7ffff6c52888
    fields <field_decl 0x7ffff6adec78 a
        type <record_type 0x7ffff6c52738 A SI size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
            align 32 symtab 0 alias set -1 canonical type 0x7ffff6c52738 fields <field_decl 0x7ffff6adebe0 a> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
            chain <type_decl 0x7ffff6af2f18 A>>
        nonlocal SI file t.C line 3 col 20 size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
        align 32 offset_align 128
        offset <integer_cst 0x7ffff6ae8060 constant 0>
        bit offset <integer_cst 0x7ffff6ae80e0 constant 0> context <record_type 0x7ffff6c52888 B>
        chain <type_decl 0x7ffff6c55170 B type <record_type 0x7ffff6c52930 B>
            nonlocal VOID file t.C line 3 col 10
            align 1 context <record_type 0x7ffff6c52888 B> result <record_type 0x7ffff6c52888 B>>> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
    pointer_to_this <pointer_type 0x7ffff6c529d8> chain <type_decl 0x7ffff6c550b8 B>>
bases:0
 <record_type 0x7ffff6c52b28 C SI
    size <integer_cst 0x7ffff6ae83a0 type <integer_type 0x7ffff6ae5150 bitsizetype> constant 32>
    unit size <integer_cst 0x7ffff6ae83c0 type <integer_type 0x7ffff6ae50a8 sizetype> constant 4>
    align 32 symtab 0 alias set -1 structural equality
    fields <field_decl 0x7ffff6adeda8 D.2831
        type <record_type 0x7ffff6c52738 A SI size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
            align 32 symtab 0 alias set -1 canonical type 0x7ffff6c52738 fields <field_decl 0x7ffff6adebe0 a> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
            chain <type_decl 0x7ffff6af2f18 A>>
        ignored SI file t.C line 2 col 8 size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
        align 32 offset_align 128
        offset <integer_cst 0x7ffff6ae8060 constant 0>
        bit offset <integer_cst 0x7ffff6ae80e0 constant 0> context <record_type 0x7ffff6c52a80 C>
        chain <type_decl 0x7ffff6c552e0 C type <record_type 0x7ffff6c52b28 C>
            nonlocal VOID file t.C line 2 col 12
            align 1 context <record_type 0x7ffff6c52a80 C> result <record_type 0x7ffff6c52a80 C>>> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
    chain <type_decl 0x7ffff6c55228 C>>
bases:1

So we prevail structure B with structure C.  One has bases to walk other doesn't.
If that BINFO walk in alias.c (on canonical types) did something useful, we have a wrong code bug.

Yes, zero sized classes are those having no fields (but other stuff, type decls, bases etc.)

Honza

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-16 23:55       ` Jan Hubicka
@ 2014-02-17 11:36         ` Richard Biener
  2014-02-17 20:55           ` Jan Hubicka
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2014-02-17 11:36 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Mon, 17 Feb 2014, Jan Hubicka wrote:

> > On Fri, 14 Feb 2014, Jan Hubicka wrote:
> > 
> > > > > This smells bad, since it is given a canonical type that is after the
> > > > > structural equivalency merging that ignores BINFOs, so it may be completely
> > > > > different class with completely different bases than the original.  Bases are
> > > > > structuraly merged, too and may be exchanged for normal fields because
> > > > > DECL_ARTIFICIAL (that separate bases and fields) does not seem to be part of
> > > > > the canonical type definition in LTO.
> > > > 
> > > > Can you elaborate on that DECL_ARTIFICIAL thing?  That is, what is broken
> > > > by considering all fields during that merging?
> > > 
> > > To make the code work with LTO, one can not merge 
> > > struct B {struct A a}
> > > struct B: A {}
> > > 
> > > these IMO differ only by DECL_ARTIFICIAL flag on the fields.
> > 
> > "The code" == that BINFO walk?  Is that because we walk a completely
> 
> Yes.
> 
> > unrelated BINFO chain?  I'd say we should have merged its types
> > so that difference shouldn't matter.
> > 
> > Hopefully ;)
> 
> I am trying to make point that will matter.  Here is completed testcase above:
> 
> struct A {int a;};
> struct C:A {};
> struct B {struct A a;};
> struct C *p2;
> struct B *p1;
> int
> t()
> {
>   p1->a.a = 2;
>   return p2->a;
> }
> 
> With patch I get:
> 
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c   (revision 207777)
> +++ lto/lto.c   (working copy)
> @@ -49,6 +49,8 @@ along with GCC; see the file COPYING3.
>  #include "data-streamer.h"
>  #include "context.h"
>  #include "pass_manager.h"
> +#include "print-tree.h"
>  
>  
>  /* Number of parallel tasks to run, -1 if we want to use GNU Make jobserver.  */
> @@ -619,6 +621,15 @@ gimple_canonical_type_eq (const void *p1
>  {
>    const_tree t1 = (const_tree) p1;
>    const_tree t2 = (const_tree) p2;
> +  if (gimple_canonical_types_compatible_p (CONST_CAST_TREE (t1),
> +                                             CONST_CAST_TREE (t2))
> +      && TREE_CODE (CONST_CAST_TREE (t1)) == RECORD_TYPE)
> +     {
> +       debug_tree (CONST_CAST_TREE (t1));
> +       fprintf (stderr, "bases:%i\n", BINFO_BASE_BINFOS (TYPE_BINFO (t1))->length());
> +       debug_tree (CONST_CAST_TREE (t2));
> +       fprintf (stderr, "bases:%i\n", BINFO_BASE_BINFOS (TYPE_BINFO (t2))->length());
> +     }
>    return gimple_canonical_types_compatible_p (CONST_CAST_TREE (t1),
>                                               CONST_CAST_TREE (t2));
>  }
> 
>  <record_type 0x7ffff6c52888 B SI
>     size <integer_cst 0x7ffff6ae83a0 type <integer_type 0x7ffff6ae5150 bitsizetype> constant 32>
>     unit size <integer_cst 0x7ffff6ae83c0 type <integer_type 0x7ffff6ae50a8 sizetype> constant 4>
>     align 32 symtab 0 alias set -1 canonical type 0x7ffff6c52888
>     fields <field_decl 0x7ffff6adec78 a
>         type <record_type 0x7ffff6c52738 A SI size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
>             align 32 symtab 0 alias set -1 canonical type 0x7ffff6c52738 fields <field_decl 0x7ffff6adebe0 a> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
>             chain <type_decl 0x7ffff6af2f18 A>>
>         nonlocal SI file t.C line 3 col 20 size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
>         align 32 offset_align 128
>         offset <integer_cst 0x7ffff6ae8060 constant 0>
>         bit offset <integer_cst 0x7ffff6ae80e0 constant 0> context <record_type 0x7ffff6c52888 B>
>         chain <type_decl 0x7ffff6c55170 B type <record_type 0x7ffff6c52930 B>
>             nonlocal VOID file t.C line 3 col 10
>             align 1 context <record_type 0x7ffff6c52888 B> result <record_type 0x7ffff6c52888 B>>> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
>     pointer_to_this <pointer_type 0x7ffff6c529d8> chain <type_decl 0x7ffff6c550b8 B>>
> bases:0
>  <record_type 0x7ffff6c52b28 C SI
>     size <integer_cst 0x7ffff6ae83a0 type <integer_type 0x7ffff6ae5150 bitsizetype> constant 32>
>     unit size <integer_cst 0x7ffff6ae83c0 type <integer_type 0x7ffff6ae50a8 sizetype> constant 4>
>     align 32 symtab 0 alias set -1 structural equality
>     fields <field_decl 0x7ffff6adeda8 D.2831
>         type <record_type 0x7ffff6c52738 A SI size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
>             align 32 symtab 0 alias set -1 canonical type 0x7ffff6c52738 fields <field_decl 0x7ffff6adebe0 a> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
>             chain <type_decl 0x7ffff6af2f18 A>>
>         ignored SI file t.C line 2 col 8 size <integer_cst 0x7ffff6ae83a0 32> unit size <integer_cst 0x7ffff6ae83c0 4>
>         align 32 offset_align 128
>         offset <integer_cst 0x7ffff6ae8060 constant 0>
>         bit offset <integer_cst 0x7ffff6ae80e0 constant 0> context <record_type 0x7ffff6c52a80 C>
>         chain <type_decl 0x7ffff6c552e0 C type <record_type 0x7ffff6c52b28 C>
>             nonlocal VOID file t.C line 2 col 12
>             align 1 context <record_type 0x7ffff6c52a80 C> result <record_type 0x7ffff6c52a80 C>>> context <translation_unit_decl 0x7ffff6af2e60 D.2821>
>     chain <type_decl 0x7ffff6c55228 C>>
> bases:1
> 
> So we prevail structure B with structure C.  One has bases to walk other 
> doesn't. If that BINFO walk in alias.c (on canonical types) did 
> something useful, we have a wrong code bug.

Yeah, ok.  But we treat those types (B and C) TBAA equivalent because
structurally they are the same ;)  Luckily C has a "proper" field
for its base (proper means that offset and size are correct as well
as the type).  It indeed has DECL_ARTIFICIAL set and yes, we treat
those as "real" fields when doing the structural comparison.

More interesting is of course when we can re-use tail-padding in
one but not the other (works as expected - not merged).

struct A { A (); short x; bool a;};
struct C:A { bool b; };
struct B {struct A a; bool b;};
struct C *p2;
struct B *p1;
int
t()
{
  p1->a.a = 2;
  return p2->a;
}

> Yes, zero sized classes are those having no fields (but other stuff, 
> type decls, bases etc.)

Yeah, but TBAA obviously doesn't care about type decls and bases.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-17 11:36         ` Richard Biener
@ 2014-02-17 20:55           ` Jan Hubicka
  2014-02-18  8:51             ` Richard Biener
  2014-02-20 20:57             ` Jason Merrill
  0 siblings, 2 replies; 18+ messages in thread
From: Jan Hubicka @ 2014-02-17 20:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc, jason

> 
> Yeah, ok.  But we treat those types (B and C) TBAA equivalent because
> structurally they are the same ;)  Luckily C has a "proper" field
> for its base (proper means that offset and size are correct as well
> as the type).  It indeed has DECL_ARTIFICIAL set and yes, we treat
> those as "real" fields when doing the structural comparison.

Yep, the difference is that depending if C or D win, we will end up walking the
BINFO or not.  So we should not depend on the BINFo walk for correctness.
> 
> More interesting is of course when we can re-use tail-padding in
> one but not the other (works as expected - not merged).

Yep.
> 
> struct A { A (); short x; bool a;};
> struct C:A { bool b; };
> struct B {struct A a; bool b;};
> struct C *p2;
> struct B *p1;
> int
> t()
> {
>   p1->a.a = 2;
>   return p2->a;
> }
> 
> > Yes, zero sized classes are those having no fields (but other stuff, 
> > type decls, bases etc.)
> 
> Yeah, but TBAA obviously doesn't care about type decls and bases.

So I guess the conclussion is that the BINFO walk in alias.c is pointless?

Concerning the merging details and LTO aliasing, I think for 4.10 we should
make C++ to compute mangled names of types (i.e. call DECL_ASSEMBLER_NAME on
the associated type_decl + explicitly mark that type is driven by ODR) and then
we can do merging driven by ODR rule.

Non-ODR types born from other frontends will then need to be made to alias all
the ODR variants that can be done by storing them into the current canonical type hash.
(I wonder if we want to support cross language aliasing for non-POD?)

I also think we want explicit representation of types known to be local to compilation
unit - anonymous namespaces in C/C++, types defined within function bodies in C and
god knows what in Ada/Fortran/Java.

Honza
> 
> Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-17 20:55           ` Jan Hubicka
@ 2014-02-18  8:51             ` Richard Biener
  2014-02-18 10:42               ` Richard Biener
  2014-02-18 19:02               ` Jan Hubicka
  2014-02-20 20:57             ` Jason Merrill
  1 sibling, 2 replies; 18+ messages in thread
From: Richard Biener @ 2014-02-18  8:51 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Mon, 17 Feb 2014, Jan Hubicka wrote:

> > 
> > Yeah, ok.  But we treat those types (B and C) TBAA equivalent because
> > structurally they are the same ;)  Luckily C has a "proper" field
> > for its base (proper means that offset and size are correct as well
> > as the type).  It indeed has DECL_ARTIFICIAL set and yes, we treat
> > those as "real" fields when doing the structural comparison.
> 
> Yep, the difference is that depending if C or D win, we will end up walking the
> BINFO or not.  So we should not depend on the BINFo walk for correctness.
> > 
> > More interesting is of course when we can re-use tail-padding in
> > one but not the other (works as expected - not merged).
> 
> Yep.
> > 
> > struct A { A (); short x; bool a;};
> > struct C:A { bool b; };
> > struct B {struct A a; bool b;};
> > struct C *p2;
> > struct B *p1;
> > int
> > t()
> > {
> >   p1->a.a = 2;
> >   return p2->a;
> > }
> > 
> > > Yes, zero sized classes are those having no fields (but other stuff, 
> > > type decls, bases etc.)
> > 
> > Yeah, but TBAA obviously doesn't care about type decls and bases.
> 
> So I guess the conclussion is that the BINFO walk in alias.c is pointless?

Yes.  But as I said - I remember being there and proposing to remove
it.  Some N > 5 years ago or so and it was either rejected or it didn't
work out ;)

> Concerning the merging details and LTO aliasing, I think for 4.10 we 
> should make C++ to compute mangled names of types (i.e. call 
> DECL_ASSEMBLER_NAME on the associated type_decl + explicitly mark that 
> type is driven by ODR) and then we can do merging driven by ODR rule.
> 
> Non-ODR types born from other frontends will then need to be made to 
> alias all the ODR variants that can be done by storing them into the 
> current canonical type hash.
> (I wonder if we want to support cross language aliasing for non-POD?)

Surely for accessing components of non-POD types, no?  Like

class Foo {
Foo();
int *get_data ();
int *data;
} glob_foo;

extern "C" int *get_foo_data() { return glob_foo.get_data(); }

?  But you are talking about the "tree" merging part using ODR info
to also merge types which differ in completeness of contained
pointer types, right?  (exactly equal cases should be already merged)

The canonical type computation happens separately (only for prevailing
types, of course), and there we already "merge" types which differ
in completeness.  Canonical type merging is conservative the other
way aroud - if we merge _all_ types to a single canonical type then
TBAA is still correct (we get a single alias set).

> I also think we want explicit representation of types known to be local 
> to compilation unit - anonymous namespaces in C/C++, types defined 
> within function bodies in C and god knows what in Ada/Fortran/Java.

But here you get into the idea of improving TBAA, thus having
_more_ distinct canonical types?

Just to make sure to not mix those two ;)

And whatever "frontend knowledge" we want to excercise - please
make sure we get a reliable way for the middle-end to see
that "frontend knowledge" (no langhooks!).  Thus, make it
"middle-end knowledge".

Oh - and the easiest way to improve things is to get less types into
the merging process in the first place!

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-18  8:51             ` Richard Biener
@ 2014-02-18 10:42               ` Richard Biener
  2014-02-18 19:02               ` Jan Hubicka
  1 sibling, 0 replies; 18+ messages in thread
From: Richard Biener @ 2014-02-18 10:42 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Tue, 18 Feb 2014, Richard Biener wrote:

> On Mon, 17 Feb 2014, Jan Hubicka wrote:
> 
> > > 
> > > Yeah, ok.  But we treat those types (B and C) TBAA equivalent because
> > > structurally they are the same ;)  Luckily C has a "proper" field
> > > for its base (proper means that offset and size are correct as well
> > > as the type).  It indeed has DECL_ARTIFICIAL set and yes, we treat
> > > those as "real" fields when doing the structural comparison.
> > 
> > Yep, the difference is that depending if C or D win, we will end up walking the
> > BINFO or not.  So we should not depend on the BINFo walk for correctness.
> > > 
> > > More interesting is of course when we can re-use tail-padding in
> > > one but not the other (works as expected - not merged).
> > 
> > Yep.
> > > 
> > > struct A { A (); short x; bool a;};
> > > struct C:A { bool b; };
> > > struct B {struct A a; bool b;};
> > > struct C *p2;
> > > struct B *p1;
> > > int
> > > t()
> > > {
> > >   p1->a.a = 2;
> > >   return p2->a;
> > > }
> > > 
> > > > Yes, zero sized classes are those having no fields (but other stuff, 
> > > > type decls, bases etc.)
> > > 
> > > Yeah, but TBAA obviously doesn't care about type decls and bases.
> > 
> > So I guess the conclussion is that the BINFO walk in alias.c is pointless?
> 
> Yes.  But as I said - I remember being there and proposing to remove
> it.  Some N > 5 years ago or so and it was either rejected or it didn't
> work out ;)

Btw, a bootstrap & regtest worked fine with removing that loop
(not that this proves anything).

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-18  8:51             ` Richard Biener
  2014-02-18 10:42               ` Richard Biener
@ 2014-02-18 19:02               ` Jan Hubicka
  2014-02-19 12:16                 ` Richard Biener
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Hubicka @ 2014-02-18 19:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc, jason

> > Non-ODR types born from other frontends will then need to be made to 
> > alias all the ODR variants that can be done by storing them into the 
> > current canonical type hash.
> > (I wonder if we want to support cross language aliasing for non-POD?)
> 
> Surely for accessing components of non-POD types, no?  Like
> 
> class Foo {
> Foo();
> int *get_data ();
> int *data;
> } glob_foo;
> 
> extern "C" int *get_foo_data() { return glob_foo.get_data(); }

OK, if we want to support this, then we want to merge.
What about types with vtbl pointer? :)
> 
> ?  But you are talking about the "tree" merging part using ODR info
> to also merge types which differ in completeness of contained
> pointer types, right?  (exactly equal cases should be already merged)

Actually I was speaking of canonical types here. I want to preserve more of TBAA
via honoring ODR and local types.
I want to change lto to not merge canonical types for pairs of types of same layout
(i.e. equivalent in the current canonical type definition) but with different
mangled names. I also want it to never merge when types are local.
For inter-language TBAA we will need to ensure aliasing in between non-ODR type
of same layout and all unmerged variants of ODR type.  Can it be done by attaching
chains of ODR types into the canonical type hash and when non-ODR type appears, just
make it alias with all of them?

It would make sense to ODR merge in tree merging, too, but I am not sure if
this fits the current design, since you would need to merge SCC components of
different shape then that seems hard, right?

It may be easier to ODR merge after streaming (during DECL fixup) just to make
WPA streaming cheaper and to reduce debug info size.  If you use
-fdump-ipa-devirt, it will dump you ODR types that did not get merged (only
ones with vtable pointers in them ATM) and there are quite long chains for
firefox. Surely then hundreds of duplicated ODR types will end up in the ltrans
partition streams and they eventually hit debug output machinery.
Eric sent me presentation about doing this in LLVM.
http://llvm.org/devmtg/2013-11/slides/Christopher-DebugInfo.pdf
> 
> The canonical type computation happens separately (only for prevailing
> types, of course), and there we already "merge" types which differ
> in completeness.  Canonical type merging is conservative the other
> way aroud - if we merge _all_ types to a single canonical type then
> TBAA is still correct (we get a single alias set).

Yes, I think I understand that. One equivalence is kind of minimal so we merge
only if we are sure there is no informationloss, other is maximal so we are
sure that types that needs to be equivalent by whatever underlying langauge
TBAA rules are actually equivalent.
> 
> > I also think we want explicit representation of types known to be local 
> > to compilation unit - anonymous namespaces in C/C++, types defined 
> > within function bodies in C and god knows what in Ada/Fortran/Java.
> 
> But here you get into the idea of improving TBAA, thus having
> _more_ distinct canonical types?

Yes.
> 
> Just to make sure to not mix those two ;)
> 
> And whatever "frontend knowledge" we want to excercise - please
> make sure we get a reliable way for the middle-end to see
> that "frontend knowledge" (no langhooks!).  Thus, make it
> "middle-end knowledge".

Sure that is what I am proposing - just have DECL_ASSEMBLER_NAME on TYPE_DECL
and ODR flag. Middle-end when comparing types will test ODR flag and if flag
is set, then it will compare via DECL_ASEBMLER_NAME (TYPE_DECL (type)).
No langhooks needed here + if other language has similar inter-unit equivalency
it can use the same mechanizm. Just turn the equivalency description into
string identifiers.
> 
> Oh - and the easiest way to improve things is to get less types into
> the merging process in the first place!

Yep, my experiments with not streaming BINFO are directed in it.  I will collect
some numbers and send.

Honza
> 
> Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-18 19:02               ` Jan Hubicka
@ 2014-02-19 12:16                 ` Richard Biener
  2014-02-19 19:19                   ` Jan Hubicka
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2014-02-19 12:16 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Tue, 18 Feb 2014, Jan Hubicka wrote:

> > > Non-ODR types born from other frontends will then need to be made to 
> > > alias all the ODR variants that can be done by storing them into the 
> > > current canonical type hash.
> > > (I wonder if we want to support cross language aliasing for non-POD?)
> > 
> > Surely for accessing components of non-POD types, no?  Like
> > 
> > class Foo {
> > Foo();
> > int *get_data ();
> > int *data;
> > } glob_foo;
> > 
> > extern "C" int *get_foo_data() { return glob_foo.get_data(); }
> 
> OK, if we want to support this, then we want to merge.
> What about types with vtbl pointer? :)

I can easily create a C struct variant covering that.  Basically
in _practice_ I can inter-operate with any language from C if I
know its ABI.  Do we really want to make this undefined?  See
the (even standard) Fortran - C interoperability spec.  I'm sure
something exists for Ada interoperating with C (or even C++).

> > ?  But you are talking about the "tree" merging part using ODR info
> > to also merge types which differ in completeness of contained
> > pointer types, right?  (exactly equal cases should be already merged)
> 
> Actually I was speaking of canonical types here. I want to preserve more 
> of TBAA via honoring ODR and local types.

So, are you positive there will be a net gain in optimization when
doing that?  Please factor in the surprises you'll get when code
gets "miscompiled" because of "slight" ODR violations or interoperability
that no longer works.

> I want to change lto to not 
> merge canonical types for pairs of types of same layout (i.e. equivalent 
> in the current canonical type definition) but with different mangled 
> names.

Names are nothing ;)  In C I very often see different _names_ used
in headers vs. implementation (when the implementation uses a different
internal header).  You have struct Foo; in public headers vs.
struct Foo_impl; in the implementation.

> I also want it to never merge when types are local. For 
> inter-language TBAA we will need to ensure aliasing in between non-ODR 
> type of same layout and all unmerged variants of ODR type.
>  Can it be 
> done by attaching chains of ODR types into the canonical type hash and 
> when non-ODR type appears, just make it alias with all of them?

No, how would that work?

> It would make sense to ODR merge in tree merging, too, but I am not sure if
> this fits the current design, since you would need to merge SCC components of
> different shape then that seems hard, right?

Right.  You'd lose the nice incremental SCC merging (where we haven't even
yet implemented the nicest way - avoid re-materializing the SCC until
we know it prevails).

> It may be easier to ODR merge after streaming (during DECL fixup) just to make
> WPA streaming cheaper and to reduce debug info size.  If you use
> -fdump-ipa-devirt, it will dump you ODR types that did not get merged (only
> ones with vtable pointers in them ATM) and there are quite long chains for
> firefox. Surely then hundreds of duplicated ODR types will end up in the ltrans
> partition streams and they eventually hit debug output machinery.
> Eric sent me presentation about doing this in LLVM.
> http://llvm.org/devmtg/2013-11/slides/Christopher-DebugInfo.pdf

Debuginfo is sth completely separate and should be done separately
(early debug), avoiding to stream the types in the first place.

> > 
> > The canonical type computation happens separately (only for prevailing
> > types, of course), and there we already "merge" types which differ
> > in completeness.  Canonical type merging is conservative the other
> > way aroud - if we merge _all_ types to a single canonical type then
> > TBAA is still correct (we get a single alias set).
> 
> Yes, I think I understand that. One equivalence is kind of minimal so we merge
> only if we are sure there is no informationloss, other is maximal so we are
> sure that types that needs to be equivalent by whatever underlying langauge
> TBAA rules are actually equivalent.

The former is just not correct - it would mean that not merging at all
would be valid, which it is not (you'd create wrong-code all over the 
place).

We still don't merge enough (because of latent bugs that I didn't manage
to fix in time) - thus we do not merge all structurally equivalent types
right now.

> > > I also think we want explicit representation of types known to be local 
> > > to compilation unit - anonymous namespaces in C/C++, types defined 
> > > within function bodies in C and god knows what in Ada/Fortran/Java.
> > 
> > But here you get into the idea of improving TBAA, thus having
> > _more_ distinct canonical types?
> 
> Yes.
> > 
> > Just to make sure to not mix those two ;)
> > 
> > And whatever "frontend knowledge" we want to excercise - please
> > make sure we get a reliable way for the middle-end to see
> > that "frontend knowledge" (no langhooks!).  Thus, make it
> > "middle-end knowledge".
> 
> Sure that is what I am proposing - just have DECL_ASSEMBLER_NAME on TYPE_DECL
> and ODR flag. Middle-end when comparing types will test ODR flag and if flag
> is set, then it will compare via DECL_ASEBMLER_NAME (TYPE_DECL (type)).
> No langhooks needed here + if other language has similar inter-unit equivalency
> it can use the same mechanizm. Just turn the equivalency description into
> string identifiers.

Ok.  You have to be aware of the effects on inter-language 
interoperability though (you'll break it).  Thus I'd make this
guarded by -fextra-strict-aliasing and only auto-enable that when
all TUs are produced by the same frontend (easy enough to check I guess).

Richard.

> > Oh - and the easiest way to improve things is to get less types into
> > the merging process in the first place!
> 
> Yep, my experiments with not streaming BINFO are directed in it.  I will collect
> some numbers and send.
> 
> Honza
> > 
> > Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-19 12:16                 ` Richard Biener
@ 2014-02-19 19:19                   ` Jan Hubicka
  2014-02-20 11:28                     ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Hubicka @ 2014-02-19 19:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc, jason

> On Tue, 18 Feb 2014, Jan Hubicka wrote:
> 
> > > > Non-ODR types born from other frontends will then need to be made to 
> > > > alias all the ODR variants that can be done by storing them into the 
> > > > current canonical type hash.
> > > > (I wonder if we want to support cross language aliasing for non-POD?)
> > > 
> > > Surely for accessing components of non-POD types, no?  Like
> > > 
> > > class Foo {
> > > Foo();
> > > int *get_data ();
> > > int *data;
> > > } glob_foo;
> > > 
> > > extern "C" int *get_foo_data() { return glob_foo.get_data(); }
> > 
> > OK, if we want to support this, then we want to merge.
> > What about types with vtbl pointer? :)
> 
> I can easily create a C struct variant covering that.  Basically
> in _practice_ I can inter-operate with any language from C if I
> know its ABI.  Do we really want to make this undefined?  See
> the (even standard) Fortran - C interoperability spec.  I'm sure
> something exists for Ada interoperating with C (or even C++).

Well, if you know the ABI, you can interoperate C
struct {int a,b;};
with
struct {struct {int a;} a; struct {int b;}};

So we don't interoperate everything.  We may eventually want to define what
kind of inter-operation we support.  non-POD class layout is not always a
natural extension of C struct layout as one would expect
http://mentorembedded.github.io/cxx-abi/abi.html#layout
so it may make sense to declare it uninteroperable if it helps something in real
world (which I am not quite sure about)

I am not really shooting for this. I just want to get something that would
improve TBAA in pre-dominantly C++ programs (with some C code in them) such as
firefox, chromium, openoffice or GCC. Because I see us giving up on TBAA very
often when playing with cases that should be devirtualized by GVN/aggregate
propagation in ipa-prop but aren't. Also given that C++ standard actually
have notion of inter-module type equivalence it may be good idea to honnor it.
> 
> > > ?  But you are talking about the "tree" merging part using ODR info
> > > to also merge types which differ in completeness of contained
> > > pointer types, right?  (exactly equal cases should be already merged)
> > 
> > Actually I was speaking of canonical types here. I want to preserve more 
> > of TBAA via honoring ODR and local types.
> 
> So, are you positive there will be a net gain in optimization when
> doing that?  Please factor in the surprises you'll get when code
> gets "miscompiled" because of "slight" ODR violations or interoperability
> that no longer works.
> 
> > I want to change lto to not 
> > merge canonical types for pairs of types of same layout (i.e. equivalent 
> > in the current canonical type definition) but with different mangled 
> > names.
> 
> Names are nothing ;)  In C I very often see different _names_ used
> in headers vs. implementation (when the implementation uses a different
> internal header).  You have struct Foo; in public headers vs.
> struct Foo_impl; in the implementation.

Yes, I am aware of that - I had to go through all those issues with --combine
code in mid 2000s. C++ is different.
What I want is to make TBAA in between C++ types stronger and TBAA between
C++ and other languages to do pretty much what we donot, if possible.
> 
> > I also want it to never merge when types are local. For 
> > inter-language TBAA we will need to ensure aliasing in between non-ODR 
> > type of same layout and all unmerged variants of ODR type.
> >  Can it be 
> > done by attaching chains of ODR types into the canonical type hash and 
> > when non-ODR type appears, just make it alias with all of them?
> 
> No, how would that work?

This is what I am asking you about.
At the end of streaming process, we will have canonical type hash with one leader
and list of ODR variants that was not merged based on ODR rules.
If leader happens to be non-ODR, then it must be made to alias all of them.
I think either we can rewrite all ODR cases to have the non-ODR as canonical type
or make something more fine grained so we can force loads/stores through non-ODR
types to alias with all of them, but loads/stores through ODR types to alias
as within C++ compilation unit.
> 
> > It would make sense to ODR merge in tree merging, too, but I am not sure if
> > this fits the current design, since you would need to merge SCC components of
> > different shape then that seems hard, right?
> 
> Right.  You'd lose the nice incremental SCC merging (where we haven't even
> yet implemented the nicest way - avoid re-materializing the SCC until
> we know it prevails).

Yes, SCC merging w/o re-materialization is a priority, indeed.
> 
> > It may be easier to ODR merge after streaming (during DECL fixup) just to make
> > WPA streaming cheaper and to reduce debug info size.  If you use
> > -fdump-ipa-devirt, it will dump you ODR types that did not get merged (only
> > ones with vtable pointers in them ATM) and there are quite long chains for
> > firefox. Surely then hundreds of duplicated ODR types will end up in the ltrans
> > partition streams and they eventually hit debug output machinery.
> > Eric sent me presentation about doing this in LLVM.
> > http://llvm.org/devmtg/2013-11/slides/Christopher-DebugInfo.pdf
> 
> Debuginfo is sth completely separate and should be done separately
> (early debug), avoiding to stream the types in the first place.

If we generate dwarf dies, we still need to hook them to our IL so we know what they describe.
We will also need to merge them. LLVM basically does early debug and it does the merging
and it needs 13GB extra memory use when you compile firefox with -g compared to no -g.
So it seems to serve as a warning that this path is not 100% problem-free either.

But yes, it is a separate issue.
> 
> > > 
> > > The canonical type computation happens separately (only for prevailing
> > > types, of course), and there we already "merge" types which differ
> > > in completeness.  Canonical type merging is conservative the other
> > > way aroud - if we merge _all_ types to a single canonical type then
> > > TBAA is still correct (we get a single alias set).
> > 
> > Yes, I think I understand that. One equivalence is kind of minimal so we merge
> > only if we are sure there is no informationloss, other is maximal so we are
> > sure that types that needs to be equivalent by whatever underlying langauge
> > TBAA rules are actually equivalent.
> 
> The former is just not correct - it would mean that not merging at all
> would be valid, which it is not (you'd create wrong-code all over the 
> place).

Ah yes, because of memory layout differences.
> 
> Ok.  You have to be aware of the effects on inter-language 
> interoperability though (you'll break it).  Thus I'd make this
> guarded by -fextra-strict-aliasing and only auto-enable that when
> all TUs are produced by the same frontend (easy enough to check I guess).

What I am trying to propose (in first step) should be safe inter-language.
But yes, we can start doing it optional and gain some experience. I am sure
there will be surprises ;)

Honza
> 
> Richard.
> 
> > > Oh - and the easiest way to improve things is to get less types into
> > > the merging process in the first place!
> > 
> > Yep, my experiments with not streaming BINFO are directed in it.  I will collect
> > some numbers and send.
> > 
> > Honza
> > > 
> > > Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-19 19:19                   ` Jan Hubicka
@ 2014-02-20 11:28                     ` Richard Biener
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Biener @ 2014-02-20 11:28 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, jason

On Wed, 19 Feb 2014, Jan Hubicka wrote:

> > On Tue, 18 Feb 2014, Jan Hubicka wrote:
> > 
> > > > > Non-ODR types born from other frontends will then need to be made to 
> > > > > alias all the ODR variants that can be done by storing them into the 
> > > > > current canonical type hash.
> > > > > (I wonder if we want to support cross language aliasing for non-POD?)
> > > > 
> > > > Surely for accessing components of non-POD types, no?  Like
> > > > 
> > > > class Foo {
> > > > Foo();
> > > > int *get_data ();
> > > > int *data;
> > > > } glob_foo;
> > > > 
> > > > extern "C" int *get_foo_data() { return glob_foo.get_data(); }
> > > 
> > > OK, if we want to support this, then we want to merge.
> > > What about types with vtbl pointer? :)
> > 
> > I can easily create a C struct variant covering that.  Basically
> > in _practice_ I can inter-operate with any language from C if I
> > know its ABI.  Do we really want to make this undefined?  See
> > the (even standard) Fortran - C interoperability spec.  I'm sure
> > something exists for Ada interoperating with C (or even C++).
> 
> Well, if you know the ABI, you can interoperate C
> struct {int a,b;};
> with
> struct {struct {int a;} a; struct {int b;}};
> 
> So we don't interoperate everything.  We may eventually want to define what
> kind of inter-operation we support.  non-POD class layout is not always a
> natural extension of C struct layout as one would expect
> http://mentorembedded.github.io/cxx-abi/abi.html#layout
> so it may make sense to declare it uninteroperable if it helps something in real
> world (which I am not quite sure about)

I'm considering interoperation as defined by language standards
(definitely existing for Fortran <-> C) only.  I'm quite sure such
language specified interoperation exists for Ada as well and it
does exist for Java NI / C++ (not that we care here I guess).

I'm not at all sure to what extent interoperation between C and C++
is defined (apart from knowing 'extern "C"' to avoid mangling), so
I'm curious there.

So what we need to make sure is that the TBAA we apply does not break
when LTO-ing "valid" cross-language interoperation.  The current 
implementation is somewhat ad-hoc and tries to be very conservative
(but at the same time has known issues).

> I am not really shooting for this. I just want to get something that would
> improve TBAA in pre-dominantly C++ programs (with some C code in them) such as
> firefox, chromium, openoffice or GCC. Because I see us giving up on TBAA very
> often when playing with cases that should be devirtualized by GVN/aggregate
> propagation in ipa-prop but aren't. Also given that C++ standard actually
> have notion of inter-module type equivalence it may be good idea to honnor it.

Sure.  You can also track "language" per function and update it to
"mix" when inlining different language functions (but then you have to
implement knowledge into IPA propagators as well), and then fall back
to conservative TBAA for functions with cross-language interoperation.

I'm just not sure how to represent those "two" views of TBAA with
current infrastructure ...

> > 
> > > > ?  But you are talking about the "tree" merging part using ODR info
> > > > to also merge types which differ in completeness of contained
> > > > pointer types, right?  (exactly equal cases should be already merged)
> > > 
> > > Actually I was speaking of canonical types here. I want to preserve more 
> > > of TBAA via honoring ODR and local types.
> > 
> > So, are you positive there will be a net gain in optimization when
> > doing that?  Please factor in the surprises you'll get when code
> > gets "miscompiled" because of "slight" ODR violations or interoperability
> > that no longer works.
> > 
> > > I want to change lto to not 
> > > merge canonical types for pairs of types of same layout (i.e. equivalent 
> > > in the current canonical type definition) but with different mangled 
> > > names.
> > 
> > Names are nothing ;)  In C I very often see different _names_ used
> > in headers vs. implementation (when the implementation uses a different
> > internal header).  You have struct Foo; in public headers vs.
> > struct Foo_impl; in the implementation.
> 
> Yes, I am aware of that - I had to go through all those issues with --combine
> code in mid 2000s. C++ is different.
> What I want is to make TBAA in between C++ types stronger and TBAA between
> C++ and other languages to do pretty much what we donot, if possible.
> > 
> > > I also want it to never merge when types are local. For 
> > > inter-language TBAA we will need to ensure aliasing in between non-ODR 
> > > type of same layout and all unmerged variants of ODR type.
> > >  Can it be 
> > > done by attaching chains of ODR types into the canonical type hash and 
> > > when non-ODR type appears, just make it alias with all of them?
> > 
> > No, how would that work?
> 
> This is what I am asking you about. At the end of streaming process, we 
> will have canonical type hash with one leader and list of ODR variants 
> that was not merged based on ODR rules. If leader happens to be non-ODR, 
> then it must be made to alias all of them. I think either we can rewrite 
> all ODR cases to have the non-ODR as canonical type or make something 
> more fine grained so we can force loads/stores through non-ODR types to 
> alias with all of them, but loads/stores through ODR types to alias as 
> within C++ compilation unit.

Ok, so currently we don't have that "list of variants" but we could
keep it.  So we'd compute the conservative TYPE_CANONICAL but keep
a list of "equivalent" types during streaming.

After streaming we iterate over all canonical type leaders, looking
in the list of structurally equal types and if all of them need
to follow ODR rules split this canonical group into multiple distinct
groups based on ODR rules?

We can (and should) do that at LTRANS stage only?  That avoids
the need to keep the list at WPA time and the time to dissect the
groups?

> > > It would make sense to ODR merge in tree merging, too, but I am not sure if
> > > this fits the current design, since you would need to merge SCC components of
> > > different shape then that seems hard, right?
> > 
> > Right.  You'd lose the nice incremental SCC merging (where we haven't even
> > yet implemented the nicest way - avoid re-materializing the SCC until
> > we know it prevails).
> 
> Yes, SCC merging w/o re-materialization is a priority, indeed.

Yeah.  It should be quite easy to do for SCCs without hash collisions
(just keep around the on-disk SCC rep of the prevailing SCC and
do a memcmp with the SCC you are about to read).  For SCCs with
hash collisions the compare isn't so trivial so maybe we'd just keep
the re-materializing code-path for those.  We can easily stream
a flag whether the SCC does have multiple entries with the same hash
or not.

I will probably play with this during next stage1 if nobody beats me
on it (you can add statistics to lto-streamer-out.c:1254 to see
how many SCCs or SCC entries you'll cover with that approach).

Richard.

> > > It may be easier to ODR merge after streaming (during DECL fixup) just to make
> > > WPA streaming cheaper and to reduce debug info size.  If you use
> > > -fdump-ipa-devirt, it will dump you ODR types that did not get merged (only
> > > ones with vtable pointers in them ATM) and there are quite long chains for
> > > firefox. Surely then hundreds of duplicated ODR types will end up in the ltrans
> > > partition streams and they eventually hit debug output machinery.
> > > Eric sent me presentation about doing this in LLVM.
> > > http://llvm.org/devmtg/2013-11/slides/Christopher-DebugInfo.pdf
> > 
> > Debuginfo is sth completely separate and should be done separately
> > (early debug), avoiding to stream the types in the first place.
> 
> If we generate dwarf dies, we still need to hook them to our IL so we know what they describe.
> We will also need to merge them. LLVM basically does early debug and it does the merging
> and it needs 13GB extra memory use when you compile firefox with -g compared to no -g.
> So it seems to serve as a warning that this path is not 100% problem-free either.
> 
> But yes, it is a separate issue.
> > 
> > > > 
> > > > The canonical type computation happens separately (only for prevailing
> > > > types, of course), and there we already "merge" types which differ
> > > > in completeness.  Canonical type merging is conservative the other
> > > > way aroud - if we merge _all_ types to a single canonical type then
> > > > TBAA is still correct (we get a single alias set).
> > > 
> > > Yes, I think I understand that. One equivalence is kind of minimal so we merge
> > > only if we are sure there is no informationloss, other is maximal so we are
> > > sure that types that needs to be equivalent by whatever underlying langauge
> > > TBAA rules are actually equivalent.
> > 
> > The former is just not correct - it would mean that not merging at all
> > would be valid, which it is not (you'd create wrong-code all over the 
> > place).
> 
> Ah yes, because of memory layout differences.
> > 
> > Ok.  You have to be aware of the effects on inter-language 
> > interoperability though (you'll break it).  Thus I'd make this
> > guarded by -fextra-strict-aliasing and only auto-enable that when
> > all TUs are produced by the same frontend (easy enough to check I guess).
> 
> What I am trying to propose (in first step) should be safe inter-language.
> But yes, we can start doing it optional and gain some experience. I am sure
> there will be surprises ;)
> 
> Honza
> > 
> > Richard.
> > 
> > > > Oh - and the easiest way to improve things is to get less types into
> > > > the merging process in the first place!
> > > 
> > > Yep, my experiments with not streaming BINFO are directed in it.  I will collect
> > > some numbers and send.
> > > 
> > > Honza
> > > > 
> > > > Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-15 10:10       ` Richard Biener
@ 2014-02-20 20:49         ` Jason Merrill
  0 siblings, 0 replies; 18+ messages in thread
From: Jason Merrill @ 2014-02-20 20:49 UTC (permalink / raw)
  To: Richard Biener, Jan Hubicka; +Cc: gcc

On 02/15/2014 05:09 AM, Richard Biener wrote:
> On Sat, 15 Feb 2014, Jan Hubicka wrote:
>> The code really seems to be adding only cases of zero sized classes.
>
> zero-size as in with no fields?  Then we can't do anything with such
> classes and thus we don't need to record component aliases?

Right.  The TYPE_SIZE of an empty class is still 1, but a base of such a 
type doesn't get a FIELD_DECL.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: TYPE_BINFO and canonical types at LTO
  2014-02-17 20:55           ` Jan Hubicka
  2014-02-18  8:51             ` Richard Biener
@ 2014-02-20 20:57             ` Jason Merrill
  1 sibling, 0 replies; 18+ messages in thread
From: Jason Merrill @ 2014-02-20 20:57 UTC (permalink / raw)
  To: Jan Hubicka, Richard Biener; +Cc: gcc

On 02/17/2014 03:55 PM, Jan Hubicka wrote:
> I also think we want explicit representation of types known to be local to compilation
> unit - anonymous namespaces in C/C++, types defined within function bodies in C and
> god knows what in Ada/Fortran/Java.

In the C++ front end we set TREE_PUBLIC appropriately on the TYPE_DECL 
of a named type.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-02-20 20:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-14  0:48 TYPE_BINFO and canonical types at LTO Jan Hubicka
2014-02-14  8:06 ` Jason Merrill
2014-02-14  8:42 ` Richard Biener
2014-02-14 16:52   ` Jan Hubicka
2014-02-14 23:22     ` Jan Hubicka
2014-02-15 10:10       ` Richard Biener
2014-02-20 20:49         ` Jason Merrill
2014-02-15 10:03     ` Richard Biener
2014-02-16 23:55       ` Jan Hubicka
2014-02-17 11:36         ` Richard Biener
2014-02-17 20:55           ` Jan Hubicka
2014-02-18  8:51             ` Richard Biener
2014-02-18 10:42               ` Richard Biener
2014-02-18 19:02               ` Jan Hubicka
2014-02-19 12:16                 ` Richard Biener
2014-02-19 19:19                   ` Jan Hubicka
2014-02-20 11:28                     ` Richard Biener
2014-02-20 20:57             ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).