public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: g++ and aliasing bools
@ 2002-01-25  8:55 Robert Dewar
  2002-01-25  9:21 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  8:55 UTC (permalink / raw)
  To: dan, dewar; +Cc: gcc, kenner

<<However, the important point is that the language semantics don't allow
us to get it *right*, not just deciding. We may get it *wrong*, and think
we have it *right*.
>>

Then you have misread the language specs, or GCC is wrong, no other
possibilities exist!

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:55 g++ and aliasing bools Robert Dewar
@ 2002-01-25  9:21 ` Daniel Berlin
  2002-01-25 10:00   ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  9:21 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, kenner

On Fri, 25 Jan 2002, Robert Dewar wrote:

> <<However, the important point is that the language semantics don't allow
> us to get it *right*, not just deciding. We may get it *wrong*, and think
> we have it *right*.
> >>
> 
> Then you have misread the language specs, or GCC is wrong, no other
> possibilities exist!
Incorrect on all three points.
First, The devil is in the implementation specific portions that are part 
of the 
C++ ABI spec, and tell you what the layout of vtables, etc, is. This is 
what one needs, in addition to the language spec, to be able to tell 
aliasing properly.
At least, this is what is thought, anyway, it still might not be enough.
The language spec by itself, however, is *not* enough, and just using the 
rules in it will ignore the whole problem of shared vtables, and given you 
wrong answers *because* implementations do share pieces of classes.
Second, GCC is right  because it just says that in C++, all aggregate type 
can alias anything, avoiding the probblem altogether
Third, the above is a third dpossibility, which not only exists, but is 
the current situation.


> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  9:21 ` Daniel Berlin
@ 2002-01-25 10:00   ` Daniel Berlin
  2002-01-25 10:54     ` Paolo Carlini
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 10:00 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, kenner

On Fri, 25 Jan 2002, Daniel Berlin wrote:

> On Fri, 25 Jan 2002, Robert Dewar wrote:
> 
> > <<However, the important point is that the language semantics don't allow
> > us to get it *right*, not just deciding. We may get it *wrong*, and think
> > we have it *right*.
> > >>
> > 
> > Then you have misread the language specs, or GCC is wrong, no other
> > possibilities exist!
> Incorrect on all three points.
> First, The devil is in the implementation specific portions that are part 
> of the 
> C++ ABI spec, and tell you what the layout of vtables, etc, is. This is 
> what one needs, in addition to the language spec, to be able to tell 
> aliasing properly.
> At least, this is what is thought, anyway, it still might not be enough.
> The language spec by itself, however, is *not* enough, and just using the 
> rules in it will ignore the whole problem of shared vtables, and given you 
> wrong answers *because* implementations do share pieces of classes.
> Second, GCC is right  because it just says that in C++, all aggregate type 
> can alias anything, avoiding the probblem altogether
> Third, the above is a third dpossibility, which not only exists, but is 
> the current situation.

The second is what we are trying to change, if it hasn't clicked by now. 
We want to say that everything besides simple c-like structs can alias 
anything, and simple c-like structs have the same aliasing properties as 
c structs.

Simple c-like structs means classes and structs in C++ which have no base 
classes, and no virtual functions.
If people agree that these are equivalent to c-structs (from an aliasing 
perspective, obviously there are some formal syntactical differences), then we can.
If they don't, then apparently someone else needs to take the time to 
prove it formally, at which point we can add 6 lines of code (or 
whatever, it depends on whether someone can point out any small 
difference in aliasing properties that need to be accounted for) to g++ to 
do it.

There is no "problem" as aliasing for C++ works today, because it doesn't 
do anything for things that aren't in reality, C.

Therein lies the other half of why i don't think the above needs to be 
proven formally, if they are agreed to be equivalent

Already, for non-aggregate types, the routine for C++ TBAA simply calls 
the c routine.

The whole code is literally:

cxx_get_alias_set ()
{
	if (AGGREGATE_TYPE)
		return 0;
	return c_get_alias_set();
}

The above suggestion is simply saying "AGGREGATE_TYPE is way too 
conservative, we only need to return 0 for those things that have C++ 
specific features that may affect aliasing, like virtual functions, or 
inheritance", so let's change it to:

cxx_get_alias_set ()
{
	if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
		return 0;
	return c_get_alias_set();
}

If C has the right alias semantics for all non-aggregate types, is there 
something special about the subset of aggregate types without 
baseclasses or virtuals in C++ that is different from C from an aliasing 
perspective.
No one has said yes yet, and unless someone can think of some reason why 
they *would* be different, i don't see why it should be formally proven. 

The basic question, therefore, is, "Do structs/classes in C++ that have no 
C++ specific features to them, have the same aliasing rules as C"
I.E.
Is the aliasing for
"
	struct dan
	{
		int a;
	};
"
The same for C and C++.
If so, then we should be able to change the test in cxx_get_alias_set so 
that for the case of structs that have no C++ specific features, we treat 
them as c structs for aliasing.
 --Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 10:00   ` Daniel Berlin
@ 2002-01-25 10:54     ` Paolo Carlini
  2002-01-25 11:37       ` Daniel Berlin
  2002-01-25 11:45       ` David Edelsohn
  0 siblings, 2 replies; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25 10:54 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc

Daniel Berlin wrote:

[...]

> The second is what we are trying to change, if it hasn't clicked by now.
> We want to say that everything besides simple c-like structs can alias
> anything, and simple c-like structs have the same aliasing properties as
> c structs.
>
> Simple c-like structs means classes and structs in C++ which have no base
> classes, and no virtual functions.
> If people agree that these are equivalent to c-structs (from an aliasing
> perspective, obviously there are some formal syntactical differences), then we can.
> If they don't, then apparently someone else needs to take the time to
> prove it formally, at which point we can add 6 lines of code (or
> whatever, it depends on whether someone can point out any small
> difference in aliasing properties that need to be accounted for) to g++ to
> do it.
>
> There is no "problem" as aliasing for C++ works today, because it doesn't
> do anything for things that aren't in reality, C.
>
> Therein lies the other half of why i don't think the above needs to be
> proven formally, if they are agreed to be equivalent
>
> Already, for non-aggregate types, the routine for C++ TBAA simply calls
> the c routine.
>
> The whole code is literally:
>
> cxx_get_alias_set ()
> {
>         if (AGGREGATE_TYPE)
>                 return 0;
>         return c_get_alias_set();
> }
>
> The above suggestion is simply saying "AGGREGATE_TYPE is way too
> conservative, we only need to return 0 for those things that have C++
> specific features that may affect aliasing, like virtual functions, or
> inheritance", so let's change it to:
>
> cxx_get_alias_set ()
> {
>         if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
>                 return 0;
>         return c_get_alias_set();
> }

[...]

From my, utterly naive point of view, this looks like an incredibly appealing
proposal!!!!

Consider my beloved testcase (distilled from Haney Speed), which is currently
optimized *very* bad by g++:

/////////////////

class RealMatrix {
public:

  float &index(int i, int j)
    {
      return d[i - 1 + n[0] * (j - 1)];
    }
  float index(int i, int j) const
    {
      return d[i - 1 + n[0] * (j - 1)];
    }

  int dim(int i) const { return n[i - 1]; }

private:

  float *d;
  int n[4];
};

void rmatMul(RealMatrix &t, const RealMatrix &a, const RealMatrix &b)
{
  const int M = a.dim(1), N = b.dim(2), K = b.dim(1);

  for (int j = 1; j <= N; j++)
    {
      for (int k = 1; k <= K; k++)
        {
          float temp = b.index(k, j);
          if (temp != 0.0)
            {
              for (int i = 1; i <= M; i++)
                t.index(i, j) += temp * a.index(i, k);
            }
        }
    }
}

///////////////////////

Honestly, I cannot imagine why the bulk of the C aliasing analysis machinery could not
be used for it!!!!

Cheers,
Paolo.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 10:54     ` Paolo Carlini
@ 2002-01-25 11:37       ` Daniel Berlin
  2002-01-25 11:45       ` David Edelsohn
  1 sibling, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 11:37 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: gcc

> 
> Honestly, I cannot imagine why the bulk of the C aliasing analysis machinery could not
> be used for it!!!!
Nor can I, but unless someone has the inclination to prove it formally, 
it'll apparently not be changed.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 10:54     ` Paolo Carlini
  2002-01-25 11:37       ` Daniel Berlin
@ 2002-01-25 11:45       ` David Edelsohn
  2002-01-25 11:53         ` Joe Buck
  2002-01-25 12:05         ` Mark Mitchell
  1 sibling, 2 replies; 97+ messages in thread
From: David Edelsohn @ 2002-01-25 11:45 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Paolo Carlini, gcc

>>>>> Paolo Carlini writes:

>> Honestly, I cannot imagine why the bulk of the C aliasing analysis
>> machinery could not be used for it!!!!

Mark,

	You originally contributed c_get_alias_set().  Would you please
provide us with a reference to the proof showing that it is safe for C?

Thanks, David

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 11:45       ` David Edelsohn
@ 2002-01-25 11:53         ` Joe Buck
  2002-01-25 12:09           ` Mark Mitchell
  2002-01-25 12:10           ` Paolo Carlini
  2002-01-25 12:05         ` Mark Mitchell
  1 sibling, 2 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25 11:53 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Mark Mitchell, Paolo Carlini, gcc

David E. writes:

> 	You originally contributed c_get_alias_set().  Would you please
> provide us with a reference to the proof showing that it is safe for C?

Dan sabotaged his own cause by bringing up a lot of extraneous arguments
that just sidetracked people, I think.

Let's proceed in another way.  I think that, for the purpose of Dan's
proof, he should be allowed to accept as a postulate that c_get_alias_set
is correct when applied to C.  If this postulate is not acceptable,
then it seems that if someone owes us a proof, it is Mark, not Daniel.
However, c_get_alias_set has been there long enough that we have
reasonable confidence in it.

I think it's also fair to assume as a postulate that the current
conservative cxx_get_alias_set is safe.  Again, long experience.

Given that, let's look at Dan's proposal: currently we assume that all
aggregates alias with each other in C++, but for C, we compute aliasing
specifically based on structs and their members.  The way we do this is
by putting all aggregates in alias set 0 in C++.

Dan's proposed modification, in pseudocode, is

cxx_get_alias_set ()
{
	if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
		return 0;
	return c_get_alias_set();
}

Let's assume that, while c_get_alias_set is correct, Dan is wrong, for
purposes of the proof, and that his change introduces an error that was
not there before.  Then there must be an aggregate and another object,
neither of which has baseclasses or virtual functions, and neither of
which would alias in C, but that alias in C++.  Let's assume that this is
so.  This means that I can construct a legal C++ program that creates
the extra aliasing; this program cannot be a legal C program as well.

There is only one extra C++ operation that will allow "type punning" that
doesn't exist in C: casting between base classes and derived classes.  But
any derived class pointer or reference, once constructed, will go into
alias set 0.  Even if the derived class pointer is created later, from
the base pointer, the compiler will not assume that it can use its
knowledge of the base object to learn anything about the extra members.

While this is not a formal proof, it's enough to convince me that Daniel's
change is safe.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 11:45       ` David Edelsohn
  2002-01-25 11:53         ` Joe Buck
@ 2002-01-25 12:05         ` Mark Mitchell
  2002-01-25 22:14           ` Daniel Berlin
  1 sibling, 1 reply; 97+ messages in thread
From: Mark Mitchell @ 2002-01-25 12:05 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Paolo Carlini, gcc



--On Friday, January 25, 2002 01:38:53 PM -0500 David Edelsohn 
<dje@watson.ibm.com> wrote:

>>>>>> Paolo Carlini writes:
>
>>> Honestly, I cannot imagine why the bulk of the C aliasing analysis
>>> machinery could not be used for it!!!!
>
> Mark,
>
> 	You originally contributed c_get_alias_set().  Would you please
> provide us with a reference to the proof showing that it is safe for C?

Your question is at best facetious.  You clearly intend to point
out that I did not provide a proof originally.  I did not, but that
doesn't make it right.  Part of the reason I am asking for a proof
now is that the original changes resulted in problems which a took
a while to untangle.  That was a mistake; one I have learned from.

In addition, the rules which I implemented were much simpler than
the sorts of things that can appear with overlaid class types in
C++.  I have already spent days debugging C++ alias set problems
(everything from vtables to multiple inheritance) so I am once-bitten
twice-shy in that regard.  There have been almost no alias set bugs
resulting in the generation of incorrect C code in ages -- much
better evidence than "this patch doesn't cause any regressions"
that the code is correct.

Furthermore, I did not ask for a proof that some piece of C code
was correct (as you do above), I asked for a proof that an
algorithm was correct.

Finally, you ignore the key point: either the proof is easy, or the
problem is hard.

None the less, I'm happy to provide a sketch.  I will do the version
without restrict (that was added later) and without the assignment of
alias sets to structs (Kenner added that later), and without
type-punning for unions (this is optional under ANSI/ISO C).

This is from memory; there might be minor mistakes.  Also note that
the code has changed considerably from my original version, which
makes it harder to see the structure.

1. The C aliasing rules say that if you reference memory using one
   type, you may not reference it using another types, unless:

   - The types are signed/unsigned variants of each other.

   - The types very only in their cv-qualification.

   - One of them is (possibly cv-qualified) "char".

2. Alias sets have the following semantics:

   - Two things in the same alias set may alias one another.

   - Things in two distinct alias sets may alias if one is
     a "subset" of another, under transitive closure.

   - All alias sets are a subset of a special alias set
     called "alias set zero".  (An immediate consequence is
     that something in alias set zero can alias everything.)

Let T be the set of all C types.  Let TA be a relation on TxT such
that (t1, t2) \in TA if and only if t and u may alias.

Similarly, let S be the set of all alias sets.  Let SA be relation on
SxS that (s1, s2) \in SA if and only if s1 and s2 may alias.  (Note
that in the original incarnation, there were no subsets other than
the fact that everything was a subset of alias set zero, so this
relation is well-defined statically.)

What we wish to prove is that C's lang_get_alias_set assigns
alias sets to type safely.  In particular, let f be
c_get_alias_set, and then:

Then, we wish to show that, for all t, u \in T:

   (t, u) \in TA \implies f(t), f(u)) \in SA

(We do not need if and only if for correctness.)

The proof is by induction.  All aggregate types are mapped
to alias set zero which aliases everything; therefore, we
need only consider non-aggregate types.  The code says:

  if (TREE_CODE (t) == INTEGER_TYPE && TREE_UNSIGNED (t))
    {
      tree t1 = signed_type (t);

      return get_alias_set (t1);
    }

Therefore, signed and unsigned variants of types get the same alias
set.

The code says:

  t = TYPE_MAIN_VARIANT (t);
  if (TYPE_P (t) && TYPE_ALIAS_SET_KNOWN_P (t))
    return TYPE_ALIAS_SET (t);

Therefore, if a cv-qualified type and its unqualified variant will
get the same alias set.  By transitivity, so will all cv-qualified
variants of the type.

The code says that:

  /* If this is a char *, the ANSI C standard says it can alias
     anything.  Note that all references need do this.  */
  if (TREE_CODE_CLASS (TREE_CODE (t)) == 'r'
      && TREE_CODE (TREE_TYPE (t)) == INTEGER_TYPE
      && TYPE_PRECISION (TREE_TYPE (t)) == TYPE_PRECISION (char_type_node))
    return 0;

Therefore, "char" is mapped to alias set zero, completing the proof.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 11:53         ` Joe Buck
@ 2002-01-25 12:09           ` Mark Mitchell
  2002-01-25 12:28             ` Paolo Carlini
                               ` (2 more replies)
  2002-01-25 12:10           ` Paolo Carlini
  1 sibling, 3 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-25 12:09 UTC (permalink / raw)
  To: Joe Buck, David Edelsohn; +Cc: Paolo Carlini, gcc

> Let's proceed in another way.  I think that, for the purpose of Dan's
> proof, he should be allowed to accept as a postulate that c_get_alias_set
> is correct when applied to C.  If this postulate is not acceptable,
> then it seems that if someone owes us a proof, it is Mark, not Daniel.
> However, c_get_alias_set has been there long enough that we have
> reasonable confidence in it.

Agreed.

> I think it's also fair to assume as a postulate that the current
> conservative cxx_get_alias_set is safe.  Again, long experience.

Agreed.

> cxx_get_alias_set ()
> {
> 	if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
> 		return 0;
> 	return c_get_alias_set();
> }
>

> While this is not a formal proof, it's enough to convince me that Daniel's
> change is safe.

Your proof has at least one bug.  A type that has no baseclasses or
virtuals can contain (as a data member) a type that does; such a type
is at least as complex as the contained type.  (Similarly, an array
of classes with virtual bases, etc.)  You need to recurse through the
type structure.

It may be that with that change, your sketch is correct -- but the
fact that you missed this point just makes me more nervous.  It
shows that smart people with long experience with C++ can get tripped
up here.

One of the problems with this kind of change is that it can result
in a bug in a one million line program that is seemingly irreducible,
and isn't even very easy to isolate.  We need to be very careful.

That said, if this change is safe, we should definitely do it.  It's
just that "this change causes no regressions" isn't close to being
good enough for this kind of change.

I certainly support C++ optimization improvements!

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 11:53         ` Joe Buck
  2002-01-25 12:09           ` Mark Mitchell
@ 2002-01-25 12:10           ` Paolo Carlini
  2002-01-25 13:16             ` Joe Buck
  2002-01-25 15:23             ` Daniel Berlin
  1 sibling, 2 replies; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25 12:10 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

Joe Buck wrote:

> cxx_get_alias_set ()
> {
>         if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
>                 return 0;
>         return c_get_alias_set();
> }

Should'nt this be in pseudocode, even more safely:

{
        if (AGGREGATE_TYPE && (HAS_BASECLASSES || HAS_VIRTUALS))
                return 0;
        return c_get_alias_set();
}

???

Otherwise, I'm all with your way of approaching the issue!!!

Cheers,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:09           ` Mark Mitchell
@ 2002-01-25 12:28             ` Paolo Carlini
  2002-01-25 13:49               ` Mark Mitchell
  2002-01-25 13:07             ` Joe Buck
  2002-01-25 15:13             ` Daniel Berlin
  2 siblings, 1 reply; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25 12:28 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

Mark Mitchell wrote:

> Your proof has at least one bug.  A type that has no baseclasses or
> virtuals can contain (as a data member) a type that does; such a type
> is at least as complex as the contained type.  (Similarly, an array
> of classes with virtual bases, etc.)  You need to recurse through the
> type structure.

Mark, is this really hard to implement??

My feeling is that this first improvement step should not be terribly difficult
to implement and also not terribly difficult to prove correct!

Cheers,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:09           ` Mark Mitchell
  2002-01-25 12:28             ` Paolo Carlini
@ 2002-01-25 13:07             ` Joe Buck
  2002-01-25 15:43               ` Daniel Berlin
  2002-01-25 15:13             ` Daniel Berlin
  2 siblings, 1 reply; 97+ messages in thread
From: Joe Buck @ 2002-01-25 13:07 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, David Edelsohn, Paolo Carlini, gcc

I wrote:

[ attempt at a proof that Daniel's right ]

Mark writes:
> Your proof has at least one bug.  A type that has no baseclasses or
> virtuals can contain (as a data member) a type that does; such a type
> is at least as complex as the contained type.  (Similarly, an array
> of classes with virtual bases, etc.)  You need to recurse through the
> type structure.

Good catch.  OK, Daniel, Mark has demonstrated that he was correct in
asking you for a proof: your proposal was wrong.

> It may be that with that change, your sketch is correct -- but the
> fact that you missed this point just makes me more nervous.  It
> shows that smart people with long experience with C++ can get tripped
> up here.

My many years of net experience demonstrate that the fastest way to make
progress is often to strongly assert something and wait for someone to
blow it up.  At the risk of looking bad twice: if no object at any level
in the aggregate has any virtual functions or baseclasses, it is safe
to apply the C aliasing rules, because for the purpose of aliasing, it
is just C (presence of nonvirtual member functions and protection
change nothing).  So, if we have an aggregate, we must (recursively)
check all members.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:10           ` Paolo Carlini
@ 2002-01-25 13:16             ` Joe Buck
  2002-01-25 15:23             ` Daniel Berlin
  1 sibling, 0 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25 13:16 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Joe Buck, gcc

Paolo wrote:

> Should'nt this be in pseudocode, even more safely:
> 
> {
>         if (AGGREGATE_TYPE && (HAS_BASECLASSES || HAS_VIRTUALS))
>                 return 0;
>         return c_get_alias_set();
> }

The new version would look like

/* return true iff, for purposes of layout, we have a C object.
 */

bool
cxx_simple_enough_type(type t)
{
	if (!AGGREGATE_TYPE(t))
		return true;
	if (HAS_BASECLASSES(t) || HAS_VIRTUALS(t))
		return false;
	for(each member m of t) {
		if (cxx_simple_enough_type(typeof(m)))
			return false;
	}
	return true;
}

alias_set*
cxx_get_alias_set(type t)
{
	if (!cxx_simple_enough_type(t))
		return 0;
	return c_get_alias_set();
}

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:28             ` Paolo Carlini
@ 2002-01-25 13:49               ` Mark Mitchell
  2002-01-25 14:19                 ` Joe Buck
  0 siblings, 1 reply; 97+ messages in thread
From: Mark Mitchell @ 2002-01-25 13:49 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: gcc



--On Friday, January 25, 2002 08:52:39 PM +0100 Paolo Carlini 
<pcarlini@unitus.it> wrote:

> Mark Mitchell wrote:
>
>> Your proof has at least one bug.  A type that has no baseclasses or
>> virtuals can contain (as a data member) a type that does; such a type
>> is at least as complex as the contained type.  (Similarly, an array
>> of classes with virtual bases, etc.)  You need to recurse through the
>> type structure.
>
> Mark, is this really hard to implement??

No!  You just have to think of it. :-)

> My feeling is that this first improvement step should not be terribly
> difficult to implement and also not terribly difficult to prove correct!

Agreed.  I think Joe's argument is a huge step forward; if we can just
work that through we'll probably get somewhere.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 13:49               ` Mark Mitchell
@ 2002-01-25 14:19                 ` Joe Buck
  2002-01-25 14:21                   ` Mark Mitchell
  0 siblings, 1 reply; 97+ messages in thread
From: Joe Buck @ 2002-01-25 14:19 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Paolo Carlini, gcc

Mark wrote:

> Agreed.  I think Joe's argument is a huge step forward; if we can just
> work that through we'll probably get somewhere.

OK, I'll try to do it once more in a clean way.

Postulates: the current c alias set mechanism is correct, and the current
c++ alias mechanism, while overly conservative, is correct

The proposed replacement is the following: an aggregate is simple_enough
if it has no baseclasses and no virtual functions and all of its members
are either non-aggregates or simple_enough.

if cxx_get_alias_set is applied to an aggregate that is not simple_enough,
it returns 0.  Otherwise it uses the C rules (c_get_alias_set).

Now, let's assume that there is a bug with this, meaning that two accesses
will falsely be reported not to alias with this new rule (sorry about the
double negative).  So, assume that we have program P in which this is
true.

As I said before, there is only one extra C++ operation that will allow
"type punning" that doesn't exist in C: casting between base classes and
derived classes.  But any derived class pointer or reference, once
constructed, will go into alias set 0.  Even if the derived class pointer
is created later, from the base pointer, the compiler will not assume that
it can use its knowledge of the base object to learn anything about the
extra members.

Further, we can replace all simple_enough aggregates with C structs:
turn member functions into functions, drop protection, insert constructor
and destructor calls as ordinary function calls.

Conclusion: if program P exists, we can convert the relevant parts of
the point where we miscompile it into a C program, and GCC will also
miscompile the C program.  (imagine passing it through, say, EDG's
or KAI's C++ to C compiler).






^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 14:19                 ` Joe Buck
@ 2002-01-25 14:21                   ` Mark Mitchell
  2002-01-25 15:41                     ` Neil Booth
  0 siblings, 1 reply; 97+ messages in thread
From: Mark Mitchell @ 2002-01-25 14:21 UTC (permalink / raw)
  To: Joe Buck; +Cc: Paolo Carlini, gcc

> Conclusion: if program P exists, we can convert the relevant parts of
> the point where we miscompile it into a C program, and GCC will also
> miscompile the C program.  (imagine passing it through, say, EDG's
> or KAI's C++ to C compiler).

OK, I like this argument.  *I* find it convincing, at least. :-)

Implicitly, you're assuming that a simple_enough C++ struct (minus
member functions and gunk) is also a C struct.  There's one pedantic
case where this is false: zero-sized structs are not part of ANSI/ISO
C89, but are part of C++.  But, they are supported in GNU C, so
your reduction still works.

Thanks for taking the time to work out the reduction explicitly.

Therefore, I'll pre-approve the patch, in concept -- but only for 3.2.
(I can't see how this is a "bug fix" by the criteria we've been using.)

So, after the branch, let's have a patch to do this.

Thanks,

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:09           ` Mark Mitchell
  2002-01-25 12:28             ` Paolo Carlini
  2002-01-25 13:07             ` Joe Buck
@ 2002-01-25 15:13             ` Daniel Berlin
  2 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 15:13 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, David Edelsohn, Paolo Carlini, gcc

On Fri, 25 Jan 2002, Mark Mitchell wrote:

> > Let's proceed in another way.  I think that, for the purpose of Dan's
> > proof, he should be allowed to accept as a postulate that c_get_alias_set
> > is correct when applied to C.  If this postulate is not acceptable,
> > then it seems that if someone owes us a proof, it is Mark, not Daniel.
> > However, c_get_alias_set has been there long enough that we have
> > reasonable confidence in it.
> 
> Agreed.
> 
> > I think it's also fair to assume as a postulate that the current
> > conservative cxx_get_alias_set is safe.  Again, long experience.
> 
> Agreed.
> 
> > cxx_get_alias_set ()
> > {
> > 	if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
> > 		return 0;
> > 	return c_get_alias_set();
> > }
> >
> 
> > While this is not a formal proof, it's enough to convince me that Daniel's
> > change is safe.
> 
> Your proof has at least one bug.  A type that has no baseclasses or
> virtuals can contain (as a data member) a type that does; such a type
> is at least as complex as the contained type.

However, that type would have alias set 0 already, because it would have 
failed the above test.  So any accesses to something inside of it would be 
assumed to alias everything else anyway.


--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:10           ` Paolo Carlini
  2002-01-25 13:16             ` Joe Buck
@ 2002-01-25 15:23             ` Daniel Berlin
  1 sibling, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 15:23 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Joe Buck, gcc

On Fri, 25 Jan 2002, Paolo Carlini wrote:

> Joe Buck wrote:
> 
> > cxx_get_alias_set ()
> > {
> >         if (AGGREGATE_TYPE && HAS_BASECLASSES && HAS_VIRTUALS)
> >                 return 0;
> >         return c_get_alias_set();
> > }
> 
> Should'nt this be in pseudocode, even more safely:
> 
> {
>         if (AGGREGATE_TYPE && (HAS_BASECLASSES || HAS_VIRTUALS))
>                 return 0;
>         return c_get_alias_set();
> }
> 
> ???
Yes, it should.

> 
> Otherwise, I'm all with your way of approaching the issue!!!
> 
> Cheers,
> Paolo.
> 
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 14:21                   ` Mark Mitchell
@ 2002-01-25 15:41                     ` Neil Booth
  2002-01-25 16:04                       ` Joe Buck
  0 siblings, 1 reply; 97+ messages in thread
From: Neil Booth @ 2002-01-25 15:41 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Paolo Carlini, gcc

Mark Mitchell wrote:-

> OK, I like this argument.  *I* find it convincing, at least. :-)

[...]

> So, after the branch, let's have a patch to do this.

Great; I think that Joe's "proof" should go in as a comment; both to
explain the ideas and subtleties behind the code, but also so we can
point at it and extract a similar argument from anyone who wants to
tighten the logic in future.

Neil.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 13:07             ` Joe Buck
@ 2002-01-25 15:43               ` Daniel Berlin
  2002-01-25 16:03                 ` Joe Buck
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: Mark Mitchell, David Edelsohn, Paolo Carlini, gcc

On Fri, 25 Jan 2002, Joe Buck wrote:

> I wrote:
> 
> [ attempt at a proof that Daniel's right ]
> 
> Mark writes:
> > Your proof has at least one bug.  A type that has no baseclasses or
> > virtuals can contain (as a data member) a type that does; such a type
> > is at least as complex as the contained type.  (Similarly, an array
> > of classes with virtual bases, etc.)  You need to recurse through the
> > type structure.
> 
> Good catch.  OK, Daniel, Mark has demonstrated that he was correct in
> asking you for a proof: your proposal was wrong.
Not quite, Mark has caught a case that is already handled properly.
It is impossible for the type involved as a member to have been given an 
alias set other than 0, by the same test.  So we would never improperly 
generate code because of it.
If you want to me to show both it, and the current type, would be placed 
in set 0:
The C machinery in question records component aliases of structs.
In recording component aliases of structs, it recursively walks the 
members, and gets alias sets for their types, recording them as subsets of 
the struct's alias set:
From record_component_aliases:
"
   for (field = TYPE_FIELDS (type); field != 0; field = TREE_CHAIN 
(field))
        if (TREE_CODE (field) == FIELD_DECL && ! DECL_NONADDRESSABLE_P 
(field))
          record_alias_subset (superset, get_alias_set (TREE_TYPE 
(field)));
"
Once we hit the tree type of the field in question, we'll record it as 
having alias set 0 (it will fail the test, or the same process above 
will repeat, if it has a member that is itself too complex), and then 
record our type as having alias set 0 as a subset.
That will cause record_alias_subset to mark the alias set as a child of 
zero (there's a flag to do this), which is commented as:

"
  /* Nonzero if would have a child of zero: this effectively makes 
this
     alias set the same as alias set zero.  */
"
Thus, no accesses to the type in question will be treated as not aliasing 
anything else.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 15:43               ` Daniel Berlin
@ 2002-01-25 16:03                 ` Joe Buck
  0 siblings, 0 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25 16:03 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Joe Buck, Mark Mitchell, David Edelsohn, Paolo Carlini, gcc

me:
> > Good catch.  OK, Daniel, Mark has demonstrated that he was correct in
> > asking you for a proof: your proposal was wrong.

Daniel:
> Not quite, Mark has caught a case that is already handled properly.
> It is impossible for the type involved as a member to have been given an 
> alias set other than 0, by the same test.  So we would never improperly 
> generate code because of it.

My analysis was based on the pseudocode, where this this was not obvious.

But if this is the case, there is no bug, so it seems that Mark's pre-approval
would apply to your patch, and it can go in (for 3.2).



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 15:41                     ` Neil Booth
@ 2002-01-25 16:04                       ` Joe Buck
  2002-01-25 17:37                         ` Paolo Carlini
                                           ` (2 more replies)
  0 siblings, 3 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25 16:04 UTC (permalink / raw)
  To: Neil Booth; +Cc: Mark Mitchell, Joe Buck, Paolo Carlini, gcc

> Mark Mitchell wrote:-
> 
> > OK, I like this argument.  *I* find it convincing, at least. :-)
> 
> [...]
> 
> > So, after the branch, let's have a patch to do this.
> 
> Great; I think that Joe's "proof" should go in as a comment; both to
> explain the ideas and subtleties behind the code, but also so we can
> point at it and extract a similar argument from anyone who wants to
> tighten the logic in future.

The argument can be extended to handle non-virtual single or multiple
inheritance: from the C point of view,

struct D : public A, public B {
	int m;
};

can be thought of as

struct D {
	struct A __base1;
	struct B __base2;
	int m;
};

and analyzed in the same way, meaning that D's alias set contains A, B,
and m.  But implementation-wise, that would mean that c_get_alias_set
would have to be told to treat base classes as members.





^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 16:04                       ` Joe Buck
@ 2002-01-25 17:37                         ` Paolo Carlini
  2002-01-25 18:10                         ` Daniel Berlin
  2002-01-27  5:11                         ` Mark Mitchell
  2 siblings, 0 replies; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25 17:37 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

Joe Buck wrote:

> The argument can be extended to handle non-virtual single or multiple
> inheritance: from the C point of view,
>
> struct D : public A, public B {
>         int m;
> };
>
> can be thought of as
>
> struct D {
>         struct A __base1;
>         struct B __base2;
>         int m;
> };
>
> and analyzed in the same way, meaning that D's alias set contains A, B,
> and m.  But implementation-wise, that would mean that c_get_alias_set
> would have to be told to treat base classes as members.

Great! Joe, *please* work on formalizing this: it would catch so *many* cases
present in real code!!

Cheers,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 16:04                       ` Joe Buck
  2002-01-25 17:37                         ` Paolo Carlini
@ 2002-01-25 18:10                         ` Daniel Berlin
  2002-01-27  5:11                         ` Mark Mitchell
  2 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 18:10 UTC (permalink / raw)
  To: Joe Buck; +Cc: Neil Booth, Mark Mitchell, Paolo Carlini, gcc


On Friday, January 25, 2002, at 06:23  PM, Joe Buck wrote:

>> Mark Mitchell wrote:-
>>
>>> OK, I like this argument.  *I* find it convincing, at least. :-)
>>
>> [...]
>>
>>> So, after the branch, let's have a patch to do this.
>>
>> Great; I think that Joe's "proof" should go in as a comment; both to
>> explain the ideas and subtleties behind the code, but also so we can
>> point at it and extract a similar argument from anyone who wants to
>> tighten the logic in future.
>
> The argument can be extended to handle non-virtual single or multiple
> inheritance: from the C point of view,
>
> struct D : public A, public B {
> 	int m;
> };
>
> can be thought of as
>
> struct D {
> 	struct A __base1;
> 	struct B __base2;
> 	int m;
> };
>
> and analyzed in the same way, meaning that D's alias set contains A, B,
> and m.  But implementation-wise, that would mean that c_get_alias_set
> would have to be told to treat base classes as members.

But I wonder if it's worth it without going whole hog at that point (and 
handling all cases properly). It stands to reason that the  more 
inheritance you have, the likelier you are to have some base class or 
member not meeting the standard, causing us to give up, and mark the 
whole thing as alias set 0.
I wonder how quick the gains diminish (IE how common it is to have 
inheritance without the other features we don't support).
--Dan
>
>
>
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:05         ` Mark Mitchell
@ 2002-01-25 22:14           ` Daniel Berlin
  2002-01-26  3:46             ` Mark Mitchell
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 22:14 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: David Edelsohn, Paolo Carlini, gcc

> 
> None the less, I'm happy to provide a sketch.  I will do the version
> without restrict (that was added later) and without the assignment of
> alias sets to structs (Kenner added that later), and without
> type-punning for unions (this is optional under ANSI/ISO C).
> 
> This is from memory; there might be minor mistakes.  Also note that
> the code has changed considerably from my original version, which
> makes it harder to see the structure.
> 
> 1. The C aliasing rules say that if you reference memory using one
>    type, you may not reference it using another types, unless:
> 
>    - The types are signed/unsigned variants of each other.
> 
>    - The types very only in their cv-qualification.
> 
>    - One of them is (possibly cv-qualified) "char"
The void * type, as well, because it says "A type compatible with the 
effective type of the object." and "   [#1] A pointer to void may be 
converted to or from a pointer
        to  any  incomplete  or  object  type.   A  pointer  to  any
        incomplete or object type may be converted to a  pointer  to
        void  and  back again; the result shall compare equal to the
        original pointer." and "  [#2] Conversion of an operand value  to  
a  compatible  type  causes no change to the value or the 
representation." and "[#2] For two pointer types to be compatible, both  
shall  be identically   qualified   and  both  shall  be  pointers  to 
compatible types."
Which seems to imply that the void * type can legally alias anything, 
because it's compatible with everything (though it only appears to have 
all the properties of compatible types, i can't find a specific phrase to 
support that it *is* a compatible type of everything )

Pro64 (I haven't looked at other compilers) seems to agree with me too:

*   C.1: (ANSI Rules)
*
*     An object shall have its stored value accessed only by an lvalue 
that
*     has one of the following types:
*     *) the declared type of the object,
*     *) a qualified version of the declared type of the object,
*     *) a type that is signed or unsigned type corresponding to the 
declared type
*        of the object,
*     *) a type that is signed or unsigned type corresponding to a
*        qualified version of the declared type of the object,
*     *) an aggregrate or union type that includes one of the 
aforementioned types
*        among its members (including, recursively, a member of a 
subaggregate
*        or contained union),
*     *) a character type, or
*     *) a void type.
*
^^^^^^^^^^^^^^^^^^^^^
*     Use the Ragnarok interpretation here.  Objects are aliased if
*     their base types (MTYPES), after stripping off the qualifiers and
*     signed-ness, are equal.  See ANSI C 3.3 and 3.2.2.3.
*
*   C.2: (C Qualifier Rule)
*
*     C.2.1: (restricted pointer)
*       If both memory operations are restricted pointer dereference,
*       they are not aliased if their based pointer are different.
*


> 
> 2. Alias sets have the following semantics:
> 
>    - Two things in the same alias set may alias one another.
> 
>    - Things in two distinct alias sets may alias if one is
>      a "subset" of another, under transitive closure.
> 
>    - All alias sets are a subset of a special alias set
>      called "alias set zero".  (An immediate consequence is
>      that something in alias set zero can alias everything.)
> 
> Let T be the set of all C types.  Let TA be a relation on TxT such
> that (t1, t2) \in TA if and only if t and u may alias.
> 
> Similarly, let S be the set of all alias sets.  Let SA be relation on
> SxS that (s1, s2) \in SA if and only if s1 and s2 may alias.  (Note
> that in the original incarnation, there were no subsets other than
> the fact that everything was a subset of alias set zero, so this
> relation is well-defined statically.)
> 
> What we wish to prove is that C's lang_get_alias_set assigns
> alias sets to type safely.  In particular, let f be
> c_get_alias_set, and then:
> 
> Then, we wish to show that, for all t, u \in T:
> 
>    (t, u) \in TA \implies f(t), f(u)) \in SA
> 
> (We do not need if and only if for correctness.)
> 
> The proof is by induction.  All aggregate types are mapped
> to alias set zero which aliases everything; therefore, we
> need only consider non-aggregate types.  The code says:
> 
>   if (TREE_CODE (t) == INTEGER_TYPE && TREE_UNSIGNED (t))
>     {
>       tree t1 = signed_type (t);
> 
>       return get_alias_set (t1);
>     }
> 
> Therefore, signed and unsigned variants of types get the same alias
> set.
> 
> The code says:
> 
>   t = TYPE_MAIN_VARIANT (t);
>   if (TYPE_P (t) && TYPE_ALIAS_SET_KNOWN_P (t))
>     return TYPE_ALIAS_SET (t);
> 
> Therefore, if a cv-qualified type and its unqualified variant will
> get the same alias set.  By transitivity, so will all cv-qualified
> variants of the type.
> 
> The code says that:
> 
>   /* If this is a char *, the ANSI C standard says it can alias
>      anything.  Note that all references need do this.  */
>   if (TREE_CODE_CLASS (TREE_CODE (t)) == 'r'
>       && TREE_CODE (TREE_TYPE (t)) == INTEGER_TYPE
>       && TYPE_PRECISION (TREE_TYPE (t)) == TYPE_PRECISION (char_type_node))
>     return 0;
> 
> Therefore, "char" is mapped to alias set zero, completing the proof.

What about void * types, we never seem to handle that specifically?



> 
> --
> Mark Mitchell                   mark@codesourcery.com
> CodeSourcery, LLC               http://www.codesourcery.com
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 22:14           ` Daniel Berlin
@ 2002-01-26  3:46             ` Mark Mitchell
  0 siblings, 0 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-26  3:46 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: David Edelsohn, Paolo Carlini, gcc



--On Friday, January 25, 2002 09:19:43 PM -0500 Daniel Berlin 
<dan@dberlin.org> wrote:

>>
>> None the less, I'm happy to provide a sketch.  I will do the version
>> without restrict (that was added later) and without the assignment of
>> alias sets to structs (Kenner added that later), and without
>> type-punning for unions (this is optional under ANSI/ISO C).
>>
>> This is from memory; there might be minor mistakes.  Also note that
>> the code has changed considerably from my original version, which
>> makes it harder to see the structure.
>>
>> 1. The C aliasing rules say that if you reference memory using one
>>    type, you may not reference it using another types, unless:
>>
>>    - The types are signed/unsigned variants of each other.
>>
>>    - The types very only in their cv-qualification.
>>
>>    - One of them is (possibly cv-qualified) "char"

The alias set routines in question refer to the type of the
storage accessed, not the type of a pointer to that storage.

There is no storage of type "void", so there need be no special
handling for this case.

The type "void *" is not compatible with, say "int *", because
"void" is not compatible with "int".

> Which seems to imply that the void * type can legally alias anything,

This doesn't make sense; "void *" cannot alias "double".  A
"void *" can point to a "double", but a "void *" and a "double" cannot
occupy the same storage.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 16:04                       ` Joe Buck
  2002-01-25 17:37                         ` Paolo Carlini
  2002-01-25 18:10                         ` Daniel Berlin
@ 2002-01-27  5:11                         ` Mark Mitchell
  2002-01-27  5:34                           ` Daniel Berlin
  2 siblings, 1 reply; 97+ messages in thread
From: Mark Mitchell @ 2002-01-27  5:11 UTC (permalink / raw)
  To: Joe Buck, Neil Booth; +Cc: Paolo Carlini, gcc

> The argument can be extended to handle non-virtual single or multiple
> inheritance: from the C point of view,

This isn't obvious yet.

You have to at least discuss zero-sized base classes and whether or not
GNU C handles them in the same way -- including cases like this:

  struct A {};
  struct B : public A {};
  struct C : public A {};
  struct D : public B, C {};

Here, D has size two to avoid having two A's at the same address.
If C did not derive from A, D would have size one.  In GNU C, does:

  struct A {};
  struct B { struct A __base1; };
  struct C { struct A __base1; };
  struct D {
     struct B __base1;
     struct C __base2;
  };

have size two?

Does the variant where C is empty have size 1?

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-27  5:11                         ` Mark Mitchell
@ 2002-01-27  5:34                           ` Daniel Berlin
  2002-01-28 10:39                             ` Joe Buck
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-27  5:34 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Neil Booth, Paolo Carlini, gcc

On Sat, 26 Jan 2002, Mark Mitchell wrote:

> > The argument can be extended to handle non-virtual single or multiple
> > inheritance: from the C point of view,
> 
> This isn't obvious yet.
> 
> You have to at least discuss zero-sized base classes and whether or not
> GNU C handles them in the same way -- including cases like this:
> 
>   struct A {};
>   struct B : public A {};
>   struct C : public A {};
>   struct D : public B, C {};
> 
> Here, D has size two to avoid having two A's at the same address.
> If C did not derive from A, D would have size one.  In GNU C, does:
> 
>   struct A {};
>   struct B { struct A __base1; };
>   struct C { struct A __base1; };
>   struct D {
>      struct B __base1;
>      struct C __base2;
>   };
> 
> have size two?
> 
No, it has size 0.
> Does the variant where C is empty have size 1?
Nope.

However, this will just cause us to be more conservative than necessary. 
We'll think they have the same address, and thus, alias, rather than say 
they *don't* alias.

Right?



--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-27  5:34                           ` Daniel Berlin
@ 2002-01-28 10:39                             ` Joe Buck
  2002-01-28 10:51                               ` Joe Buck
  2002-01-28 15:59                               ` Mark Mitchell
  0 siblings, 2 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-28 10:39 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Mark Mitchell, Joe Buck, Neil Booth, Paolo Carlini, gcc

 
On Sat, 26 Jan 2002, Mark Mitchell wrote:
> > > The argument can be extended to handle non-virtual single or multiple
> > > inheritance: from the C point of view,

Daniel Berlin writes:
> > This isn't obvious yet.
> > 
> > You have to at least discuss zero-sized base classes and whether or not
> > GNU C handles them in the same way -- including cases like this:
> > 
> >   struct A {};
> >   struct B : public A {};
> >   struct C : public A {};
> >   struct D : public B, C {};
> > 
> > Here, D has size two to avoid having two A's at the same address.
> > If C did not derive from A, D would have size one.  In GNU C, does:
> > 
> >   struct A {};
> >   struct B { struct A __base1; };
> >   struct C { struct A __base1; };
> >   struct D {
> >      struct B __base1;
> >      struct C __base2;
> >   };
> > 
> > have size two?
> > 
> No, it has size 0.
> > Does the variant where C is empty have size 1?
> Nope.

But the C aliasing rules completely ignore the question of size, so you're
bringing up a red herring.  The rules have *nothing* to do with whether we
think that two objects have the same address!  It doesn't even come up.
For that reason, you don't have to waste any time thinking about class
layout.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 10:39                             ` Joe Buck
@ 2002-01-28 10:51                               ` Joe Buck
  2002-01-28 15:59                               ` Mark Mitchell
  1 sibling, 0 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-28 10:51 UTC (permalink / raw)
  To: Joe Buck
  Cc: Daniel Berlin, Mark Mitchell, Joe Buck, Neil Booth, Paolo Carlini, gcc

I wrote:

> But the C aliasing rules completely ignore the question of size, so you're
> bringing up a red herring.  The rules have *nothing* to do with whether we
> think that two objects have the same address!  It doesn't even come up.
> For that reason, you don't have to waste any time thinking about class
> layout.

This is true for the alias-sets computation: it depends only on types.
However, if we have more information about two accesses that allow
us to prove that they never overlap in memory, we can use that information
as well, and here the memory layout may make a difference.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 10:39                             ` Joe Buck
  2002-01-28 10:51                               ` Joe Buck
@ 2002-01-28 15:59                               ` Mark Mitchell
  2002-01-28 17:11                                 ` Daniel Berlin
  2002-01-28 17:18                                 ` Joe Buck
  1 sibling, 2 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-28 15:59 UTC (permalink / raw)
  To: Joe Buck, Daniel Berlin; +Cc: Neil Booth, Paolo Carlini, gcc


[ zero-sized based class example elided ]

>
> But the C aliasing rules completely ignore the question of size, so you're
> bringing up a red herring.  The rules have *nothing* to do with whether we
> think that two objects have the same address!  It doesn't even come up.
> For that reason, you don't have to waste any time thinking about class
> layout.
>

Joe's argument is a reduction from C++ to C; the claim is that (basically)
the C++ alias rules are the same as C and that certain C++ classes are
laid out like certain C structs and that there are no additional
"dangerous" operations and therefore correctness for C implies correctness
for C++.

The argument doesn't work in this case -- the proposed isomorphism from
C++ classes to C structs is not correct.

Joe's argument either needs to handle zero-sized classes in some other
way (arguing that they're not dangerous for some other reason), or it
needs to not apply to class hierarchives that involve zero-sized bases,
treating such things as not simple_enough.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 15:59                               ` Mark Mitchell
@ 2002-01-28 17:11                                 ` Daniel Berlin
  2002-01-28 17:28                                   ` Joe Buck
  2002-01-28 17:18                                 ` Joe Buck
  1 sibling, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-28 17:11 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Neil Booth, Paolo Carlini, gcc

On Mon, 28 Jan 2002, Mark Mitchell wrote:

> 
> [ zero-sized based class example elided ]
> 
> >
> > But the C aliasing rules completely ignore the question of size, so you're
> > bringing up a red herring.  The rules have *nothing* to do with whether we
> > think that two objects have the same address!  It doesn't even come up.
> > For that reason, you don't have to waste any time thinking about class
> > layout.
> >
> 
> Joe's argument is a reduction from C++ to C; the claim is that (basically)
> the C++ alias rules are the same as C and that certain C++ classes are
> laid out like certain C structs and that there are no additional
> "dangerous" operations and therefore correctness for C implies correctness
> for C++.

Um, that's Joe talking, i think we are getting attribution messed up. 
Unless you really meant to talk about Joe as "Joe's argument", rather than 
"His argument".


> 
> The argument doesn't work in this case -- the proposed isomorphism from
> C++ classes to C structs is not correct.

> 
> Joe's argument either needs to handle zero-sized classes in some other
> way (arguing that they're not dangerous for some other reason), or it
> needs to not apply to class hierarchives that involve zero-sized bases,
> treating such things as not simple_enough.

*I* already argued that they aren't dangerous for another reason.  In C, 
since they would have the same address, they would alias.  In reality, 
in C++, they *don't* alias because they *don't* have the same address.  
However, it's perfectly fine to say things alias that don't.
Thus, it's not dangerous.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 15:59                               ` Mark Mitchell
  2002-01-28 17:11                                 ` Daniel Berlin
@ 2002-01-28 17:18                                 ` Joe Buck
  2002-01-28 18:05                                   ` Mark Mitchell
  1 sibling, 1 reply; 97+ messages in thread
From: Joe Buck @ 2002-01-28 17:18 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Daniel Berlin, Neil Booth, Paolo Carlini, gcc

I wrote:
> > But the C aliasing rules completely ignore the question of size, so you're
> > bringing up a red herring.  The rules have *nothing* to do with whether we
> > think that two objects have the same address!  It doesn't even come up.
> > For that reason, you don't have to waste any time thinking about class
> > layout.

Mark writes:
> Joe's argument is a reduction from C++ to C; the claim is that (basically)
> the C++ alias rules are the same as C and that certain C++ classes are
> laid out like certain C structs and that there are no additional
> "dangerous" operations and therefore correctness for C implies correctness
> for C++.
> 
> The argument doesn't work in this case -- the proposed isomorphism from
> C++ classes to C structs is not correct.

But for the purpose of isomorphism one can ignore features that don't
matter.  Consider a translation from C++ to C that does not create
zero-sized structs.  A C++ implementation is free to waste space.  Such a
translation is legal and will produce a standard C program.  C's aliasing
analysis will work on such a program, correct?

Now, it seems you are telling me that you fear that the aliasing rules
could change if we optimize the implementation to not waste space,
effectively making some fields that originally had size > 0 now have
size 0.  But this cannot be the case, because both the C and the C++
aliasing rules are based solely on the types, not on the layouts.

> Joe's argument either needs to handle zero-sized classes in some other
> way (arguing that they're not dangerous for some other reason), or it
> needs to not apply to class hierarchives that involve zero-sized bases,
> treating such things as not simple_enough.

The basic argument is this: I am allowed to implement C++ in a way that
zero-sized classes don't exist (they always come out as at least one
byte).  For such implementations I can clearly use the C rules, as there
is an equivalent C program.  If I then replace the inefficient
implementation with an efficient one, the language specification does not
care, for the purpose of aliasing.  The type-based aliasing rules are
independent of implementation, so if they come out correctly for the
inefficient implementation, we get the same answer for the efficient one.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 17:11                                 ` Daniel Berlin
@ 2002-01-28 17:28                                   ` Joe Buck
  2002-01-28 18:14                                     ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Joe Buck @ 2002-01-28 17:28 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Mark Mitchell, Joe Buck, Neil Booth, Paolo Carlini, gcc

Daniel writes:
> *I* already argued that they aren't dangerous for another reason.  In C, 
> since they would have the same address, they would alias.  In reality, 
> in C++, they *don't* alias because they *don't* have the same address.  
> However, it's perfectly fine to say things alias that don't.
> Thus, it's not dangerous.

Excuse me, but the whole point of strict aliasing is that certain things
are treated by the compiler based solely on their types, even if they have
the same address (for such cases it is considered a user bug).  For example,
an access to an int can never alias an access to a float, period.  If a
user writes code that violates this rule, it is not valid ISO C or C++,
and they need to write -fno-strict-aliasing if they want GCC to avoid
"miscompiling".

Thus for any discussion of cxx_get_alias_set, addresses DON'T MATTER.
All that matters are types and nothing else.

Now, if the compiler can separately determine that two accesses can never
have overlapping addresses, it can assume that these accesses do not
alias.  But that needs to be handled elsewhere in the compiler, and not
by xxx_get_alias_set.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 19:33                                       ` Mark Mitchell
@ 2002-01-28 17:40                                         ` Daniel Berlin
  2002-01-28 21:55                                           ` Daniel Berlin
  2002-01-28 22:02                                         ` Alexandre Oliva
  1 sibling, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-28 17:40 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Neil Booth, Paolo Carlini, gcc

On Mon, 28 Jan 2002, Mark Mitchell wrote:

> 
> 
> --On Monday, January 28, 2002 05:11:02 PM -0800 Joe Buck 
> <jbuck@synopsys.COM> wrote:
> 
> >
> >> > The basic argument is this: I am allowed to implement C++ in a way that
> >> > zero-sized classes don't exist (they always come out as at least one
> >> > byte).  For such implementations I can clearly use the C rules, as
> >> > there is an equivalent C program.  If I then replace the inefficient
> >>
> >> [Implicitly, we're still in the simple_enough case here, i.e., no
> >> virtuals, etc.  Perhaps the argument applies more generally, but we
> >> don't know yet.]
> >
> >> OK, you've convinced me to extend simple_enough to the cases with
> >> zero-sized thingies, so long as the other conditions still hold.
> >
> > We can handle inheritance as well if the zero-sized object is OK, since
> > the C translation is to make the base class look like a member of the
> > derived class.  This will allow us to do better on most STL iterators;
> > many are derived classes but they have no virtual functions.
> 
> I agree; single, non-virtual inheritance should be OK.  In fact, even
> multiple inheritance should be OK, as long as there is no virtual
> inheritance anywhere in the hierarchy.
> 
> Once we introduce vtables, more complicated things can happen.  (We
> have to make sure that the vptr fields are in the same alias set.
> We may already do this; I remember fixing some problem like this
> at some point.)  If we have virtual bases, things are very complex;
> layouts are semi-dynamic.  It may be that it "just works", but I
> will continue to play devil's advocate.  You can continu to make
> persusasive arguments for safety. :-)
> 
> I remember another problem with base classes: they don't appear in
> the list of FIELDS on the class.  (The FIELD_DECLS are made and then
> thrown away since they are no longer needed.)

>  Therefore, the stock
> back-end record_component_aliases stuff won't work.  (This is not
> an algorithmic issue, like the ones we've been discussing before;
> this is a nitty-gritty here-we-are-in-GCC sort of thing.)

However, nowadays, the tree field you need isn't only part of the C++ 
frontend (IE it's common to tree.h).

Something like:

 /* Recursively record aliases for the base classes, if there are any */
    if (TYPE_BINFO_BASETYPES (t) != NULL)
      {
        for (i = 0; i < TREE_VEC_LENGTH (TYPE_BINFO_BASETYPES (t)); i++)
        {
          binfo = TREE_VEC_ELT (TYPE_BINFO_BASETYPES (t), i);
          record_alias_subset (<whatever>, get_alias_set (BINFO_TYPE (binfo)));
        }
      }
Ought to do it.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 17:18                                 ` Joe Buck
@ 2002-01-28 18:05                                   ` Mark Mitchell
  2002-01-28 18:50                                     ` Joe Buck
  0 siblings, 1 reply; 97+ messages in thread
From: Mark Mitchell @ 2002-01-28 18:05 UTC (permalink / raw)
  To: Joe Buck; +Cc: Daniel Berlin, Neil Booth, Paolo Carlini, gcc

> The basic argument is this: I am allowed to implement C++ in a way that
> zero-sized classes don't exist (they always come out as at least one
> byte).  For such implementations I can clearly use the C rules, as there
> is an equivalent C program.  If I then replace the inefficient

[Implicitly, we're still in the simple_enough case here, i.e., no virtuals,
etc.  Perhaps the argument applies more generally, but we don't know yet.]

> implementation with an efficient one, the language specification does not
> care, for the purpose of aliasing.  The type-based aliasing rules are
> independent of implementation, so if they come out correctly for the
> inefficient implementation, we get the same answer for the efficient one.

OK, I'll buy that.

I think we need to make explicit Kenner's point; namely that our alias
sets do not talk about addresses.  In particular, Kenner's semantics
for alias sets allow the case that two MEMs do not alias (because they
are in different alias sets), but that the addresses of those things are
the same.  Therefore, the compiler is *not* allowed to turn the
following pseudo-code:

  if ((void*) x != (void*)y)
    abort();

into:

  abort();

even if it knows that "*x" and "*y" do not alias.  (That makes sense
because:

  int i;
  int *ip = &i;
  short *sp = (short*) &i;

  if ((void*) ip != (void*) sp)
    abort();

is well-defined -- assuming typical implementation definitions
regarding pointer widths, conversions, etc. -- and does not call
abort.)

We need this fact to make your proof safe -- but it is indeed the
right semantics and it is how we use alias sets at present.  We
should just put this into the comments in alias.c somewhere.

OK, you've convinced me to extend simple_enough to the cases with
zero-sized thingies, so long as the other conditions still hold.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 17:28                                   ` Joe Buck
@ 2002-01-28 18:14                                     ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-28 18:14 UTC (permalink / raw)
  To: Joe Buck; +Cc: Mark Mitchell, Neil Booth, Paolo Carlini, gcc

On Mon, 28 Jan 2002, Joe Buck wrote:

> Daniel writes:
> > *I* already argued that they aren't dangerous for another reason.  In C, 
> > since they would have the same address, they would alias.  In reality, 
> > in C++, they *don't* alias because they *don't* have the same address.  
> > However, it's perfectly fine to say things alias that don't.
> > Thus, it's not dangerous.
> 
> Excuse me, but the whole point of strict aliasing is that certain things
> are treated by the compiler based solely on their types, even if they have
> the same address (for such cases it is considered a user bug).  For example,
> an access to an int can never alias an access to a float, period.  If a
> user writes code that violates this rule, it is not valid ISO C or C++,
> and they need to write -fno-strict-aliasing if they want GCC to avoid
> "miscompiling".
> 
Right, but this is why i said they weren't dangerous for "another" 
reason.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 18:05                                   ` Mark Mitchell
@ 2002-01-28 18:50                                     ` Joe Buck
  2002-01-28 19:33                                       ` Mark Mitchell
  0 siblings, 1 reply; 97+ messages in thread
From: Joe Buck @ 2002-01-28 18:50 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Daniel Berlin, Neil Booth, Paolo Carlini, gcc


> > The basic argument is this: I am allowed to implement C++ in a way that
> > zero-sized classes don't exist (they always come out as at least one
> > byte).  For such implementations I can clearly use the C rules, as there
> > is an equivalent C program.  If I then replace the inefficient
> 
> [Implicitly, we're still in the simple_enough case here, i.e., no virtuals,
> etc.  Perhaps the argument applies more generally, but we don't know yet.]

> OK, you've convinced me to extend simple_enough to the cases with
> zero-sized thingies, so long as the other conditions still hold.

We can handle inheritance as well if the zero-sized object is OK, since
the C translation is to make the base class look like a member of the
derived class.  This will allow us to do better on most STL iterators;
many are derived classes but they have no virtual functions.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 18:50                                     ` Joe Buck
@ 2002-01-28 19:33                                       ` Mark Mitchell
  2002-01-28 17:40                                         ` Daniel Berlin
  2002-01-28 22:02                                         ` Alexandre Oliva
  0 siblings, 2 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-28 19:33 UTC (permalink / raw)
  To: Joe Buck; +Cc: Daniel Berlin, Neil Booth, Paolo Carlini, gcc



--On Monday, January 28, 2002 05:11:02 PM -0800 Joe Buck 
<jbuck@synopsys.COM> wrote:

>
>> > The basic argument is this: I am allowed to implement C++ in a way that
>> > zero-sized classes don't exist (they always come out as at least one
>> > byte).  For such implementations I can clearly use the C rules, as
>> > there is an equivalent C program.  If I then replace the inefficient
>>
>> [Implicitly, we're still in the simple_enough case here, i.e., no
>> virtuals, etc.  Perhaps the argument applies more generally, but we
>> don't know yet.]
>
>> OK, you've convinced me to extend simple_enough to the cases with
>> zero-sized thingies, so long as the other conditions still hold.
>
> We can handle inheritance as well if the zero-sized object is OK, since
> the C translation is to make the base class look like a member of the
> derived class.  This will allow us to do better on most STL iterators;
> many are derived classes but they have no virtual functions.

I agree; single, non-virtual inheritance should be OK.  In fact, even
multiple inheritance should be OK, as long as there is no virtual
inheritance anywhere in the hierarchy.

Once we introduce vtables, more complicated things can happen.  (We
have to make sure that the vptr fields are in the same alias set.
We may already do this; I remember fixing some problem like this
at some point.)  If we have virtual bases, things are very complex;
layouts are semi-dynamic.  It may be that it "just works", but I
will continue to play devil's advocate.  You can continu to make
persusasive arguments for safety. :-)

I remember another problem with base classes: they don't appear in
the list of FIELDS on the class.  (The FIELD_DECLS are made and then
thrown away since they are no longer needed.)  Therefore, the stock
back-end record_component_aliases stuff won't work.  (This is not
an algorithmic issue, like the ones we've been discussing before;
this is a nitty-gritty here-we-are-in-GCC sort of thing.)

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 17:40                                         ` Daniel Berlin
@ 2002-01-28 21:55                                           ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-28 21:55 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Neil Booth, Paolo Carlini, gcc

On Mon, 28 Jan 2002, Mark Mitchell wrote:

> 
> 
> --On Monday, January 28, 2002 05:11:02 PM -0800 Joe Buck 
> <jbuck@synopsys.COM> wrote:
> 
> >
> >> > The basic argument is this: I am allowed to implement C++ in a way that
> >> > zero-sized classes don't exist (they always come out as at least one
> >> > byte).  For such implementations I can clearly use the C rules, as
> >> > there is an equivalent C program.  If I then replace the inefficient
> >>
> >> [Implicitly, we're still in the simple_enough case here, i.e., no
> >> virtuals, etc.  Perhaps the argument applies more generally, but we
> >> don't know yet.]
> >
> >> OK, you've convinced me to extend simple_enough to the cases with
> >> zero-sized thingies, so long as the other conditions still hold.
> >
> > We can handle inheritance as well if the zero-sized object is OK, since
> > the C translation is to make the base class look like a member of the
> > derived class.  This will allow us to do better on most STL iterators;
> > many are derived classes but they have no virtual functions.
> 
> I agree; single, non-virtual inheritance should be OK.  In fact, even
> multiple inheritance should be OK, as long as there is no virtual
> inheritance anywhere in the hierarchy.
> 
> Once we introduce vtables, more complicated things can happen.  (We
> have to make sure that the vptr fields are in the same alias set.
> We may already do this; I remember fixing some problem like this
> at some point.)  If we have virtual bases, things are very complex;
> layouts are semi-dynamic.  It may be that it "just works", but I
> will continue to play devil's advocate.  You can continu to make
> persusasive arguments for safety. :-)
> 
> I remember another problem with base classes: they don't appear in
> the list of FIELDS on the class.  (The FIELD_DECLS are made and then
> thrown away since they are no longer needed.)

>  Therefore, the stock
> back-end record_component_aliases stuff won't work.  (This is not
> an algorithmic issue, like the ones we've been discussing before;
> this is a nitty-gritty here-we-are-in-GCC sort of thing.)

However, nowadays, the tree field you need isn't only part of the C++ 
frontend (IE it's common to tree.h).

Something like:

 /* Recursively record aliases for the base classes, if there are any */
    if (TYPE_BINFO_BASETYPES (t) != NULL)
      {
        for (i = 0; i < TREE_VEC_LENGTH (TYPE_BINFO_BASETYPES (t)); i++)
        {
          binfo = TREE_VEC_ELT (TYPE_BINFO_BASETYPES (t), i);
          record_alias_subset (<whatever>, get_alias_set (BINFO_TYPE (binfo)));
        }
      }
Ought to do it.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 19:33                                       ` Mark Mitchell
  2002-01-28 17:40                                         ` Daniel Berlin
@ 2002-01-28 22:02                                         ` Alexandre Oliva
  2002-01-28 22:12                                           ` Mark Mitchell
  1 sibling, 1 reply; 97+ messages in thread
From: Alexandre Oliva @ 2002-01-28 22:02 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Joe Buck, Daniel Berlin, Neil Booth, Paolo Carlini, gcc

On Jan 28, 2002, Mark Mitchell <mark@codesourcery.com> wrote:

> Once we introduce vtables, more complicated things can happen.  (We
> have to make sure that the vptr fields are in the same alias set.
> We may already do this; I remember fixing some problem like this
> at some point.)  If we have virtual bases, things are very complex;
> layouts are semi-dynamic.  It may be that it "just works", but I
> will continue to play devil's advocate.  You can continu to make
> persusasive arguments for safety. :-)

It can still be mapped to C struct, given appropriate types for the
vptr fields, right?  So Joe's point still holds.  We may have to be
careful about how we assign aliases to these hidden fields, but it's
not like they are not representable in C :-)

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer                  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicamp        oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist    *Please* write to mailing lists, not to me

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-28 22:02                                         ` Alexandre Oliva
@ 2002-01-28 22:12                                           ` Mark Mitchell
  0 siblings, 0 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-28 22:12 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Joe Buck, Daniel Berlin, Neil Booth, Paolo Carlini, gcc

>
> It can still be mapped to C struct, given appropriate types for the
> vptr fields, right?  So Joe's point still holds.  We may have to be
> careful about how we assign aliases to these hidden fields, but it's
> not like they are not representable in C :-)

Yes, probably.

I'm nervous about assigning alias sets to class with virtual bases
because the layout changes when that class is derived from.  So, when
you say "*p" where "p" points to some class with virtual bases, you
have to make sure to be conservative; we have to figure out what all
*p might involve.  I guess that if we just recursively include
everything in all the virtual bases we are probably OK.  Sometimes,
that stuff won't be there, but that's OK; that will only yield false
positives, which are safe.

So, I'm gradually coming to the conclusion that some smarter version
of record_component_aliases should work; it just needs to be very
careful to go through everything that that appears in all the bases
and members.

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-27 17:04               ` Dan Nicolaescu
@ 2002-01-27 17:59                 ` Paolo Carlini
  0 siblings, 0 replies; 97+ messages in thread
From: Paolo Carlini @ 2002-01-27 17:59 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: gcc

Dan Nicolaescu wrote:

> We can relax the checks ss soon as we convinve ourselves that more
> cases are acceptable.
> Is a patch like this acceptable at this point, or should I wait until
> 3.1 branches?

Hi Dan,

it's nice that, after some many words, you are willing to actually implement
what Daniel, Joe and the others have proposed and discussed!

However, I think that you should wait a little more:

    http://gcc.gnu.org/ml/gcc/2002-01/msg01685.html

Please, post some actual testsuite and benchmarking results as soon as you have
them!

Cheers,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 20:22             ` Joe Buck
  2002-01-25 23:59               ` Daniel Berlin
@ 2002-01-27 17:04               ` Dan Nicolaescu
  2002-01-27 17:59                 ` Paolo Carlini
  1 sibling, 1 reply; 97+ messages in thread
From: Dan Nicolaescu @ 2002-01-27 17:04 UTC (permalink / raw)
  To: Joe Buck; +Cc: Mark Mitchell, Daniel Berlin, gcc

Joe Buck <jbuck@synopsys.com> writes:

  > > I bootstrapped this version on sparc-sun-solaris2.8:
  > > 
  > > {
  > >   if (AGGREGATE_TYPE_P (t) &&
  > >       !((TREE_CODE (t) == RECORD_TYPE)
  > >         && !CLASSTYPE_NON_POD_P(t)
  > >         && !CLASSTYPE_N_BASECLASSES(t)))
  > >       return 0;
  > >   }
  > 
  > CLASSNAME_NON_POD_P returns false if there are any methods or if there
  > is a constructor or destructor, right?  If so, then I think we're going
  > to get little gain: much of the payoff from better aliasing in C++ will
  > come if we don't get false aliasing complaints for things like STL
  > iterators.  At least in my code, I don't have many NON_PODs; even if
  > it's a plain struct otherwise, I use a constructor to initialize it.


The above code is intended to be very conservative, to make g++
compile C code at least as good as gcc does 
(see  http://gcc.gnu.org/ml/gcc/2002-01/msg01478.html)

We can relax the checks ss soon as we convinve ourselves that more
cases are acceptable. 
Is a patch like this acceptable at this point, or should I wait until
3.1 branches? 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 20:22             ` Joe Buck
@ 2002-01-25 23:59               ` Daniel Berlin
  2002-01-27 17:04               ` Dan Nicolaescu
  1 sibling, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 23:59 UTC (permalink / raw)
  To: Joe Buck; +Cc: Dan Nicolaescu, Mark Mitchell, gcc

On Fri, 25 Jan 2002, Joe Buck wrote:

> 
> > I bootstrapped this version on sparc-sun-solaris2.8:
> > 
> > {
> >   if (AGGREGATE_TYPE_P (t) &&
> >       !((TREE_CODE (t) == RECORD_TYPE)
> >         && !CLASSTYPE_NON_POD_P(t)
> >         && !CLASSTYPE_N_BASECLASSES(t)))
> >       return 0;
> >   }
> 
> CLASSNAME_NON_POD_P returns false if there are any methods or if there
> is a constructor or destructor, right?  If so, then I think we're going
> to get little gain: much of the payoff from better aliasing in C++ will
> come if we don't get false aliasing complaints for things like STL
> iterators.  At least in my code, I don't have many NON_PODs; even if
> it's a plain struct otherwise, I use a constructor to initialize it.
Constructors and destructors should be fine, well, sorry, *non-virtual* 
destructors should be fine.

Unless adding non-virtual destructors and constructors affect object 
layout in memory somehow.


> 
> On the other hand, some of the STL iterators are derived classes.  Sigh.
> 

One thing at a time, of course.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 15:48           ` Dan Nicolaescu
@ 2002-01-25 20:22             ` Joe Buck
  2002-01-25 23:59               ` Daniel Berlin
  2002-01-27 17:04               ` Dan Nicolaescu
  0 siblings, 2 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25 20:22 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: Mark Mitchell, Daniel Berlin, gcc


> I bootstrapped this version on sparc-sun-solaris2.8:
> 
> {
>   if (AGGREGATE_TYPE_P (t) &&
>       !((TREE_CODE (t) == RECORD_TYPE)
>         && !CLASSTYPE_NON_POD_P(t)
>         && !CLASSTYPE_N_BASECLASSES(t)))
>       return 0;
>   }

CLASSNAME_NON_POD_P returns false if there are any methods or if there
is a constructor or destructor, right?  If so, then I think we're going
to get little gain: much of the payoff from better aliasing in C++ will
come if we don't get false aliasing complaints for things like STL
iterators.  At least in my code, I don't have many NON_PODs; even if
it's a plain struct otherwise, I use a constructor to initialize it.

On the other hand, some of the STL iterators are derived classes.  Sigh.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 15:36         ` Mark Mitchell
  2002-01-25  2:25           ` Daniel Berlin
@ 2002-01-25 15:48           ` Dan Nicolaescu
  2002-01-25 20:22             ` Joe Buck
  1 sibling, 1 reply; 97+ messages in thread
From: Dan Nicolaescu @ 2002-01-25 15:48 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Daniel Berlin, gcc

Mark Mitchell <mark@codesourcery.com> writes:

  > > An incremental improvement would be to allow at least C-type structs.
  > >
  > > {
  > >   if (AGGREGATE_TYPE_P (t) &&
  > >      IS_A_CLASS_DERIVED_FROM_ANOTHER_CLASS_P (t)) /* [1] */
  > >     return 0;
  > >
  > >   return c_common_get_alias_set (t);
  > > }
  > >
  > > should be safe because:
  > > a) the predicate [1] is true for any aggregate that is not a C-type struct
  > > b) c_common_get_alias_set deals with C-type structs correctly
  > > c) C-type structs cannot alias derived classes because the later are
  > >    put in alias set 0 because of [1]
  > >
  > 
  > See this is where things get subtle.  You have to (at least) worry
  > about whether or not t is zero-sized or has zero-sized bases or members.
  > If it does, there may be other zero-sized things at the same address.
  > If that's so, then if you put the zero-sized things in different alias
  > sets, you're saying they never alias.  Now, obviously, we never read or
  > write zero-sized things -- but different alias sets also implies that
  > &x != &y which is false.


I bootstrapped this version on sparc-sun-solaris2.8:

{
  if (AGGREGATE_TYPE_P (t) &&
      !((TREE_CODE (t) == RECORD_TYPE)
        && !CLASSTYPE_NON_POD_P(t)
        && !CLASSTYPE_N_BASECLASSES(t)))
      return 0;
  }
 
 return c_common_get_alias_set (t);
}

This is quite conservative, (the CLASSTYPE_N_BASECLASSES test might not
be needed). 
Is it acceptable? 

The test should filter out all the non

AGGREGATE_TYPE_P(TYPE) is defined as: 
  (TREE_CODE (TYPE) == ARRAY_TYPE || TREE_CODE (TYPE) == RECORD_TYPE \
   || TREE_CODE (TYPE) == UNION_TYPE || TREE_CODE (TYPE) == QUAL_UNION_TYPE \
   || TREE_CODE (TYPE) == SET_TYPE)

I am not sure which of these apply to C++, it seems that only
RECORD_TYPE and UNION_TYPE should be taken into consideration. Is that
true? 


The test results are:


                === g++ Summary ===

# of expected passes            6805
# of unexpected failures        18
# of expected failures          95
# of untested testcases         19
# of unsupported tests          1

Unfortunately I don't have the results before the change. 
I'll bootstrap again without the change and rerun the tests, this
will take a few hours.

        --dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:33 ` Daniel Berlin
@ 2002-01-25 15:43   ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25 15:43 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc


On Friday, January 25, 2002, at 09:31  AM, Richard Kenner wrote:

>     Errr, not really.
>     They just happen to be our representation of aliases for types.
>
> In some sense, but they avoid undecidability problems because you just
> need to look at language semantics, not run-time behavior of a
> particular program.
>
They don't avoid undecidability, or else we wouldn't have alias set 0.
If we could determine what everything aliases, we wouldn't need a set 
that aliases everything.
We can't.
Undecidability is not just based on run-time behavior of the program.  
It's also undecidable in the static case.
may-alias is undecidable statically, as is must-alias (Unless i'm 
misremembering).
IIRC, these  were proven on either C or modula-3, so the language 
semantics don't help us at all.

> The issue with alias sets is whether you can prove from the language
> specification that two objects *cannot* alias each other.  If and only
> if you can, they are in different alias sets.

Of course.
I know how alias sets in gcc/g++ work, i know how non-type-based alias 
analysis works.
I know how non-type-based alias analysis with type based disambiguation 
work.
That doesn't make it any easier.
--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25 14:49 mike stump
  0 siblings, 0 replies; 97+ messages in thread
From: mike stump @ 2002-01-25 14:49 UTC (permalink / raw)
  To: dje, jbuck, mark; +Cc: gcc, pcarlini

> Date: Fri, 25 Jan 2002 11:44:21 -0800
> From: Mark Mitchell <mark@codesourcery.com>
> To: Joe Buck <jbuck@synopsys.COM>, David Edelsohn <dje@watson.ibm.com>
> cc: Paolo Carlini <pcarlini@unitus.it>, "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>

> Your proof has at least one bug.  A type that has no baseclasses or
> virtuals can contain (as a data member) a type that does; such a
> type is at least as complex as the contained type.  (Similarly, an
> array of classes with virtual bases, etc.)  You need to recurse
> through the type structure.

Ah, but I believe the CLASSTYPE_NON_POD_P test may conservative
enough in this case, in particular, it does:

      if (! pod_type_p (type))
        /* DR 148 now allows pointers to members (which are POD themselves),
           to be allowed in POD structs.  */
	   CLASSTYPE_NON_POD_P (t) = 1;

in class.c, thus cutting off the recursive case of members.

The cases of virtuals and bases are handled by:

  CLASSTYPE_NON_POD_P (t)
    |= (CLASSTYPE_NON_AGGREGATE (t) || TYPE_HAS_DESTRUCTOR (t) 
    || TYPE_HAS_ASSIGN_REF (t));

and for CLASSTYPE_NON_AGGREGATE in the case of bases, by:

  /* An aggregate cannot have baseclasses.  */
  CLASSTYPE_NON_AGGREGATE (t) |= (n_baseclasses != 0);

and in the case of virtuals, by:

  CLASSTYPE_NON_AGGREGATE (t) |= (TYPE_HAS_CONSTRUCTOR (t)
				   || TYPE_POLYMORPHIC_P (t));

We can see the limiting of virtuals by:

/* Nonzero if this class has a virtual function table pointer.  */
#define TYPE_CONTAINS_VPTR_P(NODE)		\
  (TYPE_POLYMORPHIC_P (NODE)					\
   || TYPE_USES_VIRTUAL_BASECLASSES (NODE))

So, if we believe it, we know that either TYPE_POLYMORPHIC_P is true,
or TYPE_USES_VIRTUAL_BASECLASSES is true, if the type has a vtable
pointer.  TYPE_USES_VIRTUAL_BASECLASSES should only be true, if the
type has base classes.  We can see this as this is only set in:

       if (via_virtual || TYPE_USES_VIRTUAL_BASECLASSES (basetype))
           {
	         TYPE_USES_VIRTUAL_BASECLASSES (ref) = 1;

in decl.c, and we can only get there if we process a basetype.

> It may be that with that change, your sketch is correct -- but the
> fact that you missed this point just makes me more nervous.

I found it trivial to spot; it hit me the same second his words/code
first hit my eyes, though, yes, there may be other trivial things
about it that we have missed.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 13:57     ` Gabriel Dos Reis
@ 2002-01-25 14:47       ` Tim Hollebeek
  0 siblings, 0 replies; 97+ messages in thread
From: Tim Hollebeek @ 2002-01-25 14:47 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Daniel Berlin, Richard Kenner, pcarlini, gcc

> 
> | Mark's claim that if the underlying algorithm is easy, reasoning about it 
> | should be, is also not quite right.
> 
> A claim I completely agree with.
> 
> | Take Fermat's theorem, for instance.
> 
> Oh, come on.
> 
> Proof by analogy is fraud.  Secondly, Fermat's theorem is *not* an
> algorithm -- it is an *existential* (or non-existential) theorem
> without any algorithm specification --- there is no remote relation
> about the Fermat's theorem and the concrete issue we have at hand.

Replace Fermat's theorem with Collatz's sequence (aka Hasse's
_algorithm_).  It's even simpler to state than Fermat, but harder to
prove [1].

The only way to get around this fact is to claim that either (1)
reasoning about recursion is generally hard, or (2) proving an
algorithm terminates is generally hard, which gives me a sense of deja
vu ... =)

So, it is definitely true that reasoning about simple algorithms may
be very difficult.  Q.E.D.

[1] based on the fact it is still unproved, while Fermat is not.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:11 ` Daniel Berlin
@ 2002-01-25 14:09   ` Gabriel Dos Reis
  0 siblings, 0 replies; 97+ messages in thread
From: Gabriel Dos Reis @ 2002-01-25 14:09 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Robert Dewar, kenner, gcc

Daniel Berlin <dan@dberlin.org> writes:

| On Fri, 25 Jan 2002, Robert Dewar wrote:
| 
| > I agree with Richard, there are no undecidability problems here, we are
| > talking about static proofs that two objects cannot be aliased.
| 
| You would be incorrect.
| Type based aliasing says "we have some sets of types that we know
| alias each other,  

A more accurate statement is "we have sets of types that we know *may*
alias each other..."

-- Gaby

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:17   ` Daniel Berlin
@ 2002-01-25 13:57     ` Gabriel Dos Reis
  2002-01-25 14:47       ` Tim Hollebeek
  0 siblings, 1 reply; 97+ messages in thread
From: Gabriel Dos Reis @ 2002-01-25 13:57 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Gabriel Dos Reis, Richard Kenner, pcarlini, gcc

Daniel Berlin <dan@dberlin.org> writes:

| On 25 Jan 2002, Gabriel Dos Reis wrote:
| 
| > kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:
| > 
| > [...]
| > 
| > | I am very much against the idea of defining a change to be "correct" if
| > | it doesn't cause any regression test failures.  You have to be able to make
| > | an argument that a change is correct independently and the regression tests
| > | serve as a debugger of (among other things) that proof.
| > 
| > I completely agree with Kenner and Mark.  Given, current ABIs
| > supported by g++, aliasing detection is a very subtle issue and we
| > should resist from the temptation of not proving that our algorithms
| > are correct;
| Aliasing is very hard to reason about formally, because no matter what you 
| do, you start running into the undecidability issue.
| In fact, in papers on static type determination for C++ (Do a search on 
| researchindex.org), i've yet to see a *single* formal proof of any kind 
| offered that they are correct.

That is where llies the line between writing a paper to talk about
something and do the actual implementations.  For some reasons, papers
have to abstract over some (important) details.  Something you cannot
aford for in *actual* implementations.

[...]

| Mark's claim that if the underlying algorithm is easy, reasoning about it 
| should be, is also not quite right.

A claim I completely agree with.

| Take Fermat's theorem, for instance.

Oh, come on.

Proof by analogy is fraud.  Secondly, Fermat's theorem is *not* an
algorithm -- it is an *existential* (or non-existential) theorem
without any algorithm specification --- there is no remote relation
about the Fermat's theorem and the concrete issue we have at hand.

-- Gaby

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25 12:23 Robert Dewar
@ 2002-01-25 13:29 ` Joe Buck
  0 siblings, 0 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25 13:29 UTC (permalink / raw)
  To: Robert Dewar; +Cc: dje, jbuck, gcc, mark, pcarlini


> >>While this is not a formal proof, it's enough to convince me that Daniel's
> change is safe.
> 
> It is "convincing" that is important, a formal proof is a means to an end
> not an end in itself :-)

Fortunately Mark was not convinced.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25 12:23 Robert Dewar
  2002-01-25 13:29 ` Joe Buck
  0 siblings, 1 reply; 97+ messages in thread
From: Robert Dewar @ 2002-01-25 12:23 UTC (permalink / raw)
  To: dje, jbuck; +Cc: gcc, mark, pcarlini

>>While this is not a formal proof, it's enough to convince me that Daniel's
change is safe.

It is "convincing" that is important, a formal proof is a means to an end
not an end in itself :-)

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25 12:06 mike stump
  0 siblings, 0 replies; 97+ messages in thread
From: mike stump @ 2002-01-25 12:06 UTC (permalink / raw)
  To: dan, dewar; +Cc: gcc, kenner

> Date: Fri, 25 Jan 2002 11:34:48 -0500 (EST)
> From: Daniel Berlin <dan@dberlin.org>
> To: Robert Dewar <dewar@gnat.com>
> cc: gcc@gcc.gnu.org, <kenner@vlsi1.ultra.nyu.edu>

> > And we need to prove that they are indeed
> > conservative estimates. We need to be able to prove that elements of
> > separate sets are indeed never aliased. 
> Herein lies the point you've ignored.  Using only the C++ language 
> standard, we can't do that, and it's quite possible we may place 
> things into separate sets that do in fact alias, because of the 
> implementation specific details.

To get concrete here, and because I didn't like how Dan worded this,
an example of such an issue is the reuse of stack slots for values
which have different alias sets.  Reusing stack slots is beyond the
standard (at least for C and C++), but reuse is mandated by users
expectation (to some degree).

However, this implementation choice does force us to think about
complex alias issues:

  { int i; i = 1; ... }
  { double d; d = 24; ... }

If one reuses the space for i for d, then one has to be careful to not
allow the write of i to be moved past any playing with d.  They
conflict (alias), even though through the rules of ANSI C, you will
not find a statement that they conflict.

I think this is the canonical example of what he is talking about,
Daniel, can you confirm this?

Also, think about inlining two routines into a single function:

  void foo1() { int i; i =1; ... }
  void foo2() { double d; d = 24; ... }

  void bar() { foo1(); foo2(); }

If we reuse the stack space for one, for the other one...

In the end, we have to ensure that the implementation is aliasing
correct.  If it is, then the only additional rules we need, are those
based upon mandates by the language.  Or put another, given an
aliasing correct backend, we need only consider rules from the
language spec.

There can be existing code in the backend, that is aliasing correct
for the existing ways in which we use it, but for which it is not
aliasing correct for more advanced uses.  This means, if we change how
think about aliasing in the frontend, we can introduce aliasing bugs.
An example of this would be reusing stack slots, and having a frontend
always use set 0.  The backend would be alias correct for how the
frontend used it.  And it would seem natural to put in more smarts
about aliasing into the frontend, but this would cause the stack reuse
to then be aliasing incorrect.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:32 Robert Dewar
  2002-01-25  8:53 ` Daniel Berlin
@ 2002-01-25  9:39 ` Joe Buck
  1 sibling, 0 replies; 97+ messages in thread
From: Joe Buck @ 2002-01-25  9:39 UTC (permalink / raw)
  To: Robert Dewar; +Cc: dan, dewar, gcc, kenner

Robert writes:

> You are still hung up on the idea that the problem we are trying to
> solve is to determine whether two items are aliased. That's NOT the
> problem here.  THe problem is to determine sets of items that are
> provably not aliased. These are quite different problems, and you are
> getting confused between them.

Sorry I didn't see this message before I wrote mine; Robert expresses it
exactly correctly.  And all other problems gcc may encounter that involve
undecidability get treated in the same way (handle cases you can't analyze
conservatively).



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  9:13 Robert Dewar
  0 siblings, 0 replies; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  9:13 UTC (permalink / raw)
  To: dan, dewar; +Cc: gcc, kenner

<<Which, if i wasn't busy with law school and the bugzilla stuff, is likely
what i'd do instead of writing up proofs, since these algorithms have
already been proven, and I think most of them can handle structure field
aliasing as a basic case anway, without any regards as to the actual type
of the structure field. Though obviously not as well as you can do once
you've got type based disambiguation to tell you for certain either way.
>>

Sounds like a perfectly reasonable approach to me!

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:05 Richard Kenner
@ 2002-01-25  8:59 ` Paolo Carlini
  0 siblings, 0 replies; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25  8:59 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

Richard Kenner wrote:

>Indeed, I strongly suspect that if it were possible to do better alias
>set handling in C++, it would produce a far larger performance
>improvement than any of the Intermediate Language Transformations
>being discussed and would also likely be a lot less work.
>
This is *exactly* the kind of authoritative statement which is able to 
give voice to my intuitive feelings.

Thanks,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:35 Robert Dewar
@ 2002-01-25  8:54 ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  8:54 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, kenner

On Fri, 25 Jan 2002, Robert Dewar wrote:

> <<I've been told i need to come up with a formal proof that given certain
> relationships between C++ types, that may-alias is always determinable
> statically correctly, or that we always correctly determine that we can't
> determine it (IE never claim wrong that things may not alias).
> I've said before, and i'll say again, that i'm not going to do that.
> >>
> 
> THat's fine, but the fact that you are not going to do it does not mean
> that it is not desirable or not necessary!
But it's *not* necessary when we've restricted ourselves to a subset of 
C++ structs and classes that correspond to C structs and classes!


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:32 Robert Dewar
@ 2002-01-25  8:53 ` Daniel Berlin
  2002-01-25  9:39 ` Joe Buck
  1 sibling, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  8:53 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, kenner

On Fri, 25 Jan 2002, Robert Dewar wrote:

> <<> The problem before us is to narrow down the may-alias relationship as far
> > as possible statically. There is no issue of undecidability here.
> Of course there is, which is why you conservatively assume that everything
> aliases everything else until you prove otherwise.
> I've been told i need to come up with a formal proof that given certain
> relationships between C++ types, that may-alias is always determinable
> statically correctly, or that we always correctly determine that we can't
> determine it (IE never claim wrong that things may not alias).
> I've said before, and i'll say again, that i'm not going to do that.
> >>
> 
> Nope, this is plain confused. 
No, once again, it's not.
> It is undecidable in general whether two
> objects will be aliased at run time (that's so trivial to prove that it
> is a silly and useless observation).
> 
> But it is not a problem, it merely says that the alias sets we create are
> simply conservative estimates. 
Yes.
> And we need to prove that they are indeed
> conservative estimates. We need to be able to prove that elements of
> separate sets are indeed never aliased. 
Herein lies the point you've ignored.  Using only the C++ language 
standard, we can't do that, and it's quite possible we may place 
things into separate sets that do in fact alias, because of the 
implementation specific details.
> The fact that we can't prove that
> elements of the *same* set *are* aliased is obvioulsy and trivially true,
> but quite irrelevant.
Yes.

> 
> <<But claiming that undecidability doesn't enter anyway is simply wrong.
> I'm claiming that undecidability enters the picture in another way as
> well. The C++ language specification does not give you enough information
> in our case to determine that we can determine it or not. We can't always
> say whether or not we can say whether two pieces of two types alias.
> >>
> 
> You are still hung up on the idea that the problem we are trying to solve
> is to determine whether two items are aliased. That's NOT the problem here.
> THe problem is to determine sets of items that are provably not aliased. These
> are quite different problems, and you are getting confused between them.
No, i'm not confused at all, or hung up at all.
You have simply missed the point that the language semantics of C++ alone 
don't give you enough to get the sets of items that don't alias 
*correctly*, because they don't tell you the implementation specific details.
The question is whether our ABI gives us enough, and is written in such a 
way that does enable us to determine the sets of non-aliasing items 
*correctly*.

> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:28 Robert Dewar
@ 2002-01-25  8:49 ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  8:49 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, kenner

On Fri, 25 Jan 2002, Robert Dewar wrote:

> <<The language semantics don't necessarily give you enough to be able to
> place all types in the right sets. If it doesn't, you are back to
> undecidability.
> >>
> 
> Of course it is undecidable in the dynamic sense, but there we are not
> interested in "the right sets", we are interested in sets that we can
> prove are disjoint. It is quite "right" to put everything in the same
> set, just not very efficient. We are NOT looking for an optimal solution,
> here, that's obvious to anyone that that's unattainable.
Of course.

> 
> What we are looking for is improved, verifiable, principles for splitting
> the sets more finely.
> 
> Sometimes I really think they should not teach anyone about undecidability.
> It always ends up with people looking at a perfectly simple problem like
> this one (simple conceptually, not simple to get good solutions to), and
> worrying about undecidability when it is a complete red herring.
Not here, however.
> 
> If you propose that two items are not aliased, and it is undecidable whether
> they are aliased, that's fine, it just means they go in the same alias set,
> no big deal!

However, the important point is that the language semantics don't allow 
us to get it *right*, not just deciding. We may get it *wrong*, and think 
we have it *right*.
--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  8:35 Robert Dewar
  2002-01-25  8:54 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  8:35 UTC (permalink / raw)
  To: dan, dewar; +Cc: gcc, kenner

<<I've been told i need to come up with a formal proof that given certain
relationships between C++ types, that may-alias is always determinable
statically correctly, or that we always correctly determine that we can't
determine it (IE never claim wrong that things may not alias).
I've said before, and i'll say again, that i'm not going to do that.
>>

THat's fine, but the fact that you are not going to do it does not mean
that it is not desirable or not necessary!

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  8:33 Richard Kenner
  0 siblings, 0 replies; 97+ messages in thread
From: Richard Kenner @ 2002-01-25  8:33 UTC (permalink / raw)
  To: dan; +Cc: gcc

    They don't avoid undecidability, or else we wouldn't have alias set 0.
    If we could determine what everything aliases, we wouldn't need a set 
    that aliases everything.

I didn't say they *avoid* undecidability, just avoid the undecidability
*problem* by separating static language semantics from all other aspects
of alising.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  8:32 Robert Dewar
  2002-01-25  8:53 ` Daniel Berlin
  2002-01-25  9:39 ` Joe Buck
  0 siblings, 2 replies; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  8:32 UTC (permalink / raw)
  To: dan, dewar; +Cc: gcc, kenner

<<> The problem before us is to narrow down the may-alias relationship as far
> as possible statically. There is no issue of undecidability here.
Of course there is, which is why you conservatively assume that everything
aliases everything else until you prove otherwise.
I've been told i need to come up with a formal proof that given certain
relationships between C++ types, that may-alias is always determinable
statically correctly, or that we always correctly determine that we can't
determine it (IE never claim wrong that things may not alias).
I've said before, and i'll say again, that i'm not going to do that.
>>

Nope, this is plain confused. It is undecidable in general whether two
objects will be aliased at run time (that's so trivial to prove that it
is a silly and useless observation).

But it is not a problem, it merely says that the alias sets we create are
simply conservative estimates. And we need to prove that they are indeed
conservative estimates. We need to be able to prove that elements of
separate sets are indeed never aliased. The fact that we can't prove that
elements of the *same* set *are* aliased is obvioulsy and trivially true,
but quite irrelevant.

<<But claiming that undecidability doesn't enter anyway is simply wrong.
I'm claiming that undecidability enters the picture in another way as
well. The C++ language specification does not give you enough information
in our case to determine that we can determine it or not. We can't always
say whether or not we can say whether two pieces of two types alias.
>>

You are still hung up on the idea that the problem we are trying to solve
is to determine whether two items are aliased. That's NOT the problem here.
THe problem is to determine sets of items that are provably not aliased. These
are quite different problems, and you are getting confused between them.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  8:28 Robert Dewar
  2002-01-25  8:49 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  8:28 UTC (permalink / raw)
  To: dan, dewar; +Cc: gcc, kenner

<<The language semantics don't necessarily give you enough to be able to
place all types in the right sets. If it doesn't, you are back to
undecidability.
>>

Of course it is undecidable in the dynamic sense, but there we are not
interested in "the right sets", we are interested in sets that we can
prove are disjoint. It is quite "right" to put everything in the same
set, just not very efficient. We are NOT looking for an optimal solution,
here, that's obvious to anyone that that's unattainable.

What we are looking for is improved, verifiable, principles for splitting
the sets more finely.

Sometimes I really think they should not teach anyone about undecidability.
It always ends up with people looking at a perfectly simple problem like
this one (simple conceptually, not simple to get good solutions to), and
worrying about undecidability when it is a complete red herring.

If you propose that two items are not aliased, and it is undecidable whether
they are aliased, that's fine, it just means they go in the same alias set,
no big deal!

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  8:18 ` Daniel Berlin
@ 2002-01-25  8:20   ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  8:20 UTC (permalink / raw)
  To: Robert Dewar; +Cc: kenner, gcc

On Fri, 25 Jan 2002, Daniel Berlin wrote:

> On Fri, 25 Jan 2002, Robert Dewar wrote:
> 
> > <<Undecidability is not just based on run-time behavior of the program.
> > It's also undecidable in the static case.
> > may-alias is undecidable statically, as is must-alias (Unless i'm
> > misremembering).
> > >>
> > 
> > This is confused.
> No, it's not, it's perfectly true.
> > 
> > May-alias is a predicate with many possible solutions. One not very useful
> > solution is that anything may alias anything else. We are not interested
> > in whether something actually IS aliased at run time. We are interested
> > just in the subset of cases that we can prove do NOT alias. 
> Well, duh.
> > 
> > The problem before us is to narrow down the may-alias relationship as far
> > as possible statically. There is no issue of undecidability here.
> Of course there is, which is why you conservatively assume that everything 
> aliases everything else until you prove otherwise.
> I've been told i need to come up with a formal proof that given certain 
> relationships between C++ types, that may-alias is always determinable 
> statically correctly, or that we always correctly determine that we can't 
> determine it (IE never claim wrong that things may not alias).
> I've said before, and i'll say again, that i'm not going to do that.
> 
> But claiming that undecidability doesn't enter anyway is simply wrong. 
> I'm claiming that undecidability enters the picture in another way as 
> well. The C++ language specification does not give you enough information 
> in our case to determine that we can determine it or not. We can't always 
> say whether or not we can say whether two pieces of two types alias.
> Our C++ ABI may or may not allow us to do it, since it gives us some, 
> possibly all, of the information  the C++ standard leaves to 
> implementations.
> I'm not going to spend time trying to formally prove it one way or the 
> other.
> 
> Which brings us back to the reason i suggested we improve it for a 
> restricted subset of C++ classes that we already handle properly for C.
Pardon, Actually, the other Dan suggested it, and I said I thought it was 
a good idea, which is what started the whole discussion.

In hindsight, i should never have replied to mark, and just left C++ TBAA 
in the mostly non-existent state it is in now, for another 5 years.

To give you some idea about how annoying it is to prove it one way or 
another, it would likely take less time, and more productive, to implement 
one of the newer interprocedural, incremental, pointer analysis algorithms (be 
they flow-sensitive or flow-insensitive), and use the TBAA as a 
disambiguation, rather than *as* our  points-to sets.

That way, you'd both improve our aliasing for C, as well as give us some 
information on aliasing for C++, without dealing with type issues at all.

Which, if i wasn't busy with law school and the bugzilla stuff, is likely 
what i'd do instead of writing up proofs, since these algorithms have 
already been proven, and I think most of them can handle structure field 
aliasing as a basic case anway, without any regards as to the actual type 
of the structure field. Though obviously not as well as you can do once 
you've got type based disambiguation to tell you for certain either way.


--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:51 Robert Dewar
@ 2002-01-25  8:18 ` Daniel Berlin
  2002-01-25  8:20   ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  8:18 UTC (permalink / raw)
  To: Robert Dewar; +Cc: kenner, gcc

On Fri, 25 Jan 2002, Robert Dewar wrote:

> <<Undecidability is not just based on run-time behavior of the program.
> It's also undecidable in the static case.
> may-alias is undecidable statically, as is must-alias (Unless i'm
> misremembering).
> >>
> 
> This is confused.
No, it's not, it's perfectly true.
> 
> May-alias is a predicate with many possible solutions. One not very useful
> solution is that anything may alias anything else. We are not interested
> in whether something actually IS aliased at run time. We are interested
> just in the subset of cases that we can prove do NOT alias. 
Well, duh.
> 
> The problem before us is to narrow down the may-alias relationship as far
> as possible statically. There is no issue of undecidability here.
Of course there is, which is why you conservatively assume that everything 
aliases everything else until you prove otherwise.
I've been told i need to come up with a formal proof that given certain 
relationships between C++ types, that may-alias is always determinable 
statically correctly, or that we always correctly determine that we can't 
determine it (IE never claim wrong that things may not alias).
I've said before, and i'll say again, that i'm not going to do that.

But claiming that undecidability doesn't enter anyway is simply wrong. 
I'm claiming that undecidability enters the picture in another way as 
well. The C++ language specification does not give you enough information 
in our case to determine that we can determine it or not. We can't always 
say whether or not we can say whether two pieces of two types alias.
Our C++ ABI may or may not allow us to do it, since it gives us some, 
possibly all, of the information  the C++ standard leaves to 
implementations.
I'm not going to spend time trying to formally prove it one way or the 
other.

Which brings us back to the reason i suggested we improve it for a 
restricted subset of C++ classes that we already handle properly for C.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:38 Robert Dewar
@ 2002-01-25  8:11 ` Daniel Berlin
  2002-01-25 14:09   ` Gabriel Dos Reis
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  8:11 UTC (permalink / raw)
  To: Robert Dewar; +Cc: kenner, gcc

On Fri, 25 Jan 2002, Robert Dewar wrote:

> I agree with Richard, there are no undecidability problems here, we are
> talking about static proofs that two objects cannot be aliased.

You would be incorrect.
Type based aliasing says "we have some sets of types that we know alias each other, 
if you have one of these types, and we can tell you whether you may 
alias some other set of types".
The language semantics don't necessarily give you enough to be able to 
place all types in the right sets. If it doesn't, you are back to 
undecidability.
In the case of C++, they don't.
Given no C++ ABI restrictions, it's undecidable, because we don't know 
what pieces of vtables, etc, are shared. You cannot determine, based soley 
on what you find in the C++ standard, whether two types alias, for all 
cases.  Thus, you cannot determine properly, using only the C++ standard, 
the alias set a type necessarily belongs to.  Worse, using only the C++ 
standard, you might end up being just plain wrong, and saying two things 
alias that don't, or the other way around.  It's completely implementation 
dependent.
Nothing in the C++ language will let me prove whether or not two objects share a vtable or not.
The question is whether our C++ abi provides enough to enable us to build 
all the sets properly.
The whole issue is whether i'm going to write up a formal proof of whether 
we our C++ ABI lets us do it for some restricted subset, using C 
semantics.
I'm not going to write that up.
It's just too difficult to go about formally proving.

> Either we can prove that, or we assume that they are aliased. The issue of whether
> they are *really* aliased at run time is interesting, but irrelevant.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  7:51 Robert Dewar
  2002-01-25  8:18 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  7:51 UTC (permalink / raw)
  To: dan, kenner; +Cc: gcc

<<Undecidability is not just based on run-time behavior of the program.
It's also undecidable in the static case.
may-alias is undecidable statically, as is must-alias (Unless i'm
misremembering).
>>

This is confused.

May-alias is a predicate with many possible solutions. One not very useful
solution is that anything may alias anything else. We are not interested
in whether something actually IS aliased at run time. We are interested
just in the subset of cases that we can prove do NOT alias. 

The problem before us is to narrow down the may-alias relationship as far
as possible statically. There is no issue of undecidability here. If we
propose that may-alias (a,b) is false, then either we can prove it or
we cannot. If we cannot, then we cannot proceed on the basis that
may-alias (a,b) is false.

I mean by prove here: demonstrate in a manner that generates sufficient
confidence.

A very formal mathematical proof might or might not suffice (if a proof is
too complex, it does not generate confidence, since, like a complex program
it may have a hard to find bug). We can hardly talk about machine verified
proofs in this context.

A demonstration MIGHT come from a test suite if the test suite was
sufficiently comprehensive in some appropriate sense (after all, testing
is the main method for proof of reliability of safety critical software,
such as is used in nuclear power plants and avionics, but it is pretty
formal comprehensive testing, requiring e.g. full coverage testing, including
full MCDC testing).

I think what is being said here is that people do not feel that the existing
test suite, which was not designed at all to test reliability of aliasing
analysis, is anywhere near meeting the criterion of generating sufficient
confidence. 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  7:38 Robert Dewar
  2002-01-25  8:11 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Robert Dewar @ 2002-01-25  7:38 UTC (permalink / raw)
  To: dan, kenner; +Cc: gcc

I agree with Richard, there are no undecidability problems here, we are
talking about static proofs that two objects cannot be aliased. Either
we can prove that, or we assume that they are aliased. The issue of whether
they are *really* aliased at run time is interesting, but irrelevant.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:30 Richard Kenner
@ 2002-01-25  7:33 ` Daniel Berlin
  2002-01-25 15:43   ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  7:33 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc


On Friday, January 25, 2002, at 09:31  AM, Richard Kenner wrote:

>     Errr, not really.
>     They just happen to be our representation of aliases for types.
>
> In some sense, but they avoid undecidability problems because you just
> need to look at language semantics, not run-time behavior of a
> particular program.
>
They don't avoid undecidability, or else we wouldn't have alias set 0.
If we could determine what everything aliases, we wouldn't need a set 
that aliases everything.
We can't.
Undecidability is not just based on run-time behavior of the program.  
It's also undecidable in the static case.
may-alias is undecidable statically, as is must-alias (Unless i'm 
misremembering).
IIRC, these  were proven on either C or modula-3, so the language 
semantics don't help us at all.

> The issue with alias sets is whether you can prove from the language
> specification that two objects *cannot* alias each other.  If and only
> if you can, they are in different alias sets.

Of course.
I know how alias sets in gcc/g++ work, i know how non-type-based alias 
analysis works.
I know how non-type-based alias analysis with type based disambiguation 
work.
That doesn't make it any easier.
--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  7:30 Richard Kenner
  0 siblings, 0 replies; 97+ messages in thread
From: Richard Kenner @ 2002-01-25  7:30 UTC (permalink / raw)
  To: dan; +Cc: gcc

    Errr, not really.
    They just happen to be our representation of aliases for types.

In some sense, but they avoid undecidability problems because you just
need to look at language semantics, not run-time behavior.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  7:30 Richard Kenner
  2002-01-25  7:33 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Richard Kenner @ 2002-01-25  7:30 UTC (permalink / raw)
  To: dan; +Cc: gcc

    Errr, not really.
    They just happen to be our representation of aliases for types.

In some sense, but they avoid undecidability problems because you just
need to look at language semantics, not run-time behavior of a
particular program.

The issue with alias sets is whether you can prove from the language
specification that two objects *cannot* alias each other.  If and only
if you can, they are in different alias sets.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  7:23 Richard Kenner
@ 2002-01-25  7:24 ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  7:24 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc


On Friday, January 25, 2002, at 08:41  AM, Richard Kenner wrote:

>     Aliasing is very hard to reason about formally, because no matter 
> what you
>     do, you start running into the undecidability issue.
>
> Sure, but we're talking about alias *sets*, which is a different issue.

Errr, not really.
They just happen to be our representation of aliases for types.

--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  7:23 Richard Kenner
  2002-01-25  7:24 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Richard Kenner @ 2002-01-25  7:23 UTC (permalink / raw)
  To: dan; +Cc: gcc

    Aliasing is very hard to reason about formally, because no matter what you 
    do, you start running into the undecidability issue.

Sure, but we're talking about alias *sets*, which is a different issue.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  2:16 ` Gabriel Dos Reis
  2002-01-25  3:04   ` Paolo Carlini
@ 2002-01-25  7:17   ` Daniel Berlin
  2002-01-25 13:57     ` Gabriel Dos Reis
  1 sibling, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  7:17 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Richard Kenner, pcarlini, gcc

On 25 Jan 2002, Gabriel Dos Reis wrote:

> kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:
> 
> [...]
> 
> | I am very much against the idea of defining a change to be "correct" if
> | it doesn't cause any regression test failures.  You have to be able to make
> | an argument that a change is correct independently and the regression tests
> | serve as a debugger of (among other things) that proof.
> 
> I completely agree with Kenner and Mark.  Given, current ABIs
> supported by g++, aliasing detection is a very subtle issue and we
> should resist from the temptation of not proving that our algorithms
> are correct;
Aliasing is very hard to reason about formally, because no matter what you 
do, you start running into the undecidability issue.
In fact, in papers on static type determination for C++ (Do a search on 
researchindex.org), i've yet to see a *single* formal proof of any kind 
offered that they are correct.  They make statements about what language 
features they support, but never *why* or even prove that they support 
them.
It is simply assumed they are conservative enough.
If you think i'm going to write a formal proof that our aliasing for C 
structs, and thus, for the restricted case we are trying to make gcc 
handle for C++, you would be incorrect. I have neither the time, nor the 
inclination. It has nothing to do with whether i value correctness for 
speed, and everything to do with the fact that i'm not going to get myself 
into the kind of time I see something like that taking.
It's, IMHO, not something that as mark claims, is reasonably enough 
defined to do such a thing. If it was, others would have done it before.  
Mark's claim that if the underlying algorithm is easy, reasoning about it 
should be, is also not quite right.
Take Fermat's theorem, for instance.
So IMHO, that ends the discussion of trying to improve g++ TBAA 
for me.
I'm happy to let you guys require that someone formally reason about this 
stuff (I don't think it'll ever happen, but hey, i'm young and 
idealistic. Oh, wait...). It's just not gonna be me.

>  simply because correctness should come first, speed later.Of course.
> That doesn't mean I'm against any effort to improve alias analysis in
> g++, I'm simply against a change which doesn't consider correctness as
> serious issue.
What the heck is that supposed to mean?
Nobody has said correctness is not a serious issue.
I just don't think for simple cases, that it's a particularly *hard* 
issue.
Which is why we are talking about handling cases where what we have are 
effectively C  structs in C++ (IE nothing that inherits from anything 
else), which we handle properly for C already.
*Anything* is better than nothing, which is what we have now.
Formally proving things about simple C++ class aliasing, however, is not 
something i'm going to attempt.
So it'll have to wait for someone else.

Back to bugzilla work,
Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-25  7:05 Richard Kenner
  2002-01-25  8:59 ` Paolo Carlini
  0 siblings, 1 reply; 97+ messages in thread
From: Richard Kenner @ 2002-01-25  7:05 UTC (permalink / raw)
  To: gdr; +Cc: gcc

    Optimizations have high priorities (cf. regular patches and
    discussions about such and such Intermdiate Language Tranformations).
    It may be that an optimization like alias analysis doesn't have the
    highest priority but that doesn't mean notbody isn't consider it --

Indeed, I strongly suspect that if it were possible to do better alias
set handling in C++, it would produce a far larger performance
improvement than any of the Intermediate Language Transformations
being discussed and would also likely be a lot less work.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  4:35       ` Paolo Carlini
@ 2002-01-25  6:34         ` Daniel Berlin
  0 siblings, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  6:34 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Gabriel Dos Reis, gcc

On Fri, 25 Jan 2002, Paolo Carlini wrote:

> Gabriel Dos Reis wrote:
> 
> >hey, isn't Mark who
> >implemented the type-based analysis in g++? ;-)
> >
> I know that, Gaby.
> 
> And I already complimented him for his beautiful article in Dr Dobb's 
> where he described the whole issue! (it never occurred to me before that 
> C-type languages could be optimized better than fortran!)
>  
> ... however, I learned a few months ago from some of Kenner's postings 
> (and my personal experimentation) that right now the C++ alias analysis 
> *actually* implemented in g++ is not so strong after all... :-(
In fact, it's turned off for structs, classes, and, well, any aggregate 
type.
This is the problem. 
--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  4:17     ` Gabriel Dos Reis
@ 2002-01-25  4:35       ` Paolo Carlini
  2002-01-25  6:34         ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25  4:35 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc

Gabriel Dos Reis wrote:

>hey, isn't Mark who
>implemented the type-based analysis in g++? ;-)
>
I know that, Gaby.

And I already complimented him for his beautiful article in Dr Dobb's 
where he described the whole issue! (it never occurred to me before that 
C-type languages could be optimized better than fortran!)
 
... however, I learned a few months ago from some of Kenner's postings 
(and my personal experimentation) that right now the C++ alias analysis 
*actually* implemented in g++ is not so strong after all... :-(

Cheers,
Paolo.
 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  3:04   ` Paolo Carlini
@ 2002-01-25  4:17     ` Gabriel Dos Reis
  2002-01-25  4:35       ` Paolo Carlini
  0 siblings, 1 reply; 97+ messages in thread
From: Gabriel Dos Reis @ 2002-01-25  4:17 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Gabriel Dos Reis, gcc

Paolo Carlini <pcarlini@unitus.it> writes:

[...]

| My sympathy with Dan's message was due to the fact that in the last 
| months I got the impression that C++-alias analysis may be really 
| crucial in order to effectively optimize some codes whereas it looks 
| like, for some reason nobody among the knowledgeable people is actively 
| working on it... 

I think Mark and Kenner would like to see GCC performance improve.
I think that is the case for every GCC developper.  However, limited
ressources mean things have to be assigned priority.  Optimizations
have high priorities (cf. regular patches and discussions about such
and such Intermdiate Language Tranformations).  It may be that
an optimization like alias analysis doesn't have the highest priority
but that doesn't mean notbody isn't consider it -- hey, isn't Mark who
implemented the type-based analysis in g++? ;-)

-- Gaby

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-25  2:16 ` Gabriel Dos Reis
@ 2002-01-25  3:04   ` Paolo Carlini
  2002-01-25  4:17     ` Gabriel Dos Reis
  2002-01-25  7:17   ` Daniel Berlin
  1 sibling, 1 reply; 97+ messages in thread
From: Paolo Carlini @ 2002-01-25  3:04 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc

Gabriel Dos Reis wrote:

>kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:
>
>| I am very much against the idea of defining a change to be "correct" if
>| it doesn't cause any regression test failures.  You have to be able to make
>| an argument that a change is correct independently and the regression tests
>| serve as a debugger of (among other things) that proof.
>
>I completely agree with Kenner and Mark.  Given, current ABIs
>supported by g++, aliasing detection is a very subtle issue and we
>should resist from the temptation of not proving that our algorithms
>are correct; simply because correctness should come first, speed later.
>That doesn't mean I'm against any effort to improve alias analysis in
>g++, I'm simply against a change which doesn't consider correctness as
>serious issue.
>
In this connection I would like to clearly state that indeed I agree 
with all of you. By the way, I'm doing proofs myself daily in my 
research work and I *understand* the power of abstract reasoning!

My sympathy with Dan's message was due to the fact that in the last 
months I got the impression that C++-alias analysis may be really 
crucial in order to effectively optimize some codes whereas it looks 
like, for some reason nobody among the knowledgeable people is actively 
working on it... I took Dan's words as a spur for the C++ developers to 
start working again on that without being scared by the first technical 
difficulties!!

Cheers,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 15:36         ` Mark Mitchell
@ 2002-01-25  2:25           ` Daniel Berlin
  2002-01-25 15:48           ` Dan Nicolaescu
  1 sibling, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-25  2:25 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Dan Nicolaescu, gcc

On Thu, 24 Jan 2002, Mark Mitchell wrote:

> 
> > An incremental improvement would be to allow at least C-type structs.
> >
> > {
> >   if (AGGREGATE_TYPE_P (t) &&
> >      IS_A_CLASS_DERIVED_FROM_ANOTHER_CLASS_P (t)) /* [1] */
> >     return 0;
> >
> >   return c_common_get_alias_set (t);
> > }
> >
> > should be safe because:
> > a) the predicate [1] is true for any aggregate that is not a C-type struct
> > b) c_common_get_alias_set deals with C-type structs correctly
> > c) C-type structs cannot alias derived classes because the later are
> >    put in alias set 0 because of [1]
> >
> 
> See this is where things get subtle.  You have to (at least) worry
> about whether or not t is zero-sized or has zero-sized bases or members.
Not any more than we worry about it in C.
And we are already giving up on the case where we are derived from another 
class, we are just trying to handle simple classes/structs here, so we 
need not worry about bases.  A c 
struct can have zero sized members, too.

> If it does, there may be other zero-sized things at the same address.
Same in C.
> If that's so, then if you put the zero-sized things in different alias
> sets, you're saying they never alias.  Now, obviously, we never read or
> write zero-sized things -- but different alias sets also implies that
> &x != &y which is false.
The only way we could end upw ith a problem in C++ for the restricted 
case above is if aliasing is broken for zero sized members in C, too.
You can end up with empty structs, and empty members just the same 
in C.
In the following example, foo.empty and foo.empty2 get placed in the same 
alias set, properly (i had to use the memset with 1 rather than 0 to 
actually get it to not no-op it at -O2)
#include <stddef.h>
struct dan
{
        struct empty
        {
        } empty;
        struct empty
        {
        } empty2;
        int a;
};
int main(void)
{
        struct dan foo;
        printf("%d\n", offsetof (struct dan, empty));
        printf("%d\n", offsetof (struct dan, empty2));
        printf("%d\n", offsetof (struct dan, a));
        printf("%d\n", sizeof (struct dan));
        printf("%d\n", (unsigned int) &foo.empty - (unsigned int) &foo);
        printf("%d\n", (unsigned int) &foo.empty2 - (unsigned int) &foo);
        printf("%d\n", (unsigned int) &foo.a - (unsigned int) &foo);
        foo.a = 5;
        memset(&foo.empty, 5, 1);
        memset(&foo.empty2, 5, 1);
}

Note that foo.empty2 and foo.empty are at the same address, too.

Also Note it lists them both as foo+0, which is correct.
...
(insn 62 59 63 (set (reg:QI 127)
        (const_int 5 [0x5])) -1 (nil)
    (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))

(insn 63 62 66 (set (mem/s:QI (reg/f:SI 111 virtual-stack-vars) [3 foo+0 
S1 A128])
        (reg:QI 127)) -1 (nil)
    (nil))

(insn 66 63 67 (set (reg:QI 128)
        (const_int 5 [0x5])) -1 (nil)
    (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))

(insn 67 66 68 (set (mem/s:QI (reg/f:SI 111 virtual-stack-vars) [3 foo+0 
S1 A128])
        (reg:QI 128)) -1 (nil)
    (nil))
...


If you throw an int b in between the two structs, like so:
#include <stddef.h>
struct dan
{
        struct empty
        {
        } empty;
        int b;
        struct empty
        {
        } empty2;
        int a;
};
int main(void)
{
        struct dan foo;
        printf("%d\n", offsetof (struct dan, empty));
        printf("%d\n", offsetof (struct dan, empty2));
        printf("%d\n", offsetof (struct dan, a));
        printf("%d\n", offsetof (struct dan, b));
        printf("%d\n", sizeof (struct dan));
        printf("%d\n", (unsigned int) &foo.empty - (unsigned int) &foo);
        printf("%d\n", (unsigned int) &foo.empty2 - (unsigned int) &foo);
        printf("%d\n", (unsigned int) &foo.a - (unsigned int) &foo);
        printf("%d\n", (unsigned int) &foo.b - (unsigned int) &foo);
        foo.a = 5;
        foo.b = 5;
        memset(&foo.empty, 5, 1);
        memset(&foo.empty2, 5, 1);
}

We get:
(insn 78 75 79 (set (reg:SI 132)
        (const_int 5 [0x5])) -1 (nil)
    (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))

(insn 79 78 82 (set (mem/s:SI (plus:SI (reg/f:SI 111 virtual-stack-vars)
                (const_int 12 [0xc])) [5 foo.a+0 S4 A32])
        (reg:SI 132)) -1 (nil)
    (nil))

(insn 82 79 83 (set (reg:SI 133)
        (const_int 5 [0x5])) -1 (nil)
    (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))

(insn 83 82 86 (set (mem/s:SI (plus:SI (reg/f:SI 111 virtual-stack-vars)
                (const_int 4 [0x4])) [5 foo.b+0 S4 A32])
        (reg:SI 133)) -1 (nil)
    (nil))

(insn 86 83 87 (set (reg:QI 134)
        (const_int 5 [0x5])) -1 (nil)
    (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))

(insn 87 86 90 (set (mem/s:QI (reg/f:SI 111 virtual-stack-vars) [3 foo+0 
S1 A128])
        (reg:QI 134)) -1 (nil)
    (nil))

(insn 90 87 91 (set (reg:QI 135)
        (const_int 5 [0x5])) -1 (nil)
    (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))

(insn 91 90 93 (set (mem:QI (plus:SI (reg/f:SI 111 virtual-stack-vars)
                (const_int 8 [0x8])) [0 S1 A8])
        (reg:QI 135)) -1 (nil)
    (nil))

Which I *think* is right (though why we lost track of foo.empty2 before we 
even wrote the first rtl dump is curious)
Now, we haven't gotten into tricky C++ stuff, but that's okay, some 
improvement is better than nothing.
Unless you can show that we have a problem with empty members in C, i 
don't see how, for the restricted case of classes that aren't inheriting 
from other classes, we could get screwed.
Remove the inheritance, and we have a c struct with virtual 
functions, which translates directly into a c struct with function 
pointers.


> 
> --
> Mark Mitchell                   mark@codesourcery.com
> CodeSourcery, LLC               http://www.codesourcery.com
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 15:30 Richard Kenner
@ 2002-01-25  2:16 ` Gabriel Dos Reis
  2002-01-25  3:04   ` Paolo Carlini
  2002-01-25  7:17   ` Daniel Berlin
  0 siblings, 2 replies; 97+ messages in thread
From: Gabriel Dos Reis @ 2002-01-25  2:16 UTC (permalink / raw)
  To: Richard Kenner; +Cc: pcarlini, gcc

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

[...]

| I am very much against the idea of defining a change to be "correct" if
| it doesn't cause any regression test failures.  You have to be able to make
| an argument that a change is correct independently and the regression tests
| serve as a debugger of (among other things) that proof.

I completely agree with Kenner and Mark.  Given, current ABIs
supported by g++, aliasing detection is a very subtle issue and we
should resist from the temptation of not proving that our algorithms
are correct; simply because correctness should come first, speed later.
That doesn't mean I'm against any effort to improve alias analysis in
g++, I'm simply against a change which doesn't consider correctness as
serious issue.

-- Gaby

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-24 16:09 Richard Kenner
  0 siblings, 0 replies; 97+ messages in thread
From: Richard Kenner @ 2002-01-24 16:09 UTC (permalink / raw)
  To: mark; +Cc: gcc

    Now, obviously, we never read or write zero-sized things -- but
    different alias sets also implies that &x != &y which is false.

I don't think so.

Consider the Ada record:

	type r is record
	    f1: integer;
	    f2: string (1..0);
	    f3: float;
	end record;

F2 and F3 will have different alias sets, but every instance of F2 and F3
will be at the same address.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 15:18       ` Dan Nicolaescu
@ 2002-01-24 15:36         ` Mark Mitchell
  2002-01-25  2:25           ` Daniel Berlin
  2002-01-25 15:48           ` Dan Nicolaescu
  0 siblings, 2 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-24 15:36 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: Daniel Berlin, gcc


> An incremental improvement would be to allow at least C-type structs.
>
> {
>   if (AGGREGATE_TYPE_P (t) &&
>      IS_A_CLASS_DERIVED_FROM_ANOTHER_CLASS_P (t)) /* [1] */
>     return 0;
>
>   return c_common_get_alias_set (t);
> }
>
> should be safe because:
> a) the predicate [1] is true for any aggregate that is not a C-type struct
> b) c_common_get_alias_set deals with C-type structs correctly
> c) C-type structs cannot alias derived classes because the later are
>    put in alias set 0 because of [1]
>

See this is where things get subtle.  You have to (at least) worry
about whether or not t is zero-sized or has zero-sized bases or members.
If it does, there may be other zero-sized things at the same address.
If that's so, then if you put the zero-sized things in different alias
sets, you're saying they never alias.  Now, obviously, we never read or
write zero-sized things -- but different alias sets also implies that
&x != &y which is false.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
@ 2002-01-24 15:30 Richard Kenner
  2002-01-25  2:16 ` Gabriel Dos Reis
  0 siblings, 1 reply; 97+ messages in thread
From: Richard Kenner @ 2002-01-24 15:30 UTC (permalink / raw)
  To: pcarlini; +Cc: gcc

    I would really like to see g++ doing better in C++-aliasing sensitive
    codes (i.e., Haney Speed) and I believe that any effort in this area
    should be supported and encouraged as much as possible.

I don't think anybody disagrees, but that doesn't mean that a proof of
correctness isn't needed.

As Mark points out, there are only a relatively small number of things in
GCC where we can prove things.  We ought to take advantage of those.

I am very much against the idea of defining a change to be "correct" if
it doesn't cause any regression test failures.  You have to be able to make
an argument that a change is correct independently and the regression tests
serve as a debugger of (among other things) that proof.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 14:15     ` Mark Mitchell
  2002-01-24 14:16       ` Daniel Berlin
@ 2002-01-24 15:18       ` Dan Nicolaescu
  2002-01-24 15:36         ` Mark Mitchell
  1 sibling, 1 reply; 97+ messages in thread
From: Dan Nicolaescu @ 2002-01-24 15:18 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Daniel Berlin, gcc

Mark Mitchell <mark@codesourcery.com> writes:

  > > Can the  AGGREGATE_TYPE_P check be relaxed for classes that don't have
  > > a base class?
  > 
  > Maybe, but it needs careful thought.  "It makes sense" isn't good enough;
  > we need a proof that this is safe.  Such a proof needs to take into
  > account that other classes may be derived (possibly indirectly or
  > virtually) from the simple class.

Currently cxx_get_alias_set looks like: 

{
  if (AGGREGATE_TYPE_P (t))
    return 0;
  
  return c_common_get_alias_set (t);
}


An incremental improvement would be to allow at least C-type structs. 

{
  if (AGGREGATE_TYPE_P (t) &&
     IS_A_CLASS_DERIVED_FROM_ANOTHER_CLASS_P (t)) /* [1] */
    return 0;
  
  return c_common_get_alias_set (t);
}

should be safe because: 
a) the predicate [1] is true for any aggregate that is not a C-type struct
b) c_common_get_alias_set deals with C-type structs correctly
c) C-type structs cannot alias derived classes because the later are
   put in alias set 0 because of [1]

Is this acceptable? 

I don't know how the
IS_A_CLASS_DERIVED_FROM_ANOTHER_CLASS_P predicate should be written. 

I tried using:
 CLASSTYPE_N_BASECLASSES (t)

but that does not seem to be enough (I don't know much about the C++
frontend...)

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 14:35           ` Daniel Berlin
  2002-01-24 15:06             ` Mark Mitchell
@ 2002-01-24 15:08             ` Paolo Carlini
  1 sibling, 0 replies; 97+ messages in thread
From: Paolo Carlini @ 2002-01-24 15:08 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc

Daniel Berlin wrote:

> I'm confused as to why you think it's such a tricky thing to do C++
> aliasing that it requires any kind of formal sketch of a proof.
> I've read about 6 papers now on aliasing and static type determination in
> C++, and all just treat them as C structs that contain other C structs,
> with the special case that virtual bases are unions.
> Isn't this the correct way to do it?
> Other than virtual bases, don't the addresses of each subobject, and
> thus, their members, have to be different?
> The draft standard (I don't feel like opening the formal standard, i'm
> not near a graphical workstation of any kind right now) says
> " A base class subobject may be of zero size  (_class_);
>    however,  two subobjects that have the same class type and that belong
>    to the same most derived object must not  be  allocated  at  the  same
>    address (_expr.eq_).  ]"
>
> Doesn't this mean that excluding the virtual base class case, we could
> simply treat them as structs with other structs embedded in them (Which is
> already an improvement over what we have now)
> Or are all of the papers i've read on the subject just not being
> compliant, and getting lucky?
>
> It's also interesting that it appears pro64 simply removed that check, and
> calls record_component_aliases.
>
> --Dan

As a passionate supporter of g++, trying to do my best (in the time allowed by
my work) to help the project, I'd like to express my profound sympathy to Dan's
point of view.

I would really like to see g++ doing better in C++-aliasing sensitive codes
(i.e., Haney Speed) and I believe that any effort in this area should be
supported and encouraged as much as possible.

Dan, could you please point me (privately if you prefer) to some of those
papers?

Thanks,
Paolo.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 14:35           ` Daniel Berlin
@ 2002-01-24 15:06             ` Mark Mitchell
  2002-01-24 15:08             ` Paolo Carlini
  1 sibling, 0 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-24 15:06 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Dan Nicolaescu, gcc



> I'm confused as to why you think it's such a tricky thing to do C++
> aliasing that it requires any kind of formal sketch of a proof.

I'm confused as to why you think we shouldn't prove things if we can. :-)

This is a change that can cause very hard to debug problems, both for
users and for us.  We should try to be sure we know what we're doing
before we do it.

Things are not very simple in C++ with the new IA64 ABI class layout.
Earlier C++ compilers *literally* translated C++ into C using nested
structs, which would have made things pretty simple.  But here we
can have multiple types overlaying the same location.

It's not that the idea won't work -- it might.  But it needs to be
proven first.

If it's not a hard proof, what's the big deal?  And if it is a hard
proof, that's just an indicating that it's worth doing before we
make the change.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 14:27         ` Mark Mitchell
@ 2002-01-24 14:35           ` Daniel Berlin
  2002-01-24 15:06             ` Mark Mitchell
  2002-01-24 15:08             ` Paolo Carlini
  0 siblings, 2 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-24 14:35 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Dan Nicolaescu, gcc

On Thu, 24 Jan 2002, Mark Mitchell wrote:

> > What do you consider "proof"?
> 
> The sort of thing that appears in a computer science textbook when
> proving that an algorithm is sound.  Or at least what is often called
> a "sketch of a proof".
> 
> > What more than not causing any regressions are you looking for?
> 
> A proof that this is not an algorithmic mistake.  This
> is an aspect of GCC that we can reason about formally; the semantics
> of alias sets are reasonably well defined, and the semantics of the
> C++ abstract machine are reasonably well defined in this regard, so
> it should be reasonably possible to convince ourselves that this
> change is safe in the abstract if it is.
What change?
I simply suggested it might be possible to do it.

I'm confused as to why you think it's such a tricky thing to do C++ 
aliasing that it requires any kind of formal sketch of a proof.
I've read about 6 papers now on aliasing and static type determination in 
C++, and all just treat them as C structs that contain other C structs, 
with the special case that virtual bases are unions.
Isn't this the correct way to do it?
Other than virtual bases, don't the addresses of each subobject, and 
thus, their members, have to be different?
The draft standard (I don't feel like opening the formal standard, i'm 
not near a graphical workstation of any kind right now) says
" A base class subobject may be of zero size  (_class_);
   however,  two subobjects that have the same class type and that belong
   to the same most derived object must not  be  allocated  at  the  same
   address (_expr.eq_).  ]"

Doesn't this mean that excluding the virtual base class case, we could 
simply treat them as structs with other structs embedded in them (Which is 
already an improvement over what we have now)
Or are all of the papers i've read on the subject just not being 
compliant, and getting lucky?

It's also interesting that it appears pro64 simply removed that check, and 
calls record_component_aliases.

--Dan


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 14:16       ` Daniel Berlin
@ 2002-01-24 14:27         ` Mark Mitchell
  2002-01-24 14:35           ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Mark Mitchell @ 2002-01-24 14:27 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Dan Nicolaescu, gcc

> What do you consider "proof"?

The sort of thing that appears in a computer science textbook when
proving that an algorithm is sound.  Or at least what is often called
a "sketch of a proof".

> What more than not causing any regressions are you looking for?

A proof that this is not an algorithmic mistake.  This
is an aspect of GCC that we can reason about formally; the semantics
of alias sets are reasonably well defined, and the semantics of the
C++ abstract machine are reasonably well defined in this regard, so
it should be reasonably possible to convince ourselves that this
change is safe in the abstract if it is.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-24 14:15     ` Mark Mitchell
@ 2002-01-24 14:16       ` Daniel Berlin
  2002-01-24 14:27         ` Mark Mitchell
  2002-01-24 15:18       ` Dan Nicolaescu
  1 sibling, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-24 14:16 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Dan Nicolaescu, gcc

On Thu, 24 Jan 2002, Mark Mitchell wrote:

> 
> > Can the  AGGREGATE_TYPE_P check be relaxed for classes that don't have
> > a base class?
> 
> Maybe, but it needs careful thought.  "It makes sense" isn't good enough;
> we need a proof that this is safe.  Such a proof needs to take into
> account that other classes may be derived (possibly indirectly or
> virtually) from the simple class.
What do you consider "proof"?
What more than not causing any regressions are you looking for?
I don't think code to fix this problem should be held to any higher 
standard than code to fix other problems.
Not that it should be introduced into 3.1 at this late a stage, but to say 
that it requires "proof" is vague, and implies you think something more 
than what other patches go through is neccessary.
--Dan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-23 18:48   ` Dan Nicolaescu
  2002-01-23 19:16     ` Daniel Berlin
@ 2002-01-24 14:15     ` Mark Mitchell
  2002-01-24 14:16       ` Daniel Berlin
  2002-01-24 15:18       ` Dan Nicolaescu
  1 sibling, 2 replies; 97+ messages in thread
From: Mark Mitchell @ 2002-01-24 14:15 UTC (permalink / raw)
  To: Dan Nicolaescu, Daniel Berlin; +Cc: gcc


> Can the  AGGREGATE_TYPE_P check be relaxed for classes that don't have
> a base class?

Maybe, but it needs careful thought.  "It makes sense" isn't good enough;
we need a proof that this is safe.  Such a proof needs to take into
account that other classes may be derived (possibly indirectly or
virtually) from the simple class.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-23 18:48   ` Dan Nicolaescu
@ 2002-01-23 19:16     ` Daniel Berlin
  2002-01-24 14:15     ` Mark Mitchell
  1 sibling, 0 replies; 97+ messages in thread
From: Daniel Berlin @ 2002-01-23 19:16 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: gcc

On Wed, 23 Jan 2002, Dan Nicolaescu wrote:

> Daniel Berlin <dan@dberlin.org> writes:
> 
>   > On Wed, 23 Jan 2002, Dan Nicolaescu 
>   > wrote:
>   > 
>   > > 
>   > > It looks like g++ has some problems with aliasing bools, and gcc
>   > > doesn't.
>   > 
>   > It's a side effect of C++ not making alias sets for classes, i think.
>   > Try commenting out the AGGREGATE_TYPE_P check in cxx_get_alias_set and see 
>   > if it goes away.
> 
> Yep, it went away. 
> 
> There's a comment in that function:
>  /* It's not yet safe to use alias sets for classes in C++ because
>      the TYPE_FIELDs list for a class doesn't mention base classes.  */
> 
> Can the  AGGREGATE_TYPE_P check be relaxed for classes that don't have
> a base class? 
Dunno, but it makes sense to me to do so.
> 
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-23 18:27 ` Daniel Berlin
@ 2002-01-23 18:48   ` Dan Nicolaescu
  2002-01-23 19:16     ` Daniel Berlin
  2002-01-24 14:15     ` Mark Mitchell
  0 siblings, 2 replies; 97+ messages in thread
From: Dan Nicolaescu @ 2002-01-23 18:48 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc

Daniel Berlin <dan@dberlin.org> writes:

  > On Wed, 23 Jan 2002, Dan Nicolaescu 
  > wrote:
  > 
  > > 
  > > It looks like g++ has some problems with aliasing bools, and gcc
  > > doesn't.
  > 
  > It's a side effect of C++ not making alias sets for classes, i think.
  > Try commenting out the AGGREGATE_TYPE_P check in cxx_get_alias_set and see 
  > if it goes away.

Yep, it went away. 

There's a comment in that function:
 /* It's not yet safe to use alias sets for classes in C++ because
     the TYPE_FIELDs list for a class doesn't mention base classes.  */

Can the  AGGREGATE_TYPE_P check be relaxed for classes that don't have
a base class? 


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: g++ and aliasing bools
  2002-01-23 17:56 Dan Nicolaescu
@ 2002-01-23 18:27 ` Daniel Berlin
  2002-01-23 18:48   ` Dan Nicolaescu
  0 siblings, 1 reply; 97+ messages in thread
From: Daniel Berlin @ 2002-01-23 18:27 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: gcc

On Wed, 23 Jan 2002, Dan Nicolaescu 
wrote:

> 
> It looks like g++ has some problems with aliasing bools, and gcc
> doesn't.

It's a side effect of C++ not making alias sets for classes, i think.
Try commenting out the AGGREGATE_TYPE_P check in cxx_get_alias_set and see 
if it goes away.
> 
> 
> The following program: 
> 
> #ifdef __cplusplus
> 
> struct bool_struct         
> {
>   bool f0;
>   bool f1;
>   bool f2; 
>   bool f3;
> };
> #else
> struct bool_struct         
> {
>   _Bool f0;
>   _Bool f1;
>   _Bool f2; 
>   _Bool f3;
> };
> 
> #endif
> 
> 
> struct bool_struct *str; 
> 
> 
> void 
> g (void)
> {
>   str->f0 = 1;
>   str->f1 = 0;
>   str->f2 = 1;
>   str->f3 = 0; 
> }
> 
> compiles to: 
> 
> _Z1gv:
> .LLFB1:
>         !#PROLOGUE# 0
>         !#PROLOGUE# 1
>         sethi   %hi(str), %o2
>         ld      [%o2+%lo(str)], %o1
>         mov     1, %o3
>         stb     %o3, [%o1]
>         ld      [%o2+%lo(str)], %o0   <- this load is not needed
>         stb     %g0, [%o0+1]
>         ld      [%o2+%lo(str)], %o1   <- this load is not needed
>         stb     %o3, [%o1+2]
>         ld      [%o2+%lo(str)], %o0   <- this load is not needed
>         retl
>         stb     %g0, [%o0+3]
> 
> when compiled with g++ -O2  on sparc-sun-solaris2.8 using a GCC from
> CVS as of this morning. 
> 
> but the code is much better when compiling with gcc -O2 
> 
> g:
>         !#PROLOGUE# 0
>         !#PROLOGUE# 1
>         sethi   %hi(str), %o0
>         ld      [%o0+%lo(str)], %o1
>         mov     1, %o2
>         stb     %o2, [%o1+2]
>         stb     %g0, [%o1+3]
>         stb     %o2, [%o1]
>         retl
>         stb     %g0, [%o1+1]
> 
> 
> so it looks like g++ has some issues with aliasing bools. 
> 
> Any idea what is wrong? 
> 
>         
> 

^ permalink raw reply	[flat|nested] 97+ messages in thread

* g++ and aliasing bools
@ 2002-01-23 17:56 Dan Nicolaescu
  2002-01-23 18:27 ` Daniel Berlin
  0 siblings, 1 reply; 97+ messages in thread
From: Dan Nicolaescu @ 2002-01-23 17:56 UTC (permalink / raw)
  To: gcc


It looks like g++ has some problems with aliasing bools, and gcc
doesn't.


The following program: 

#ifdef __cplusplus

struct bool_struct         
{
  bool f0;
  bool f1;
  bool f2; 
  bool f3;
};
#else
struct bool_struct         
{
  _Bool f0;
  _Bool f1;
  _Bool f2; 
  _Bool f3;
};

#endif


struct bool_struct *str; 


void 
g (void)
{
  str->f0 = 1;
  str->f1 = 0;
  str->f2 = 1;
  str->f3 = 0; 
}

compiles to: 

_Z1gv:
.LLFB1:
        !#PROLOGUE# 0
        !#PROLOGUE# 1
        sethi   %hi(str), %o2
        ld      [%o2+%lo(str)], %o1
        mov     1, %o3
        stb     %o3, [%o1]
        ld      [%o2+%lo(str)], %o0   <- this load is not needed
        stb     %g0, [%o0+1]
        ld      [%o2+%lo(str)], %o1   <- this load is not needed
        stb     %o3, [%o1+2]
        ld      [%o2+%lo(str)], %o0   <- this load is not needed
        retl
        stb     %g0, [%o0+3]

when compiled with g++ -O2  on sparc-sun-solaris2.8 using a GCC from
CVS as of this morning. 

but the code is much better when compiling with gcc -O2 

g:
        !#PROLOGUE# 0
        !#PROLOGUE# 1
        sethi   %hi(str), %o0
        ld      [%o0+%lo(str)], %o1
        mov     1, %o2
        stb     %o2, [%o1+2]
        stb     %g0, [%o1+3]
        stb     %o2, [%o1]
        retl
        stb     %g0, [%o1+1]


so it looks like g++ has some issues with aliasing bools. 

Any idea what is wrong? 

        

^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2002-01-29  2:14 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-01-25  8:55 g++ and aliasing bools Robert Dewar
2002-01-25  9:21 ` Daniel Berlin
2002-01-25 10:00   ` Daniel Berlin
2002-01-25 10:54     ` Paolo Carlini
2002-01-25 11:37       ` Daniel Berlin
2002-01-25 11:45       ` David Edelsohn
2002-01-25 11:53         ` Joe Buck
2002-01-25 12:09           ` Mark Mitchell
2002-01-25 12:28             ` Paolo Carlini
2002-01-25 13:49               ` Mark Mitchell
2002-01-25 14:19                 ` Joe Buck
2002-01-25 14:21                   ` Mark Mitchell
2002-01-25 15:41                     ` Neil Booth
2002-01-25 16:04                       ` Joe Buck
2002-01-25 17:37                         ` Paolo Carlini
2002-01-25 18:10                         ` Daniel Berlin
2002-01-27  5:11                         ` Mark Mitchell
2002-01-27  5:34                           ` Daniel Berlin
2002-01-28 10:39                             ` Joe Buck
2002-01-28 10:51                               ` Joe Buck
2002-01-28 15:59                               ` Mark Mitchell
2002-01-28 17:11                                 ` Daniel Berlin
2002-01-28 17:28                                   ` Joe Buck
2002-01-28 18:14                                     ` Daniel Berlin
2002-01-28 17:18                                 ` Joe Buck
2002-01-28 18:05                                   ` Mark Mitchell
2002-01-28 18:50                                     ` Joe Buck
2002-01-28 19:33                                       ` Mark Mitchell
2002-01-28 17:40                                         ` Daniel Berlin
2002-01-28 21:55                                           ` Daniel Berlin
2002-01-28 22:02                                         ` Alexandre Oliva
2002-01-28 22:12                                           ` Mark Mitchell
2002-01-25 13:07             ` Joe Buck
2002-01-25 15:43               ` Daniel Berlin
2002-01-25 16:03                 ` Joe Buck
2002-01-25 15:13             ` Daniel Berlin
2002-01-25 12:10           ` Paolo Carlini
2002-01-25 13:16             ` Joe Buck
2002-01-25 15:23             ` Daniel Berlin
2002-01-25 12:05         ` Mark Mitchell
2002-01-25 22:14           ` Daniel Berlin
2002-01-26  3:46             ` Mark Mitchell
  -- strict thread matches above, loose matches on Subject: below --
2002-01-25 14:49 mike stump
2002-01-25 12:23 Robert Dewar
2002-01-25 13:29 ` Joe Buck
2002-01-25 12:06 mike stump
2002-01-25  9:13 Robert Dewar
2002-01-25  8:35 Robert Dewar
2002-01-25  8:54 ` Daniel Berlin
2002-01-25  8:33 Richard Kenner
2002-01-25  8:32 Robert Dewar
2002-01-25  8:53 ` Daniel Berlin
2002-01-25  9:39 ` Joe Buck
2002-01-25  8:28 Robert Dewar
2002-01-25  8:49 ` Daniel Berlin
2002-01-25  7:51 Robert Dewar
2002-01-25  8:18 ` Daniel Berlin
2002-01-25  8:20   ` Daniel Berlin
2002-01-25  7:38 Robert Dewar
2002-01-25  8:11 ` Daniel Berlin
2002-01-25 14:09   ` Gabriel Dos Reis
2002-01-25  7:30 Richard Kenner
2002-01-25  7:30 Richard Kenner
2002-01-25  7:33 ` Daniel Berlin
2002-01-25 15:43   ` Daniel Berlin
2002-01-25  7:23 Richard Kenner
2002-01-25  7:24 ` Daniel Berlin
2002-01-25  7:05 Richard Kenner
2002-01-25  8:59 ` Paolo Carlini
2002-01-24 16:09 Richard Kenner
2002-01-24 15:30 Richard Kenner
2002-01-25  2:16 ` Gabriel Dos Reis
2002-01-25  3:04   ` Paolo Carlini
2002-01-25  4:17     ` Gabriel Dos Reis
2002-01-25  4:35       ` Paolo Carlini
2002-01-25  6:34         ` Daniel Berlin
2002-01-25  7:17   ` Daniel Berlin
2002-01-25 13:57     ` Gabriel Dos Reis
2002-01-25 14:47       ` Tim Hollebeek
2002-01-23 17:56 Dan Nicolaescu
2002-01-23 18:27 ` Daniel Berlin
2002-01-23 18:48   ` Dan Nicolaescu
2002-01-23 19:16     ` Daniel Berlin
2002-01-24 14:15     ` Mark Mitchell
2002-01-24 14:16       ` Daniel Berlin
2002-01-24 14:27         ` Mark Mitchell
2002-01-24 14:35           ` Daniel Berlin
2002-01-24 15:06             ` Mark Mitchell
2002-01-24 15:08             ` Paolo Carlini
2002-01-24 15:18       ` Dan Nicolaescu
2002-01-24 15:36         ` Mark Mitchell
2002-01-25  2:25           ` Daniel Berlin
2002-01-25 15:48           ` Dan Nicolaescu
2002-01-25 20:22             ` Joe Buck
2002-01-25 23:59               ` Daniel Berlin
2002-01-27 17:04               ` Dan Nicolaescu
2002-01-27 17:59                 ` Paolo Carlini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).