[tree-ssa] alias analysis

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* [tree-ssa] alias analysis
@ 2003-02-11 17:29 law
  2003-02-11 18:44 ` Diego Novillo
  0 siblings, 1 reply; 23+ messages in thread
From: law @ 2003-02-11 17:29 UTC (permalink / raw)
  To: dnovillo; +Cc: gcc

I've finally wrapped my head around how your scheme speeds up alias
analysis and it's not going to be trivial to merge your ideas with
mine for separating loads and stores, though it may be possible.

Your scheme depends on having every object which has the same alias set
being represented by a single tag.  Then you just need to check things
against that representative tag.

In my scheme for separating loads and stores, we can have different
tags for stores to addressable objects with the same underlying tag.

ie

  int x = 5;
  int y = 10;

  foo (&x, &y);

We'd have distinct tags for x & y, even though they have the same alias
set.

This is important when we check aliased loads -- we have to look through
*all* the tags rather than quitting when we find a alias.  The number of
tags is proportional to the number of stores to unique addressable variables
and the number of pointer stores using unique alias sets.

FWIW, this scheme completes my components of cc1 test in 698 seconds.  Note
I haven't really tested this scheme thoroughly, so it may have bugs
which could affect the timing.

Another approach is to go ahead and assume that X & Y may alias each 
other so that they're globbed into a single alias set.  This scheme will
lead to less accurate information, but does play reasonably well with
your approach.  This scheme completes my components of cc1 test in 701
seconds.  This scheme has been beat on pretty good.

Compare that to the current state of the branch which clocks in at 708
seconds and produces less accurate information than either of the approaches
mentioned above.

I haven't compared the two schemes to see in general how they perform
in terms of disambiguating memory references, but they both manage to
disambiguate all the memory references in 20001226-1.c.

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-11 17:29 [tree-ssa] alias analysis law
@ 2003-02-11 18:44 ` Diego Novillo
  2003-02-11 18:51   ` law
  0 siblings, 1 reply; 23+ messages in thread
From: Diego Novillo @ 2003-02-11 18:44 UTC (permalink / raw)
  To: law; +Cc: gcc

On Tue, 11 Feb 2003, Jeff Law wrote:

> Your scheme depends on having every object which has the same alias set
> being represented by a single tag.  Then you just need to check things
> against that representative tag.
> 
Yes.

> In my scheme for separating loads and stores, we can have different
> tags for stores to addressable objects with the same underlying tag.
> 
> ie
> 
>   int x = 5;
>   int y = 10;
> 
>   foo (&x, &y);
> 
> 
> We'd have distinct tags for x & y, even though they have the same alias
> set.
> 
Now you lost me.  I thought you were separating load from store
aliases?  Aren't x and y store aliases here?  Or is it load
aliases?  I think I need a step-by-step description again :)


> This is important when we check aliased loads -- we have to look through
> *all* the tags rather than quitting when we find a alias.  The number of
> tags is proportional to the number of stores to unique addressable variables
> and the number of pointer stores using unique alias sets.
> 
Maybe we could add an attribute to each alias set to determine if
it's a load or a store alias?



Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-11 18:44 ` Diego Novillo
@ 2003-02-11 18:51   ` law
  2003-02-11 20:35     ` Diego Novillo
  0 siblings, 1 reply; 23+ messages in thread
From: law @ 2003-02-11 18:51 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <20030211183344.GA5611@tornado.toronto.redhat.com>, Diego Novillo wr
ites:
 >On Tue, 11 Feb 2003, Jeff Law wrote:
 >
 >> Your scheme depends on having every object which has the same alias set
 >> being represented by a single tag.  Then you just need to check things
 >> against that representative tag.
 >> 
 >Yes.
 >
 >> In my scheme for separating loads and stores, we can have different
 >> tags for stores to addressable objects with the same underlying tag.
 >> 
 >> ie
 >> 
 >>   int x = 5;
 >>   int y = 10;
 >> 
 >>   foo (&x, &y);
 >> 
 >> 
 >> We'd have distinct tags for x & y, even though they have the same alias
 >> set.
 >> 
 >Now you lost me.  I thought you were separating load from store
 >aliases?  Aren't x and y store aliases here?  Or is it load
 >aliases?  I think I need a step-by-step description again :)
They are stores, but they are stores to distinct VAR_DECLs.  There's no
way they can actually alias each other as the underlying objects will
always be distinct.  Ie, there is no way that x = foo will every modify
y.

So while they have the same alias set number, they do not alias and
we can keep their tags distinct.  The only time we have to join them
into a superset is if we find something like *p = blah.

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-11 18:51   ` law
@ 2003-02-11 20:35     ` Diego Novillo
  2003-02-11 20:35       ` Diego Novillo
                         ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Diego Novillo @ 2003-02-11 20:35 UTC (permalink / raw)
  To: law; +Cc: gcc

On Tue, 11 Feb 2003, Jeff Law wrote:

>  >>   int x = 5;
>  >>   int y = 10;
>  >> 
>  >>   foo (&x, &y);
>  >> 
>  >> 
>  >> We'd have distinct tags for x & y, even though they have the same alias
>  >> set.
>  >> 
>  >Now you lost me.  I thought you were separating load from store
>  >aliases?  Aren't x and y store aliases here?  Or is it load
>  >aliases?  I think I need a step-by-step description again :)
> They are stores, but they are stores to distinct VAR_DECLs.  There's no
> way they can actually alias each other as the underlying objects will
> always be distinct.  Ie, there is no way that x = foo will every modify
> y.
> 
Oh, I see.  Do you want to be able to rewrite X and Y in this
case?  The problem is that since X and Y are aliased, regardless
of what their alias is, SSA will refuse to rewrite them.  What we
will re-write are the VDEFs to their alias tag.

In this case we would want to split alias sets whose members
cannot alias each other.  But this could quickly take us back to
quadratic behaviour.  Say we have this initial alias set for X
and Y:

*p: { x y }

What you can do is traverse this alias set and make X and Y alias
only those members of the set that can actually point to them.
That is, any other INDIRECT_REFs in the set and themselves.

In the case above, X and Y only alias themselves and *P.  I know
it's a bit absurd to think that an addressable aliases itself,
but it's a tautology and happens to fit in the existing framework :)

At this point we can introduce the asymmetry of loads and stores.
For each variable V, we distinguish between V's store-aliases and
load-aliases:

(1) We create a VDEF for every store-alias of V everytime V is
    assigned a new value.

(2) We create a VUSE for every load-alias of V everytime V is
    used.

In this case we will have:

x: store-alias: *p, load-alias: x
y: store-alias: *p, load-alias: y
*p: store-alias: x, y, load-alias: x, y

Normal Form			SSA Form

{		  	  	   {
  p = &x or &y		   	 	p = &x or &y
  ...		    			...

			    		# (*p)_2 = VDEF<(*p)_1>
  x = 5;	   	 		x_1 = 5;

		    			# (*p)_3 = VDEF<(*p)_2>
  y = 3;	   	 		y_2 = 3;

		    			# x_2 = VDEF<x_1>
		   		 	# y_3 = VDEF<y_2>
  *p = *p + 8;			    	(*p)_5 = (*p)_3 + 8;
  ... = x + y;			    	... = x_2 + y_3;
}			     	   }

Is this something along the lines you had in mind.  Apologies if
it makes little sense.  I'm on my way out and I'll probably have
little connectivity until next week.  I'll think about it some
more in the meantime.

Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-11 20:35     ` Diego Novillo
@ 2003-02-11 20:35       ` Diego Novillo
  2003-02-11 22:39       ` [tree-ssa]: Always compute sets (Was Re: [tree-ssa] alias analysis) Daniel Berlin
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Diego Novillo @ 2003-02-11 20:35 UTC (permalink / raw)
  To: law; +Cc: gcc

On Tue, 11 Feb 2003, Diego Novillo wrote:

> On Tue, 11 Feb 2003, Jeff Law wrote:
> 
> >  >>   int x = 5;
> >  >>   int y = 10;
> >  >> 
> >  >>   foo (&x, &y);
> >  >> 
> >  >> 
> >  >> We'd have distinct tags for x & y, even though they have the same alias
> >  >> set.
> >  >> 
> >  >Now you lost me.  I thought you were separating load from store
> >  >aliases?  Aren't x and y store aliases here?  Or is it load
> >  >aliases?  I think I need a step-by-step description again :)
> > They are stores, but they are stores to distinct VAR_DECLs.  There's no
> > way they can actually alias each other as the underlying objects will
> > always be distinct.  Ie, there is no way that x = foo will every modify
> > y.
> > 
> Oh, I see.  Do you want to be able to rewrite X and Y in this
> case?  The problem is that since X and Y are aliased, regardless
> of what their alias is, SSA will refuse to rewrite them.  What we
> will re-write are the VDEFs to their alias tag.
> 
> In this case we would want to split alias sets whose members
> cannot alias each other.  But this could quickly take us back to
> quadratic behaviour.  Say we have this initial alias set for X
> and Y:
> 
> *p: { x y }
> 
> What you can do is traverse this alias set and make X and Y alias
> only those members of the set that can actually point to them.
> That is, any other INDIRECT_REFs in the set and themselves.
> 
> In the case above, X and Y only alias themselves and *P.  I know
> it's a bit absurd to think that an addressable aliases itself,
> but it's a tautology and happens to fit in the existing framework :)
> 
> At this point we can introduce the asymmetry of loads and stores.
> For each variable V, we distinguish between V's store-aliases and
> load-aliases:
> 
> (1) We create a VDEF for every store-alias of V everytime V is
>     assigned a new value.
> 
> (2) We create a VUSE for every load-alias of V everytime V is
>     used.
> 
> In this case we will have:
> 
> x: store-alias: *p, load-alias: x
> y: store-alias: *p, load-alias: y
> *p: store-alias: x, y, load-alias: x, y
> 
> Normal Form			SSA Form
> 
> {		  	  	   {
>   p = &x or &y		   	 	p = &x or &y
>   ...		    			...
> 	
> 			    		# (*p)_2 = VDEF<(*p)_1>
>   x = 5;	   	 		x_1 = 5;
> 
> 		    			# (*p)_3 = VDEF<(*p)_2>
>   y = 3;	   	 		y_2 = 3;
> 
> 		    			# x_2 = VDEF<x_1>
> 		   		 	# y_3 = VDEF<y_2>
>   *p = *p + 8;			    	(*p)_5 = (*p)_3 + 8;
>   ... = x + y;			    	... = x_2 + y_3;
> }			     	   }
> 
> 
> Is this something along the lines you had in mind.  Apologies if
> it makes little sense.  I'm on my way out and I'll probably have
> little connectivity until next week.  I'll think about it some
> more in the meantime.
> 
One thing I forgot.  This has the potential of being a memory and
performance pig.  These alias sets tend to have many variables
clustered together.


Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tree-ssa]: Always compute sets (Was Re: [tree-ssa] alias analysis)
  2003-02-11 20:35     ` Diego Novillo
  2003-02-11 20:35       ` Diego Novillo
@ 2003-02-11 22:39       ` Daniel Berlin
  2003-02-12  4:37       ` [tree-ssa] alias analysis law
  2003-02-12  6:30       ` law
  3 siblings, 0 replies; 23+ messages in thread
From: Daniel Berlin @ 2003-02-11 22:39 UTC (permalink / raw)
  To: Diego Novillo; +Cc: law, gcc

>
>
> Is this something along the lines you had in mind.  Apologies if
> it makes little sense.  I'm on my way out and I'll probably have
> little connectivity until next week.  I'll think about it some
> more in the meantime.
>

FYI: I'm going to remove the specific casing for PTA (since it's broken
right now, the find_may_aliases_for routine it calls doesn't generate
aliases right), and let it just be a disambiguator.

IE it'll just always call compute_alias_sets in tree-dfa.c

It seems to do okay on tests i have in terms of alias sets not globbing
together unrelated things that PTA says are non-aliasing.

And the statics-only interprocedural mode does get rid of all the global
var and parameter passing related aliasing sets when approriate, it seems.

IE in

static int bob (int **b1, int *c1, int *d1)
{
	*b1 = c1;
}

int main(void)
{
	int *a;
	int b2;
	int c2;
	bob (&a, &b2, &c2);
	printf ("%d\n", *a);
}

With -ftree-points-to=andersen -fip, we come up with:

Alias information for main: 2 sets

Alias set #0:
  Tag: *.GLOBAL_VAR, may aliases: *.GLOBAL_VAR, may alias global memory
  Aliases: { *.GLOBAL_VAR }

Alias set #1:
  Tag: *a, may aliases: *a
  Aliases: { c2 *a }

which is perfect information.


And without -fip, for:
int main(void)
{
        int *a;
        int *b;
        int c;
        c = 5;
        b = &c;
        a = b;
        *a = *b;
}

we come up with:

Without Andersen:

Alias information for main: 1 sets

Alias set #0:
  Tag: *a, may aliases: *a
  Aliases: { c *a *b }

With Andersen:

Alias information for main: 2 sets

Alias set #0:
  Tag: *a, may aliases: *a
  Aliases: { c *a }

Alias set #1:
  Tag: *b, may aliases: *b
  Aliases: { *b }

(we never ask whether b can alias c, presumably because b is never
dereferenced)


Patch i'll commit attach.


--Dan
2003-02-12  Daniel Berlin  <dberlin@dberlin.org>

	* tree-dfa.c (find_may_aliases_for): Remove
	(compute_may_aliases): Always all compute_alias_sets.

Index: tree-dfa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-dfa.c,v
retrieving revision 1.1.4.76
diff -u -3 -p -r1.1.4.76 tree-dfa.c
--- tree-dfa.c	10 Feb 2003 12:27:03 -0000	1.1.4.76
+++ tree-dfa.c	11 Feb 2003 22:37:41 -0000
@@ -100,7 +100,6 @@ static void register_alias_set		PARAMS (
 static void find_alias_for		PARAMS ((tree, tree));
 static bool may_alias_p			PARAMS ((tree, tree, HOST_WIDE_INT,
 						 tree, tree, HOST_WIDE_INT));
-static void find_may_aliases_for	PARAMS ((int));
 static bool may_access_global_mem_p 	PARAMS ((tree, tree));
 static void set_def			PARAMS ((tree *, tree));
 static void add_use			PARAMS ((tree *, tree));
@@ -1449,7 +1448,6 @@ clobber_vars_r (tp, walk_subtrees, data)
 void
 compute_may_aliases ()
 {
-  unsigned long i;
   static htab_t vars_found;
   static htab_t indirect_refs_found;
   static htab_t addressable_vars_found;
@@ -1507,13 +1505,7 @@ compute_may_aliases ()
       timevar_pop (TV_TREE_PTA);
     }

-  if (flag_tree_points_to == PTA_NONE)
-    compute_alias_sets ();
-  else
-    {
-      for (i = 0; i < num_indirect_refs; i++)
-	find_may_aliases_for (i);
-    }
+  compute_alias_sets ();

   num_indirect_refs = 0;
   indirect_refs = 0;
@@ -1870,58 +1862,6 @@ may_alias_p (v1, v1_base, v1_alias_set,
   return true;
 }

-
-/* Find variables that may be aliased by the variable (V1) at
-   index INDIRECT_REF_INDEX in the INDIRECT_REFS varray.  */
-
-static void
-find_may_aliases_for (indirect_ref_index)
-     int indirect_ref_index;
-{
-  unsigned long i;
-  tree v1 = VARRAY_TREE (indirect_refs, indirect_ref_index);
-  tree v1_base = VARRAY_TREE (indirect_refs_base, indirect_ref_index);
-  HOST_WIDE_INT v1_alias_set
-    = VARRAY_INT (indirect_refs_alias_set, indirect_ref_index);
-
-#if defined ENABLE_CHECKING
-  if (TREE_CODE (v1) != INDIRECT_REF)
-    abort ();
-#endif
-
-  /* Note that our aliasing properties are symmetric, so we can
-     start this loop at INDIRECT_REF_INDEX + 1 to cut down on the
-     runtime for this routine.  */
-  for (i = indirect_ref_index + 1; i < num_indirect_refs; i++)
-    {
-      tree v2 = VARRAY_TREE (indirect_refs, i);
-      tree v2_base = VARRAY_TREE (indirect_refs_base, i);
-      HOST_WIDE_INT v2_alias_set = VARRAY_INT (indirect_refs_alias_set, i);
-
-      if (may_alias_p (v1, v1_base, v1_alias_set, v2, v2_base, v2_alias_set))
-	{
-	  add_may_alias (v2, v2_base, v1, v1_base);
-	  add_may_alias (v1, v1_base, v2, v2_base);
-	}
-    }
-
-  /* Now check if V1 may alias any of the addressable variables.  */
-  for (i = 0; i < num_addressable_vars; i++)
-    {
-      tree v2 = VARRAY_TREE (addressable_vars, i);
-      tree v2_base = VARRAY_TREE (addressable_vars_base, i);
-      HOST_WIDE_INT v2_alias_set = VARRAY_INT (addressable_vars_alias_set, i);
-
-      if (v1 == v2)
-	continue;
-
-      if (may_alias_p (v1, v1_base, v1_alias_set, v2, v2_base, v2_alias_set))
-	{
-	  add_may_alias (v2, v2_base, v1, v1_base);
-	  add_may_alias (v1, v1_base, v2, v2_base);
-	}
-    }
-}


 /* Add ALIAS to the set of variables that may alias VAR.  VAR_SYM and



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-11 20:35     ` Diego Novillo
  2003-02-11 20:35       ` Diego Novillo
  2003-02-11 22:39       ` [tree-ssa]: Always compute sets (Was Re: [tree-ssa] alias analysis) Daniel Berlin
@ 2003-02-12  4:37       ` law
  2003-02-12  7:33         ` Daniel Berlin
  2003-02-13 15:32         ` Diego Novillo
  2003-02-12  6:30       ` law
  3 siblings, 2 replies; 23+ messages in thread
From: law @ 2003-02-12  4:37 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <20030211203337.GA6202@tornado.toronto.redhat.com>, Diego Novillo wr
ites:
 >Oh, I see.  Do you want to be able to rewrite X and Y in this
 >case?  The problem is that since X and Y are aliased, regardless
 >of what their alias is, SSA will refuse to rewrite them.  What we
 >will re-write are the VDEFs to their alias tag.
Well, in that example we certainly wouldn't want to rewrite them --
they're addressable *and* they would be aliased by the virtual store
at the call site.  We would want to continue to rewrite their VDEFs 
the same way we do now.

All we're effectively doing is pruning the set of objects that are
considered aliasing each other by noting that an alias created by
a load-load is totally uninteresting.  I don't see how these changes
necessarily have to interact with what variables we rewrite.

From the SSA builder's standpoint nothing has to change in this case
as far as I can tell.

 >In this case we would want to split alias sets whose members
 >cannot alias each other.  But this could quickly take us back to
 >quadratic behaviour.  Say we have this initial alias set for X
 >and Y:
No more so than the current algorithm.  We can make the current algorithm
blow up by having a different type for every variable and indirection
for each of those types.  Suddenly you're in the "oh god this sucks"
situation again.

If we make the same simplifying assumption you do for the current code,
namely that the number of types is significantly smaller than the number
of variables & indirections and we make the assumption that most variables
are not addressable, then my code will be roughly equivalent in time
complexity.

Of course that also shows precisely when and how my code would blow up --
if you have code where every variable is a different type with lots of
indirections, then it loses.  Similarly it would lose if we had an 
ungodly number of addressable variables and no pointers which aliased
them (if we have a pointer which aliases the addressables, then the
pointer becomes the tag for the set which includes all the addressables
it may alias).

The reason my code tends to produce faster overall alias analysis is 
because we avoid computing any may aliasing information for load-load
situations.

Also note that my code can be twiddled to degenerate more gracefully by
having addressable VAR_DECLs with the same alias set use the same tag.
That was slightly slower than keeping them separate.

Yes, splitting the sets would work too, but that seems to be avoiding 
the real issue -- creating too many useless alias relationships to begin
with.

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-12  4:37       ` [tree-ssa] alias analysis law
@ 2003-02-12  7:33         ` Daniel Berlin
  2003-02-13 15:55           ` Diego Novillo
  2003-02-13 15:32         ` Diego Novillo
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Berlin @ 2003-02-12  7:33 UTC (permalink / raw)
  To: law; +Cc: Diego Novillo, gcc


>
> Yes, splitting the sets would work too, but that seems to be avoiding
> the real issue -- creating too many useless alias relationships to 
> begin
> with.
It's funny you mention creating too many useless alias relationships vs 
splitting sets.
Splitting sets doesn't help make it faster.
Turning on PTA makes it split sets a lot more right now (since we can 
disambiguate so much better).
On 20001226-1.c, it takes a little over a second and a half (I can 
actually make it take well under a second, but it's not a priority for 
me since i think 1.68 seconds is an okay number here) to compute the 
points-to results
tree PTA              :   1.68 ( 1%) usr   0.11 ( 8%) sys   1.87 ( 1%) 
wall

However, because of the extra set splitting that happens, phi insertion 
time goes through the roof:
  tree PHI insertion    :  50.27 (18%) usr   0.05 ( 3%) sys  52.36 (18%) 
wall

This is with *always* using alias sets, and just using PTA to 
disambiguate.

(In case someone tries this, tree alias analysis time goes through the 
roof, but this is just a red herring. I've already fixed it in my tree)

So be careful with extra set splitting, it'll currently drive your phi 
insertion time up (whether it should or not is another matter)

On a random note, do either of you plan on making calls to may_alias_p 
from outside of tree-dfa.c anymore?

I had moved the freeing of PTA memory to SSA deletion time when Diego 
was doing alias tags, and using may_alias_p in tree-ssa.c, but if it's 
not going to be called during SSA building anymore, i'll move the 
freeing of PTA memory back to where it used to be.
>
> Jeff
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-12  7:33         ` Daniel Berlin
@ 2003-02-13 15:55           ` Diego Novillo
  2003-02-13 16:00             ` Daniel Berlin
  2003-02-13 16:00             ` Daniel Berlin
  0 siblings, 2 replies; 23+ messages in thread
From: Diego Novillo @ 2003-02-13 15:55 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jeff Law, gcc

On Wed, 2003-02-12 at 02:17, Daniel Berlin wrote:

> On a random note, do either of you plan on making calls to may_alias_p 
> from outside of tree-dfa.c anymore?
> 
Yes, of course.  I keep wondering why you removed the call to PTA from
tree-dfa.c.  Ultimately I want to be able to use PTA.  The only reason
why we have a type-based analyzer was because we don't yet have a fully
working PTA one.

> I had moved the freeing of PTA memory to SSA deletion time when Diego 
> was doing alias tags, and using may_alias_p in tree-ssa.c, but if it's 
> not going to be called during SSA building anymore, i'll move the 
> freeing of PTA memory back to where it used to be.
>
Please put it back.  I thought I had neatly separated what we do when
-ftree-points-to is given and when it's not.  I don't know why you
removed it.

If it's because tree-dfa.c doesn't use it properly, let's fix that.  I
want to be able to do both.  Type-based should be *very* fast and rather
stupid.  PTA can be a bit slower (not necessarily, only if it wants to
be), but more precise.


Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 15:55           ` Diego Novillo
@ 2003-02-13 16:00             ` Daniel Berlin
  2003-02-13 16:14               ` Diego Novillo
  2003-02-13 16:00             ` Daniel Berlin
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Berlin @ 2003-02-13 16:00 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jeff Law, gcc


On Thursday, February 13, 2003, at 10:27  AM, Diego Novillo wrote:

> On Wed, 2003-02-12 at 02:17, Daniel Berlin wrote:
>
>> On a random note, do either of you plan on making calls to may_alias_p
>> from outside of tree-dfa.c anymore?
>>
> Yes, of course.

Errr, you removed them all and made may_alias_p static, and told me you 
wanted to make it private. That's why I asked.
This is a completely opposite answer than what you said a few days ago.

>  I keep wondering why you removed the call to PTA from
> tree-dfa.c.
I didn't, look again.

>  Ultimately I want to be able to use PTA.  The only reason
> why we have a type-based analyzer was because we don't yet have a fully
> working PTA one.
We shouldn't be having an analyzer that *only* uses PTA or *only* uses 
TBAA.
That's silly.
They can work in concert, as they do now.
PTA is still used to disambiguate the aliases.

>> I had moved the freeing of PTA memory to SSA deletion time when Diego
>> was doing alias tags, and using may_alias_p in tree-ssa.c, but if it's
>> not going to be called during SSA building anymore, i'll move the
>> freeing of PTA memory back to where it used to be.
>>
> Please put it back.
Put what back?

I'm not sure you quite understand what I asked above.

Before (as in before  we had alias leaders, months ago) we had

create pta info
do may-aliasing
delete pta info
create ssa
optimize
delete ssa

When we had alias leaders (months ago) we had to do
create pta info
do may-aliasing
create ssa
optimize
delete ssa
delete pta info

because you had calls to may_alias_p (which used pta info) from the 
"create ssa" steps.

Now we still have
create pta info
do-may-aliasing
create ssa
optimize
delete ssa
delete pta info

but you've removed all the calls to may_alias_p and made it static.

If this is how it will stay, we can move back to doing
create pta info
do may-aliasing
delete pta info
create ssa
optimize
delete ssa

again.

That's all i've asked.

>   I thought I had neatly separated what we do when
> -ftree-points-to is given and when it's not.  I don't know why you
> removed it.

Because it's broken that way (Jeff's changes broke it, AFAICT), and 
because it's too memory intensive if you create explicit may-alias sets 
because of global var aliasing.  It's not usable that way.  I've got 
figures, but they aren't pretty.
It's not a fixable problem.  We'll have to create some kind of set to 
reduce the memory requirements.

> If it's because tree-dfa.c doesn't use it properly, let's fix that.
>  I
> want to be able to do both.

We still use PTA to disambiguate the sets created now if it's available.
We just don't create explicit may aliases with no sets anymore.
The quality of the information actually hasn't changed with what we do 
now, or else i wouldn't have done it.
>   Type-based should be *very* fast and rather
> stupid.  PTA can be a bit slower (not necessarily, only if it wants to
> be), but more precise.

It still is 99.99% as precise as it was before.

You seem to think we don't use PTA anymore.
This is not the case at all.
I've only removed broken code that won't do something we can use 
anyway, and can't be made to do something we can use.

>
> Diego.
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 16:00             ` Daniel Berlin
@ 2003-02-13 16:14               ` Diego Novillo
  0 siblings, 0 replies; 23+ messages in thread
From: Diego Novillo @ 2003-02-13 16:14 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jeff Law, gcc

On Thu, 2003-02-13 at 10:58, Daniel Berlin wrote:
> 
> On Thursday, February 13, 2003, at 10:27  AM, Diego Novillo wrote:
> 
> > On Wed, 2003-02-12 at 02:17, Daniel Berlin wrote:
> >
> >> On a random note, do either of you plan on making calls to may_alias_p
> >> from outside of tree-dfa.c anymore?
> >>
> > Yes, of course.
> 
> Errr, you removed them all and made may_alias_p static, and told me you 
> wanted to make it private. That's why I asked.
> This is a completely opposite answer than what you said a few days ago.
> 
Eh?  Oh, right.  Sorry, I just realized that I was talking nonsense.  I
didn't read your message right.  Please disregard *everything* I said :)



Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 15:55           ` Diego Novillo
  2003-02-13 16:00             ` Daniel Berlin
@ 2003-02-13 16:00             ` Daniel Berlin
  2003-02-13 16:23               ` Diego Novillo
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Berlin @ 2003-02-13 16:00 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jeff Law, gcc


On Thursday, February 13, 2003, at 10:27  AM, Diego Novillo wrote:

> On Wed, 2003-02-12 at 02:17, Daniel Berlin wrote:
>
>> On a random note, do either of you plan on making calls to may_alias_p
>> from outside of tree-dfa.c anymore?
>>
> Yes, of course.

Errr, you removed them all and made may_alias_p static, and told me you 
wanted to make it private. That's why I asked.
This is a completely opposite answer than what you said a few days ago.

>  I keep wondering why you removed the call to PTA from
> tree-dfa.c.
I didn't, look again.

>  Ultimately I want to be able to use PTA.  The only reason
> why we have a type-based analyzer was because we don't yet have a fully
> working PTA one.
We shouldn't be having an analyzer that *only* uses PTA or *only* uses 
TBAA.
That's silly.
They can work in concert, as they do now.
PTA is still used to disambiguate the aliases.

>> I had moved the freeing of PTA memory to SSA deletion time when Diego
>> was doing alias tags, and using may_alias_p in tree-ssa.c, but if it's
>> not going to be called during SSA building anymore, i'll move the
>> freeing of PTA memory back to where it used to be.
>>
> Please put it back.
Put what back?

I'm not sure you quite understand what I asked above.

Before (as in before  we had alias leaders, months ago) we had

create pta info
do may-aliasing
delete pta info
create ssa
optimize
delete ssa

When we had alias leaders (months ago) we had to do
create pta info
do may-aliasing
create ssa
optimize
delete ssa
delete pta info

because you had calls to may_alias_p (which used pta info) from the 
"create ssa" steps.

Now we still have
create pta info
do-may-aliasing
create ssa
optimize
delete ssa
delete pta info

but you've removed all the calls to may_alias_p and made it static.

If this is how it will stay, we can move back to doing
create pta info
do may-aliasing
delete pta info
create ssa
optimize
delete ssa

again.

That's all i've asked.

>   I thought I had neatly separated what we do when
> -ftree-points-to is given and when it's not.  I don't know why you
> removed it.

Because it's broken that way (Jeff's changes broke it, AFAICT), and 
because it's too memory intensive if you create explicit may-alias sets 
because of global var aliasing.  It's not usable that way.  I've got 
figures, but they aren't pretty.
It's not a fixable problem.  We'll have to create some kind of set to 
reduce the memory requirements.

> If it's because tree-dfa.c doesn't use it properly, let's fix that.
>  I
> want to be able to do both.

We still use PTA to disambiguate the sets created now if it's available.
We just don't create explicit may aliases with no sets anymore.
The quality of the information actually hasn't changed with what we do 
now, or else i wouldn't have done it.
>   Type-based should be *very* fast and rather
> stupid.  PTA can be a bit slower (not necessarily, only if it wants to
> be), but more precise.

It still is 99.99% as precise as it was before.

You seem to think we don't use PTA anymore.
This is not the case at all.
I've only removed broken code that won't do something we can use 
anyway, and can't be made to do something we can use.

>
> Diego.
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 16:00             ` Daniel Berlin
@ 2003-02-13 16:23               ` Diego Novillo
  2003-02-13 17:35                 ` Daniel Berlin
  0 siblings, 1 reply; 23+ messages in thread
From: Diego Novillo @ 2003-02-13 16:23 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jeff Law, gcc

On Thu, 2003-02-13 at 10:59, Daniel Berlin wrote:

> We shouldn't be having an analyzer that *only* uses PTA or *only* uses 
> TBAA.
> That's silly.
> They can work in concert, as they do now.
> PTA is still used to disambiguate the aliases.
> 
OK, so what I didn't understand is the model we use for PTA then.  We
don't actively call may_alias_p() from the optimizers.  What we do is
collect alias information before hand so that the optimizers can call
'may_aliases (var)' and get a list of all the variables that may alias
with 'var'.

Perhaps that's not a good model?  When I suggested making may_alias_p
static I had that model in mind.  We pre-compute all the alias
relationships before building SSA and then the optimizers can traverse
'may_aliases (var)'.  If this is not a good model for doing PTA, then
let's change it.


> >   I thought I had neatly separated what we do when
> > -ftree-points-to is given and when it's not.  I don't know why you
> > removed it.
> 
> Because it's broken that way (Jeff's changes broke it, AFAICT), and 
> because it's too memory intensive if you create explicit may-alias sets 
> because of global var aliasing.  It's not usable that way.  I've got 
> figures, but they aren't pretty.
> It's not a fixable problem.  We'll have to create some kind of set to 
> reduce the memory requirements.
> 
OK, this one is different.  I'm starting to believe that we should not
model call-clobbering with a forced alias to *.GLOBAL_VAR.  It seemed
like a neat trick, but it creates problems:

(1) Since *.GLOBAL_VAR aliases every type, functions that have
call-clobbered variables also have a single alias set where everything
aliases everything.  Too pessimistic.

(2) As you point out, *.GLOBAL_VAR blows up together with PTA.

I've been thinking that instead of having this silly aliasing model, we
should actually insert a VDEF for every variable that may be clobbered
at a call site.

> You seem to think we don't use PTA anymore.
> This is not the case at all.
>
Right, sorry about this one.  This is just me reading every other line
of a message.


Thoughts?   Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 16:23               ` Diego Novillo
@ 2003-02-13 17:35                 ` Daniel Berlin
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Berlin @ 2003-02-13 17:35 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jeff Law, gcc

On Thursday, February 13, 2003, at 11:21  AM, Diego Novillo wrote:

> On Thu, 2003-02-13 at 10:59, Daniel Berlin wrote:
>
>> We shouldn't be having an analyzer that *only* uses PTA or *only* uses
>> TBAA.
>> That's silly.
>> They can work in concert, as they do now.
>> PTA is still used to disambiguate the aliases.
>>
> OK, so what I didn't understand is the model we use for PTA then.  We
> don't actively call may_alias_p() from the optimizers.  What we do is
> collect alias information before hand so that the optimizers can call
> 'may_aliases (var)' and get a list of all the variables that may alias
> with 'var'.

I've not yet figured out whether this model is going to be usable in 
terms of speed/space.
Tell ya what.
Let me commit the SSAPRE stuff, and while i'm waiting for the insertion 
framework (which it needs for me to continue debugging bootstraps), 
i'll work on something that uses it so i can tell if it's usable this 
way.

>
> Perhaps that's not a good model?  When I suggested making may_alias_p
> static I had that model in mind.  We pre-compute all the alias
> relationships before building SSA and then the optimizers can traverse
> 'may_aliases (var)'.  If this is not a good model for doing PTA, then
> let's change it.

The thing is that i'm not sure.   PTA info is currently cheap to both 
query (because it caches it properly) and update (because it's flow 
insensitive, and thus, we can just throw new and changed statements at 
it regardless of where they actually appear. Of course, doing it this 
way makes it more conservative, so after some large number of updates 
we could recompute it fully if it makes sense or something).

That said, right now (well, in my tree with the hash table lookups and 
other overhead removed), in computing alias sets, we ask PTA to 
disambiguate aliases (in 20001226-1.c) 260 million times.
That's right.  260,160,770 times.
( 18.03     35.62    35.62 260160770     0.00     0.00  
ptr_may_alias_var)
It takes 35 seconds.

I can't make ptr_may_alias_var any faster, it already takes less than a 
millisecond.
Quick math shows it takes 1.3691533787422801e-07 seconds per call, 
which is good. :).

This means TBAA, addressable determination, etc, is failing to 
determine things don't alias 260 million times (since PTA is a last 
resort call).

 From a different perspective, this also means if our optimizers are 
going to make less than 260 million PTA queries, it's a win to have 
*them* query the PTA info when they want to, even though this means an 
uncleaner interface. (IE they would walk may_aliases, which wouldn't be 
PTA disambiguated, and if they see some variable alias they really 
don't want, they call PTA to try to disambiguate it).

Without some optimizers that use the info, it's impossible for me to 
really even guess which makes sense, since I have no figures on "how 
often an optimizer runs into an alias it really doesn't want to exist 
(IE is preventing an optimization) that PTA can disambiguate".

 From an interface perspective, a PTA query isn't very difficult or 
unclean or anything like that.  Personally, I don't think code that 
looks like

for (i = 0; i < VARRAY_ACTIVE_SIZE (may_aliases (var)); i++)
{
	tree alias = VARRAY_TREE (may_aliases (var), i);
	if (I think this alias prevents me from doing what i want)
        {
		if (!ptr_may_alias_var (ptr, var))
			<but it really doesn't>
		else
			<it really does>
	  }
}

is going to be horrifically ugly compared to:
for (i = 0; i < VARRAY_ACTIVE_SIZE (may_aliases (var)); i++)
{
	tree alias = VARRAY_TREE (may_aliases (var), i);
	if (this alias prevents me from doing what i want)
        {
			<it really does>
	  }
}

if it doesn't occur too often.

But if it's everywhere that an optimizer needs to do this, ...

Trying to draw on guidance from other compilers, i'll say i have yet to 
see a single one where the PTA info isn't integrated with the other 
alias set info (IE they effectively have just a single may_aliases 
array per variable).

But those that use PTA info all do some combination of
1. Have alias info backed by a file, so that they aren't keeping it all 
in memory at once if they don't have to (granularity varies).
2. Use some kind of indexing and bitmaps so that their queries are bit 
testing and their memory usage is smaller.

IE In one case they assign each variable an index, and use a sparse 
triangular bitmatrix to tell what numbers alias.

If they want to determine the entire list of aliases (not often) for a 
variable, they just walk the bitmatrix row for that index, and use a 
reverse mapping to map the numbers back into variables.

>
>>>   I thought I had neatly separated what we do when
>>> -ftree-points-to is given and when it's not.  I don't know why you
>>> removed it.
>>
>> Because it's broken that way (Jeff's changes broke it, AFAICT), and
>> because it's too memory intensive if you create explicit may-alias 
>> sets
>> because of global var aliasing.  It's not usable that way.  I've got
>> figures, but they aren't pretty.
>> It's not a fixable problem.  We'll have to create some kind of set to
>> reduce the memory requirements.
>>
> OK, this one is different.  I'm starting to believe that we should not
> model call-clobbering with a forced alias to *.GLOBAL_VAR.  It seemed
> like a neat trick, but it creates problems:
>
> (1) Since *.GLOBAL_VAR aliases every type, functions that have
> call-clobbered variables also have a single alias set where everything
> aliases everything.  Too pessimistic.
>
> (2) As you point out, *.GLOBAL_VAR blows up together with PTA.

Blows up is an understatement.
It's more like a nuclear holocaust.
>
> I've been thinking that instead of having this silly aliasing model, we
> should actually insert a VDEF for every variable that may be clobbered
> at a call site.

>
>> You seem to think we don't use PTA anymore.
>> This is not the case at all.
>>
> Right, sorry about this one.  This is just me reading every other line
> of a message.
>
>
> Thoughts?   Diego.
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-12  4:37       ` [tree-ssa] alias analysis law
  2003-02-12  7:33         ` Daniel Berlin
@ 2003-02-13 15:32         ` Diego Novillo
  2003-02-15  0:20           ` law
  2003-02-20  1:43           ` law
  1 sibling, 2 replies; 23+ messages in thread
From: Diego Novillo @ 2003-02-13 15:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc

On Tue, 2003-02-11 at 23:41, law@redhat.com wrote:
> In message <20030211203337.GA6202@tornado.toronto.redhat.com>, Diego Novillo wr
> ites:
>  >In this case we would want to split alias sets whose members
>  >cannot alias each other.  But this could quickly take us back to
>  >quadratic behaviour.  Say we have this initial alias set for X
>  >and Y:
> No more so than the current algorithm.  We can make the current algorithm
> blow up by having a different type for every variable and indirection
> for each of those types.  Suddenly you're in the "oh god this sucks"
> situation again.
> 
To get into that situation you would need a program with tens of
thousands of *different* types and tens of thousands of different
variables of those types.  If you got that kind of program, I bet you
you will lose no matter what you do.


> The reason my code tends to produce faster overall alias analysis is 
> because we avoid computing any may aliasing information for load-load
> situations.
> 
Faster than what we have now?  At this point I will need to see the
patch :)  Right now we are so fast mainly because we bag most
addressables and dereferences together.

Your patch would refine this by separating load-load aliases.  How can
that be faster?  You're adding more checks to the aliasing code :)
But I agree with you that it can be comparable in complexity.  If so,
I'm all for it.

What I would like to avoid is getting into the situation where we start
to implement various additional heuristics to the type-based analyzer
instead of relying on the PTA code.  I think I'd rather have a good PTA
implementation.  The type-based analyzer was something to get by in the
meantime, really.

> Yes, splitting the sets would work too, but that seems to be avoiding 
> the real issue -- creating too many useless alias relationships to begin
> with.
> 
Yeah, good point.


Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 15:32         ` Diego Novillo
@ 2003-02-15  0:20           ` law
  2003-02-15  2:01             ` Daniel Berlin
  2003-02-19  1:04             ` Diego Novillo
  2003-02-20  1:43           ` law
  1 sibling, 2 replies; 23+ messages in thread
From: law @ 2003-02-15  0:20 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <1045149809.10501.18.camel@frodo>, Diego Novillo writes:
 >On Tue, 2003-02-11 at 23:41, law@redhat.com wrote:
 >> In message <20030211203337.GA6202@tornado.toronto.redhat.com>, Diego Novill
 >o wr
 >> ites:
 >>  >In this case we would want to split alias sets whose members
 >>  >cannot alias each other.  But this could quickly take us back to
 >>  >quadratic behaviour.  Say we have this initial alias set for X
 >>  >and Y:
 >> No more so than the current algorithm.  We can make the current algorithm
 >> blow up by having a different type for every variable and indirection
 >> for each of those types.  Suddenly you're in the "oh god this sucks"
 >> situation again.
 >> 
 >To get into that situation you would need a program with tens of
 >thousands of *different* types and tens of thousands of different
 >variables of those types.  If you got that kind of program, I bet you
 >you will lose no matter what you do.
Precisely.


 >> The reason my code tends to produce faster overall alias analysis is 
 >> because we avoid computing any may aliasing information for load-load
 >> situations.
 >> 
 >Faster than what we have now?  At this point I will need to see the
 >patch :)  Right now we are so fast mainly because we bag most
 >addressables and dereferences together.
Yup.  My scheme is faster than what we have now.  

FWIW, after my various improvements to CCP, alias analysis has become the
clear CPU hog again as far as the tree optimizers are concerned
(with gimplification running a close second).  And I know how to make
mine even faster and probably use less memory as well :-)


 >What I would like to avoid is getting into the situation where we start
 >to implement various additional heuristics to the type-based analyzer
 >instead of relying on the PTA code.  I think I'd rather have a good PTA
 >implementation.  The type-based analyzer was something to get by in the
 >meantime, really.
Conceptually you should think of PTA as a way to prune the aliases 
found by type analysis.  In all likelihood they're going to have to
work together.


jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-15  0:20           ` law
@ 2003-02-15  2:01             ` Daniel Berlin
  2003-02-19  1:04             ` Diego Novillo
  1 sibling, 0 replies; 23+ messages in thread
From: Daniel Berlin @ 2003-02-15  2:01 UTC (permalink / raw)
  To: law; +Cc: Diego Novillo, gcc


On Friday, February 14, 2003, at 07:12  PM, law@redhat.com wrote:

> In message <1045149809.10501.18.camel@frodo>, Diego Novillo writes:
>> On Tue, 2003-02-11 at 23:41, law@redhat.com wrote:
>>> In message <20030211203337.GA6202@tornado.toronto.redhat.com>, Diego 
>>> Novill
>> o wr
>>> ites:
>>>> In this case we would want to split alias sets whose members
>>>> cannot alias each other.  But this could quickly take us back to
>>>> quadratic behaviour.  Say we have this initial alias set for X
>>>> and Y:
>>> No more so than the current algorithm.  We can make the current 
>>> algorithm
>>> blow up by having a different type for every variable and indirection
>>> for each of those types.  Suddenly you're in the "oh god this sucks"
>>> situation again.
>>>
>> To get into that situation you would need a program with tens of
>> thousands of *different* types and tens of thousands of different
>> variables of those types.  If you got that kind of program, I bet you
>> you will lose no matter what you do.
> Precisely.
>
>
>>> The reason my code tends to produce faster overall alias analysis is
>>> because we avoid computing any may aliasing information for load-load
>>> situations.
>>>
>> Faster than what we have now?  At this point I will need to see the
>> patch :)  Right now we are so fast mainly because we bag most
>> addressables and dereferences together.
> Yup.  My scheme is faster than what we have now.
>
> FWIW, after my various improvements to CCP, alias analysis has become 
> the
> clear CPU hog again as far as the tree optimizers are concerned
> (with gimplification running a close second).  And I know how to make
> mine even faster and probably use less memory as well :-)
>
>
>> What I would like to avoid is getting into the situation where we 
>> start
>> to implement various additional heuristics to the type-based analyzer
>> instead of relying on the PTA code.  I think I'd rather have a good 
>> PTA
>> implementation.  The type-based analyzer was something to get by in 
>> the
>> meantime, really.
> Conceptually you should think of PTA as a way to prune the aliases
> found by type analysis.  In all likelihood they're going to have to
> work together.
>
Yes, but it's likely going to need to be the other way around for speed 
reasons.
PTA, without TBAA, generates orders of magnitudes less aliases that 
need to be pruned, in just as much time as TBAA takes.
Thus, you'd want to use PTA and disambiguate the results using TBAA, as 
it would be faster.
Otherwise, you end up asking PTA about aliases that TBAA can't figure 
out, 206 million times, like we do now, rather than using the results, 
and asking TBAA 100k times about the remaining aliases (or whatever).
Keep that in mind as you are spending time optimizing the type-based 
analyzer.

>
> jeff
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-15  0:20           ` law
  2003-02-15  2:01             ` Daniel Berlin
@ 2003-02-19  1:04             ` Diego Novillo
  2003-02-19  2:54               ` law
  1 sibling, 1 reply; 23+ messages in thread
From: Diego Novillo @ 2003-02-19  1:04 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc

On Fri, 2003-02-14 at 19:12, law@redhat.com wrote:

>  >Faster than what we have now?  At this point I will need to see the
>  >patch :)  Right now we are so fast mainly because we bag most
>  >addressables and dereferences together.
> Yup.  My scheme is faster than what we have now.  
> 
Faster to compute aliasing, or faster when combined with the SSA
builder?  If you are trying to improve times that are in the fractions
of a second for a multi-second compile, I think you're wasting your
time.  We have much bigger performance problems in the branch, don't you
think?

> FWIW, after my various improvements to CCP, alias analysis has become the
> clear CPU hog again as far as the tree optimizers are concerned
> (with gimplification running a close second).  And I know how to make
> mine even faster and probably use less memory as well :-)
> 
Now you lost me again.  Why are you so interested in making alias
analysis a CPU hog?  We were trying to do the opposite!

>  >What I would like to avoid is getting into the situation where we start
>  >to implement various additional heuristics to the type-based analyzer
>  >instead of relying on the PTA code.  I think I'd rather have a good PTA
>  >implementation.  The type-based analyzer was something to get by in the
>  >meantime, really.
> Conceptually you should think of PTA as a way to prune the aliases 
> found by type analysis.  In all likelihood they're going to have to
> work together.
> 
I always thought that TBA was a way of pruning PTA.  You first do a
quick TBA query, if that fails, you do PTA.


Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-19  1:04             ` Diego Novillo
@ 2003-02-19  2:54               ` law
  0 siblings, 0 replies; 23+ messages in thread
From: law @ 2003-02-19  2:54 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <1045613408.1867.19.camel@frodo>, Diego Novillo writes:
 >On Fri, 2003-02-14 at 19:12, law@redhat.com wrote:
 >
 >>  >Faster than what we have now?  At this point I will need to see the
 >>  >patch :)  Right now we are so fast mainly because we bag most
 >>  >addressables and dereferences together.
 >> Yup.  My scheme is faster than what we have now.  
 >> 
 >Faster to compute aliasing, or faster when combined with the SSA
 >builder?  If you are trying to improve times that are in the fractions
 >of a second for a multi-second compile, I think you're wasting your
 >time.  We have much bigger performance problems in the branch, don't you
 >think?
Faster overall.  ie, any extra time spent in the SSA builder is recovered
by significantly faster alias analysis.

While there are other issues that can and should be addressed, the aliasing
code (once my remaining CPP fixes are installed) is the slowest part of the
tree-ssa path as a whole and is taking a significant amount of CPU time.

To put it in perspective, over 2% of our time building cc1 is spent in the
tree alias analysis code.  That's a hell of a lot better than we used to
be, but 2% frankly is way too much.

IMHO if any part of the SSA path is taking more than 1% of the total 
compilation time, then it needs to be looked at very carefully.  While
it's great to have the ability to do more complex analysis and optimizations
using tree-ssa, we have to be very aware of introducing more slow code
into the compiler.  It's very important to use good algorithms and 
implementations, otherwise we're going to be introducing a lot of crap
that nobody will ever use.

 >> FWIW, after my various improvements to CCP, alias analysis has become the
 >> clear CPU hog again as far as the tree optimizers are concerned
 >> (with gimplification running a close second).  And I know how to make
 >> mine even faster and probably use less memory as well :-)
 >> 
 >Now you lost me again.  Why are you so interested in making alias
 >analysis a CPU hog?  We were trying to do the opposite!
Before my changes to CCP it was the slowest part of the SSA path, taking
on the order of 20 seconds for my components of cc1 test (out of a 
total of around 700 seconds).  Now CCP is down to around 4 seconds for
that same test.

Alias analysis takes 13 seconds for that test and gimplification takes
around 15 seconds.  Neither are affected by my changes.  But both clearly
need to be worked on further as each accounts for about 2% of our total
compilation time and are the biggest cpu hogs in the SSA path.

By no means am I looking to slow things down -- what's happening is
merely a matter of taking the worst offenders, making them fast,
then continuing to iterate.  With my CCP changes, CCP gets under
the radar again.  With CCP off the radar screen, it's time to look
at another hog -- which happens to be our aliasing code.

 >>  >What I would like to avoid is getting into the situation where we start
 >>  >to implement various additional heuristics to the type-based analyzer
 >>  >instead of relying on the PTA code.  I think I'd rather have a good PTA
 >>  >implementation.  The type-based analyzer was something to get by in the
 >>  >meantime, really.
 >> Conceptually you should think of PTA as a way to prune the aliases 
 >> found by type analysis.  In all likelihood they're going to have to
 >> work together.
 >> 
 >I always thought that TBA was a way of pruning PTA.  You first do a
 >quick TBA query, if that fails, you do PTA.
Either way, our TBA is too bloody slow, even with your nice changes
from a couple weeks ago.  The amount of useless work we do in TBA needs
to be reduced.

jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-13 15:32         ` Diego Novillo
  2003-02-15  0:20           ` law
@ 2003-02-20  1:43           ` law
  2003-02-20  1:53             ` Diego Novillo
  1 sibling, 1 reply; 23+ messages in thread
From: law @ 2003-02-20  1:43 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <1045149809.10501.18.camel@frodo>, Diego Novillo writes:
 >
 >> The reason my code tends to produce faster overall alias analysis is 
 >> because we avoid computing any may aliasing information for load-load
 >> situations.
 >> 
 >Faster than what we have now?  At this point I will need to see the
 >patch :)  Right now we are so fast mainly because we bag most
 >addressables and dereferences together.
 >
 >Your patch would refine this by separating load-load aliases.  How can
 >that be faster?  You're adding more checks to the aliasing code :)
 >But I agree with you that it can be comparable in complexity.  If so,
 >I'm all for it.
The first (and most important) thing to realize is that most functions
have more loads than stores.  And in fact, many functions have just loads
and no stores.

To show the effect, let's (just for the sake of argument) look at a
naive alias analysis.

foreach indirect I
  {
    foreach indirect J
      check for conflicts between I & J

    foreach addressable K
      check for conflicts between I & K     
  }

Which performs I * J + I * K checks.  Which is just I * (J + K)
We know that I & J are related from which we get I * (I/2 + K) checks

This algorithm is clearly dominated by I -- the number of indirect
references.  I includes indirects created by both stores and loads.

If you split loads & stores you get something like this:

foreach store S
  {
    foreach store T
      check for conflicts between S & T

    foreach load L
      check for conflicts between S & L
  }

Which is S * (S/2 + L).  Which looks the same until you realize that
typically S will be much smaller than I.  Which is good -- when we
can drastically decrease dominating variable, we get much better 
behavior.

The same concepts apply to the scheme currently on the branch.  We
register alias sets for all the stores, then check the stores & loads
for conflicts with the registered alias sets.  We have to register
far far fewer alias sets because we only register those used by store
instructions.

A secondary effect is that our may alias sets get smaller, meaning
that our cost to manage those sets can go down significantly. ie, we
get far fewer calls to add_may_alias.

The proof is in the pudding so they say.  Separating loads from stores
cuts alias time by nearly 70%.  That's nothing to sneeze at.

 >What I would like to avoid is getting into the situation where we start
 >to implement various additional heuristics to the type-based analyzer
 >instead of relying on the PTA code.  I think I'd rather have a good PTA
 >implementation.  The type-based analyzer was something to get by in the
 >meantime, really.
No heuristics needed.  And remember, you're still going to need type
based alias analysis.

jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-20  1:43           ` law
@ 2003-02-20  1:53             ` Diego Novillo
  2003-02-20 18:18               ` law
  0 siblings, 1 reply; 23+ messages in thread
From: Diego Novillo @ 2003-02-20  1:53 UTC (permalink / raw)
  To: law; +Cc: gcc

On Wed, 19 Feb 2003, Jeff Law wrote:

> The proof is in the pudding so they say.  Separating loads from stores
> cuts alias time by nearly 70%.  That's nothing to sneeze at.
> 
Sold.  Let's do it then.  Of course, by having better aliasing
info we are going to make the compiler want to insert lots more
PHI nodes, but you knew that already :)


Diego.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-20  1:53             ` Diego Novillo
@ 2003-02-20 18:18               ` law
  0 siblings, 0 replies; 23+ messages in thread
From: law @ 2003-02-20 18:18 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <20030220002936.GA29249@tornado.toronto.redhat.com>, Diego Novillo w
rites:
 >On Wed, 19 Feb 2003, Jeff Law wrote:
 >
 >> The proof is in the pudding so they say.  Separating loads from stores
 >> cuts alias time by nearly 70%.  That's nothing to sneeze at.
 >> 
 >Sold.  Let's do it then.
I'm testing what I hope is the final version right now.


 >Of course, by having better aliasing
 >info we are going to make the compiler want to insert lots more
 >PHI nodes, but you knew that already :)
Yup.  Typically this isn't a problem.  But there are of course
exceptions such as 20001226-1.c.  I'll probably end up pushing a
revisit of the global life code on my todo stack.

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [tree-ssa] alias analysis
  2003-02-11 20:35     ` Diego Novillo
                         ` (2 preceding siblings ...)
  2003-02-12  4:37       ` [tree-ssa] alias analysis law
@ 2003-02-12  6:30       ` law
  3 siblings, 0 replies; 23+ messages in thread
From: law @ 2003-02-12  6:30 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <20030211203337.GA6202@tornado.toronto.redhat.com>, Diego Novillo wr
ites:
 >On Tue, 11 Feb 2003, Jeff Law wrote:
 >
 >>  >>   int x = 5;
 >>  >>   int y = 10;
 >>  >> 
 >>  >>   foo (&x, &y);
 >>  >> 
 >>  >> 
 >>  >> We'd have distinct tags for x & y, even though they have the same alias
 >>  >> set.
 >>  >> 
 >>  >Now you lost me.  I thought you were separating load from store
 >>  >aliases?  Aren't x and y store aliases here?  Or is it load
 >>  >aliases?  I think I need a step-by-step description again :)
 >> They are stores, but they are stores to distinct VAR_DECLs.  There's no
 >> way they can actually alias each other as the underlying objects will
 >> always be distinct.  Ie, there is no way that x = foo will every modify y.

So, just to provide a little more color on the two ways I think we can
improve on the initial generation of may alias code.

As I stated before, both are based on the notion that separating loads
from stores can both speed up the alias analysis phase and provide
more accurate information.    I've already shown data which supports
the belief that we can speed up the compiler.

In the first scheme we keep separate tags for addressable VAR_DECLs
even if they have the same alias set number.  This requires us to
walk all the store tags when checking to see if a load is aliased by
any of the stores.  Note that if we encounter a store via an 
INDIRECT_REF we glob all the stores via addressable VAR_DECLs with
the same alias set number to use the tag for the INDIRECT_REF.

The second scheme is simpler in that we always glob addressable VAR_DECLs
with the same alias set into a single tag.  This can lead to slightly
worse alias information than the first scheme.  Interestingly enough it
also had slightly worse overall performance than the first scheme
(but was still better than the code on the branch).

First I instrumented find_alias_for.  For the current branch sources
it records how often an alias was found or not found when walking the
indirect_refs and how often an alias was found or not found when
walking the addresssables.

Calls for indirect_refs which found an alias          : 79896
Calls for indirect_refs which did not find an alias   :     8
Totals calls for indirect_refs:                       : 79904

Calls for addressable_vars which found an alias       : 16382
Calls for addressable_vars which did not find an alias:   469
Total calls for addressable_vars:                     : 16851

Total calls:                                          : 96755

As you can see the vast majority of calls resulted in finding an
alias.

Then I instrumented the compiler using the two schemes noted
above.  Because the arrays are tracking totally different concepts
than the current branch the numbers below aren't a direct comparison,
but they do give you something to think about:

Calls for stores which found a store alias            : 31665  (31659)
Calls for stores which did not find a store alias     :     0
Total calls for stores                                : 31665

Calls for loads which found a store alias             : 69780  (69932)
Calls for loads which did not find a store alias      :  4801
Total calls for loads                                 : 74581

Total calls                                           : 106246

The number in parenthesis is how many aliases were found -- remember
we can't stop when we find the first alias.    As you can see it
was pretty rare to find a second alias.

For the scheme which separates loads from stores, but always globs
addressable VAR_DECLs we get:

Calls for stores which found an alias                 : 31659
Calls for stores which did not find an alias:         :     0
Total calls for stores                                : 31659

Calls for loads which found an alias                  : 69808
Calls for loads which did not find an alias           :  4773
Total calls for loads                                 : 74581

Total calls                                           : 106240

Note that since this scheme globs, we can stop when we find the
first alias.

So what does all this mean?

First, my scheme surprisingly does 10.5% more calls into find_alias_for.
I say surprising because the compilers with my scheme are both around
1% faster than the current branch.  I suspect, but have not confirmed
that the increase in calls can be accounted for by the fact that
my schemes generate a call for both a load and a store through the same
INDIRECT_REF, but the current sources would generate a single call.

The other difference is the number of calls which resulted in finding
no may aliases.  On the current branch only 0.5% of the calls resulted
in finding no aliases.  Contrast this to my code in which 4.4% of the
calls resulted in not finding an alias (this is a good thing).
Presumably this results in fewer vdef/vuses that need to be created,
rewritten, etc etc.  But I haven't verified this.
[ In a worst case scenario the 4000 disambiguated references could
  come from the 10k extra calls.  I really doubt that's the case, but
  it's definitely possible. ]

Anyway, we could use some flag bits to reduce the redundant calls, which
might give us results more comparable to the current branch sources.

Anyway, I'm going to table this until Diego gets back.  While I don't
think there's an affect on the SSA builder and I think that doing something
in this space is wise, it's clear that there's stuff to hash through.

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2003-02-20 18:11 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-11 17:29 [tree-ssa] alias analysis law
2003-02-11 18:44 ` Diego Novillo
2003-02-11 18:51   ` law
2003-02-11 20:35     ` Diego Novillo
2003-02-11 20:35       ` Diego Novillo
2003-02-11 22:39       ` [tree-ssa]: Always compute sets (Was Re: [tree-ssa] alias analysis) Daniel Berlin
2003-02-12  4:37       ` [tree-ssa] alias analysis law
2003-02-12  7:33         ` Daniel Berlin
2003-02-13 15:55           ` Diego Novillo
2003-02-13 16:00             ` Daniel Berlin
2003-02-13 16:14               ` Diego Novillo
2003-02-13 16:00             ` Daniel Berlin
2003-02-13 16:23               ` Diego Novillo
2003-02-13 17:35                 ` Daniel Berlin
2003-02-13 15:32         ` Diego Novillo
2003-02-15  0:20           ` law
2003-02-15  2:01             ` Daniel Berlin
2003-02-19  1:04             ` Diego Novillo
2003-02-19  2:54               ` law
2003-02-20  1:43           ` law
2003-02-20  1:53             ` Diego Novillo
2003-02-20 18:18               ` law
2003-02-12  6:30       ` law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).