public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][RFC] Restrict, take 42
@ 2014-09-03 13:00 Richard Biener
  2014-09-05 12:44 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2014-09-03 13:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: Michael Matz


Ok, so with recent activity in that mgrid bug (PR55334) I tried
to remember what solution we thought of after determining that
ADD_RESTRICT is a no-go.

The following very prototypish patch implements the idea of
computing known non-dependences and maintaining them over
the compilation (mainly inlining / cloning for PR55334).

So the patch piggy-backs on PTA (bad - PTA doesn't compute
must-aliases, so it will work only for a very limited set
of testcases now - but at least it might work for
non-register restrict stuff).

The representation of "non-dependences" is the most interesting
bit of course.  We partition memory references into different
cliques in which all references are independent.  And we
split that clique again into sets of references based on the
same pointer.

That allows us to disambiguate equal clique but distinct pointer
memory references.

For restrict you'd put all(!) memory references in the BLOCK
where the restrict pointer is live into the same clique
and assign a distinct ptr based on which restrict pointer this
is based on.  You make all references not based on any
restrict pointer have ptr == 0 (not implemented in the prototype
patch - they don't get a clique assigned).

The patch simplifies this by taking function scope as the
only BLOCK to consider, thus inside a function the clique
will be unique (before inlining).

I can see issues arising with assigning numbers in the frontend
based on real BLOCKs and
  
  {
    int * restrict q;
    {
      int * restrict p;
      *p = *q;
    }
    {
      int * restrict r;
      *r = *q;
    }

that is, non-dependences based on pointers coming from different
nested scopes.  The FE in this case would need to "duplicate"
the ptr value for 'q' to not make *p and *r falsely a
non-dependence.  But I think we're not planning to assign
this in the C frontend family (maybe the Fortran FE though).

To preserve correctness cliques from inlined functions need to
be remapped to an unused clique number so struct function
gets a max_clique number.  (remapping not implemented in
the prototype)

Any correctness / missed-optimization holes to punch?  That is,
given a set of non-dependent reference pairs, can you assign
that clique, ptr values in a correct and optimal way?
(it somehow assumes transitivity, if the set is *a, *b; *b, *c;
then *a, *c are also non-dependent)

It all depends on a conservative but agressive must-be-based-on
analysis of course.  You'd meet that with the conservative
may-be-based-on analysis from PTA and compute the proper
non-dependences from that.

Comments welcome.

Patch tested on

static inline void foo (int * __restrict__ p, int * __restrict__ q)
{
  int i;
  for (i = 0; i < 1024; ++i)
    p[i] = q[i];
}

void bar (int *p, int *q)
{
  foo (p, q);
}

where it still vectorizes the loop in bar without alias check.

Richard.

Index: trunk/gcc/function.h
===================================================================
*** trunk.orig/gcc/function.h	2014-08-29 14:18:27.221364973 +0200
--- trunk/gcc/function.h	2014-09-03 13:15:11.433883630 +0200
*************** struct GTY(()) function {
*** 592,597 ****
--- 592,600 ----
       a string describing the reason for failure.  */
    const char * GTY((skip)) cannot_be_copied_reason;
  
+   /* Last assigned restrict UID.  */
+   unsigned short last_restrict_uid;
+ 
    /* Collected bit flags.  */
  
    /* Number of units of general registers that need saving in stdarg
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2014-08-26 13:40:18.811368135 +0200
--- trunk/gcc/tree-inline.c	2014-09-03 14:15:14.792635543 +0200
*************** remap_gimple_op_r (tree *tp, int *walk_s
*** 929,934 ****
--- 929,937 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  /* ???  Properly remap the clique to a new one.  */
+ 	  if (old->base.u.version != 0)
+ 	    (*tp)->base.u.version = old->base.u.version;
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
*************** copy_tree_body_r (tree *tp, int *walk_su
*** 1180,1185 ****
--- 1183,1191 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  /* ???  Properly remap the clique to a new one.  */
+ 	  if (old->base.u.version != 0)
+ 	    (*tp)->base.u.version = old->base.u.version;
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2014-08-04 11:50:13.421690694 +0200
--- trunk/gcc/tree-pretty-print.c	2014-09-03 14:04:41.816679122 +0200
*************** dump_generic_node (pretty_printer *buffe
*** 1071,1077 ****
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1))))))
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
--- 1071,1078 ----
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)))))
! 	    && node->base.u.version == 0)
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
*************** dump_generic_node (pretty_printer *buffe
*** 1102,1107 ****
--- 1103,1115 ----
  		dump_generic_node (buffer, TREE_OPERAND (node, 1),
  				   spc, flags, false);
  	      }
+ 	    if (node->base.u.version != 0)
+ 	      {
+ 		pp_string (buffer, " clq ");
+ 		pp_unsigned_wide_integer (buffer, node->base.u.version >> 16);
+ 		pp_string (buffer, " ptr ");
+ 		pp_unsigned_wide_integer (buffer, node->base.u.version & 0xffff);
+ 	      }
  	    pp_right_bracket (buffer);
  	  }
  	break;
*************** dump_generic_node (pretty_printer *buffe
*** 1440,1446 ****
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1))))))))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
--- 1448,1455 ----
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1)))))
! 		  && op0->base.u.version == 0)))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
Index: trunk/gcc/tree-ssa-alias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-alias.c	2014-08-26 13:40:16.710368279 +0200
--- trunk/gcc/tree-ssa-alias.c	2014-09-03 14:25:17.324594059 +0200
*************** refs_may_alias_p_1 (ao_ref *ref1, ao_ref
*** 1371,1376 ****
--- 1371,1406 ----
      return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
  				  ref2->ref, base2, offset2, max_size2);
  
+   /* Handle restrict based accesses.
+      ???  ao_ref_base strips inner MEM_REF [&decl], recover from that
+      here.  */
+   tree rbase1 = base1;
+   tree rbase2 = base2;
+   if (var1_p)
+     {
+       rbase1 = ref1->ref;
+       if (rbase1)
+ 	while (handled_component_p (rbase1))
+ 	  rbase1 = TREE_OPERAND (rbase1, 0);
+     }
+   if (var2_p)
+     {
+       rbase2 = ref2->ref;
+       if (rbase2)
+ 	while (handled_component_p (rbase2))
+ 	  rbase2 = TREE_OPERAND (rbase2, 0);
+     }
+   if (rbase1 && rbase2
+       && TREE_CODE (base1) == MEM_REF
+       && TREE_CODE (base2) == MEM_REF
+       && base1->base.u.version != 0
+       && base2->base.u.version != 0
+       /* If the accesses are in the same restrict clique... */
+       && (base1->base.u.version >> 16) == (base2->base.u.version >> 16)
+       /* But based on different pointers they do not alias.  */
+       && (base1->base.u.version & 0xffff) != (base2->base.u.version & 0xffff))
+     return false;
+ 
    ind1_p = (TREE_CODE (base1) == MEM_REF
  	    || TREE_CODE (base1) == TARGET_MEM_REF);
    ind2_p = (TREE_CODE (base2) == MEM_REF
Index: trunk/gcc/tree-ssa-structalias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-structalias.c	2014-09-02 10:13:58.150580778 +0200
--- trunk/gcc/tree-ssa-structalias.c	2014-09-03 14:23:10.540602788 +0200
***************
*** 52,57 ****
--- 52,59 ----
  #include "splay-tree.h"
  #include "params.h"
  #include "alias.h"
+ #include "tree-phinodes.h"
+ #include "ssa-iterators.h"
  
  /* The idea behind this analyzer is to generate set constraints from the
     program, then solve the resulting constraints in order to generate the
*************** struct variable_info
*** 272,283 ****
--- 274,292 ----
    /* True if this field has only restrict qualified pointers.  */
    unsigned int only_restrict_pointers : 1;
  
+   /* True if this represents a heap var created for a restrict qualified
+      pointer.  */
+   unsigned int is_restrict_var : 1;
+ 
    /* True if this represents a global variable.  */
    unsigned int is_global_var : 1;
  
    /* True if this represents a IPA function info.  */
    unsigned int is_fn_info : 1;
  
+   /* ???  Store somewhere better.  */
+   unsigned short ruid;
+ 
    /* The ID of the variable for the next field in this structure
       or zero for the last field in this structure.  */
    unsigned next;
*************** new_var_info (tree t, const char *name)
*** 369,374 ****
--- 378,384 ----
    ret->is_heap_var = false;
    ret->may_have_pointers = true;
    ret->only_restrict_pointers = false;
+   ret->is_restrict_var = false;
    ret->is_global_var = (t == NULL_TREE);
    ret->is_fn_info = false;
    if (t && DECL_P (t))
*************** static varinfo_t
*** 3773,3778 ****
--- 3783,3789 ----
  make_constraint_from_restrict (varinfo_t lhs, const char *name)
  {
    varinfo_t vi = make_heapvar (name);
+   vi->is_restrict_var = 1;
    vi->is_global_var = 1;
    vi->may_have_pointers = 1;
    make_constraint_from (lhs, vi->id);
*************** compute_may_aliases (void)
*** 6948,6953 ****
--- 6959,7046 ----
    if (dump_file)
      dump_alias_info (dump_file);
  
+     {
+       unsigned short clique = 0;
+       unsigned short last_ruid = 0;
+       for (unsigned i = 0; i < num_ssa_names; ++i)
+ 	{
+ 	  tree ptr = ssa_name (i);
+ 	  if (!ptr || !POINTER_TYPE_P (TREE_TYPE (ptr)))
+ 	    continue;
+ 
+ 	  /* Avoid all this when ptr is not dereferenced?  */
+ 	  tree p = ptr;
+ 	  if (SSA_NAME_IS_DEFAULT_DEF (ptr)
+ 	      && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL
+ 		  || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL))
+ 	    p = SSA_NAME_VAR (ptr);
+ 
+ 	  varinfo_t vi = get_varinfo (find (lookup_vi_for_tree (p)->id));
+ 	  bitmap_iterator bi;
+ 	  unsigned j;
+ 	  varinfo_t restrict_var = NULL;
+ 	  EXECUTE_IF_SET_IN_BITMAP (vi->solution, 0, j, bi)
+ 	    {
+ 	      varinfo_t oi = get_varinfo (j);
+ 	      if (oi->is_restrict_var)
+ 		{
+ 		  if (restrict_var)
+ 		    {
+ 		      restrict_var = NULL;
+ 		      break;
+ 		    }
+ 		  restrict_var = oi;
+ 		}
+ 	      /* NULL is the only other valid points-to entry.  */
+ 	      else if (oi->id != nothing_id)
+ 		{
+ 		  restrict_var = NULL;
+ 		  break;
+ 		}
+ 	    }
+ 	  /* Ok, found that ptr must(!) point to a single(!) restrict
+ 	     variable.  */
+ 	  /* ???  PTA isn't really a proper propagation engine to compute
+ 	     this property.
+ 	     ???  We can handle merging of two restricts
+ 	     by unifying them.  */
+ 	  if (restrict_var)
+ 	    {
+ 	      if (clique == 0)
+ 		clique = ++cfun->last_restrict_uid;
+ 	      if (restrict_var->ruid == 0)
+ 		restrict_var->ruid = ++last_ruid;
+ 
+ 	      /* Now look at possible dereferences of ptr.  */
+ 	      imm_use_iterator ui;
+ 	      gimple use_stmt;
+ 	      FOR_EACH_IMM_USE_STMT (use_stmt, ui, ptr)
+ 		{
+ 		  /* ???  Calls and asms.  */
+ 		  if (!gimple_assign_single_p (use_stmt))
+ 		    continue;
+ 		  tree *ref = gimple_assign_lhs_ptr (use_stmt);
+ 		  while (handled_component_p (*ref))
+ 		    ref = &TREE_OPERAND (*ref, 0);
+ 		  if (TREE_CODE (*ref) == MEM_REF
+ 		      && TREE_OPERAND (*ref, 0) == ptr)
+ 		    (*ref)->base.u.version = clique << 16 | restrict_var->ruid;
+ 		  ref = gimple_assign_rhs1_ptr (use_stmt);
+ 		  while (handled_component_p (*ref))
+ 		    ref = &TREE_OPERAND (*ref, 0);
+ 		  if (TREE_CODE (*ref) == MEM_REF
+ 		      && TREE_OPERAND (*ref, 0) == ptr)
+ 		    (*ref)->base.u.version = clique << 16 | restrict_var->ruid;
+ 		}
+ 
+ 	      /* ???  Accesses not based on a restrict pointer should
+ 	         all get into the clique but share a single ptr ID.
+ 		 That way they get disabiguated against restrict
+ 		 accesses but not against each other.  */
+ 	    }
+ 	}
+     }
+ 
    /* Deallocate memory used by aliasing data structures and the internal
       points-to solution.  */
    delete_points_to_sets ();
Index: trunk/gcc/tree-data-ref.c
===================================================================
*** trunk.orig/gcc/tree-data-ref.c	2014-08-14 14:38:53.735508565 +0200
--- trunk/gcc/tree-data-ref.c	2014-09-03 14:55:51.995467744 +0200
*************** dr_analyze_indices (struct data_referenc
*** 979,987 ****
--- 979,989 ----
  	     guaranteed.
  	     As a band-aid, mark the access so we can special-case
  	     it in dr_may_alias_p.  */
+ 	  tree old = ref;
  	  ref = fold_build2_loc (EXPR_LOCATION (ref),
  				 MEM_REF, TREE_TYPE (ref),
  				 base, memoff);
+ 	  ref->base.u.version = old->base.u.version;
  	  access_fns.safe_push (access_fn);
  	}
      }
*************** dr_may_alias_p (const struct data_refere
*** 1388,1393 ****
--- 1390,1403 ----
  	return false;
      }
  
+   if (TREE_CODE (addr_a) == MEM_REF
+       && TREE_CODE (addr_b) == MEM_REF
+       && addr_a->base.u.version != 0
+       && addr_b->base.u.version != 0
+       && (addr_a->base.u.version >> 16) == (addr_b->base.u.version >> 16)
+       && (addr_a->base.u.version & 0xffff) != (addr_b->base.u.version & 0xffff))
+     return false;
+ 
    /* If we had an evolution in a pointer-based MEM_REF BASE_OBJECT we
       do not know the size of the base-object.  So we cannot do any
       offset/overlap based analysis but have to rely on points-to

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] Restrict, take 42
  2014-09-03 13:00 [PATCH][RFC] Restrict, take 42 Richard Biener
@ 2014-09-05 12:44 ` Richard Biener
  2014-09-08 10:04   ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2014-09-05 12:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Michael Matz

On Wed, 3 Sep 2014, Richard Biener wrote:

> 
> Ok, so with recent activity in that mgrid bug (PR55334) I tried
> to remember what solution we thought of after determining that
> ADD_RESTRICT is a no-go.
> 
> The following very prototypish patch implements the idea of
> computing known non-dependences and maintaining them over
> the compilation (mainly inlining / cloning for PR55334).
> 
> So the patch piggy-backs on PTA (bad - PTA doesn't compute
> must-aliases, so it will work only for a very limited set
> of testcases now - but at least it might work for
> non-register restrict stuff).
> 
> The representation of "non-dependences" is the most interesting
> bit of course.  We partition memory references into different
> cliques in which all references are independent.  And we
> split that clique again into sets of references based on the
> same pointer.
> 
> That allows us to disambiguate equal clique but distinct pointer
> memory references.
> 
> For restrict you'd put all(!) memory references in the BLOCK
> where the restrict pointer is live into the same clique
> and assign a distinct ptr based on which restrict pointer this
> is based on.  You make all references not based on any
> restrict pointer have ptr == 0 (not implemented in the prototype
> patch - they don't get a clique assigned).
> 
> The patch simplifies this by taking function scope as the
> only BLOCK to consider, thus inside a function the clique
> will be unique (before inlining).
> 
> I can see issues arising with assigning numbers in the frontend
> based on real BLOCKs and
>   
>   {
>     int * restrict q;
>     {
>       int * restrict p;
>       *p = *q;
>     }
>     {
>       int * restrict r;
>       *r = *q;
>     }
> 
> that is, non-dependences based on pointers coming from different
> nested scopes.  The FE in this case would need to "duplicate"
> the ptr value for 'q' to not make *p and *r falsely a
> non-dependence.  But I think we're not planning to assign
> this in the C frontend family (maybe the Fortran FE though).
> 
> To preserve correctness cliques from inlined functions need to
> be remapped to an unused clique number so struct function
> gets a max_clique number.  (remapping not implemented in
> the prototype)
> 
> Any correctness / missed-optimization holes to punch?  That is,
> given a set of non-dependent reference pairs, can you assign
> that clique, ptr values in a correct and optimal way?
> (it somehow assumes transitivity, if the set is *a, *b; *b, *c;
> then *a, *c are also non-dependent)
> 
> It all depends on a conservative but agressive must-be-based-on
> analysis of course.  You'd meet that with the conservative
> may-be-based-on analysis from PTA and compute the proper
> non-dependences from that.
> 
> Comments welcome.
> 
> Patch tested on
> 
> static inline void foo (int * __restrict__ p, int * __restrict__ q)
> {
>   int i;
>   for (i = 0; i < 1024; ++i)
>     p[i] = q[i];
> }
> 
> void bar (int *p, int *q)
> {
>   foo (p, q);
> }
> 
> where it still vectorizes the loop in bar without alias check.

Updated patch with a tiny fix to manage fixing PR55334 (IPA-CP
issue with mgrid) and renaming all-over-the-place.

Still lacks the clique remapping code in the inliner which is
required for correctness and still lacks code to deal with
disambiguating restrct vs. non-restrict accesses.

Improves mgrid performance by 10%, still fixes the above tiny
testcase but otherwise untested.

Richard.

2014-09-05  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/55334

Index: trunk/gcc/function.h
===================================================================
*** trunk.orig/gcc/function.h	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/function.h	2014-09-05 14:10:30.961758013 +0200
*************** struct GTY(()) function {
*** 593,598 ****
--- 593,601 ----
       a string describing the reason for failure.  */
    const char * GTY((skip)) cannot_be_copied_reason;
  
+   /* Last assigned dependence info clique.  */
+   unsigned short last_clique;
+ 
    /* Collected bit flags.  */
  
    /* Number of units of general registers that need saving in stdarg
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree-inline.c	2014-09-05 14:10:10.303759435 +0200
*************** remap_gimple_op_r (tree *tp, int *walk_s
*** 929,934 ****
--- 929,938 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  /* ???  Properly remap the clique to a new one.  */
+ 	  if (MR_DEPENDENCE_CLIQUE (old) != 0)
+ 	    MR_DEPENDENCE_CLIQUE (*tp) = MR_DEPENDENCE_CLIQUE (old);
+ 	  MR_DEPENDENCE_BASE (*tp) = MR_DEPENDENCE_BASE (old);
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
*************** copy_tree_body_r (tree *tp, int *walk_su
*** 1180,1185 ****
--- 1184,1193 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  /* ???  Properly remap the clique to a new one.  */
+ 	  if (MR_DEPENDENCE_CLIQUE (old) != 0)
+ 	    MR_DEPENDENCE_CLIQUE (*tp) = MR_DEPENDENCE_CLIQUE (old);
+ 	  MR_DEPENDENCE_BASE (*tp) = MR_DEPENDENCE_BASE (old);
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree-pretty-print.c	2014-09-05 14:13:42.869744800 +0200
*************** dump_generic_node (pretty_printer *buffe
*** 1081,1087 ****
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1))))))
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
--- 1081,1088 ----
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)))))
! 	    && MR_DEPENDENCE_CLIQUE (node) == 0)
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
*************** dump_generic_node (pretty_printer *buffe
*** 1112,1117 ****
--- 1113,1125 ----
  		dump_generic_node (buffer, TREE_OPERAND (node, 1),
  				   spc, flags, false);
  	      }
+ 	    if (MR_DEPENDENCE_CLIQUE (node) != 0)
+ 	      {
+ 		pp_string (buffer, " clique ");
+ 		pp_unsigned_wide_integer (buffer, MR_DEPENDENCE_CLIQUE (node));
+ 		pp_string (buffer, " base ");
+ 		pp_unsigned_wide_integer (buffer, MR_DEPENDENCE_BASE (node));
+ 	      }
  	    pp_right_bracket (buffer);
  	  }
  	break;
*************** dump_generic_node (pretty_printer *buffe
*** 1450,1456 ****
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1))))))))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
--- 1458,1465 ----
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1)))))
! 		  && MR_DEPENDENCE_CLIQUE (op0) == 0)))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
Index: trunk/gcc/tree-ssa-alias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-alias.c	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree-ssa-alias.c	2014-09-05 14:10:10.304759435 +0200
*************** refs_may_alias_p_1 (ao_ref *ref1, ao_ref
*** 1371,1376 ****
--- 1371,1404 ----
      return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
  				  ref2->ref, base2, offset2, max_size2);
  
+   /* Handle restrict based accesses.
+      ???  ao_ref_base strips inner MEM_REF [&decl], recover from that
+      here.  */
+   tree rbase1 = base1;
+   tree rbase2 = base2;
+   if (var1_p)
+     {
+       rbase1 = ref1->ref;
+       if (rbase1)
+ 	while (handled_component_p (rbase1))
+ 	  rbase1 = TREE_OPERAND (rbase1, 0);
+     }
+   if (var2_p)
+     {
+       rbase2 = ref2->ref;
+       if (rbase2)
+ 	while (handled_component_p (rbase2))
+ 	  rbase2 = TREE_OPERAND (rbase2, 0);
+     }
+   if (rbase1 && rbase2
+       && (TREE_CODE (base1) == MEM_REF || TREE_CODE (base1) == TARGET_MEM_REF)
+       && (TREE_CODE (base2) == MEM_REF || TREE_CODE (base2) == TARGET_MEM_REF)
+       /* If the accesses are in the same restrict clique... */
+       && MR_DEPENDENCE_CLIQUE (base1) == MR_DEPENDENCE_CLIQUE (base2)
+       /* But based on different pointers they do not alias.  */
+       && MR_DEPENDENCE_BASE (base1) != MR_DEPENDENCE_BASE (base2))
+     return false;
+ 
    ind1_p = (TREE_CODE (base1) == MEM_REF
  	    || TREE_CODE (base1) == TARGET_MEM_REF);
    ind2_p = (TREE_CODE (base2) == MEM_REF
Index: trunk/gcc/tree-ssa-structalias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-structalias.c	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree-ssa-structalias.c	2014-09-05 14:25:22.078696660 +0200
***************
*** 52,57 ****
--- 52,60 ----
  #include "splay-tree.h"
  #include "params.h"
  #include "alias.h"
+ #include "tree-phinodes.h"
+ #include "ssa-iterators.h"
+ #include "tree-pretty-print.h"
  
  /* The idea behind this analyzer is to generate set constraints from the
     program, then solve the resulting constraints in order to generate the
*************** struct variable_info
*** 272,283 ****
--- 275,293 ----
    /* True if this field has only restrict qualified pointers.  */
    unsigned int only_restrict_pointers : 1;
  
+   /* True if this represents a heap var created for a restrict qualified
+      pointer.  */
+   unsigned int is_restrict_var : 1;
+ 
    /* True if this represents a global variable.  */
    unsigned int is_global_var : 1;
  
    /* True if this represents a IPA function info.  */
    unsigned int is_fn_info : 1;
  
+   /* ???  Store somewhere better.  */
+   unsigned short ruid;
+ 
    /* The ID of the variable for the next field in this structure
       or zero for the last field in this structure.  */
    unsigned next;
*************** new_var_info (tree t, const char *name)
*** 369,374 ****
--- 379,385 ----
    ret->is_heap_var = false;
    ret->may_have_pointers = true;
    ret->only_restrict_pointers = false;
+   ret->is_restrict_var = false;
    ret->is_global_var = (t == NULL_TREE);
    ret->is_fn_info = false;
    if (t && DECL_P (t))
*************** static varinfo_t
*** 3773,3778 ****
--- 3784,3790 ----
  make_constraint_from_restrict (varinfo_t lhs, const char *name)
  {
    varinfo_t vi = make_heapvar (name);
+   vi->is_restrict_var = 1;
    vi->is_global_var = 1;
    vi->may_have_pointers = 1;
    make_constraint_from (lhs, vi->id);
*************** intra_create_variable_infos (struct func
*** 5845,5850 ****
--- 5857,5863 ----
  	  tree heapvar = build_fake_var_decl (TREE_TYPE (TREE_TYPE (t)));
  	  DECL_EXTERNAL (heapvar) = 1;
  	  vi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS");
+ 	  vi->is_restrict_var = 1;
  	  insert_vi_for_tree (heapvar, vi);
  	  lhsc.var = p->id;
  	  lhsc.type = SCALAR;
*************** compute_may_aliases (void)
*** 6948,6953 ****
--- 6961,7061 ----
    if (dump_file)
      dump_alias_info (dump_file);
  
+     {
+       unsigned short clique = 0;
+       unsigned short last_ruid = 0;
+       for (unsigned i = 0; i < num_ssa_names; ++i)
+ 	{
+ 	  tree ptr = ssa_name (i);
+ 	  if (!ptr || !POINTER_TYPE_P (TREE_TYPE (ptr)))
+ 	    continue;
+ 
+ 	  /* Avoid all this when ptr is not dereferenced?  */
+ 	  tree p = ptr;
+ 	  if (SSA_NAME_IS_DEFAULT_DEF (ptr)
+ 	      && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL
+ 		  || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL))
+ 	    p = SSA_NAME_VAR (ptr);
+ 
+ 	  varinfo_t vi = get_varinfo (find (lookup_vi_for_tree (p)->id));
+ 	  bitmap_iterator bi;
+ 	  unsigned j;
+ 	  varinfo_t restrict_var = NULL;
+ 	  EXECUTE_IF_SET_IN_BITMAP (vi->solution, 0, j, bi)
+ 	    {
+ 	      varinfo_t oi = get_varinfo (j);
+ 	      if (oi->is_restrict_var)
+ 		{
+ 		  if (restrict_var)
+ 		    {
+ 		      if (dump_file && (dump_flags & TDF_DETAILS))
+ 			{
+ 			  fprintf (dump_file, "found restrict pointed-to "
+ 				   "for ");
+ 			  print_generic_expr (dump_file, ptr, 0);
+ 			  fprintf (dump_file, " but not exclusively\n");
+ 			}
+ 		      restrict_var = NULL;
+ 		      break;
+ 		    }
+ 		  restrict_var = oi;
+ 		}
+ 	      /* NULL is the only other valid points-to entry.  */
+ 	      else if (oi->id != nothing_id)
+ 		{
+ 		  restrict_var = NULL;
+ 		  break;
+ 		}
+ 	    }
+ 	  /* Ok, found that ptr must(!) point to a single(!) restrict
+ 	     variable.  */
+ 	  /* ???  PTA isn't really a proper propagation engine to compute
+ 	     this property.
+ 	     ???  We can handle merging of two restricts
+ 	     by unifying them.  */
+ 	  if (restrict_var)
+ 	    {
+ 	      if (clique == 0)
+ 		clique = ++cfun->last_clique;
+ 	      if (restrict_var->ruid == 0)
+ 		restrict_var->ruid = ++last_ruid;
+ 
+ 	      /* Now look at possible dereferences of ptr.  */
+ 	      imm_use_iterator ui;
+ 	      gimple use_stmt;
+ 	      FOR_EACH_IMM_USE_STMT (use_stmt, ui, ptr)
+ 		{
+ 		  /* ???  Calls and asms.  */
+ 		  if (!gimple_assign_single_p (use_stmt))
+ 		    continue;
+ 		  tree *ref = gimple_assign_lhs_ptr (use_stmt);
+ 		  while (handled_component_p (*ref))
+ 		    ref = &TREE_OPERAND (*ref, 0);
+ 		  if (TREE_CODE (*ref) == MEM_REF
+ 		      && TREE_OPERAND (*ref, 0) == ptr)
+ 		    {
+ 		      MR_DEPENDENCE_CLIQUE (*ref) = clique;
+ 		      MR_DEPENDENCE_BASE (*ref) = restrict_var->ruid;
+ 		    }
+ 		  ref = gimple_assign_rhs1_ptr (use_stmt);
+ 		  while (handled_component_p (*ref))
+ 		    ref = &TREE_OPERAND (*ref, 0);
+ 		  if (TREE_CODE (*ref) == MEM_REF
+ 		      && TREE_OPERAND (*ref, 0) == ptr)
+ 		    {
+ 		      MR_DEPENDENCE_CLIQUE (*ref) = clique;
+ 		      MR_DEPENDENCE_BASE (*ref) = restrict_var->ruid;
+ 		    }
+ 		}
+ 
+ 	      /* ???  Accesses not based on a restrict pointer should
+ 	         all get into the clique but share a single ptr ID.
+ 		 That way they get disabiguated against restrict
+ 		 accesses but not against each other.  */
+ 	    }
+ 	}
+     }
+ 
    /* Deallocate memory used by aliasing data structures and the internal
       points-to solution.  */
    delete_points_to_sets ();
Index: trunk/gcc/tree-data-ref.c
===================================================================
*** trunk.orig/gcc/tree-data-ref.c	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree-data-ref.c	2014-09-05 14:10:10.306759435 +0200
*************** dr_analyze_indices (struct data_referenc
*** 980,988 ****
--- 980,991 ----
  	     guaranteed.
  	     As a band-aid, mark the access so we can special-case
  	     it in dr_may_alias_p.  */
+ 	  tree old = ref;
  	  ref = fold_build2_loc (EXPR_LOCATION (ref),
  				 MEM_REF, TREE_TYPE (ref),
  				 base, memoff);
+ 	  MR_DEPENDENCE_CLIQUE (ref) = MR_DEPENDENCE_CLIQUE (old);
+ 	  MR_DEPENDENCE_BASE (ref) = MR_DEPENDENCE_BASE (old);
  	  access_fns.safe_push (access_fn);
  	}
      }
*************** dr_may_alias_p (const struct data_refere
*** 1389,1394 ****
--- 1392,1403 ----
  	return false;
      }
  
+   if ((TREE_CODE (addr_a) == MEM_REF || TREE_CODE (addr_a) == TARGET_MEM_REF)
+       && (TREE_CODE (addr_b) == MEM_REF || TREE_CODE (addr_b) == TARGET_MEM_REF)
+       && MR_DEPENDENCE_CLIQUE (addr_a) == MR_DEPENDENCE_CLIQUE (addr_b)
+       && MR_DEPENDENCE_BASE (addr_a) != MR_DEPENDENCE_BASE (addr_b))
+     return false;
+ 
    /* If we had an evolution in a pointer-based MEM_REF BASE_OBJECT we
       do not know the size of the base-object.  So we cannot do any
       offset/overlap based analysis but have to rely on points-to
Index: trunk/gcc/tree-core.h
===================================================================
*** trunk.orig/gcc/tree-core.h	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree-core.h	2014-09-05 14:10:10.306759435 +0200
*************** struct GTY(()) tree_base {
*** 802,807 ****
--- 802,817 ----
  
      /* Internal function code.  */
      enum internal_fn ifn;
+ 
+     /* The following two fields are used for MEM_REF and TARGET_MEM_REF
+        expression trees and specify known data non-dependences.  For
+        two memory references in a function they are known to not
+        alias if dependence_info.clique are equal and dependence_info.base
+        are distinct.  */
+     struct {
+       unsigned short clique;
+       unsigned short base;
+     } dependence_info;
    } GTY((skip(""))) u;
  };
  
Index: trunk/gcc/tree.h
===================================================================
*** trunk.orig/gcc/tree.h	2014-09-05 14:05:43.799777783 +0200
--- trunk/gcc/tree.h	2014-09-05 14:10:10.307759435 +0200
*************** extern void protected_set_expr_location
*** 1081,1086 ****
--- 1081,1091 ----
  #define TMR_STEP(NODE) (TREE_OPERAND (TARGET_MEM_REF_CHECK (NODE), 3))
  #define TMR_INDEX2(NODE) (TREE_OPERAND (TARGET_MEM_REF_CHECK (NODE), 4))
  
+ #define MR_DEPENDENCE_CLIQUE(NODE) \
+   (TREE_CHECK2 (NODE, MEM_REF, TARGET_MEM_REF)->base.u.dependence_info.clique)
+ #define MR_DEPENDENCE_BASE(NODE) \
+   (TREE_CHECK2 (NODE, MEM_REF, TARGET_MEM_REF)->base.u.dependence_info.base)
+ 
  /* The operands of a BIND_EXPR.  */
  #define BIND_EXPR_VARS(NODE) (TREE_OPERAND (BIND_EXPR_CHECK (NODE), 0))
  #define BIND_EXPR_BODY(NODE) (TREE_OPERAND (BIND_EXPR_CHECK (NODE), 1))

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] Restrict, take 42
  2014-09-05 12:44 ` Richard Biener
@ 2014-09-08 10:04   ` Richard Biener
  2014-09-08 14:46     ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2014-09-08 10:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: Michael Matz

On Fri, 5 Sep 2014, Richard Biener wrote:

> On Wed, 3 Sep 2014, Richard Biener wrote:
> 
> > 
> > Ok, so with recent activity in that mgrid bug (PR55334) I tried
> > to remember what solution we thought of after determining that
> > ADD_RESTRICT is a no-go.
> > 
> > The following very prototypish patch implements the idea of
> > computing known non-dependences and maintaining them over
> > the compilation (mainly inlining / cloning for PR55334).
> > 
> > So the patch piggy-backs on PTA (bad - PTA doesn't compute
> > must-aliases, so it will work only for a very limited set
> > of testcases now - but at least it might work for
> > non-register restrict stuff).
> > 
> > The representation of "non-dependences" is the most interesting
> > bit of course.  We partition memory references into different
> > cliques in which all references are independent.  And we
> > split that clique again into sets of references based on the
> > same pointer.
> > 
> > That allows us to disambiguate equal clique but distinct pointer
> > memory references.
> > 
> > For restrict you'd put all(!) memory references in the BLOCK
> > where the restrict pointer is live into the same clique
> > and assign a distinct ptr based on which restrict pointer this
> > is based on.  You make all references not based on any
> > restrict pointer have ptr == 0 (not implemented in the prototype
> > patch - they don't get a clique assigned).
> > 
> > The patch simplifies this by taking function scope as the
> > only BLOCK to consider, thus inside a function the clique
> > will be unique (before inlining).
> > 
> > I can see issues arising with assigning numbers in the frontend
> > based on real BLOCKs and
> >   
> >   {
> >     int * restrict q;
> >     {
> >       int * restrict p;
> >       *p = *q;
> >     }
> >     {
> >       int * restrict r;
> >       *r = *q;
> >     }
> > 
> > that is, non-dependences based on pointers coming from different
> > nested scopes.  The FE in this case would need to "duplicate"
> > the ptr value for 'q' to not make *p and *r falsely a
> > non-dependence.  But I think we're not planning to assign
> > this in the C frontend family (maybe the Fortran FE though).
> > 
> > To preserve correctness cliques from inlined functions need to
> > be remapped to an unused clique number so struct function
> > gets a max_clique number.  (remapping not implemented in
> > the prototype)
> > 
> > Any correctness / missed-optimization holes to punch?  That is,
> > given a set of non-dependent reference pairs, can you assign
> > that clique, ptr values in a correct and optimal way?
> > (it somehow assumes transitivity, if the set is *a, *b; *b, *c;
> > then *a, *c are also non-dependent)
> > 
> > It all depends on a conservative but agressive must-be-based-on
> > analysis of course.  You'd meet that with the conservative
> > may-be-based-on analysis from PTA and compute the proper
> > non-dependences from that.
> > 
> > Comments welcome.
> > 
> > Patch tested on
> > 
> > static inline void foo (int * __restrict__ p, int * __restrict__ q)
> > {
> >   int i;
> >   for (i = 0; i < 1024; ++i)
> >     p[i] = q[i];
> > }
> > 
> > void bar (int *p, int *q)
> > {
> >   foo (p, q);
> > }
> > 
> > where it still vectorizes the loop in bar without alias check.
> 
> Updated patch with a tiny fix to manage fixing PR55334 (IPA-CP
> issue with mgrid) and renaming all-over-the-place.
> 
> Still lacks the clique remapping code in the inliner which is
> required for correctness and still lacks code to deal with
> disambiguating restrct vs. non-restrict accesses.
> 
> Improves mgrid performance by 10%, still fixes the above tiny
> testcase but otherwise untested.

So one non-dependence the scheme cannot encode is if you have
non-dependence pairs (x, y) (x, z) (w, z) but not (x, w) or (y, w).
From the restrict perspective this case happens for

  {
    restrict *q;
    {
      restrict *p;
      ...
    }
  }

and *q and *p disambiguated against other non-restrict references.
That is, what clique / ptr you assign to non-restrict references
in the scope of the individual restricts.  If you see everything
at analysis time we can handle it just fine but if you consider
the inner scope being inlined and thus having pre-existing
clique/ptr values then "other non-restrict references" is
different.  Thus the question is what to do when assigning
clique, ptr values to references if there are already such
values assigned.  I tend to think preserving the original ones
is better over dumping them into the "other" category of
an outer scope restrict clique.  It would end up being
a bad choice for very heavy C++ abstraction though where
inner non-loop restrict stuff gets inlined into loops.

A more heavy-weight approach is to go to a full non-dependence
list, assigning an UID to each memory reference and record
non-dependence pairs on-the-side.

Richard.



> Richard.
> 
> 2014-09-05  Richard Biener  <rguenther@suse.de>
> 
> 	PR tree-optimization/55334
> 
> Index: trunk/gcc/function.h
> ===================================================================
> *** trunk.orig/gcc/function.h	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/function.h	2014-09-05 14:10:30.961758013 +0200
> *************** struct GTY(()) function {
> *** 593,598 ****
> --- 593,601 ----
>        a string describing the reason for failure.  */
>     const char * GTY((skip)) cannot_be_copied_reason;
>   
> +   /* Last assigned dependence info clique.  */
> +   unsigned short last_clique;
> + 
>     /* Collected bit flags.  */
>   
>     /* Number of units of general registers that need saving in stdarg
> Index: trunk/gcc/tree-inline.c
> ===================================================================
> *** trunk.orig/gcc/tree-inline.c	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree-inline.c	2014-09-05 14:10:10.303759435 +0200
> *************** remap_gimple_op_r (tree *tp, int *walk_s
> *** 929,934 ****
> --- 929,938 ----
>   	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
>   	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
>   	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
> + 	  /* ???  Properly remap the clique to a new one.  */
> + 	  if (MR_DEPENDENCE_CLIQUE (old) != 0)
> + 	    MR_DEPENDENCE_CLIQUE (*tp) = MR_DEPENDENCE_CLIQUE (old);
> + 	  MR_DEPENDENCE_BASE (*tp) = MR_DEPENDENCE_BASE (old);
>   	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
>   	     remapped a parameter as the property might be valid only
>   	     for the parameter itself.  */
> *************** copy_tree_body_r (tree *tp, int *walk_su
> *** 1180,1185 ****
> --- 1184,1193 ----
>   	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
>   	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
>   	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
> + 	  /* ???  Properly remap the clique to a new one.  */
> + 	  if (MR_DEPENDENCE_CLIQUE (old) != 0)
> + 	    MR_DEPENDENCE_CLIQUE (*tp) = MR_DEPENDENCE_CLIQUE (old);
> + 	  MR_DEPENDENCE_BASE (*tp) = MR_DEPENDENCE_BASE (old);
>   	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
>   	     remapped a parameter as the property might be valid only
>   	     for the parameter itself.  */
> Index: trunk/gcc/tree-pretty-print.c
> ===================================================================
> *** trunk.orig/gcc/tree-pretty-print.c	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree-pretty-print.c	2014-09-05 14:13:42.869744800 +0200
> *************** dump_generic_node (pretty_printer *buffe
> *** 1081,1087 ****
>   	    /* Same value types ignoring qualifiers.  */
>   	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
>   		== TYPE_MAIN_VARIANT
> ! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1))))))
>   	  {
>   	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
>   	      {
> --- 1081,1088 ----
>   	    /* Same value types ignoring qualifiers.  */
>   	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
>   		== TYPE_MAIN_VARIANT
> ! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)))))
> ! 	    && MR_DEPENDENCE_CLIQUE (node) == 0)
>   	  {
>   	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
>   	      {
> *************** dump_generic_node (pretty_printer *buffe
> *** 1112,1117 ****
> --- 1113,1125 ----
>   		dump_generic_node (buffer, TREE_OPERAND (node, 1),
>   				   spc, flags, false);
>   	      }
> + 	    if (MR_DEPENDENCE_CLIQUE (node) != 0)
> + 	      {
> + 		pp_string (buffer, " clique ");
> + 		pp_unsigned_wide_integer (buffer, MR_DEPENDENCE_CLIQUE (node));
> + 		pp_string (buffer, " base ");
> + 		pp_unsigned_wide_integer (buffer, MR_DEPENDENCE_BASE (node));
> + 	      }
>   	    pp_right_bracket (buffer);
>   	  }
>   	break;
> *************** dump_generic_node (pretty_printer *buffe
> *** 1450,1456 ****
>   		  /* Same value types ignoring qualifiers.  */
>   		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
>   		      == TYPE_MAIN_VARIANT
> ! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1))))))))
>   	{
>   	  op0 = TREE_OPERAND (op0, 0);
>   	  str = "->";
> --- 1458,1465 ----
>   		  /* Same value types ignoring qualifiers.  */
>   		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
>   		      == TYPE_MAIN_VARIANT
> ! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1)))))
> ! 		  && MR_DEPENDENCE_CLIQUE (op0) == 0)))
>   	{
>   	  op0 = TREE_OPERAND (op0, 0);
>   	  str = "->";
> Index: trunk/gcc/tree-ssa-alias.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa-alias.c	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree-ssa-alias.c	2014-09-05 14:10:10.304759435 +0200
> *************** refs_may_alias_p_1 (ao_ref *ref1, ao_ref
> *** 1371,1376 ****
> --- 1371,1404 ----
>       return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
>   				  ref2->ref, base2, offset2, max_size2);
>   
> +   /* Handle restrict based accesses.
> +      ???  ao_ref_base strips inner MEM_REF [&decl], recover from that
> +      here.  */
> +   tree rbase1 = base1;
> +   tree rbase2 = base2;
> +   if (var1_p)
> +     {
> +       rbase1 = ref1->ref;
> +       if (rbase1)
> + 	while (handled_component_p (rbase1))
> + 	  rbase1 = TREE_OPERAND (rbase1, 0);
> +     }
> +   if (var2_p)
> +     {
> +       rbase2 = ref2->ref;
> +       if (rbase2)
> + 	while (handled_component_p (rbase2))
> + 	  rbase2 = TREE_OPERAND (rbase2, 0);
> +     }
> +   if (rbase1 && rbase2
> +       && (TREE_CODE (base1) == MEM_REF || TREE_CODE (base1) == TARGET_MEM_REF)
> +       && (TREE_CODE (base2) == MEM_REF || TREE_CODE (base2) == TARGET_MEM_REF)
> +       /* If the accesses are in the same restrict clique... */
> +       && MR_DEPENDENCE_CLIQUE (base1) == MR_DEPENDENCE_CLIQUE (base2)
> +       /* But based on different pointers they do not alias.  */
> +       && MR_DEPENDENCE_BASE (base1) != MR_DEPENDENCE_BASE (base2))
> +     return false;
> + 
>     ind1_p = (TREE_CODE (base1) == MEM_REF
>   	    || TREE_CODE (base1) == TARGET_MEM_REF);
>     ind2_p = (TREE_CODE (base2) == MEM_REF
> Index: trunk/gcc/tree-ssa-structalias.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa-structalias.c	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree-ssa-structalias.c	2014-09-05 14:25:22.078696660 +0200
> ***************
> *** 52,57 ****
> --- 52,60 ----
>   #include "splay-tree.h"
>   #include "params.h"
>   #include "alias.h"
> + #include "tree-phinodes.h"
> + #include "ssa-iterators.h"
> + #include "tree-pretty-print.h"
>   
>   /* The idea behind this analyzer is to generate set constraints from the
>      program, then solve the resulting constraints in order to generate the
> *************** struct variable_info
> *** 272,283 ****
> --- 275,293 ----
>     /* True if this field has only restrict qualified pointers.  */
>     unsigned int only_restrict_pointers : 1;
>   
> +   /* True if this represents a heap var created for a restrict qualified
> +      pointer.  */
> +   unsigned int is_restrict_var : 1;
> + 
>     /* True if this represents a global variable.  */
>     unsigned int is_global_var : 1;
>   
>     /* True if this represents a IPA function info.  */
>     unsigned int is_fn_info : 1;
>   
> +   /* ???  Store somewhere better.  */
> +   unsigned short ruid;
> + 
>     /* The ID of the variable for the next field in this structure
>        or zero for the last field in this structure.  */
>     unsigned next;
> *************** new_var_info (tree t, const char *name)
> *** 369,374 ****
> --- 379,385 ----
>     ret->is_heap_var = false;
>     ret->may_have_pointers = true;
>     ret->only_restrict_pointers = false;
> +   ret->is_restrict_var = false;
>     ret->is_global_var = (t == NULL_TREE);
>     ret->is_fn_info = false;
>     if (t && DECL_P (t))
> *************** static varinfo_t
> *** 3773,3778 ****
> --- 3784,3790 ----
>   make_constraint_from_restrict (varinfo_t lhs, const char *name)
>   {
>     varinfo_t vi = make_heapvar (name);
> +   vi->is_restrict_var = 1;
>     vi->is_global_var = 1;
>     vi->may_have_pointers = 1;
>     make_constraint_from (lhs, vi->id);
> *************** intra_create_variable_infos (struct func
> *** 5845,5850 ****
> --- 5857,5863 ----
>   	  tree heapvar = build_fake_var_decl (TREE_TYPE (TREE_TYPE (t)));
>   	  DECL_EXTERNAL (heapvar) = 1;
>   	  vi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS");
> + 	  vi->is_restrict_var = 1;
>   	  insert_vi_for_tree (heapvar, vi);
>   	  lhsc.var = p->id;
>   	  lhsc.type = SCALAR;
> *************** compute_may_aliases (void)
> *** 6948,6953 ****
> --- 6961,7061 ----
>     if (dump_file)
>       dump_alias_info (dump_file);
>   
> +     {
> +       unsigned short clique = 0;
> +       unsigned short last_ruid = 0;
> +       for (unsigned i = 0; i < num_ssa_names; ++i)
> + 	{
> + 	  tree ptr = ssa_name (i);
> + 	  if (!ptr || !POINTER_TYPE_P (TREE_TYPE (ptr)))
> + 	    continue;
> + 
> + 	  /* Avoid all this when ptr is not dereferenced?  */
> + 	  tree p = ptr;
> + 	  if (SSA_NAME_IS_DEFAULT_DEF (ptr)
> + 	      && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL
> + 		  || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL))
> + 	    p = SSA_NAME_VAR (ptr);
> + 
> + 	  varinfo_t vi = get_varinfo (find (lookup_vi_for_tree (p)->id));
> + 	  bitmap_iterator bi;
> + 	  unsigned j;
> + 	  varinfo_t restrict_var = NULL;
> + 	  EXECUTE_IF_SET_IN_BITMAP (vi->solution, 0, j, bi)
> + 	    {
> + 	      varinfo_t oi = get_varinfo (j);
> + 	      if (oi->is_restrict_var)
> + 		{
> + 		  if (restrict_var)
> + 		    {
> + 		      if (dump_file && (dump_flags & TDF_DETAILS))
> + 			{
> + 			  fprintf (dump_file, "found restrict pointed-to "
> + 				   "for ");
> + 			  print_generic_expr (dump_file, ptr, 0);
> + 			  fprintf (dump_file, " but not exclusively\n");
> + 			}
> + 		      restrict_var = NULL;
> + 		      break;
> + 		    }
> + 		  restrict_var = oi;
> + 		}
> + 	      /* NULL is the only other valid points-to entry.  */
> + 	      else if (oi->id != nothing_id)
> + 		{
> + 		  restrict_var = NULL;
> + 		  break;
> + 		}
> + 	    }
> + 	  /* Ok, found that ptr must(!) point to a single(!) restrict
> + 	     variable.  */
> + 	  /* ???  PTA isn't really a proper propagation engine to compute
> + 	     this property.
> + 	     ???  We can handle merging of two restricts
> + 	     by unifying them.  */
> + 	  if (restrict_var)
> + 	    {
> + 	      if (clique == 0)
> + 		clique = ++cfun->last_clique;
> + 	      if (restrict_var->ruid == 0)
> + 		restrict_var->ruid = ++last_ruid;
> + 
> + 	      /* Now look at possible dereferences of ptr.  */
> + 	      imm_use_iterator ui;
> + 	      gimple use_stmt;
> + 	      FOR_EACH_IMM_USE_STMT (use_stmt, ui, ptr)
> + 		{
> + 		  /* ???  Calls and asms.  */
> + 		  if (!gimple_assign_single_p (use_stmt))
> + 		    continue;
> + 		  tree *ref = gimple_assign_lhs_ptr (use_stmt);
> + 		  while (handled_component_p (*ref))
> + 		    ref = &TREE_OPERAND (*ref, 0);
> + 		  if (TREE_CODE (*ref) == MEM_REF
> + 		      && TREE_OPERAND (*ref, 0) == ptr)
> + 		    {
> + 		      MR_DEPENDENCE_CLIQUE (*ref) = clique;
> + 		      MR_DEPENDENCE_BASE (*ref) = restrict_var->ruid;
> + 		    }
> + 		  ref = gimple_assign_rhs1_ptr (use_stmt);
> + 		  while (handled_component_p (*ref))
> + 		    ref = &TREE_OPERAND (*ref, 0);
> + 		  if (TREE_CODE (*ref) == MEM_REF
> + 		      && TREE_OPERAND (*ref, 0) == ptr)
> + 		    {
> + 		      MR_DEPENDENCE_CLIQUE (*ref) = clique;
> + 		      MR_DEPENDENCE_BASE (*ref) = restrict_var->ruid;
> + 		    }
> + 		}
> + 
> + 	      /* ???  Accesses not based on a restrict pointer should
> + 	         all get into the clique but share a single ptr ID.
> + 		 That way they get disabiguated against restrict
> + 		 accesses but not against each other.  */
> + 	    }
> + 	}
> +     }
> + 
>     /* Deallocate memory used by aliasing data structures and the internal
>        points-to solution.  */
>     delete_points_to_sets ();
> Index: trunk/gcc/tree-data-ref.c
> ===================================================================
> *** trunk.orig/gcc/tree-data-ref.c	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree-data-ref.c	2014-09-05 14:10:10.306759435 +0200
> *************** dr_analyze_indices (struct data_referenc
> *** 980,988 ****
> --- 980,991 ----
>   	     guaranteed.
>   	     As a band-aid, mark the access so we can special-case
>   	     it in dr_may_alias_p.  */
> + 	  tree old = ref;
>   	  ref = fold_build2_loc (EXPR_LOCATION (ref),
>   				 MEM_REF, TREE_TYPE (ref),
>   				 base, memoff);
> + 	  MR_DEPENDENCE_CLIQUE (ref) = MR_DEPENDENCE_CLIQUE (old);
> + 	  MR_DEPENDENCE_BASE (ref) = MR_DEPENDENCE_BASE (old);
>   	  access_fns.safe_push (access_fn);
>   	}
>       }
> *************** dr_may_alias_p (const struct data_refere
> *** 1389,1394 ****
> --- 1392,1403 ----
>   	return false;
>       }
>   
> +   if ((TREE_CODE (addr_a) == MEM_REF || TREE_CODE (addr_a) == TARGET_MEM_REF)
> +       && (TREE_CODE (addr_b) == MEM_REF || TREE_CODE (addr_b) == TARGET_MEM_REF)
> +       && MR_DEPENDENCE_CLIQUE (addr_a) == MR_DEPENDENCE_CLIQUE (addr_b)
> +       && MR_DEPENDENCE_BASE (addr_a) != MR_DEPENDENCE_BASE (addr_b))
> +     return false;
> + 
>     /* If we had an evolution in a pointer-based MEM_REF BASE_OBJECT we
>        do not know the size of the base-object.  So we cannot do any
>        offset/overlap based analysis but have to rely on points-to
> Index: trunk/gcc/tree-core.h
> ===================================================================
> *** trunk.orig/gcc/tree-core.h	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree-core.h	2014-09-05 14:10:10.306759435 +0200
> *************** struct GTY(()) tree_base {
> *** 802,807 ****
> --- 802,817 ----
>   
>       /* Internal function code.  */
>       enum internal_fn ifn;
> + 
> +     /* The following two fields are used for MEM_REF and TARGET_MEM_REF
> +        expression trees and specify known data non-dependences.  For
> +        two memory references in a function they are known to not
> +        alias if dependence_info.clique are equal and dependence_info.base
> +        are distinct.  */
> +     struct {
> +       unsigned short clique;
> +       unsigned short base;
> +     } dependence_info;
>     } GTY((skip(""))) u;
>   };
>   
> Index: trunk/gcc/tree.h
> ===================================================================
> *** trunk.orig/gcc/tree.h	2014-09-05 14:05:43.799777783 +0200
> --- trunk/gcc/tree.h	2014-09-05 14:10:10.307759435 +0200
> *************** extern void protected_set_expr_location
> *** 1081,1086 ****
> --- 1081,1091 ----
>   #define TMR_STEP(NODE) (TREE_OPERAND (TARGET_MEM_REF_CHECK (NODE), 3))
>   #define TMR_INDEX2(NODE) (TREE_OPERAND (TARGET_MEM_REF_CHECK (NODE), 4))
>   
> + #define MR_DEPENDENCE_CLIQUE(NODE) \
> +   (TREE_CHECK2 (NODE, MEM_REF, TARGET_MEM_REF)->base.u.dependence_info.clique)
> + #define MR_DEPENDENCE_BASE(NODE) \
> +   (TREE_CHECK2 (NODE, MEM_REF, TARGET_MEM_REF)->base.u.dependence_info.base)
> + 
>   /* The operands of a BIND_EXPR.  */
>   #define BIND_EXPR_VARS(NODE) (TREE_OPERAND (BIND_EXPR_CHECK (NODE), 0))
>   #define BIND_EXPR_BODY(NODE) (TREE_OPERAND (BIND_EXPR_CHECK (NODE), 1))
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] Restrict, take 42
  2014-09-08 10:04   ` Richard Biener
@ 2014-09-08 14:46     ` Richard Biener
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2014-09-08 14:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Michael Matz

On Mon, 8 Sep 2014, Richard Biener wrote:

> On Fri, 5 Sep 2014, Richard Biener wrote:
> 
> > On Wed, 3 Sep 2014, Richard Biener wrote:
> > 
> > > 
> > > Ok, so with recent activity in that mgrid bug (PR55334) I tried
> > > to remember what solution we thought of after determining that
> > > ADD_RESTRICT is a no-go.
> > > 
> > > The following very prototypish patch implements the idea of
> > > computing known non-dependences and maintaining them over
> > > the compilation (mainly inlining / cloning for PR55334).
> > > 
> > > So the patch piggy-backs on PTA (bad - PTA doesn't compute
> > > must-aliases, so it will work only for a very limited set
> > > of testcases now - but at least it might work for
> > > non-register restrict stuff).
> > > 
> > > The representation of "non-dependences" is the most interesting
> > > bit of course.  We partition memory references into different
> > > cliques in which all references are independent.  And we
> > > split that clique again into sets of references based on the
> > > same pointer.
> > > 
> > > That allows us to disambiguate equal clique but distinct pointer
> > > memory references.
> > > 
> > > For restrict you'd put all(!) memory references in the BLOCK
> > > where the restrict pointer is live into the same clique
> > > and assign a distinct ptr based on which restrict pointer this
> > > is based on.  You make all references not based on any
> > > restrict pointer have ptr == 0 (not implemented in the prototype
> > > patch - they don't get a clique assigned).
> > > 
> > > The patch simplifies this by taking function scope as the
> > > only BLOCK to consider, thus inside a function the clique
> > > will be unique (before inlining).
> > > 
> > > I can see issues arising with assigning numbers in the frontend
> > > based on real BLOCKs and
> > >   
> > >   {
> > >     int * restrict q;
> > >     {
> > >       int * restrict p;
> > >       *p = *q;
> > >     }
> > >     {
> > >       int * restrict r;
> > >       *r = *q;
> > >     }
> > > 
> > > that is, non-dependences based on pointers coming from different
> > > nested scopes.  The FE in this case would need to "duplicate"
> > > the ptr value for 'q' to not make *p and *r falsely a
> > > non-dependence.  But I think we're not planning to assign
> > > this in the C frontend family (maybe the Fortran FE though).
> > > 
> > > To preserve correctness cliques from inlined functions need to
> > > be remapped to an unused clique number so struct function
> > > gets a max_clique number.  (remapping not implemented in
> > > the prototype)
> > > 
> > > Any correctness / missed-optimization holes to punch?  That is,
> > > given a set of non-dependent reference pairs, can you assign
> > > that clique, ptr values in a correct and optimal way?
> > > (it somehow assumes transitivity, if the set is *a, *b; *b, *c;
> > > then *a, *c are also non-dependent)
> > > 
> > > It all depends on a conservative but agressive must-be-based-on
> > > analysis of course.  You'd meet that with the conservative
> > > may-be-based-on analysis from PTA and compute the proper
> > > non-dependences from that.
> > > 
> > > Comments welcome.
> > > 
> > > Patch tested on
> > > 
> > > static inline void foo (int * __restrict__ p, int * __restrict__ q)
> > > {
> > >   int i;
> > >   for (i = 0; i < 1024; ++i)
> > >     p[i] = q[i];
> > > }
> > > 
> > > void bar (int *p, int *q)
> > > {
> > >   foo (p, q);
> > > }
> > > 
> > > where it still vectorizes the loop in bar without alias check.
> > 
> > Updated patch with a tiny fix to manage fixing PR55334 (IPA-CP
> > issue with mgrid) and renaming all-over-the-place.
> > 
> > Still lacks the clique remapping code in the inliner which is
> > required for correctness and still lacks code to deal with
> > disambiguating restrct vs. non-restrict accesses.
> > 
> > Improves mgrid performance by 10%, still fixes the above tiny
> > testcase but otherwise untested.
> 
> So one non-dependence the scheme cannot encode is if you have
> non-dependence pairs (x, y) (x, z) (w, z) but not (x, w) or (y, w).
> From the restrict perspective this case happens for
> 
>   {
>     restrict *q;
>     {
>       restrict *p;
>       ...
>     }
>   }
> 
> and *q and *p disambiguated against other non-restrict references.
> That is, what clique / ptr you assign to non-restrict references
> in the scope of the individual restricts.  If you see everything
> at analysis time we can handle it just fine but if you consider
> the inner scope being inlined and thus having pre-existing
> clique/ptr values then "other non-restrict references" is
> different.  Thus the question is what to do when assigning
> clique, ptr values to references if there are already such
> values assigned.  I tend to think preserving the original ones
> is better over dumping them into the "other" category of
> an outer scope restrict clique.  It would end up being
> a bad choice for very heavy C++ abstraction though where
> inner non-loop restrict stuff gets inlined into loops.

The following updates the patch to implement clique remapping at
inlining and disambiguation against other global references
(not pointer references as not-based-on check is not yet implemented).

A remaining issue with computing restrict non-dependence pairs
on the GIMPLE level is that proper scoping cannot be guaranteed
for bases that are not function parameters.  Both propagation
and common subexpression elimination can enlarge the scope
where restrict is applied to (and for "other non-restrict-based
references" anything convering not the whole function is
impossible to implement).  Thus the following disables using
loads from globals as restrict source (if not indirectly loading
from function parameters).

The patch implements clique "merging" as suggested - leave existing
clique information in place.

Still manages to fix mgrid.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-09-05  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/55334

Index: trunk/gcc/function.h
===================================================================
*** trunk.orig/gcc/function.h	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/function.h	2014-09-08 16:36:11.789310610 +0200
*************** struct GTY(()) function {
*** 593,598 ****
--- 593,601 ----
       a string describing the reason for failure.  */
    const char * GTY((skip)) cannot_be_copied_reason;
  
+   /* Last assigned dependence info clique.  */
+   unsigned short last_clique;
+ 
    /* Collected bit flags.  */
  
    /* Number of units of general registers that need saving in stdarg
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-inline.c	2014-09-08 16:39:19.895297659 +0200
*************** is_parm (tree decl)
*** 831,836 ****
--- 831,854 ----
    return (TREE_CODE (decl) == PARM_DECL);
  }
  
+ /* Remap the dependence CLIQUE from the source to the destination function
+    as specified in ID.  */
+ 
+ static unsigned short
+ remap_dependence_clique (copy_body_data *id, unsigned short clique)
+ {
+   if (clique == 0)
+     return 0;
+   if (!id->dependence_map)
+     id->dependence_map
+       = new hash_map<unsigned short, unsigned short, dependence_hasher>;
+   bool existed;
+   unsigned short &newc = id->dependence_map->get_or_insert (clique, &existed);
+   if (!existed)
+     newc = ++cfun->last_clique;
+   return newc;
+ }
+ 
  /* Remap the GIMPLE operand pointed to by *TP.  DATA is really a
     'struct walk_stmt_info *'.  DATA->INFO is a 'copy_body_data *'.
     WALK_SUBTREES is used to indicate walk_gimple_op whether to keep
*************** remap_gimple_op_r (tree *tp, int *walk_s
*** 929,934 ****
--- 947,958 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  if (MR_DEPENDENCE_CLIQUE (old) != 0)
+ 	    {
+ 	      MR_DEPENDENCE_CLIQUE (*tp)
+ 	        = remap_dependence_clique (id, MR_DEPENDENCE_CLIQUE (old));
+ 	      MR_DEPENDENCE_BASE (*tp) = MR_DEPENDENCE_BASE (old);
+ 	    }
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
*************** copy_tree_body_r (tree *tp, int *walk_su
*** 1180,1185 ****
--- 1204,1215 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  if (MR_DEPENDENCE_CLIQUE (old) != 0)
+ 	    {
+ 	      MR_DEPENDENCE_CLIQUE (*tp)
+ 		= remap_dependence_clique (id, MR_DEPENDENCE_CLIQUE (old));
+ 	      MR_DEPENDENCE_BASE (*tp) = MR_DEPENDENCE_BASE (old);
+ 	    }
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
*************** copy_cfg_body (copy_body_data * id, gcov
*** 2632,2637 ****
--- 2662,2672 ----
        delete id->eh_map;
        id->eh_map = NULL;
      }
+   if (id->dependence_map)
+     {
+       delete id->dependence_map;
+       id->dependence_map = NULL;
+     }
  
    return new_fndecl;
  }
*************** copy_gimple_seq_and_replace_locals (gimp
*** 4947,4952 ****
--- 4982,4992 ----
    delete id.decl_map;
    if (id.debug_map)
      delete id.debug_map;
+   if (id.dependence_map)
+     {
+       delete id.dependence_map;
+       id.dependence_map = NULL;
+     }
  
    return copy;
  }
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-pretty-print.c	2014-09-08 16:36:11.792310610 +0200
*************** dump_generic_node (pretty_printer *buffe
*** 1081,1087 ****
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1))))))
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
--- 1081,1088 ----
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)))))
! 	    && MR_DEPENDENCE_CLIQUE (node) == 0)
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
*************** dump_generic_node (pretty_printer *buffe
*** 1112,1117 ****
--- 1113,1125 ----
  		dump_generic_node (buffer, TREE_OPERAND (node, 1),
  				   spc, flags, false);
  	      }
+ 	    if (MR_DEPENDENCE_CLIQUE (node) != 0)
+ 	      {
+ 		pp_string (buffer, " clique ");
+ 		pp_unsigned_wide_integer (buffer, MR_DEPENDENCE_CLIQUE (node));
+ 		pp_string (buffer, " base ");
+ 		pp_unsigned_wide_integer (buffer, MR_DEPENDENCE_BASE (node));
+ 	      }
  	    pp_right_bracket (buffer);
  	  }
  	break;
*************** dump_generic_node (pretty_printer *buffe
*** 1450,1456 ****
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1))))))))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
--- 1458,1465 ----
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1)))))
! 		  && MR_DEPENDENCE_CLIQUE (op0) == 0)))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
Index: trunk/gcc/tree-ssa-alias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-alias.c	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-ssa-alias.c	2014-09-08 16:36:11.793310610 +0200
*************** refs_may_alias_p_1 (ao_ref *ref1, ao_ref
*** 1371,1376 ****
--- 1371,1404 ----
      return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
  				  ref2->ref, base2, offset2, max_size2);
  
+   /* Handle restrict based accesses.
+      ???  ao_ref_base strips inner MEM_REF [&decl], recover from that
+      here.  */
+   tree rbase1 = base1;
+   tree rbase2 = base2;
+   if (var1_p)
+     {
+       rbase1 = ref1->ref;
+       if (rbase1)
+ 	while (handled_component_p (rbase1))
+ 	  rbase1 = TREE_OPERAND (rbase1, 0);
+     }
+   if (var2_p)
+     {
+       rbase2 = ref2->ref;
+       if (rbase2)
+ 	while (handled_component_p (rbase2))
+ 	  rbase2 = TREE_OPERAND (rbase2, 0);
+     }
+   if (rbase1 && rbase2
+       && (TREE_CODE (base1) == MEM_REF || TREE_CODE (base1) == TARGET_MEM_REF)
+       && (TREE_CODE (base2) == MEM_REF || TREE_CODE (base2) == TARGET_MEM_REF)
+       /* If the accesses are in the same restrict clique... */
+       && MR_DEPENDENCE_CLIQUE (base1) == MR_DEPENDENCE_CLIQUE (base2)
+       /* But based on different pointers they do not alias.  */
+       && MR_DEPENDENCE_BASE (base1) != MR_DEPENDENCE_BASE (base2))
+     return false;
+ 
    ind1_p = (TREE_CODE (base1) == MEM_REF
  	    || TREE_CODE (base1) == TARGET_MEM_REF);
    ind2_p = (TREE_CODE (base2) == MEM_REF
Index: trunk/gcc/tree-ssa-structalias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-structalias.c	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-ssa-structalias.c	2014-09-08 16:36:11.796310609 +0200
***************
*** 52,57 ****
--- 52,61 ----
  #include "splay-tree.h"
  #include "params.h"
  #include "alias.h"
+ #include "tree-phinodes.h"
+ #include "ssa-iterators.h"
+ #include "tree-pretty-print.h"
+ #include "gimple-walk.h"
  
  /* The idea behind this analyzer is to generate set constraints from the
     program, then solve the resulting constraints in order to generate the
*************** struct variable_info
*** 272,283 ****
--- 276,294 ----
    /* True if this field has only restrict qualified pointers.  */
    unsigned int only_restrict_pointers : 1;
  
+   /* True if this represents a heap var created for a restrict qualified
+      pointer.  */
+   unsigned int is_restrict_var : 1;
+ 
    /* True if this represents a global variable.  */
    unsigned int is_global_var : 1;
  
    /* True if this represents a IPA function info.  */
    unsigned int is_fn_info : 1;
  
+   /* ???  Store somewhere better.  */
+   unsigned short ruid;
+ 
    /* The ID of the variable for the next field in this structure
       or zero for the last field in this structure.  */
    unsigned next;
*************** new_var_info (tree t, const char *name)
*** 369,374 ****
--- 380,386 ----
    ret->is_heap_var = false;
    ret->may_have_pointers = true;
    ret->only_restrict_pointers = false;
+   ret->is_restrict_var = false;
    ret->is_global_var = (t == NULL_TREE);
    ret->is_fn_info = false;
    if (t && DECL_P (t))
*************** static varinfo_t
*** 3773,3778 ****
--- 3785,3791 ----
  make_constraint_from_restrict (varinfo_t lhs, const char *name)
  {
    varinfo_t vi = make_heapvar (name);
+   vi->is_restrict_var = 1;
    vi->is_global_var = 1;
    vi->may_have_pointers = 1;
    make_constraint_from (lhs, vi->id);
*************** create_variable_info_for (tree decl, con
*** 5735,5741 ****
  	   && TYPE_RESTRICT (TREE_TYPE (decl)))
  	  || vi->only_restrict_pointers)
  	{
! 	  make_constraint_from_global_restrict (vi, "GLOBAL_RESTRICT");
  	  continue;
  	}
  
--- 5748,5758 ----
  	   && TYPE_RESTRICT (TREE_TYPE (decl)))
  	  || vi->only_restrict_pointers)
  	{
! 	  varinfo_t rvi
! 	    = make_constraint_from_global_restrict (vi, "GLOBAL_RESTRICT");
! 	  /* ???  For now exclude reads from globals as restrict sources
! 	     if those are not (indirectly) from incoming parameters.  */
! 	  rvi->is_restrict_var = false;
  	  continue;
  	}
  
*************** intra_create_variable_infos (struct func
*** 5845,5850 ****
--- 5862,5868 ----
  	  tree heapvar = build_fake_var_decl (TREE_TYPE (TREE_TYPE (t)));
  	  DECL_EXTERNAL (heapvar) = 1;
  	  vi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS");
+ 	  vi->is_restrict_var = 1;
  	  insert_vi_for_tree (heapvar, vi);
  	  lhsc.var = p->id;
  	  lhsc.type = SCALAR;
*************** delete_points_to_sets (void)
*** 6917,6922 ****
--- 6935,7112 ----
    obstack_free (&final_solutions_obstack, NULL);
  }
  
+ /* Mark "other" loads and stores as belonging to CLIQUE and with
+    base zero.  */
+ 
+ static bool
+ visit_loadstore (gimple, tree base, tree ref, void *clique_)
+ {
+   unsigned short clique = (uintptr_t)clique_;
+   if (TREE_CODE (base) == MEM_REF
+       || TREE_CODE (base) == TARGET_MEM_REF)
+     {
+       tree ptr = TREE_OPERAND (base, 0);
+       if (TREE_CODE (ptr) == SSA_NAME)
+ 	{
+ 	  /* ???  We need to make sure 'ptr' doesn't include any of
+ 	     the restrict tags in its points-to set.  */
+ 	  return false;
+ 	}
+ 
+       /* For now let decls through.  */
+ 
+       /* Do not overwrite existing cliques (that includes clique, base
+          pairs we just set).  */
+       if (MR_DEPENDENCE_CLIQUE (base) == 0)
+ 	{
+ 	  MR_DEPENDENCE_CLIQUE (base) = clique;
+ 	  MR_DEPENDENCE_BASE (base) = 0;
+ 	}
+     }
+ 
+   /* For plain decl accesses see whether they are accesses to globals
+      and rewrite them to MEM_REFs with { clique, 0 }.  */
+   if (TREE_CODE (base) == VAR_DECL
+       && is_global_var (base)
+       /* ???  We can't rewrite a plain decl with the walk_stmt_load_store
+ 	 ops callback.  */
+       && base != ref)
+     {
+       tree *basep = &ref;
+       while (handled_component_p (*basep))
+ 	basep = &TREE_OPERAND (*basep, 0);
+       gcc_assert (TREE_CODE (*basep) == VAR_DECL);
+       tree zero = build_int_cst (build_fold_addr_expr (*basep), 0);
+       *basep = build2 (MEM_REF, TREE_TYPE (*basep), TREE_TYPE (zero), zero);
+       MR_DEPENDENCE_CLIQUE (*basep) = clique;
+       MR_DEPENDENCE_BASE (*basep) = 0;
+     }
+ 
+   return false;
+ }
+ 
+ /* Compute the set of independend memory references based on restrict
+    tags and their conservative propagation to the points-to sets.  */
+ 
+ static void
+ compute_dependence_clique (void)
+ {
+   unsigned short clique = 0;
+   unsigned short last_ruid = 0;
+   for (unsigned i = 0; i < num_ssa_names; ++i)
+     {
+       tree ptr = ssa_name (i);
+       if (!ptr || !POINTER_TYPE_P (TREE_TYPE (ptr)))
+ 	continue;
+ 
+       /* Avoid all this when ptr is not dereferenced?  */
+       tree p = ptr;
+       if (SSA_NAME_IS_DEFAULT_DEF (ptr)
+ 	  && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL
+ 	      || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL))
+ 	p = SSA_NAME_VAR (ptr);
+       varinfo_t vi = lookup_vi_for_tree (p);
+       if (!vi)
+ 	continue;
+       vi = get_varinfo (find (vi->id));
+       bitmap_iterator bi;
+       unsigned j;
+       varinfo_t restrict_var = NULL;
+       EXECUTE_IF_SET_IN_BITMAP (vi->solution, 0, j, bi)
+ 	{
+ 	  varinfo_t oi = get_varinfo (j);
+ 	  if (oi->is_restrict_var)
+ 	    {
+ 	      if (restrict_var)
+ 		{
+ 		  if (dump_file && (dump_flags & TDF_DETAILS))
+ 		    {
+ 		      fprintf (dump_file, "found restrict pointed-to "
+ 			       "for ");
+ 		      print_generic_expr (dump_file, ptr, 0);
+ 		      fprintf (dump_file, " but not exclusively\n");
+ 		    }
+ 		  restrict_var = NULL;
+ 		  break;
+ 		}
+ 	      restrict_var = oi;
+ 	    }
+ 	  /* NULL is the only other valid points-to entry.  */
+ 	  else if (oi->id != nothing_id)
+ 	    {
+ 	      restrict_var = NULL;
+ 	      break;
+ 	    }
+ 	}
+       /* Ok, found that ptr must(!) point to a single(!) restrict
+ 	 variable.  */
+       /* ???  PTA isn't really a proper propagation engine to compute
+ 	 this property.
+ 	 ???  We can handle merging of two restricts
+ 	 by unifying them.  */
+       if (restrict_var)
+ 	{
+ 	  if (clique == 0)
+ 	    clique = ++cfun->last_clique;
+ 	  if (restrict_var->ruid == 0)
+ 	    restrict_var->ruid = ++last_ruid;
+ 
+ 	  /* Now look at possible dereferences of ptr.  */
+ 	  imm_use_iterator ui;
+ 	  gimple use_stmt;
+ 	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, ptr)
+ 	    {
+ 	      /* ???  Calls and asms.  */
+ 	      if (!gimple_assign_single_p (use_stmt))
+ 		continue;
+ 	      tree *ref = gimple_assign_lhs_ptr (use_stmt);
+ 	      while (handled_component_p (*ref))
+ 		ref = &TREE_OPERAND (*ref, 0);
+ 	      if (TREE_CODE (*ref) == MEM_REF
+ 		  && TREE_OPERAND (*ref, 0) == ptr)
+ 		{
+ 		  MR_DEPENDENCE_CLIQUE (*ref) = clique;
+ 		  MR_DEPENDENCE_BASE (*ref) = restrict_var->ruid;
+ 		}
+ 	      ref = gimple_assign_rhs1_ptr (use_stmt);
+ 	      while (handled_component_p (*ref))
+ 		ref = &TREE_OPERAND (*ref, 0);
+ 	      if (TREE_CODE (*ref) == MEM_REF
+ 		  && TREE_OPERAND (*ref, 0) == ptr)
+ 		{
+ 		  MR_DEPENDENCE_CLIQUE (*ref) = clique;
+ 		  MR_DEPENDENCE_BASE (*ref) = restrict_var->ruid;
+ 		}
+ 	    }
+ 
+ 	  /* ???  But when updating accesses this way that already
+ 	     have a clique assigned we probably should leave the
+ 	     old information.  */
+ 	  /* ???  As we do PTA multiple times we probably should
+ 	     avoid (or not) our own results.  */
+ 	}
+     }
+ 
+   if (clique == 0)
+     return;
+ 
+   /* Assign the BASE id zero to all accesses not based on a restrict
+      pointer.  That way they get disabiguated against restrict
+      accesses but not against each other.  */
+   /* ???  For restricts derived from globals (thus not incoming
+      parameters) we can't restrict scoping properly thus the following
+      is too aggressive there.  For now we have excluded those globals from
+      getting into the MR_DEPENDENCE machinery.  */
+   basic_block bb;
+   FOR_EACH_BB_FN (bb, cfun)
+     for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+ 	 !gsi_end_p (gsi); gsi_next (&gsi))
+       {
+ 	gimple stmt = gsi_stmt (gsi);
+ 	walk_stmt_load_store_ops (stmt, (void *)(uintptr_t)clique,
+ 				  visit_loadstore, visit_loadstore);
+       }
+ }
  
  /* Compute points-to information for every SSA_NAME pointer in the
     current function and compute the transitive closure of escaped
*************** compute_may_aliases (void)
*** 6948,6953 ****
--- 7138,7146 ----
    if (dump_file)
      dump_alias_info (dump_file);
  
+   /* Compute restrict-based memory disambiguations.  */
+   compute_dependence_clique ();
+ 
    /* Deallocate memory used by aliasing data structures and the internal
       points-to solution.  */
    delete_points_to_sets ();
Index: trunk/gcc/tree-data-ref.c
===================================================================
*** trunk.orig/gcc/tree-data-ref.c	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-data-ref.c	2014-09-08 16:36:11.798310609 +0200
*************** dr_analyze_indices (struct data_referenc
*** 980,988 ****
--- 980,991 ----
  	     guaranteed.
  	     As a band-aid, mark the access so we can special-case
  	     it in dr_may_alias_p.  */
+ 	  tree old = ref;
  	  ref = fold_build2_loc (EXPR_LOCATION (ref),
  				 MEM_REF, TREE_TYPE (ref),
  				 base, memoff);
+ 	  MR_DEPENDENCE_CLIQUE (ref) = MR_DEPENDENCE_CLIQUE (old);
+ 	  MR_DEPENDENCE_BASE (ref) = MR_DEPENDENCE_BASE (old);
  	  access_fns.safe_push (access_fn);
  	}
      }
*************** dr_may_alias_p (const struct data_refere
*** 1389,1394 ****
--- 1392,1403 ----
  	return false;
      }
  
+   if ((TREE_CODE (addr_a) == MEM_REF || TREE_CODE (addr_a) == TARGET_MEM_REF)
+       && (TREE_CODE (addr_b) == MEM_REF || TREE_CODE (addr_b) == TARGET_MEM_REF)
+       && MR_DEPENDENCE_CLIQUE (addr_a) == MR_DEPENDENCE_CLIQUE (addr_b)
+       && MR_DEPENDENCE_BASE (addr_a) != MR_DEPENDENCE_BASE (addr_b))
+     return false;
+ 
    /* If we had an evolution in a pointer-based MEM_REF BASE_OBJECT we
       do not know the size of the base-object.  So we cannot do any
       offset/overlap based analysis but have to rely on points-to
Index: trunk/gcc/tree-core.h
===================================================================
*** trunk.orig/gcc/tree-core.h	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-core.h	2014-09-08 16:36:11.798310609 +0200
*************** struct GTY(()) tree_base {
*** 802,807 ****
--- 802,817 ----
  
      /* Internal function code.  */
      enum internal_fn ifn;
+ 
+     /* The following two fields are used for MEM_REF and TARGET_MEM_REF
+        expression trees and specify known data non-dependences.  For
+        two memory references in a function they are known to not
+        alias if dependence_info.clique are equal and dependence_info.base
+        are distinct.  */
+     struct {
+       unsigned short clique;
+       unsigned short base;
+     } dependence_info;
    } GTY((skip(""))) u;
  };
  
Index: trunk/gcc/tree.h
===================================================================
*** trunk.orig/gcc/tree.h	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree.h	2014-09-08 16:36:11.799310609 +0200
*************** extern void protected_set_expr_location
*** 1081,1086 ****
--- 1081,1091 ----
  #define TMR_STEP(NODE) (TREE_OPERAND (TARGET_MEM_REF_CHECK (NODE), 3))
  #define TMR_INDEX2(NODE) (TREE_OPERAND (TARGET_MEM_REF_CHECK (NODE), 4))
  
+ #define MR_DEPENDENCE_CLIQUE(NODE) \
+   (TREE_CHECK2 (NODE, MEM_REF, TARGET_MEM_REF)->base.u.dependence_info.clique)
+ #define MR_DEPENDENCE_BASE(NODE) \
+   (TREE_CHECK2 (NODE, MEM_REF, TARGET_MEM_REF)->base.u.dependence_info.base)
+ 
  /* The operands of a BIND_EXPR.  */
  #define BIND_EXPR_VARS(NODE) (TREE_OPERAND (BIND_EXPR_CHECK (NODE), 0))
  #define BIND_EXPR_BODY(NODE) (TREE_OPERAND (BIND_EXPR_CHECK (NODE), 1))
Index: trunk/gcc/tree-inline.h
===================================================================
*** trunk.orig/gcc/tree-inline.h	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/tree-inline.h	2014-09-08 16:36:11.799310609 +0200
*************** enum copy_body_cge_which
*** 37,42 ****
--- 37,62 ----
    CB_CGE_MOVE_CLONES
  };
  
+ struct dependence_hasher : default_hashmap_traits
+ {
+   template<typename T>
+   static void
+   mark_deleted (T &e)
+     { gcc_unreachable (); }
+ 
+   template<typename T>
+   static void
+   mark_empty (T &e)
+     { e.m_key = 0; }
+ 
+   template<typename T>
+   static bool
+   is_deleted (T &)
+     { return false; }
+ 
+   template<typename T> static bool is_empty (T &e) { return e.m_key == 0; }
+ };
+ 
  /* Data required for function body duplication.  */
  
  struct copy_body_data
*************** struct copy_body_data
*** 138,143 ****
--- 158,167 ----
    /* Cilk keywords currently need to replace some variables that
       ordinary nested functions do not.  */ 
    bool remap_var_for_cilk;
+ 
+   /* A map from the inlined functions dependence info cliques to
+      equivalents in the function into which it is being inlined.  */
+   hash_map<unsigned short, unsigned short, dependence_hasher> *dependence_map;
  };
  
  /* Weights of constructions for estimate_num_insns.  */
Index: trunk/gcc/gimple-fold.c
===================================================================
*** trunk.orig/gcc/gimple-fold.c	2014-09-08 16:23:03.948364852 +0200
--- trunk/gcc/gimple-fold.c	2014-09-08 16:36:11.800310609 +0200
*************** maybe_canonicalize_mem_ref_addr (tree *t
*** 2754,2760 ****
       accessed is a decl that has the same access semantics as the MEM_REF.  */
    if (TREE_CODE (*t) == MEM_REF
        && TREE_CODE (TREE_OPERAND (*t, 0)) == ADDR_EXPR
!       && integer_zerop (TREE_OPERAND (*t, 1)))
      {
        tree decl = TREE_OPERAND (TREE_OPERAND (*t, 0), 0);
        tree alias_type = TREE_TYPE (TREE_OPERAND (*t, 1));
--- 2754,2761 ----
       accessed is a decl that has the same access semantics as the MEM_REF.  */
    if (TREE_CODE (*t) == MEM_REF
        && TREE_CODE (TREE_OPERAND (*t, 0)) == ADDR_EXPR
!       && integer_zerop (TREE_OPERAND (*t, 1))
!       && MR_DEPENDENCE_CLIQUE (*t) == 0)
      {
        tree decl = TREE_OPERAND (TREE_OPERAND (*t, 0), 0);
        tree alias_type = TREE_TYPE (TREE_OPERAND (*t, 1));

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-08 14:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-03 13:00 [PATCH][RFC] Restrict, take 42 Richard Biener
2014-09-05 12:44 ` Richard Biener
2014-09-08 10:04   ` Richard Biener
2014-09-08 14:46     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).