public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix PR 33870
@ 2007-11-08  0:24 Diego Novillo
  2007-11-08  0:35 ` Jakub Jelinek
  2007-11-08 10:45 ` Richard Guenther
  0 siblings, 2 replies; 18+ messages in thread
From: Diego Novillo @ 2007-11-08  0:24 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]

This patch fixes PR 33870.  There were a couple of things at play here
related to the way we compute sub-variables for structures and arrays.
 Particularly in the presence of nesting.

When alias analysis computes points-to information for a structure for
which we have sub-variables, it never adds all the variables to the
set.  It only adds the first SFT reachable by the pointer.

So, if we have a pointer p_1 pointing into some structure and we
dereference a field with p_1->fld, we can find out what offset to use
by calling get_base_ref_and_extent.  However, if that field is inside
a nested structure, the offset computed by get_ref_base_and_extent is
the offset from the start of the immediately containing structure.
However, to find out what other SFTs are affected by this reference,
we need to know the offsets starting at the root structure in the
nesting hierarchy.

For instance, given the following structure:

       struct X {
         int a;
         struct Y {
           int b;
           struct Z {
             int c[3];
           } d;
         } e;
       } m;

and the following address expression:

       p_1 = &m.e.d;

This structure will receive 5 SFTs, namely 2 for fields 'a' and 'b'
and 3 for the array 'c' in struct Z.  So, the reference p_1->c[2] and
m.e.d.c[2] access the exact same memory location (ie, SFT.5).

Now, alias analysis computed the points-to set for pointer p_1 as  {
SFT.3 } because that is the first field that p_1 actually points to.
When the expression p_1->c[2] is analyzed, get_ref_base_and_extent
will return an offset of 96 because we are accessing the third element
of the array.  But the SFT we are looking for is actually at offset
160, counting from the top of struct X.

Therefore, we need to adjust the offset given by
get_base_ref_and_extent by the offset of VAR so that we can get at all
the fields starting at VAR.

Notice that this adjustment is ONLY necessary if the pointer is
pointing inside a nested structure.  If p_1 is pointing at the base
structure of the nesting hierarchy, then no adjustment is necessary.

Since this adjustment was being applied with every field reference, we
were failing to add SFTs when some of the SFTs in the structure had
been partitioned away into an MPT.

In the specific case of PR 33870, there was a single field 'pDirty'
inside a large structure that was being referenced with a pointer and
with a regular variable.  The field had the subvariable SFT.12
associated with it, with an offset of X bytes.

Since SFT.12 was the only field that had not been partitioned away,
when we were adding SFTs in the operand scanner, we would blindly
adjust the offset of SFT.12 by X bytes and then trying to see if there
was any SFT at offset X + X.

There wasn't any, of course, and so no VOP was added to the memory
load expression, which was then taken out of the loop because it was
thought to be loop independent.  Leading to the runtime failure.

The only reason this didn't occur if partitioning was disabled was
that the same blind offset adjustment was adding SFT.12 because it had
ajdusted the offset of an SFT that was X bytes earlier in the
structure.  So, it was working by chance.

By recognizing that this offset adjustment is only required for nested
structures, we do not need to blindly adjust every offset and so we
don't end up adding SFTs by accident.

Tested on x86_64.  Committed.

[-- Attachment #2: 20071107-33870.diff.txt --]
[-- Type: text/plain, Size: 7676 bytes --]

2007-11-07  Diego Novillo  <dnovillo@google.com>

	PR 33870
	* tree.h (struct tree_struct_field_tag): Add field in_nested_struct.
	(SFT_IN_NESTED_STRUCT): Define.
	* tree-dfa.c (dump_subvars_for): Show offset of each
	sub-var.
	* tree-flow.h (struct fieldoff): Add field in_nested_struct.
	* tree-ssa-structalias.c (struct variable_info): Likewise.
	(push_fields_onto_fieldstack): If OFFSET is positive,
	set in_nested_struct.
	(create_variable_info_for): Copy setting of
	in_nested_struct from the field offset object.
	(set_uids_in_ptset): Set SFT_IN_NESTED_STRUCT from the
	variable info object.
	* tree-ssa-operands.c (add_vars_for_offset): If VAR
	belongs to a nested structure, adjust OFFSET by
	SFT_OFFSET(VAR).

testsuite/ChangeLog

	* gcc.c-torture/execute/pr33870.x: Remove.

Index: tree.h
===================================================================
--- tree.h	(revision 129956)
+++ tree.h	(working copy)
@@ -2573,15 +2573,21 @@ struct tree_struct_field_tag GTY(())
   /* Size of the field.  */
   unsigned HOST_WIDE_INT size;
 
+  /* True if this SFT is for a field in a nested structure.  */
+  unsigned int in_nested_struct : 1;
+
   /* Alias set for a DECL_NONADDRESSABLE_P field.  Otherwise -1.  */
   alias_set_type alias_set;
 };
+
 #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var)
 #define SFT_OFFSET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.offset)
 #define SFT_SIZE(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.size)
 #define SFT_NONADDRESSABLE_P(NODE) \
   (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1)
 #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set)
+#define SFT_IN_NESTED_STRUCT(NODE) \
+  (STRUCT_FIELD_TAG_CHECK (NODE)->sft.in_nested_struct)
 
 /* Memory Partition Tags (MPTs) group memory symbols under one
    common name for the purposes of placing memory PHI nodes.  */
Index: testsuite/gcc.c-torture/execute/pr33870.x
===================================================================
--- testsuite/gcc.c-torture/execute/pr33870.x	(revision 129956)
+++ testsuite/gcc.c-torture/execute/pr33870.x	(working copy)
@@ -1,9 +0,0 @@
-# The test breaks because of wrong alias info for -O2 and -Os
-
-set torture_eval_before_compile {
-  if {[string match {*-O[2s]*} "$option"]} {
-     set torture_execute_xfail "*-*-*"
-  }
-}
-
-return 0
Index: tree-dfa.c
===================================================================
--- tree-dfa.c	(revision 129956)
+++ tree-dfa.c	(working copy)
@@ -287,7 +287,7 @@ dump_subvars_for (FILE *file, tree var)
   for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i)
     {
       print_generic_expr (file, subvar, dump_flags);
-      fprintf (file, " ");
+      fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED " ", SFT_OFFSET (subvar));
     }
 
   fprintf (file, "}");
Index: tree-flow.h
===================================================================
--- tree-flow.h	(revision 129956)
+++ tree-flow.h	(working copy)
@@ -1159,6 +1159,10 @@ struct fieldoff
   /* Field.  */
   tree decl;
 
+  /* True if this field is inside a structure nested inside the base
+     containing object.  */
+  unsigned int in_nested_struct : 1;
+
   /* Offset from the base of the base containing object to this field.  */
   HOST_WIDE_INT offset;  
 
Index: tree-ssa-structalias.c
===================================================================
--- tree-ssa-structalias.c	(revision 129956)
+++ tree-ssa-structalias.c	(working copy)
@@ -253,6 +253,15 @@ struct variable_info
      variable.  This is used for C++ placement new.  */
   unsigned int no_tbaa_pruning : 1;
 
+  /* True if this variable is inside a structure nested in the
+     structure for the base variable.  For instance, in 
+     struct X { int a; struct Y { int b; int c; } }, the variables for
+     fields 'b' and 'c' are inside a nested structure.  We are not
+     interested in tracking how many levels of nesting, just whether
+     there is nesting at all.  This is later used to adjust offsets
+     for pointers pointing into sub-structures.  */
+  unsigned int in_nested_struct : 1;
+
   /* Points-to set for this variable.  */
   bitmap solution;
 
@@ -4133,6 +4142,12 @@ push_fields_onto_fieldstack (tree type, 
 		pair->alias_set = get_alias_set (addressable_type);
 	      else
 		pair->alias_set = -1;
+
+	      /* If the base offset is positive, this field belongs to
+		 a structure nested inside the base structure.  */
+	      if (offset > 0)
+		pair->in_nested_struct = true;
+
 	      count++;
 	    }
 	  else
@@ -4181,6 +4196,12 @@ push_fields_onto_fieldstack (tree type, 
 	      pair->alias_set = get_alias_set (addressable_type);
 	    else
 	      pair->alias_set = -1;
+
+	    /* If the base offset is positive, this field belongs to
+	       a structure nested inside the base structure.  */
+	    if (offset > 0)
+	      pair->in_nested_struct = true;
+
 	    count++;
 	  }
 	else
@@ -4491,6 +4512,7 @@ create_variable_info_for (tree decl, con
 	  newvi->offset = fo->offset;
 	  newvi->size = TREE_INT_CST_LOW (fo->size);
 	  newvi->fullsize = vi->fullsize;
+	  newvi->in_nested_struct = fo->in_nested_struct;
 	  insert_into_field_list (vi, newvi);
 	  VEC_safe_push (varinfo_t, heap, varmap, newvi);
 	  if (is_global && (!flag_whole_program || !in_ipa_mode))
@@ -4743,6 +4765,7 @@ set_uids_in_ptset (tree ptr, bitmap into
 		      || (!is_derefed && !vi->directly_dereferenced)
 		      || alias_sets_conflict_p (ptr_alias_set, var_alias_set))
 		    bitmap_set_bit (into, DECL_UID (sft));
+		  SFT_IN_NESTED_STRUCT (sft) = vi->in_nested_struct;
 		}
 	    }
 	  else
@@ -4946,7 +4969,6 @@ find_what_p_points_to (tree p)
 	    }
 
 	  /* Share the final set of variables when possible.  */
-
 	  finished_solution = BITMAP_GGC_ALLOC ();
 	  stats.points_to_sets_created++;
 
Index: tree-ssa-operands.c
===================================================================
--- tree-ssa-operands.c	(revision 129956)
+++ tree-ssa-operands.c	(working copy)
@@ -1397,8 +1397,48 @@ add_vars_for_offset (tree var, unsigned 
   subvar_t sv;
   unsigned int i;
 
-  /* Adjust offset by the pointed-to location.  */
-  offset += SFT_OFFSET (var);
+  if (SFT_IN_NESTED_STRUCT (var))
+    {
+      /* Since VAR is an SFT inside a nested structure, the OFFSET
+	 computed by get_ref_base_and_extent is the offset from the
+	 start of the immediately containing structure.  However, to
+	 find out what other SFTs are affected by this reference, we
+	 need to know the offsets starting at the root structure in
+	 the nesting hierarchy.
+
+	 For instance, given the following structure:
+
+	 	struct X {
+		  int a;
+		  struct Y {
+		    int b;
+		    struct Z {
+		      int c[3];
+		    } d;
+		  } e;
+		} m;
+
+	 and the following address expression:
+
+		p_1 = &m.e.d;
+
+	 This structure will receive 5 SFTs, namely 2 for fields 'a'
+	 and 'b' and 3 for the array 'c' in struct Z.  So, the
+	 reference p_1->c[2] and m.e.d.c[2] access the exact same
+	 memory location (ie, SFT.5).
+
+	 Now, alias analysis computed the points-to set for pointer
+	 p_1 as  { SFT.3 } because that is the first field that p_1
+	 actually points to.  When the expression p_1->c[2] is
+	 analyzed, get_ref_base_and_extent will return an offset of 96
+	 because we are accessing the third element of the array.  But
+	 the SFT we are looking for is actually at offset 160,
+	 counting from the top of struct X.
+
+	 Therefore, we adjust OFFSET by the offset of VAR so that we
+	 can get at all the fields starting at VAR.  */
+      offset += SFT_OFFSET (var);
+    }
 
   /* Add all subvars of var that overlap with the access.
      Binary search for the first relevant SFT.  */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-08  0:24 Fix PR 33870 Diego Novillo
@ 2007-11-08  0:35 ` Jakub Jelinek
  2007-11-08  2:20   ` Diego Novillo
  2007-11-08 10:45 ` Richard Guenther
  1 sibling, 1 reply; 18+ messages in thread
From: Jakub Jelinek @ 2007-11-08  0:35 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Wed, Nov 07, 2007 at 07:24:32PM -0500, Diego Novillo wrote:
> --- tree.h	(revision 129956)
> +++ tree.h	(working copy)
> @@ -2573,15 +2573,21 @@ struct tree_struct_field_tag GTY(())
>    /* Size of the field.  */
>    unsigned HOST_WIDE_INT size;
>  
> +  /* True if this SFT is for a field in a nested structure.  */
> +  unsigned int in_nested_struct : 1;
> +
>    /* Alias set for a DECL_NONADDRESSABLE_P field.  Otherwise -1.  */
>    alias_set_type alias_set;
>  };

Doesn't this grow struct tree_struct_field_tag by in most cases 8 bytes?
Aren't there enough bitfields that could be used for this bit instead?

struct tree_memory_tag GTY(())
{
  struct tree_decl_minimal common;

  bitmap GTY ((skip)) aliases;

  unsigned int is_global:1;
};

31 or even 63 bits in tree_memory_tag (and then tree_base has a whole
bunch of spare bits, so perhaps if is_global moved to one of those too
(of course guarded with TREE_MEMORY_TAG_CHECK resp.
STRUCT_FIELD_TAG_CHECK), we would save even 8 more bytes here.
I know [tuples] does far more here, but even 4.3 will be used in the wild
and every byte in often used structs counts.

	Jakub

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-08  0:35 ` Jakub Jelinek
@ 2007-11-08  2:20   ` Diego Novillo
  0 siblings, 0 replies; 18+ messages in thread
From: Diego Novillo @ 2007-11-08  2:20 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Nov 7, 2007 7:35 PM, Jakub Jelinek <jakub@redhat.com> wrote:

> Doesn't this grow struct tree_struct_field_tag by in most cases 8 bytes?

Oops, silly me.  Thanks for noticing.  Fixed with:

        * tree.h (struct tree_struct_field_tag): Move field
        in_nested_struct ...
        (struct tree_memory_tag): ... here.

Index: tree.h
===================================================================
--- tree.h      (revision 129976)
+++ tree.h      (working copy)
@@ -2554,7 +2554,11 @@ struct tree_memory_tag GTY(())

   bitmap GTY ((skip)) aliases;

+  /* True if this tag has global scope.  */
   unsigned int is_global:1;
+
+  /* True if this SFT is for a field in a nested structure.  */
+  unsigned int in_nested_struct : 1;
 };

 #define MTAG_GLOBAL(NODE) (TREE_MEMORY_TAG_CHECK (NODE)->mtag.is_global)
@@ -2573,9 +2577,6 @@ struct tree_struct_field_tag GTY(())
   /* Size of the field.  */
   unsigned HOST_WIDE_INT size;

-  /* True if this SFT is for a field in a nested structure.  */
-  unsigned int in_nested_struct : 1;
-
   /* Alias set for a DECL_NONADDRESSABLE_P field.  Otherwise -1.  */
   alias_set_type alias_set;
 };
@@ -2587,7 +2588,7 @@ struct tree_struct_field_tag GTY(())
   (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1)
 #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set)
 #define SFT_IN_NESTED_STRUCT(NODE) \
-  (STRUCT_FIELD_TAG_CHECK (NODE)->sft.in_nested_struct)
+  (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.in_nested_struct)

 /* Memory Partition Tags (MPTs) group memory symbols under one
    common name for the purposes of placing memory PHI nodes.  */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-08  0:24 Fix PR 33870 Diego Novillo
  2007-11-08  0:35 ` Jakub Jelinek
@ 2007-11-08 10:45 ` Richard Guenther
  2007-11-13 16:30   ` Diego Novillo
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-08 10:45 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On 11/8/07, Diego Novillo <dnovillo@google.com> wrote:
> This patch fixes PR 33870.  There were a couple of things at play here
> related to the way we compute sub-variables for structures and arrays.
>  Particularly in the presence of nesting.
>
> When alias analysis computes points-to information for a structure for
> which we have sub-variables, it never adds all the variables to the
> set.  It only adds the first SFT reachable by the pointer.
>
> So, if we have a pointer p_1 pointing into some structure and we
> dereference a field with p_1->fld, we can find out what offset to use
> by calling get_base_ref_and_extent.  However, if that field is inside
> a nested structure, the offset computed by get_ref_base_and_extent is
> the offset from the start of the immediately containing structure.
> However, to find out what other SFTs are affected by this reference,
> we need to know the offsets starting at the root structure in the
> nesting hierarchy.
>
> For instance, given the following structure:
>
>        struct X {
>          int a;
>          struct Y {
>            int b;
>            struct Z {
>              int c[3];
>            } d;
>          } e;
>        } m;
>
> and the following address expression:
>
>        p_1 = &m.e.d;
>
> This structure will receive 5 SFTs, namely 2 for fields 'a' and 'b'
> and 3 for the array 'c' in struct Z.  So, the reference p_1->c[2] and
> m.e.d.c[2] access the exact same memory location (ie, SFT.5).
>
> Now, alias analysis computed the points-to set for pointer p_1 as  {
> SFT.3 } because that is the first field that p_1 actually points to.
> When the expression p_1->c[2] is analyzed, get_ref_base_and_extent
> will return an offset of 96 because we are accessing the third element
> of the array.  But the SFT we are looking for is actually at offset
> 160, counting from the top of struct X.
>
> Therefore, we need to adjust the offset given by
> get_base_ref_and_extent by the offset of VAR so that we can get at all
> the fields starting at VAR.
>
> Notice that this adjustment is ONLY necessary if the pointer is
> pointing inside a nested structure.  If p_1 is pointing at the base
> structure of the nesting hierarchy, then no adjustment is necessary.
>
> Since this adjustment was being applied with every field reference, we
> were failing to add SFTs when some of the SFTs in the structure had
> been partitioned away into an MPT.
>
> In the specific case of PR 33870, there was a single field 'pDirty'
> inside a large structure that was being referenced with a pointer and
> with a regular variable.  The field had the subvariable SFT.12
> associated with it, with an offset of X bytes.
>
> Since SFT.12 was the only field that had not been partitioned away,
> when we were adding SFTs in the operand scanner, we would blindly
> adjust the offset of SFT.12 by X bytes and then trying to see if there
> was any SFT at offset X + X.
>
> There wasn't any, of course, and so no VOP was added to the memory
> load expression, which was then taken out of the loop because it was
> thought to be loop independent.  Leading to the runtime failure.
>
> The only reason this didn't occur if partitioning was disabled was
> that the same blind offset adjustment was adding SFT.12 because it had
> ajdusted the offset of an SFT that was X bytes earlier in the
> structure.  So, it was working by chance.
>
> By recognizing that this offset adjustment is only required for nested
> structures, we do not need to blindly adjust every offset and so we
> don't end up adding SFTs by accident.
>
> Tested on x86_64.  Committed.

You are only working around the problem and do not address the fundamental
issue.  The following still fails:

extern void abort (void);

typedef struct PgHdr PgHdr;
typedef unsigned char u8;
struct PgHdr {
int y;
struct {
  unsigned int pgno;
  PgHdr *pNextHash, *pPrevHash;
  PgHdr *pNextFree, *pPrevFree;
  PgHdr *pNextAll;
  u8 inJournal;
  short int nRef;
  PgHdr *pDirty, *pPrevDirty;
  unsigned int notUsed;
} x;
};
volatile PgHdr **xx;
static inline PgHdr *merge_pagelist(PgHdr *pA, PgHdr *pB)
{
  PgHdr result;
  PgHdr *pTail;
  xx = &result.x.pDirty;
  pTail = &result;
  while( pA && pB ){
    if( pA->x.pgno<pB->x.pgno ){
      pTail->x.pDirty = pA;
      pTail = pA;
      pA = pA->x.pDirty;
    }else{
      pTail->x.pDirty = pB;
      pTail = pB;
      pB = pB->x.pDirty;
    }
  }
  if( pA ){
    pTail->x.pDirty = pA;
  }else if( pB ){
    pTail->x.pDirty = pB;
  }else{
    pTail->x.pDirty = 0;
  }
  return result.x.pDirty;
}

PgHdr * __attribute__((noinline)) sort_pagelist(PgHdr *pIn)
{
  PgHdr *a[25], *p;
  int i;
  __builtin_memset (a, 0, sizeof (a));
  while( pIn ){
    p = pIn;
    pIn = p->x.pDirty;
    p->x.pDirty = 0;
    for(i=0; i<25 -1; i++){
      if( a[i]==0 ){
        a[i] = p;
        break;
      }else{
        p = merge_pagelist(a[i], p);
        a[i] = 0;
        a[i] = 0;
      }
    }
    if( i==25 -1 ){
      a[i] = merge_pagelist(a[i], p);
    }
  }
  p = a[0];
  for(i=1; i<25; i++){
    p = merge_pagelist (p, a[i]);
  }
  return p;
}

int main()
{
  PgHdr a[5];
  PgHdr *p;
  a[0].x.pgno = 5;
  a[0].x.pDirty = &a[1];
  a[1].x.pgno = 4;
  a[1].x.pDirty = &a[2];
  a[2].x.pgno = 1;
  a[2].x.pDirty = &a[3];
  a[3].x.pgno = 3;
  a[3].x.pDirty = 0;
  p = sort_pagelist (&a[0]);
  if (p->x.pDirty == p)
    abort ();
  return 0;
}

taking the address of result.x.pDirty causes the SFT to be marked as
in substructure
and the problem re-surfaces.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-08 10:45 ` Richard Guenther
@ 2007-11-13 16:30   ` Diego Novillo
  2007-11-13 16:47     ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Diego Novillo @ 2007-11-13 16:30 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4038 bytes --]

Richard Guenther wrote:

> taking the address of result.x.pDirty causes the SFT to be marked as
> in substructure and the problem re-surfaces.

Yes.  Thanks for the test case.  The original patch did not take into 
account that the nesting level of the memory reference is also important.

There are two problems here:

1- Given an arbitrary memory expression M dereferencing a pointer P into 
a nested structure, the offset computed for M is the offset starting at 
P.  However, the offsets computed for the original SFTs are offsets from 
the top of the pointed-to memory object O.  So, when we recognize this 
situation, we must adjust the offsets.  This was the situation 
recognized and fixed by the original patch.

The bug was in the detection of whether P is "pointing to the middle of 
an object".  This is where the nesting level comes in to play.  Suppose 
that M was the expression p_3->b[3] and the original pointed-to object 
was v.x.y:

		struct {
			...
			struct {
				...
				struct {
					...
					int b[4];
				} y;
			} x;
		} v;

		p_3 = &v.x.y;

If alias analysis created SFTs for this structure, the array 'b' will 
have 4 SFTs associated with it.  Say SFT.12, SFT.13, SFT.14 and SFT.15. 
  However, only SFT.12 will be added to the alias set of p_3's name tag.

So, when we see the expression 'p_3->b[3]', we compute an offset of 96, 
but we need SFT.14 which will probably have an offset much higher than 
that.  So, we adjust 96 by the offset of SFT.12 and get to SFT.14.

In the original patch, we recognized this situation by asking whether 
the SFT.12 was inside a nested structure.  However, it may happen that 
the pointer was actually pointing to the base of the original object 
'v'.  In which case, we would have the expression 'p_3->x.y.b[3]', which 
also takes us to SFT.12.

But in this case, the offset we computed from 'p_3->x.y.b[3]' is 
*already* the right offset, so adding the offset of SFT.12 to it gives 
us the wrong offset.  So, what this patch does is determine the nesting 
level of each SFT and the nesting level of the memory expression.  We 
only need to adjust the offset of an SFT if the nesting level of the 
memory expression is lower than the nesting level of the SFT that we are 
analyzing.


2- The second problem we have with this scheme is that we rely on the 
presence of key SFTs to adjust offsets when we detect references to 
nested aggregates.  This is something we cannot easily undo because 
inside the operand scanner we cannot really start doing use-def chain 
chasing to determine the original pointed-to objects.  This work has 
already been done by alias analysis, so duplicating this in the operand 
scanner would be wasteful.

This means that this scheme cannot tolerate these key SFTs to be 
partitioned away.  If they are partitioned, then inside the operand 
scanner is essentially impossible to determine where in the structure is 
an arbitrary memory expression referring to.  This is shown in the new 
test gcc.dg/tree-ssa/alias-16.c.

One way to deal with this would be to use the pointed-to set for the 
base pointer instead of the alias set of the name tag.  This has the 
problem that we'd miss the compile-time benefits of partitioning, as we 
would be traversing the original sets instead of the partitioned 
(smaller) alias sets.  Also, we may want to get rid of the duplicate 
pointed-to/alias sets in the future (not sure if they're both worth 
keeping).

The other way of dealing with this is to recognize which SFTs are 
important to not partition and make sure they are never partitioned. 
During alias analysis, if a nested SFT is added to the alias set of a 
pointer, then it is marked as unpartitionable.  The partitioner, in 
turn, gives these SFTs a very high score and makes sure that they are 
never added to a partition.  The effects of this are/should be 
negligible as it only involves a single SFT and nested structures.

The new test alias-16.c tests for these edge cases.

Tested on x86_64 and ppc64.  Committed to trunk.


Diego.

[-- Attachment #2: 20071113-33870-nesting-level.diff.txt --]
[-- Type: text/plain, Size: 21492 bytes --]

2007-11-12  Diego Novillo  <dnovillo@google.com>

	PR 33870
	* tree.h (strcut tree_memory_tag): Add field unpartitionable.
	Remove field in_nested_struct.
	(struct tree_struct_field_tag): Add field nesting_level.
	(SFT_IN_NESTED_STRUCT): Remove.
	(SFT_NESTING_LEVEL): Define.
	(SFT_UNPARTITIONABLE_P): Define.
	* tree-ssa-alias.c (mem_sym_score): If MP->VAR is not
	partitionable, return LONG_MAX.
	(compute_memory_partitions): Do not partition SFTs marked
	unpartitionable.
	(create_sft): Add argument NESTING_LEVEL.  Set
	SFT_NESTING_LEVEL with it.  Update all users.
	(create_overlap_variables_for): Show nesting level.
	* tree-dfa.c (dump_subvars_for): Likewise.
	(dump_variable): Likewise.
	Show whether the SFT is partitionable or not.
	* tree-flow.h (struct fieldoff): Remove field
	in_nested_struct.
	Add field nesting_level.
	* tree-ssa-structalias.c (struct variable_info): Remove
	field in_nested_struct.
	(push_fields_onto_fieldstack): Add argument
	NESTING_LEVEL.  Update all users.
	Update documentation.
	Update PAIR->NESTING_LEVEL with NESTING_LEVEL.
	Make recursive calls with NESTING_LEVEL + 1.
	(set_uids_in_ptset): If an SFT is added to the points-to
	set, mark it as unpartitionable.
	* tree-ssa-operands.c (ref_nesting_level): New.
	(add_vars_for_offset): Call it.
	Add argument FULL_REF.  Update
	callers.
	If VAR is inside a nested structure and the nesting level
	of FULL_REF is lower than the nesting level of VAR,
	adjust OFFSET by the offset of VAR.

testsuite/ChangeLog
	
	PR 33870
	* gcc.c-torture/execute/pr33870-1.c: New test.
	* gcc.dg/tree-ssa/alias-16.c: New test.

Index: tree.h
===================================================================
--- tree.h	(revision 130139)
+++ tree.h	(working copy)
@@ -2555,10 +2555,10 @@ struct tree_memory_tag GTY(())
   bitmap GTY ((skip)) aliases;
 
   /* True if this tag has global scope.  */
-  unsigned int is_global:1;
+  unsigned int is_global : 1;
 
-  /* True if this SFT is for a field in a nested structure.  */
-  unsigned int in_nested_struct : 1;
+  /* True if this tag should not be grouped into a memory partition.  */
+  unsigned int unpartitionable : 1;
 };
 
 #define MTAG_GLOBAL(NODE) (TREE_MEMORY_TAG_CHECK (NODE)->mtag.is_global)
@@ -2579,6 +2579,11 @@ struct tree_struct_field_tag GTY(())
 
   /* Alias set for a DECL_NONADDRESSABLE_P field.  Otherwise -1.  */
   alias_set_type alias_set;
+
+  /* Nesting level for this subvariable.  This indicates how many
+     structures are wrapping this field.  Fields at the top level have
+     a nesting level of 0.  */
+  unsigned int nesting_level;
 };
 
 #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var)
@@ -2587,8 +2592,10 @@ struct tree_struct_field_tag GTY(())
 #define SFT_NONADDRESSABLE_P(NODE) \
   (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1)
 #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set)
-#define SFT_IN_NESTED_STRUCT(NODE) \
-  (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.in_nested_struct)
+#define SFT_NESTING_LEVEL(NODE) \
+  (STRUCT_FIELD_TAG_CHECK (NODE)->sft.nesting_level)
+#define SFT_UNPARTITIONABLE_P(NODE) \
+  (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.unpartitionable)
 
 /* Memory Partition Tags (MPTs) group memory symbols under one
    common name for the purposes of placing memory PHI nodes.  */
Index: testsuite/gcc.c-torture/execute/pr33870-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr33870-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr33870-1.c	(revision 0)
@@ -0,0 +1,94 @@
+extern void abort (void);
+
+typedef struct PgHdr PgHdr;
+typedef unsigned char u8;
+struct PgHdr {
+int y;
+struct {
+ unsigned int pgno;
+ PgHdr *pNextHash, *pPrevHash;
+ PgHdr *pNextFree, *pPrevFree;
+ PgHdr *pNextAll;
+ u8 inJournal;
+ short int nRef;
+ PgHdr *pDirty, *pPrevDirty;
+ unsigned int notUsed;
+} x;
+};
+PgHdr **xx;
+volatile int vx;
+static inline PgHdr *merge_pagelist(PgHdr *pA, PgHdr *pB)
+{
+ PgHdr result;
+ PgHdr *pTail;
+ xx = &result.x.pDirty;
+ pTail = &result;
+ while( pA && pB ){
+   if( pA->x.pgno<pB->x.pgno ){
+     pTail->x.pDirty = pA;
+     pTail = pA;
+     pA = pA->x.pDirty;
+   }else{
+     pTail->x.pDirty = pB;
+     pTail = pB;
+     pB = pB->x.pDirty;
+   }
+   vx = (*xx)->y;
+ }
+ if( pA ){
+   pTail->x.pDirty = pA;
+ }else if( pB ){
+   pTail->x.pDirty = pB;
+ }else{
+   pTail->x.pDirty = 0;
+ }
+ return result.x.pDirty;
+}
+
+PgHdr * __attribute__((noinline)) sort_pagelist(PgHdr *pIn)
+{
+ PgHdr *a[25], *p;
+ int i;
+ __builtin_memset (a, 0, sizeof (a));
+ while( pIn ){
+   p = pIn;
+   pIn = p->x.pDirty;
+   p->x.pDirty = 0;
+   for(i=0; i<25 -1; i++){
+     if( a[i]==0 ){
+       a[i] = p;
+       break;
+     }else{
+       p = merge_pagelist(a[i], p);
+       a[i] = 0;
+       a[i] = 0;
+     }
+   }
+   if( i==25 -1 ){
+     a[i] = merge_pagelist(a[i], p);
+   }
+ }
+ p = a[0];
+ for(i=1; i<25; i++){
+   p = merge_pagelist (p, a[i]);
+ }
+ return p;
+}
+
+int main()
+{
+ PgHdr a[5];
+ PgHdr *p;
+ a[0].x.pgno = 5;
+ a[0].x.pDirty = &a[1];
+ a[1].x.pgno = 4;
+ a[1].x.pDirty = &a[2];
+ a[2].x.pgno = 1;
+ a[2].x.pDirty = &a[3];
+ a[3].x.pgno = 3;
+ a[3].x.pDirty = 0;
+ p = sort_pagelist (&a[0]);
+ if (p->x.pDirty == p)
+   abort ();
+ return 0;
+}
Index: testsuite/gcc.dg/tree-ssa/alias-16.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/alias-16.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/alias-16.c	(revision 0)
@@ -0,0 +1,33 @@
+/* { dg-do run } */
+/* { dg-options "-O --param max-aliased-vops=1" } */
+
+/* Compile with -O --param max-aliased-vops=1.  This partitions all
+   the initial SFTs for 'm' which was causing the operand scanner to
+   miss adding the right SFTs to p->b[2].  */
+extern void abort (void);
+
+struct X {
+    int a;
+    struct Y {
+	int b[4];
+    } b;
+    struct Y c;
+} m;
+
+struct X n;
+
+foo (int i)
+{
+  struct Y *p = (i > 10) ? &m.b : &n.c;
+  p->b[2] = 10;
+  m.b.b[3] = 6;
+  n.c.b[2] = 3;
+  return p->b[2] + n.c.b[2] + m.b.b[3];
+}
+
+main()
+{
+  if (foo (3) != 12)
+    abort ();
+  return 0;
+}
Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c	(revision 130139)
+++ tree-ssa-alias.c	(working copy)
@@ -828,6 +828,13 @@ count_mem_refs (long *num_vuses_p, long 
 static inline long
 mem_sym_score (mem_sym_stats_t mp)
 {
+  /* Unpartitionable SFTs are automatically thrown to the bottom of
+     the list.  They are not stored in partitions, but they are used
+     for computing overall statistics.  */
+  if (TREE_CODE (mp->var) == STRUCT_FIELD_TAG
+      && SFT_UNPARTITIONABLE_P (mp->var))
+    return LONG_MAX;
+
   return mp->frequency_writes * 64 + mp->frequency_reads * 32
          + mp->num_direct_writes * 16 + mp->num_direct_reads * 8
 	 + mp->num_indirect_writes * 4 + mp->num_indirect_reads * 2
@@ -1392,8 +1399,8 @@ update_reference_counts (struct mem_ref_
 
 static void
 build_mp_info (struct mem_ref_stats_d *mem_ref_stats,
-                 VEC(mem_sym_stats_t,heap) **mp_info_p,
-		 VEC(tree,heap) **tags_p)
+               VEC(mem_sym_stats_t,heap) **mp_info_p,
+	       VEC(tree,heap) **tags_p)
 {
   tree var;
   referenced_var_iterator rvi;
@@ -1591,6 +1598,15 @@ compute_memory_partitions (void)
       if (!need_to_partition_p (mem_ref_stats))
 	break;
 
+      /* SFTs that are marked unpartitionable should not be added to
+	 partitions.  These SFTs are special because they mark the
+	 first SFT into a structure where a pointer is pointing to.
+	 This is needed by the operand scanner to find adjacent
+	 fields.  See add_vars_for_offset for details.  */
+      if (TREE_CODE (mp_p->var) == STRUCT_FIELD_TAG
+	  && SFT_UNPARTITIONABLE_P (mp_p->var))
+	continue;
+
       mpt = find_partition_for (mp_p);
       estimate_vop_reduction (mem_ref_stats, mp_p, mpt);
     }
@@ -3774,7 +3790,8 @@ get_or_create_used_part_for (size_t uid)
 
 static tree
 create_sft (tree var, tree field, unsigned HOST_WIDE_INT offset,
-	    unsigned HOST_WIDE_INT size, alias_set_type alias_set)
+	    unsigned HOST_WIDE_INT size, alias_set_type alias_set,
+	    unsigned nesting_level)
 {
   tree subvar = create_tag_raw (STRUCT_FIELD_TAG, field, "SFT");
 
@@ -3794,6 +3811,8 @@ create_sft (tree var, tree field, unsign
   SFT_OFFSET (subvar) = offset;
   SFT_SIZE (subvar) = size;
   SFT_ALIAS_SET (subvar) = alias_set;
+  SFT_NESTING_LEVEL (subvar) = nesting_level;
+
   return subvar;
 }
 
@@ -3814,7 +3833,7 @@ create_overlap_variables_for (tree var)
     return;
 
   push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL,
-			       TREE_TYPE (var));
+			       TREE_TYPE (var), 0);
   if (VEC_length (fieldoff_s, fieldstack) != 0)
     {
       subvar_t *subvars;
@@ -3897,7 +3916,6 @@ create_overlap_variables_for (tree var)
 	     field, skip it.  Note that we always need the field at
 	     offset 0 so we can properly handle pointers to the
 	     structure.  */
-
 	  if ((fo->offset != 0
 	       && ((fo->offset <= up->minused
 		    && fo->offset + fosize <= up->minused)
@@ -3906,8 +3924,9 @@ create_overlap_variables_for (tree var)
 		  && fosize == lastfosize
 		  && currfotype == lastfotype))
 	    continue;
-	  subvar = create_sft (var, fo->type, fo->offset,
-			       fosize, fo->alias_set);
+
+	  subvar = create_sft (var, fo->type, fo->offset, fosize,
+			       fo->alias_set, fo->nesting_level);
 	  VEC_quick_push (tree, *subvars, subvar);
 
 	  if (dump_file)
@@ -3918,7 +3937,8 @@ create_overlap_variables_for (tree var)
 		       SFT_OFFSET (subvar));
 	      fprintf (dump_file, " size " HOST_WIDE_INT_PRINT_DEC,
 		       SFT_SIZE (subvar));
-	      fprintf (dump_file, "\n");
+	      fprintf (dump_file, " nesting level %d\n",
+		       SFT_NESTING_LEVEL (subvar));
 	    }
 	  
 	  lastfotype = currfotype;
Index: tree-dfa.c
===================================================================
--- tree-dfa.c	(revision 130139)
+++ tree-dfa.c	(working copy)
@@ -287,7 +287,9 @@ dump_subvars_for (FILE *file, tree var)
   for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i)
     {
       print_generic_expr (file, subvar, dump_flags);
-      fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED " ", SFT_OFFSET (subvar));
+      fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (subvar));
+      fprintf (file, "[%u]", SFT_NESTING_LEVEL (subvar));
+      fprintf (file, " ");
     }
 
   fprintf (file, "}");
@@ -417,6 +419,15 @@ dump_variable (FILE *file, tree var)
 	  fprintf (file, ", partition symbols: ");
 	  dump_decl_set (file, MPT_SYMBOLS (var));
 	}
+
+      if (TREE_CODE (var) == STRUCT_FIELD_TAG)
+	{
+	  fprintf (file, ", offset: " HOST_WIDE_INT_PRINT_UNSIGNED,
+		   SFT_OFFSET (var));
+	  fprintf (file, ", nesting: %u", SFT_NESTING_LEVEL (var));
+	  fprintf (file, ", partitionable: %s",
+		   SFT_UNPARTITIONABLE_P (var) ? "NO" : "YES");
+	}
     }
 
   fprintf (file, "\n");
Index: tree-flow.h
===================================================================
--- tree-flow.h	(revision 130139)
+++ tree-flow.h	(working copy)
@@ -1159,9 +1159,9 @@ struct fieldoff
   /* Field.  */
   tree decl;
 
-  /* True if this field is inside a structure nested inside the base
-     containing object.  */
-  unsigned int in_nested_struct : 1;
+  /* Nesting level.  This number represents how many structures are
+     wrapping this field.  */
+  unsigned nesting_level;
 
   /* Offset from the base of the base containing object to this field.  */
   HOST_WIDE_INT offset;  
@@ -1173,8 +1173,8 @@ typedef struct fieldoff fieldoff_s;
 
 DEF_VEC_O(fieldoff_s);
 DEF_VEC_ALLOC_O(fieldoff_s,heap);
-int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **,
-				 HOST_WIDE_INT, bool *, tree);
+int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **, HOST_WIDE_INT,
+				 bool *, tree, unsigned);
 void sort_fieldstack (VEC(fieldoff_s,heap) *);
 
 void init_alias_heapvars (void);
Index: tree-ssa-structalias.c
===================================================================
--- tree-ssa-structalias.c	(revision 130139)
+++ tree-ssa-structalias.c	(working copy)
@@ -253,15 +253,6 @@ struct variable_info
      variable.  This is used for C++ placement new.  */
   unsigned int no_tbaa_pruning : 1;
 
-  /* True if this variable is inside a structure nested in the
-     structure for the base variable.  For instance, in 
-     struct X { int a; struct Y { int b; int c; } }, the variables for
-     fields 'b' and 'c' are inside a nested structure.  We are not
-     interested in tracking how many levels of nesting, just whether
-     there is nesting at all.  This is later used to adjust offsets
-     for pointers pointing into sub-structures.  */
-  unsigned int in_nested_struct : 1;
-
   /* Points-to set for this variable.  */
   bitmap solution;
 
@@ -4050,19 +4041,28 @@ sort_fieldstack (VEC(fieldoff_s,heap) *f
 	 fieldoff_compare);
 }
 
-/* Given a TYPE, and a vector of field offsets FIELDSTACK, push all the fields
-   of TYPE onto fieldstack, recording their offsets along the way.
-   OFFSET is used to keep track of the offset in this entire structure, rather
-   than just the immediately containing structure.  Returns the number
-   of fields pushed.
+/* Given a TYPE, and a vector of field offsets FIELDSTACK, push all
+   the fields of TYPE onto fieldstack, recording their offsets along
+   the way.
+
+   OFFSET is used to keep track of the offset in this entire
+   structure, rather than just the immediately containing structure.
+   Returns the number of fields pushed.
+
    HAS_UNION is set to true if we find a union type as a field of
-   TYPE.  ADDRESSABLE_TYPE is the type of the outermost object that could have
-   its address taken.  */
+   TYPE.
+
+   ADDRESSABLE_TYPE is the type of the outermost object that could
+   have its address taken.
+
+   NESTING_LEVEL indicates whether TYPE is a structure nested inside
+   another, it starts at 0 and it is incremented by one on every
+   structure recursed into.  */
 
 int
 push_fields_onto_fieldstack (tree type, VEC(fieldoff_s,heap) **fieldstack,
 			     HOST_WIDE_INT offset, bool *has_union,
-			     tree addressable_type)
+			     tree addressable_type, unsigned nesting_level)
 {
   tree field;
   int count = 0;
@@ -4119,11 +4119,14 @@ push_fields_onto_fieldstack (tree type, 
 	  if (!AGGREGATE_TYPE_P (TREE_TYPE (type))) /* var_can_have_subvars */
 	    push = true;
 	  else if (!(pushed = push_fields_onto_fieldstack
-		     (TREE_TYPE (type), fieldstack,
-		      offset + i * TREE_INT_CST_LOW (elsz), has_union,
+		     (TREE_TYPE (type),
+		      fieldstack,
+		      offset + i * TREE_INT_CST_LOW (elsz),
+		      has_union,
 		      (TYPE_NONALIASED_COMPONENT (type)
 		       ? addressable_type
-		       : TREE_TYPE (type)))))
+		       : TREE_TYPE (type)),
+		      nesting_level + 1)))
 	    /* Empty structures may have actual size, like in C++. So
 	       see if we didn't push any subfields and the size is
 	       nonzero, push the field onto the stack */
@@ -4142,12 +4145,7 @@ push_fields_onto_fieldstack (tree type, 
 		pair->alias_set = get_alias_set (addressable_type);
 	      else
 		pair->alias_set = -1;
-
-	      /* If the base offset is positive, this field belongs to
-		 a structure nested inside the base structure.  */
-	      if (offset > 0)
-		pair->in_nested_struct = true;
-
+	      pair->nesting_level = nesting_level;
 	      count++;
 	    }
 	  else
@@ -4171,11 +4169,14 @@ push_fields_onto_fieldstack (tree type, 
 	if (!var_can_have_subvars (field))
 	  push = true;
 	else if (!(pushed = push_fields_onto_fieldstack
-		   (TREE_TYPE (field), fieldstack,
-		    offset + bitpos_of_field (field), has_union,
+		   (TREE_TYPE (field),
+		    fieldstack,
+		    offset + bitpos_of_field (field),
+		    has_union,
 		    (DECL_NONADDRESSABLE_P (field)
 		     ? addressable_type
-		     : TREE_TYPE (field))))
+		     : TREE_TYPE (field)),
+		    nesting_level + 1))
 		 && DECL_SIZE (field)
 		 && !integer_zerop (DECL_SIZE (field)))
 	  /* Empty structures may have actual size, like in C++. So
@@ -4196,12 +4197,7 @@ push_fields_onto_fieldstack (tree type, 
 	      pair->alias_set = get_alias_set (addressable_type);
 	    else
 	      pair->alias_set = -1;
-
-	    /* If the base offset is positive, this field belongs to
-	       a structure nested inside the base structure.  */
-	    if (offset > 0)
-	      pair->in_nested_struct = true;
-
+	    pair->nesting_level = nesting_level;
 	    count++;
 	  }
 	else
@@ -4401,7 +4397,7 @@ create_variable_info_for (tree decl, con
   if (var_can_have_subvars (decl) && use_field_sensitive && !hasunion)
     {
       push_fields_onto_fieldstack (decltype, &fieldstack, 0, &hasunion,
-				   decltype);
+				   decltype, 0);
       if (hasunion)
 	{
 	  VEC_free (fieldoff_s, heap, fieldstack);
@@ -4512,7 +4508,6 @@ create_variable_info_for (tree decl, con
 	  newvi->offset = fo->offset;
 	  newvi->size = TREE_INT_CST_LOW (fo->size);
 	  newvi->fullsize = vi->fullsize;
-	  newvi->in_nested_struct = fo->in_nested_struct;
 	  insert_into_field_list (vi, newvi);
 	  VEC_safe_push (varinfo_t, heap, varmap, newvi);
 	  if (is_global && (!flag_whole_program || !in_ipa_mode))
@@ -4764,8 +4759,20 @@ set_uids_in_ptset (tree ptr, bitmap into
 		  if (no_tbaa_pruning
 		      || (!is_derefed && !vi->directly_dereferenced)
 		      || alias_sets_conflict_p (ptr_alias_set, var_alias_set))
-		    bitmap_set_bit (into, DECL_UID (sft));
-		  SFT_IN_NESTED_STRUCT (sft) = vi->in_nested_struct;
+		    {
+		      bitmap_set_bit (into, DECL_UID (sft));
+		      
+		      /* If SFT is inside a nested structure, it will
+			 be needed by the operand scanner to adjust
+			 offsets when adding operands to memory
+			 expressions that dereference PTR.  This means
+			 that memory partitioning may not partition
+			 this SFT because the operand scanner will not
+			 be able to find the other SFTs next to this
+			 one.  */
+		      if (SFT_NESTING_LEVEL (sft) > 0)
+			SFT_UNPARTITIONABLE_P (sft) = true;
+		    }
 		}
 	    }
 	  else
Index: tree-ssa-operands.c
===================================================================
--- tree-ssa-operands.c	(revision 130139)
+++ tree-ssa-operands.c	(working copy)
@@ -1367,8 +1367,33 @@ access_can_touch_variable (tree ref, tre
   return true;
 }
 
+
+/* Given an aggregate expression FULL_REF, return the number of
+   aggregates that are containing FULL_REF.  So, given a structure
+   reference a.b.c.d, the nesting level for this expression is 2 (the
+   number of '.' in the expression minus 1).  */
+
+static unsigned
+ref_nesting_level (tree full_ref)
+{
+  unsigned nesting_level = 0;
+
+  if (!handled_component_p (full_ref))
+    return 0;
+
+  full_ref = TREE_OPERAND (full_ref, 0);
+  while (handled_component_p (full_ref))
+    {
+      nesting_level++;
+      full_ref = TREE_OPERAND (full_ref, 0);
+    }
+
+  return nesting_level;
+}
+
+
 /* Add the actual variables FULL_REF can access, given a member of
-   full_ref's points-to set VAR, where FULL_REF is an access of SIZE at
+   FULL_REF's points-to set VAR, where FULL_REF is an access of SIZE at
    OFFSET from var. IS_CALL_SITE is true if this is a call, and IS_DEF
    is true if this is supposed to be a vdef, and false if this should
    be a VUSE.
@@ -1386,10 +1411,12 @@ access_can_touch_variable (tree ref, tre
    This is necessary because foop only actually points to foo's first
    member, so that is all the points-to set contains.  However, an access
    to foop->a may be touching some single SFT if we have created some
-   SFT's for a structure.  */
+   SFT's for a structure.
+
+   FULL_REF is the original memory expression being analyzed.  */
 
 static bool
-add_vars_for_offset (tree var, unsigned HOST_WIDE_INT offset,
+add_vars_for_offset (tree full_ref, tree var, unsigned HOST_WIDE_INT offset,
 		     unsigned HOST_WIDE_INT size, bool is_def)
 {
   bool added = false;
@@ -1397,14 +1424,21 @@ add_vars_for_offset (tree var, unsigned 
   subvar_t sv;
   unsigned int i;
 
-  if (SFT_IN_NESTED_STRUCT (var))
+  if (full_ref
+      && SFT_NESTING_LEVEL (var) > 0
+      && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var))
     {
       /* Since VAR is an SFT inside a nested structure, the OFFSET
 	 computed by get_ref_base_and_extent is the offset from the
-	 start of the immediately containing structure.  However, to
-	 find out what other SFTs are affected by this reference, we
-	 need to know the offsets starting at the root structure in
-	 the nesting hierarchy.
+	 start of the immediately containing structure.  If VAR is an
+	 SFT inside a nested structure, then FULL_REF may be a
+	 reference to the structure immediately enclosing SFT, and so
+	 OFFSET will be the offset from the start of the immediately
+	 enclosing structure.
+
+	 However, to find out what other SFTs are affected by this
+	 reference, we need to know the offsets starting at the root
+	 structure in the nesting hierarchy.
 
 	 For instance, given the following structure:
 
@@ -1541,7 +1575,7 @@ add_virtual_operand (tree var, stmt_ann_
 	     if it is a potential points-to location.  */
 	  if (TREE_CODE (al) == STRUCT_FIELD_TAG
 	      && TREE_CODE (var) == NAME_MEMORY_TAG)
-	    none_added &= !add_vars_for_offset (al, offset, size,
+	    none_added &= !add_vars_for_offset (full_ref, al, offset, size,
 					        flags & opf_def);
 	  else
 	    {

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-13 16:30   ` Diego Novillo
@ 2007-11-13 16:47     ` Richard Guenther
  2007-11-14 12:36       ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-13 16:47 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote:
> Richard Guenther wrote:
>
> > taking the address of result.x.pDirty causes the SFT to be marked as
> > in substructure and the problem re-surfaces.
>
> Yes.  Thanks for the test case.  The original patch did not take into
> account that the nesting level of the memory reference is also important.
>
> There are two problems here:
>
> 1- Given an arbitrary memory expression M dereferencing a pointer P into
> a nested structure, the offset computed for M is the offset starting at
> P.  However, the offsets computed for the original SFTs are offsets from
> the top of the pointed-to memory object O.  So, when we recognize this
> situation, we must adjust the offsets.  This was the situation
> recognized and fixed by the original patch.
>
> The bug was in the detection of whether P is "pointing to the middle of
> an object".  This is where the nesting level comes in to play.  Suppose
> that M was the expression p_3->b[3] and the original pointed-to object
> was v.x.y:
>
>                 struct {
>                         ...
>                         struct {
>                                 ...
>                                 struct {
>                                         ...
>                                         int b[4];
>                                 } y;
>                         } x;
>                 } v;
>
>                 p_3 = &v.x.y;
>
> If alias analysis created SFTs for this structure, the array 'b' will
> have 4 SFTs associated with it.  Say SFT.12, SFT.13, SFT.14 and SFT.15.
>   However, only SFT.12 will be added to the alias set of p_3's name tag.
>
> So, when we see the expression 'p_3->b[3]', we compute an offset of 96,
> but we need SFT.14 which will probably have an offset much higher than
> that.  So, we adjust 96 by the offset of SFT.12 and get to SFT.14.
>
> In the original patch, we recognized this situation by asking whether
> the SFT.12 was inside a nested structure.  However, it may happen that
> the pointer was actually pointing to the base of the original object
> 'v'.  In which case, we would have the expression 'p_3->x.y.b[3]', which
> also takes us to SFT.12.
>
> But in this case, the offset we computed from 'p_3->x.y.b[3]' is
> *already* the right offset, so adding the offset of SFT.12 to it gives
> us the wrong offset.  So, what this patch does is determine the nesting
> level of each SFT and the nesting level of the memory expression.  We
> only need to adjust the offset of an SFT if the nesting level of the
> memory expression is lower than the nesting level of the SFT that we are
> analyzing.
>
>
> 2- The second problem we have with this scheme is that we rely on the
> presence of key SFTs to adjust offsets when we detect references to
> nested aggregates.  This is something we cannot easily undo because
> inside the operand scanner we cannot really start doing use-def chain
> chasing to determine the original pointed-to objects.  This work has
> already been done by alias analysis, so duplicating this in the operand
> scanner would be wasteful.
>
> This means that this scheme cannot tolerate these key SFTs to be
> partitioned away.  If they are partitioned, then inside the operand
> scanner is essentially impossible to determine where in the structure is
> an arbitrary memory expression referring to.  This is shown in the new
> test gcc.dg/tree-ssa/alias-16.c.
>
> One way to deal with this would be to use the pointed-to set for the
> base pointer instead of the alias set of the name tag.  This has the
> problem that we'd miss the compile-time benefits of partitioning, as we
> would be traversing the original sets instead of the partitioned
> (smaller) alias sets.  Also, we may want to get rid of the duplicate
> pointed-to/alias sets in the future (not sure if they're both worth
> keeping).
>
> The other way of dealing with this is to recognize which SFTs are
> important to not partition and make sure they are never partitioned.
> During alias analysis, if a nested SFT is added to the alias set of a
> pointer, then it is marked as unpartitionable.  The partitioner, in
> turn, gives these SFTs a very high score and makes sure that they are
> never added to a partition.  The effects of this are/should be
> negligible as it only involves a single SFT and nested structures.
>
> The new test alias-16.c tests for these edge cases.
>
> Tested on x86_64 and ppc64.  Committed to trunk.

Thanks.  Now, you should be able to do all this without introducing
all the nesting-level stuff if you make all pointed-to SFTs unpartitionable.

In fact,

> -  if (SFT_IN_NESTED_STRUCT (var))
> +  if (full_ref
> +      && SFT_NESTING_LEVEL (var) > 0
> +      && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var))
>      {
>        /* Since VAR is an SFT inside a nested structure, the OFFSET
>          computed by get_ref_base_and_extent is the offset from the
> -        start of the immediately containing structure.  However, to
> -        find out what other SFTs are affected by this reference, we
> -        need to know the offsets starting at the root structure in
> -        the nesting hierarchy.
> +        start of the immediately containing structure.  If VAR is an
> +        SFT inside a nested structure, then FULL_REF may be a
> +        reference to the structure immediately enclosing SFT, and so
> +        OFFSET will be the offset from the start of the immediately
> +        enclosing structure.
> +
> +        However, to find out what other SFTs are affected by this
> +        reference, we need to know the offsets starting at the root
> +        structure in the nesting hierarchy.
>
>          For instance, given the following structure:

This looks like a pure workaround for that you do not retain the offset
zero SFT as unpartitionable.  Otherwise you'd enter this function with
it (if it is pointed-to), and you can always add the pointed-to SFT offset.

I believe your patch makes it more twisted than it needs to be (it doesn't
even look like an optimization, as you are scanning for pointed-to offset
zero multiple times, for all pointed-to SFTs with a nesting level >
than the ref).

And don't you still get

struct X {
  int i;
  int a[4];
} x;

p = &x->a[0];
return p[3];

wrong?  You don't keep the SFT for a[0] from being partitioned away(?).
[it get's twisted to force GCC to generate IL with an ARRAY_REF here, but
with fortran it's possible I believe, so this is a theory example]

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-13 16:47     ` Richard Guenther
@ 2007-11-14 12:36       ` Richard Guenther
  2007-11-14 12:41         ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-14 12:36 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Nov 13, 2007 5:08 PM, Richard Guenther <richard.guenther@gmail.com> wrote:
>
> On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote:
> > Richard Guenther wrote:
> >
> > > taking the address of result.x.pDirty causes the SFT to be marked as
> > > in substructure and the problem re-surfaces.
> >
> > Yes.  Thanks for the test case.  The original patch did not take into
> > account that the nesting level of the memory reference is also important.
> >
> > There are two problems here:
> >
> > 1- Given an arbitrary memory expression M dereferencing a pointer P into
> > a nested structure, the offset computed for M is the offset starting at
> > P.  However, the offsets computed for the original SFTs are offsets from
> > the top of the pointed-to memory object O.  So, when we recognize this
> > situation, we must adjust the offsets.  This was the situation
> > recognized and fixed by the original patch.
> >
> > The bug was in the detection of whether P is "pointing to the middle of
> > an object".  This is where the nesting level comes in to play.  Suppose
> > that M was the expression p_3->b[3] and the original pointed-to object
> > was v.x.y:
> >
> >                 struct {
> >                         ...
> >                         struct {
> >                                 ...
> >                                 struct {
> >                                         ...
> >                                         int b[4];
> >                                 } y;
> >                         } x;
> >                 } v;
> >
> >                 p_3 = &v.x.y;
> >
> > If alias analysis created SFTs for this structure, the array 'b' will
> > have 4 SFTs associated with it.  Say SFT.12, SFT.13, SFT.14 and SFT.15.
> >   However, only SFT.12 will be added to the alias set of p_3's name tag.
> >
> > So, when we see the expression 'p_3->b[3]', we compute an offset of 96,
> > but we need SFT.14 which will probably have an offset much higher than
> > that.  So, we adjust 96 by the offset of SFT.12 and get to SFT.14.
> >
> > In the original patch, we recognized this situation by asking whether
> > the SFT.12 was inside a nested structure.  However, it may happen that
> > the pointer was actually pointing to the base of the original object
> > 'v'.  In which case, we would have the expression 'p_3->x.y.b[3]', which
> > also takes us to SFT.12.
> >
> > But in this case, the offset we computed from 'p_3->x.y.b[3]' is
> > *already* the right offset, so adding the offset of SFT.12 to it gives
> > us the wrong offset.  So, what this patch does is determine the nesting
> > level of each SFT and the nesting level of the memory expression.  We
> > only need to adjust the offset of an SFT if the nesting level of the
> > memory expression is lower than the nesting level of the SFT that we are
> > analyzing.
> >
> >
> > 2- The second problem we have with this scheme is that we rely on the
> > presence of key SFTs to adjust offsets when we detect references to
> > nested aggregates.  This is something we cannot easily undo because
> > inside the operand scanner we cannot really start doing use-def chain
> > chasing to determine the original pointed-to objects.  This work has
> > already been done by alias analysis, so duplicating this in the operand
> > scanner would be wasteful.
> >
> > This means that this scheme cannot tolerate these key SFTs to be
> > partitioned away.  If they are partitioned, then inside the operand
> > scanner is essentially impossible to determine where in the structure is
> > an arbitrary memory expression referring to.  This is shown in the new
> > test gcc.dg/tree-ssa/alias-16.c.
> >
> > One way to deal with this would be to use the pointed-to set for the
> > base pointer instead of the alias set of the name tag.  This has the
> > problem that we'd miss the compile-time benefits of partitioning, as we
> > would be traversing the original sets instead of the partitioned
> > (smaller) alias sets.  Also, we may want to get rid of the duplicate
> > pointed-to/alias sets in the future (not sure if they're both worth
> > keeping).
> >
> > The other way of dealing with this is to recognize which SFTs are
> > important to not partition and make sure they are never partitioned.
> > During alias analysis, if a nested SFT is added to the alias set of a
> > pointer, then it is marked as unpartitionable.  The partitioner, in
> > turn, gives these SFTs a very high score and makes sure that they are
> > never added to a partition.  The effects of this are/should be
> > negligible as it only involves a single SFT and nested structures.
> >
> > The new test alias-16.c tests for these edge cases.
> >
> > Tested on x86_64 and ppc64.  Committed to trunk.
>
> Thanks.  Now, you should be able to do all this without introducing
> all the nesting-level stuff if you make all pointed-to SFTs unpartitionable.
>
> In fact,
>
> > -  if (SFT_IN_NESTED_STRUCT (var))
> > +  if (full_ref
> > +      && SFT_NESTING_LEVEL (var) > 0
> > +      && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var))
> >      {
> >        /* Since VAR is an SFT inside a nested structure, the OFFSET
> >          computed by get_ref_base_and_extent is the offset from the
> > -        start of the immediately containing structure.  However, to
> > -        find out what other SFTs are affected by this reference, we
> > -        need to know the offsets starting at the root structure in
> > -        the nesting hierarchy.
> > +        start of the immediately containing structure.  If VAR is an
> > +        SFT inside a nested structure, then FULL_REF may be a
> > +        reference to the structure immediately enclosing SFT, and so
> > +        OFFSET will be the offset from the start of the immediately
> > +        enclosing structure.
> > +
> > +        However, to find out what other SFTs are affected by this
> > +        reference, we need to know the offsets starting at the root
> > +        structure in the nesting hierarchy.
> >
> >          For instance, given the following structure:
>
> This looks like a pure workaround for that you do not retain the offset
> zero SFT as unpartitionable.  Otherwise you'd enter this function with
> it (if it is pointed-to), and you can always add the pointed-to SFT offset.
>
> I believe your patch makes it more twisted than it needs to be (it doesn't
> even look like an optimization, as you are scanning for pointed-to offset
> zero multiple times, for all pointed-to SFTs with a nesting level >
> than the ref).
>
> And don't you still get
>
> struct X {
>   int i;
>   int a[4];
> } x;
>
> p = &x->a[0];
> return p[3];
>
> wrong?  You don't keep the SFT for a[0] from being partitioned away(?).
> [it get's twisted to force GCC to generate IL with an ARRAY_REF here, but
> with fortran it's possible I believe, so this is a theory example]

Ok, I see that even for

  struct X {
     int i;
     int a[4];
  };

the SFTs for i and a have a nesting level of 1 (that is, if pointed-to, they
are disregarded for partitioning).  So the case that is not fixed yet is only
the case of a toplevel array:

struct X {
  int i;
  int a[4];
} m;

int a[4];

static inline void *fwd(void *p) { return p; }

int foo(int b)
{
  int (*p)[4] = b ? &a : &m.a;
  a[3] = 0;
  (*p)[3] = 1;
  return (*p)[3] + (*p)[2] + (*p)[1] + a[0] + a[3];
}

extern void abort (void);

int main()
{
  int i;
  for (i = 0; i < 4; ++i)
    a[i] = 0;
  if (foo(1) != 2)
    abort ();
  return 0;
}

fails with at least -O --param max-aliased-vops=1

Because we partition away the SFT for &a[0].  I'm going to fix this by removing
all the nesting level logic again and just mark all pointed-to SFTs as
unpartitionable.  That means, the following fixes this:

Index: tree-ssa-structalias.c
===================================================================
--- tree-ssa-structalias.c      (revision 130174)
+++ tree-ssa-structalias.c      (working copy)
@@ -4770,7 +4770,7 @@ set_uids_in_ptset (tree ptr, bitmap into
                         this SFT because the operand scanner will not
                         be able to find the other SFTs next to this
                         one.  */
-                     if (SFT_NESTING_LEVEL (sft) > 0)
+                     if (1 || SFT_NESTING_LEVEL (sft) > 0)
                        SFT_UNPARTITIONABLE_P (sft) = true;
                    }
                }

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 12:36       ` Richard Guenther
@ 2007-11-14 12:41         ` Richard Guenther
  2007-11-14 13:01           ` Diego Novillo
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-14 12:41 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Nov 14, 2007 11:55 AM, Richard Guenther <richard.guenther@gmail.com> wrote:
>
> On Nov 13, 2007 5:08 PM, Richard Guenther <richard.guenther@gmail.com> wrote:
> >
> > On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote:
> > > Richard Guenther wrote:
> > >
> > > > taking the address of result.x.pDirty causes the SFT to be marked as
> > > > in substructure and the problem re-surfaces.
> > >
> > > Yes.  Thanks for the test case.  The original patch did not take into
> > > account that the nesting level of the memory reference is also important.
> > >
> > > There are two problems here:
> > >
> > > 1- Given an arbitrary memory expression M dereferencing a pointer P into
> > > a nested structure, the offset computed for M is the offset starting at
> > > P.  However, the offsets computed for the original SFTs are offsets from
> > > the top of the pointed-to memory object O.  So, when we recognize this
> > > situation, we must adjust the offsets.  This was the situation
> > > recognized and fixed by the original patch.
> > >
> > > The bug was in the detection of whether P is "pointing to the middle of
> > > an object".  This is where the nesting level comes in to play.  Suppose
> > > that M was the expression p_3->b[3] and the original pointed-to object
> > > was v.x.y:
> > >
> > >                 struct {
> > >                         ...
> > >                         struct {
> > >                                 ...
> > >                                 struct {
> > >                                         ...
> > >                                         int b[4];
> > >                                 } y;
> > >                         } x;
> > >                 } v;
> > >
> > >                 p_3 = &v.x.y;
> > >
> > > If alias analysis created SFTs for this structure, the array 'b' will
> > > have 4 SFTs associated with it.  Say SFT.12, SFT.13, SFT.14 and SFT.15.
> > >   However, only SFT.12 will be added to the alias set of p_3's name tag.
> > >
> > > So, when we see the expression 'p_3->b[3]', we compute an offset of 96,
> > > but we need SFT.14 which will probably have an offset much higher than
> > > that.  So, we adjust 96 by the offset of SFT.12 and get to SFT.14.
> > >
> > > In the original patch, we recognized this situation by asking whether
> > > the SFT.12 was inside a nested structure.  However, it may happen that
> > > the pointer was actually pointing to the base of the original object
> > > 'v'.  In which case, we would have the expression 'p_3->x.y.b[3]', which
> > > also takes us to SFT.12.
> > >
> > > But in this case, the offset we computed from 'p_3->x.y.b[3]' is
> > > *already* the right offset, so adding the offset of SFT.12 to it gives
> > > us the wrong offset.  So, what this patch does is determine the nesting
> > > level of each SFT and the nesting level of the memory expression.  We
> > > only need to adjust the offset of an SFT if the nesting level of the
> > > memory expression is lower than the nesting level of the SFT that we are
> > > analyzing.
> > >
> > >
> > > 2- The second problem we have with this scheme is that we rely on the
> > > presence of key SFTs to adjust offsets when we detect references to
> > > nested aggregates.  This is something we cannot easily undo because
> > > inside the operand scanner we cannot really start doing use-def chain
> > > chasing to determine the original pointed-to objects.  This work has
> > > already been done by alias analysis, so duplicating this in the operand
> > > scanner would be wasteful.
> > >
> > > This means that this scheme cannot tolerate these key SFTs to be
> > > partitioned away.  If they are partitioned, then inside the operand
> > > scanner is essentially impossible to determine where in the structure is
> > > an arbitrary memory expression referring to.  This is shown in the new
> > > test gcc.dg/tree-ssa/alias-16.c.
> > >
> > > One way to deal with this would be to use the pointed-to set for the
> > > base pointer instead of the alias set of the name tag.  This has the
> > > problem that we'd miss the compile-time benefits of partitioning, as we
> > > would be traversing the original sets instead of the partitioned
> > > (smaller) alias sets.  Also, we may want to get rid of the duplicate
> > > pointed-to/alias sets in the future (not sure if they're both worth
> > > keeping).
> > >
> > > The other way of dealing with this is to recognize which SFTs are
> > > important to not partition and make sure they are never partitioned.
> > > During alias analysis, if a nested SFT is added to the alias set of a
> > > pointer, then it is marked as unpartitionable.  The partitioner, in
> > > turn, gives these SFTs a very high score and makes sure that they are
> > > never added to a partition.  The effects of this are/should be
> > > negligible as it only involves a single SFT and nested structures.
> > >
> > > The new test alias-16.c tests for these edge cases.
> > >
> > > Tested on x86_64 and ppc64.  Committed to trunk.
> >
> > Thanks.  Now, you should be able to do all this without introducing
> > all the nesting-level stuff if you make all pointed-to SFTs unpartitionable.
> >
> > In fact,
> >
> > > -  if (SFT_IN_NESTED_STRUCT (var))
> > > +  if (full_ref
> > > +      && SFT_NESTING_LEVEL (var) > 0
> > > +      && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var))
> > >      {
> > >        /* Since VAR is an SFT inside a nested structure, the OFFSET
> > >          computed by get_ref_base_and_extent is the offset from the
> > > -        start of the immediately containing structure.  However, to
> > > -        find out what other SFTs are affected by this reference, we
> > > -        need to know the offsets starting at the root structure in
> > > -        the nesting hierarchy.
> > > +        start of the immediately containing structure.  If VAR is an
> > > +        SFT inside a nested structure, then FULL_REF may be a
> > > +        reference to the structure immediately enclosing SFT, and so
> > > +        OFFSET will be the offset from the start of the immediately
> > > +        enclosing structure.
> > > +
> > > +        However, to find out what other SFTs are affected by this
> > > +        reference, we need to know the offsets starting at the root
> > > +        structure in the nesting hierarchy.
> > >
> > >          For instance, given the following structure:
> >
> > This looks like a pure workaround for that you do not retain the offset
> > zero SFT as unpartitionable.  Otherwise you'd enter this function with
> > it (if it is pointed-to), and you can always add the pointed-to SFT offset.
> >
> > I believe your patch makes it more twisted than it needs to be (it doesn't
> > even look like an optimization, as you are scanning for pointed-to offset
> > zero multiple times, for all pointed-to SFTs with a nesting level >
> > than the ref).

Oh, and as you basically go back towards my first fix that was reverted this
patch causes compile-time regressions and we hit
internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484
more often as well.

(I believe the latter is because the max-aliased-vops limit is a per-function
one while a per-stmt one would be more useful, though I see that it is
difficult to estimate per-stmt VOP reduction during partitioning.  But as
we do the limit per-function we are not efficient in preventing this ICE)

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 12:41         ` Richard Guenther
@ 2007-11-14 13:01           ` Diego Novillo
  2007-11-14 13:18             ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Diego Novillo @ 2007-11-14 13:01 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

Richard Guenther wrote:

> Oh, and as you basically go back towards my first fix that was reverted this
> patch causes compile-time regressions and we hit
> internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484
> more often as well.

You'll need to be more specific.  We are simply limiting the first SFT 
of a nested struct out of the partitioner, instead of recursing into MPT 
elements during operand scanning.

Where is the extra compile time coming from?  Are we generating that 
many more virtual operands?  It would help quite a bit if you'd open a 
PR with a testcase for me to reproduce this behaviour.


Thanks.  Diego.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 13:01           ` Diego Novillo
@ 2007-11-14 13:18             ` Richard Guenther
  2007-11-14 13:59               ` Diego Novillo
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-14 13:18 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Nov 14, 2007 1:04 PM, Diego Novillo <dnovillo@google.com> wrote:
> Richard Guenther wrote:
>
> > Oh, and as you basically go back towards my first fix that was reverted this
> > patch causes compile-time regressions and we hit
> > internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484
> > more often as well.
>
> You'll need to be more specific.  We are simply limiting the first SFT
> of a nested struct out of the partitioner, instead of recursing into MPT
> elements during operand scanning.

No, we are limiting all pointed-to SFTs of all structs (maybe this was not
intended, but it is necessary) out of the partitioner.  Which will effectively
make the NMTs contain all pointed-to SFTs.  My initial patch fixed the
problem by recursing into MPTs that are contained in NMTs to look for
SFTs in them.  With my initial patch those MPTs could contain unrelated
symbols, but clearly you are growing the numer of symbols in NMTs
(the regression is not as bad as for my initial patch, but is the least
regression we can have there if we want to fix the miscompilation this way).

> Where is the extra compile time coming from?  Are we generating that
> many more virtual operands?  It would help quite a bit if you'd open a
> PR with a testcase for me to reproduce this behaviour.

We enlarge the NMT aliased symbols.  (And obviously we can now create
aritifical testcases with any number of aliased SFTs remaining on a stmt,
I'll come up with one of those for you)

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 13:18             ` Richard Guenther
@ 2007-11-14 13:59               ` Diego Novillo
  2007-11-14 14:07                 ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Diego Novillo @ 2007-11-14 13:59 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

Richard Guenther wrote:

> No, we are limiting all pointed-to SFTs of all structs (maybe this was not
> intended, but it is necessary) out of the partitioner.

*All* of them?  That's a bug.  Only the first one inside a nested 
structure should be limited.


> but clearly you are growing the numer of symbols in NMTs
> (the regression is not as bad as for my initial patch, but is the least
> regression we can have there if we want to fix the miscompilation this way).

Yes, but we should only be adding as many SFTs as nested structures we 
have.  So, to give you a ridiculous example, if we had a structure with 
300 fields and 2 nested structures, we should only have 2 
unpartitionable SFTs (the first one of each nested structure).


> We enlarge the NMT aliased symbols.  (And obviously we can now create
> aritifical testcases with any number of aliased SFTs remaining on a stmt,
> I'll come up with one of those for you)

Thanks.  The secret here is going to be one where we are marking way too 
many SFTs as unpartitionable.  My contention is that we shouldn't be 
doing this too often.  If we are, that's the bug.


Thanks.  Diego.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 13:59               ` Diego Novillo
@ 2007-11-14 14:07                 ` Richard Guenther
  2007-11-14 14:09                   ` Diego Novillo
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-14 14:07 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Nov 14, 2007 1:41 PM, Diego Novillo <dnovillo@google.com> wrote:
> Richard Guenther wrote:
>
> > No, we are limiting all pointed-to SFTs of all structs (maybe this was not
> > intended, but it is necessary) out of the partitioner.
>
> *All* of them?  That's a bug.  Only the first one inside a nested
> structure should be limited.

Yes, for

 struct X {
    int i;
  } m;

the SFT for &m.i; has nesting level 1.  And this is required - see the example
with the toplevel array that is still failing.

> > but clearly you are growing the numer of symbols in NMTs
> > (the regression is not as bad as for my initial patch, but is the least
> > regression we can have there if we want to fix the miscompilation this way).
>
> Yes, but we should only be adding as many SFTs as nested structures we
> have.  So, to give you a ridiculous example, if we had a structure with
> 300 fields and 2 nested structures, we should only have 2
> unpartitionable SFTs (the first one of each nested structure).

Huh?  No, all pointed-to SFTs are unpartitionable. So,

 struct X {
   struct Y {
     int i1;
     int i2;
     ...

&i1;
&i2;
...

makes all SFTs that belong to i1, i2, etc. unpartitionable.  At least that's
how your code works ;)

I guess in add_uids_for_ptset instead of

!                     if (SFT_NESTING_LEVEL (sft) > 0)
!                       SFT_UNPARTITIONABLE_P (sft) = true;

we can get away with only doing that if the type of the SFT itself could
have subvariables (which is probably again what you intended?).

> > We enlarge the NMT aliased symbols.  (And obviously we can now create
> > aritifical testcases with any number of aliased SFTs remaining on a stmt,
> > I'll come up with one of those for you)
>
> Thanks.  The secret here is going to be one where we are marking way too
> many SFTs as unpartitionable.  My contention is that we shouldn't be
> doing this too often.  If we are, that's the bug.

Probably the latter then.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 14:07                 ` Richard Guenther
@ 2007-11-14 14:09                   ` Diego Novillo
  2007-11-14 14:22                     ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Diego Novillo @ 2007-11-14 14:09 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

Richard Guenther wrote:

> I guess in add_uids_for_ptset instead of
> 
> !                     if (SFT_NESTING_LEVEL (sft) > 0)
> !                       SFT_UNPARTITIONABLE_P (sft) = true;
> 
> we can get away with only doing that if the type of the SFT itself could
> have subvariables (which is probably again what you intended?).

Hmm, well the SFTs themselves are usually of scalar type.  The first SFT 
of an int[] array is of type int.  Yes, the bug is around here.  Perhaps 
the easiest way to fix this is to mark the variable earlier when we are 
pushing fields in the fieldstack (in the code that computes the nesting 
level).  And then transport that marker into the SFT here.


Diego.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 14:09                   ` Diego Novillo
@ 2007-11-14 14:22                     ` Richard Guenther
  2007-11-14 15:29                       ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-14 14:22 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches, Richard Guenther

On Nov 14, 2007 1:56 PM, Diego Novillo <dnovillo@google.com> wrote:
> Richard Guenther wrote:
>
> > I guess in add_uids_for_ptset instead of
> >
> > !                     if (SFT_NESTING_LEVEL (sft) > 0)
> > !                       SFT_UNPARTITIONABLE_P (sft) = true;
> >
> > we can get away with only doing that if the type of the SFT itself could
> > have subvariables (which is probably again what you intended?).
>
> Hmm, well the SFTs themselves are usually of scalar type.  The first SFT
> of an int[] array is of type int.  Yes, the bug is around here.  Perhaps
> the easiest way to fix this is to mark the variable earlier when we are
> pushing fields in the fieldstack (in the code that computes the nesting
> level).  And then transport that marker into the SFT here.

Yes, I'm doing that at the moment.  Stay tuned for a patch.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 14:22                     ` Richard Guenther
@ 2007-11-14 15:29                       ` Richard Guenther
  2007-11-14 15:30                         ` Diego Novillo
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-14 15:29 UTC (permalink / raw)
  To: gcc-patches; +Cc: Diego Novillo

On Wed, 14 Nov 2007, Richard Guenther wrote:

> On Nov 14, 2007 1:56 PM, Diego Novillo <dnovillo@google.com> wrote:
> > Richard Guenther wrote:
> >
> > > I guess in add_uids_for_ptset instead of
> > >
> > > !                     if (SFT_NESTING_LEVEL (sft) > 0)
> > > !                       SFT_UNPARTITIONABLE_P (sft) = true;
> > >
> > > we can get away with only doing that if the type of the SFT itself could
> > > have subvariables (which is probably again what you intended?).
> >
> > Hmm, well the SFTs themselves are usually of scalar type.  The first SFT
> > of an int[] array is of type int.  Yes, the bug is around here.  Perhaps
> > the easiest way to fix this is to mark the variable earlier when we are
> > pushing fields in the fieldstack (in the code that computes the nesting
> > level).  And then transport that marker into the SFT here.
> 
> Yes, I'm doing that at the moment.  Stay tuned for a patch.

Like this one.  It does a few things though:

 1) Remove the nesting level stuff

 2) Mark SFTs that can be used as a base for further sub-references
    (ARRAY_REFs, REAL/IMAGPART, COMPONENT_REF, etc.)

 3) Mark all pointed-to SFTs that can be used as a base for sub-references
    as not partitionable

 4) Fix a bug in TBAA pruning (without that we TBAA prune the pointed-to
    SFT (of type unsigned int) from the NMT as unsigned int does not
    alias with struct PgHdr *.  We were wrongly using the pointer alias
    set instead of the pointed-to alias set for the pruning.

This should reduce the compile-time regression we are seeing at the 
moment.

Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend
to commit this if that (as expected) finishes successfully.

Richard.

2007-11-14  Richard Guenther  <rguenther@suse.de>

	* tree.h (struct tree_memory_tag): Add base_for_components flag.
	(struct tree_struct_field_tag): Remove nesting_level field.
	(SFT_NESTING_LEVEL): Remove.
	(SFT_BASE_FOR_COMPONENTS_P): Add.
	* tree-flow.h (struct fieldoff): Remove nesting_level field.  Add
	base_for_components flag.
	(push_fields_onto_fieldstack): Remove nesting_level parameter.
	* tree-ssa-alias.c (create_sft): Likewise.  Add base_for_components
	parameter.
	(create_overlap_variables_for): Deal with it.
	* tree-dfa.c (dump_subvars_for): Likewise.
	(dump_variable): Likewise.
	* tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise.
	Set base_for_components for first elements of sub-structures.
	(create_variable_info_for): Handle base_for_components.
	(set_uids_in_ptset): Always set SFT_UNPARTITIONABLE_P for
	pointed-to SFTs if SFT_BASE_FOR_COMPONENTS_P is set.
	Use the correct alias set to do TBAA pruning.
	(find_what_p_points_to): Only call set_uids_in_ptset for pointer
	variables.
	* tree-ssa-operands.c (ref_nesting_level): Remove.
	(add_vars_for_offset): Remove full_ref parameter, always add
	the offset of the pointed-to SFT.
	(add_virtual_operand): Adjust for changed signature of
	add_vars_for_offset.

	* gcc.dg/torture/pr33870.c: New testcase.
	

Index: tree.h
===================================================================
*** tree.h	(revision 130174)
--- tree.h	(working copy)
*************** struct tree_memory_tag GTY(())
*** 2557,2562 ****
--- 2557,2565 ----
    /* True if this tag has global scope.  */
    unsigned int is_global : 1;
  
+   /* True if this tag can be a base for component accesses.  */
+   unsigned int base_for_components : 1;
+ 
    /* True if this tag should not be grouped into a memory partition.  */
    unsigned int unpartitionable : 1;
  };
*************** struct tree_struct_field_tag GTY(())
*** 2579,2601 ****
  
    /* Alias set for a DECL_NONADDRESSABLE_P field.  Otherwise -1.  */
    alias_set_type alias_set;
- 
-   /* Nesting level for this subvariable.  This indicates how many
-      structures are wrapping this field.  Fields at the top level have
-      a nesting level of 0.  */
-   unsigned int nesting_level;
  };
- 
  #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var)
  #define SFT_OFFSET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.offset)
  #define SFT_SIZE(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.size)
  #define SFT_NONADDRESSABLE_P(NODE) \
    (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1)
  #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set)
- #define SFT_NESTING_LEVEL(NODE) \
-   (STRUCT_FIELD_TAG_CHECK (NODE)->sft.nesting_level)
  #define SFT_UNPARTITIONABLE_P(NODE) \
    (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.unpartitionable)
  
  /* Memory Partition Tags (MPTs) group memory symbols under one
     common name for the purposes of placing memory PHI nodes.  */
--- 2582,2598 ----
  
    /* Alias set for a DECL_NONADDRESSABLE_P field.  Otherwise -1.  */
    alias_set_type alias_set;
  };
  #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var)
  #define SFT_OFFSET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.offset)
  #define SFT_SIZE(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.size)
  #define SFT_NONADDRESSABLE_P(NODE) \
    (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1)
  #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set)
  #define SFT_UNPARTITIONABLE_P(NODE) \
    (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.unpartitionable)
+ #define SFT_BASE_FOR_COMPONENTS_P(NODE) \
+   (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.base_for_components)
  
  /* Memory Partition Tags (MPTs) group memory symbols under one
     common name for the purposes of placing memory PHI nodes.  */
Index: testsuite/gcc.dg/torture/pr33870.c
===================================================================
*** testsuite/gcc.dg/torture/pr33870.c	(revision 0)
--- testsuite/gcc.dg/torture/pr33870.c	(revision 0)
***************
*** 0 ****
--- 1,30 ----
+ /* { dg-do run } */
+ /* { dg-options "--param max-aliased-vops=1" } */
+ 
+ struct X {
+   int i;
+   int a[4];
+ } m;
+ 
+ int a[4];
+ 
+ int __attribute__((noinline)) foo(int b)
+ {
+   int (*p)[4] = b ? &a : &m.a;
+   a[3] = 0;
+   (*p)[3] = 1;
+   return (*p)[3] + (*p)[2] + (*p)[1] + a[0] + a[3];
+ }
+ 
+ extern void abort (void);
+ 
+ int main()
+ {
+   int i;
+   for (i = 0; i < 4; ++i)
+     a[i] = 0;
+   if (foo(1) != 2)
+     abort ();
+   return 0;
+ }
+ 
Index: tree-ssa-alias.c
===================================================================
*** tree-ssa-alias.c	(revision 130174)
--- tree-ssa-alias.c	(working copy)
*************** get_or_create_used_part_for (size_t uid)
*** 3791,3797 ****
  static tree
  create_sft (tree var, tree field, unsigned HOST_WIDE_INT offset,
  	    unsigned HOST_WIDE_INT size, alias_set_type alias_set,
! 	    unsigned nesting_level)
  {
    tree subvar = create_tag_raw (STRUCT_FIELD_TAG, field, "SFT");
  
--- 3791,3797 ----
  static tree
  create_sft (tree var, tree field, unsigned HOST_WIDE_INT offset,
  	    unsigned HOST_WIDE_INT size, alias_set_type alias_set,
! 	    bool base_for_components)
  {
    tree subvar = create_tag_raw (STRUCT_FIELD_TAG, field, "SFT");
  
*************** create_sft (tree var, tree field, unsign
*** 3811,3817 ****
    SFT_OFFSET (subvar) = offset;
    SFT_SIZE (subvar) = size;
    SFT_ALIAS_SET (subvar) = alias_set;
!   SFT_NESTING_LEVEL (subvar) = nesting_level;
  
    return subvar;
  }
--- 3811,3818 ----
    SFT_OFFSET (subvar) = offset;
    SFT_SIZE (subvar) = size;
    SFT_ALIAS_SET (subvar) = alias_set;
!   SFT_BASE_FOR_COMPONENTS_P (subvar) = base_for_components;
!   SFT_UNPARTITIONABLE_P (subvar) = false;
  
    return subvar;
  }
*************** create_overlap_variables_for (tree var)
*** 3833,3839 ****
      return;
  
    push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL,
! 			       TREE_TYPE (var), 0);
    if (VEC_length (fieldoff_s, fieldstack) != 0)
      {
        subvar_t *subvars;
--- 3834,3840 ----
      return;
  
    push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL,
! 			       TREE_TYPE (var));
    if (VEC_length (fieldoff_s, fieldstack) != 0)
      {
        subvar_t *subvars;
*************** create_overlap_variables_for (tree var)
*** 3916,3921 ****
--- 3917,3923 ----
  	     field, skip it.  Note that we always need the field at
  	     offset 0 so we can properly handle pointers to the
  	     structure.  */
+ 
  	  if ((fo->offset != 0
  	       && ((fo->offset <= up->minused
  		    && fo->offset + fosize <= up->minused)
*************** create_overlap_variables_for (tree var)
*** 3924,3932 ****
  		  && fosize == lastfosize
  		  && currfotype == lastfotype))
  	    continue;
! 
! 	  subvar = create_sft (var, fo->type, fo->offset, fosize,
! 			       fo->alias_set, fo->nesting_level);
  	  VEC_quick_push (tree, *subvars, subvar);
  
  	  if (dump_file)
--- 3926,3933 ----
  		  && fosize == lastfosize
  		  && currfotype == lastfotype))
  	    continue;
! 	  subvar = create_sft (var, fo->type, fo->offset,
! 			       fosize, fo->alias_set, fo->base_for_components);
  	  VEC_quick_push (tree, *subvars, subvar);
  
  	  if (dump_file)
*************** create_overlap_variables_for (tree var)
*** 3937,3944 ****
  		       SFT_OFFSET (subvar));
  	      fprintf (dump_file, " size " HOST_WIDE_INT_PRINT_DEC,
  		       SFT_SIZE (subvar));
! 	      fprintf (dump_file, " nesting level %d\n",
! 		       SFT_NESTING_LEVEL (subvar));
  	    }
  	  
  	  lastfotype = currfotype;
--- 3938,3944 ----
  		       SFT_OFFSET (subvar));
  	      fprintf (dump_file, " size " HOST_WIDE_INT_PRINT_DEC,
  		       SFT_SIZE (subvar));
! 	      fprintf (dump_file, "\n");
  	    }
  	  
  	  lastfotype = currfotype;
Index: tree-dfa.c
===================================================================
*** tree-dfa.c	(revision 130174)
--- tree-dfa.c	(working copy)
*************** dump_subvars_for (FILE *file, tree var)
*** 287,294 ****
    for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i)
      {
        print_generic_expr (file, subvar, dump_flags);
        fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (subvar));
-       fprintf (file, "[%u]", SFT_NESTING_LEVEL (subvar));
        fprintf (file, " ");
      }
  
--- 287,295 ----
    for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i)
      {
        print_generic_expr (file, subvar, dump_flags);
+       if (SFT_BASE_FOR_COMPONENTS_P (subvar))
+         fprintf (file, "@");
        fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (subvar));
        fprintf (file, " ");
      }
  
*************** dump_variable (FILE *file, tree var)
*** 424,430 ****
  	{
  	  fprintf (file, ", offset: " HOST_WIDE_INT_PRINT_UNSIGNED,
  		   SFT_OFFSET (var));
! 	  fprintf (file, ", nesting: %u", SFT_NESTING_LEVEL (var));
  	  fprintf (file, ", partitionable: %s",
  		   SFT_UNPARTITIONABLE_P (var) ? "NO" : "YES");
  	}
--- 425,432 ----
  	{
  	  fprintf (file, ", offset: " HOST_WIDE_INT_PRINT_UNSIGNED,
  		   SFT_OFFSET (var));
! 	  fprintf (file, ", base for components: %s",
! 		   SFT_BASE_FOR_COMPONENTS_P (var) ? "NO" : "YES");
  	  fprintf (file, ", partitionable: %s",
  		   SFT_UNPARTITIONABLE_P (var) ? "NO" : "YES");
  	}
Index: tree-flow.h
===================================================================
*** tree-flow.h	(revision 130174)
--- tree-flow.h	(working copy)
*************** struct fieldoff
*** 1159,1180 ****
    /* Field.  */
    tree decl;
  
-   /* Nesting level.  This number represents how many structures are
-      wrapping this field.  */
-   unsigned nesting_level;
- 
    /* Offset from the base of the base containing object to this field.  */
    HOST_WIDE_INT offset;  
  
    /* Alias set for the field.  */
    alias_set_type alias_set;
  };
  typedef struct fieldoff fieldoff_s;
  
  DEF_VEC_O(fieldoff_s);
  DEF_VEC_ALLOC_O(fieldoff_s,heap);
! int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **, HOST_WIDE_INT,
! 				 bool *, tree, unsigned);
  void sort_fieldstack (VEC(fieldoff_s,heap) *);
  
  void init_alias_heapvars (void);
--- 1159,1179 ----
    /* Field.  */
    tree decl;
  
    /* Offset from the base of the base containing object to this field.  */
    HOST_WIDE_INT offset;  
  
    /* Alias set for the field.  */
    alias_set_type alias_set;
+ 
+   /* True, if this offset can be a base for further component accesses.  */
+   unsigned base_for_components : 1;
  };
  typedef struct fieldoff fieldoff_s;
  
  DEF_VEC_O(fieldoff_s);
  DEF_VEC_ALLOC_O(fieldoff_s,heap);
! int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **,
! 				 HOST_WIDE_INT, bool *, tree);
  void sort_fieldstack (VEC(fieldoff_s,heap) *);
  
  void init_alias_heapvars (void);
Index: tree-ssa-structalias.c
===================================================================
*** tree-ssa-structalias.c	(revision 130174)
--- tree-ssa-structalias.c	(working copy)
*************** sort_fieldstack (VEC(fieldoff_s,heap) *f
*** 4053,4071 ****
     TYPE.
  
     ADDRESSABLE_TYPE is the type of the outermost object that could
!    have its address taken.
! 
!    NESTING_LEVEL indicates whether TYPE is a structure nested inside
!    another, it starts at 0 and it is incremented by one on every
!    structure recursed into.  */
  
  int
  push_fields_onto_fieldstack (tree type, VEC(fieldoff_s,heap) **fieldstack,
  			     HOST_WIDE_INT offset, bool *has_union,
! 			     tree addressable_type, unsigned nesting_level)
  {
    tree field;
    int count = 0;
  
    if (TREE_CODE (type) == COMPLEX_TYPE)
      {
--- 4053,4068 ----
     TYPE.
  
     ADDRESSABLE_TYPE is the type of the outermost object that could
!    have its address taken.  */
  
  int
  push_fields_onto_fieldstack (tree type, VEC(fieldoff_s,heap) **fieldstack,
  			     HOST_WIDE_INT offset, bool *has_union,
! 			     tree addressable_type)
  {
    tree field;
    int count = 0;
+   int first_element = VEC_length (fieldoff_s, *fieldstack);
  
    if (TREE_CODE (type) == COMPLEX_TYPE)
      {
*************** push_fields_onto_fieldstack (tree type, 
*** 4076,4081 ****
--- 4073,4079 ----
        real_part->offset = offset;
        real_part->decl = NULL_TREE;
        real_part->alias_set = -1;
+       real_part->base_for_components = false;
  
        img_part = VEC_safe_push (fieldoff_s, heap, *fieldstack, NULL);
        img_part->type = TREE_TYPE (type);
*************** push_fields_onto_fieldstack (tree type, 
*** 4083,4093 ****
        img_part->offset = offset + TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (type)));
        img_part->decl = NULL_TREE;
        img_part->alias_set = -1;
  
!       return 2;
      }
  
!   if (TREE_CODE (type) == ARRAY_TYPE)
      {
        tree sz = TYPE_SIZE (type);
        tree elsz = TYPE_SIZE (TREE_TYPE (type));
--- 4081,4092 ----
        img_part->offset = offset + TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (type)));
        img_part->decl = NULL_TREE;
        img_part->alias_set = -1;
+       img_part->base_for_components = false;
  
!       count = 2;
      }
  
!   else if (TREE_CODE (type) == ARRAY_TYPE)
      {
        tree sz = TYPE_SIZE (type);
        tree elsz = TYPE_SIZE (TREE_TYPE (type));
*************** push_fields_onto_fieldstack (tree type, 
*** 4125,4132 ****
  		      has_union,
  		      (TYPE_NONALIASED_COMPONENT (type)
  		       ? addressable_type
! 		       : TREE_TYPE (type)),
! 		      nesting_level + 1)))
  	    /* Empty structures may have actual size, like in C++. So
  	       see if we didn't push any subfields and the size is
  	       nonzero, push the field onto the stack */
--- 4124,4130 ----
  		      has_union,
  		      (TYPE_NONALIASED_COMPONENT (type)
  		       ? addressable_type
! 		       : TREE_TYPE (type)))))
  	    /* Empty structures may have actual size, like in C++. So
  	       see if we didn't push any subfields and the size is
  	       nonzero, push the field onto the stack */
*************** push_fields_onto_fieldstack (tree type, 
*** 4145,4160 ****
  		pair->alias_set = get_alias_set (addressable_type);
  	      else
  		pair->alias_set = -1;
! 	      pair->nesting_level = nesting_level;
  	      count++;
  	    }
  	  else
  	    count += pushed;
  	}
- 
-       return count;
      }
  
    for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field))
      if (TREE_CODE (field) == FIELD_DECL)
        {
--- 4143,4158 ----
  		pair->alias_set = get_alias_set (addressable_type);
  	      else
  		pair->alias_set = -1;
! 	      pair->base_for_components = false;
  	      count++;
  	    }
  	  else
  	    count += pushed;
  	}
      }
  
+   else
+     {
    for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field))
      if (TREE_CODE (field) == FIELD_DECL)
        {
*************** push_fields_onto_fieldstack (tree type, 
*** 4175,4182 ****
  		    has_union,
  		    (DECL_NONADDRESSABLE_P (field)
  		     ? addressable_type
! 		     : TREE_TYPE (field)),
! 		    nesting_level + 1))
  		 && DECL_SIZE (field)
  		 && !integer_zerop (DECL_SIZE (field)))
  	  /* Empty structures may have actual size, like in C++. So
--- 4173,4179 ----
  		    has_union,
  		    (DECL_NONADDRESSABLE_P (field)
  		     ? addressable_type
! 		     : TREE_TYPE (field))))
  		 && DECL_SIZE (field)
  		 && !integer_zerop (DECL_SIZE (field)))
  	  /* Empty structures may have actual size, like in C++. So
*************** push_fields_onto_fieldstack (tree type, 
*** 4197,4208 ****
  	      pair->alias_set = get_alias_set (addressable_type);
  	    else
  	      pair->alias_set = -1;
! 	    pair->nesting_level = nesting_level;
  	    count++;
  	  }
  	else
  	  count += pushed;
        }
  
    return count;
  }
--- 4194,4211 ----
  	      pair->alias_set = get_alias_set (addressable_type);
  	    else
  	      pair->alias_set = -1;
! 	    pair->base_for_components = false;
  	    count++;
  	  }
  	else
  	  count += pushed;
        }
+     }
+ 
+   /* Make sure the first pushed field is marked as eligible for
+      being a base for component references.  */
+   if (count > 0)
+     VEC_index (fieldoff_s, *fieldstack, first_element)->base_for_components = true;
  
    return count;
  }
*************** create_variable_info_for (tree decl, con
*** 4397,4403 ****
    if (var_can_have_subvars (decl) && use_field_sensitive && !hasunion)
      {
        push_fields_onto_fieldstack (decltype, &fieldstack, 0, &hasunion,
! 				   decltype, 0);
        if (hasunion)
  	{
  	  VEC_free (fieldoff_s, heap, fieldstack);
--- 4400,4406 ----
    if (var_can_have_subvars (decl) && use_field_sensitive && !hasunion)
      {
        push_fields_onto_fieldstack (decltype, &fieldstack, 0, &hasunion,
! 				   decltype);
        if (hasunion)
  	{
  	  VEC_free (fieldoff_s, heap, fieldstack);
*************** set_uids_in_ptset (tree ptr, bitmap into
*** 4719,4725 ****
  {
    unsigned int i;
    bitmap_iterator bi;
!   alias_set_type ptr_alias_set = get_alias_set (TREE_TYPE (ptr));
  
    EXECUTE_IF_SET_IN_BITMAP (from, 0, i, bi)
      {
--- 4722,4728 ----
  {
    unsigned int i;
    bitmap_iterator bi;
!   alias_set_type ptr_alias_set = get_alias_set (TREE_TYPE (TREE_TYPE (ptr)));
  
    EXECUTE_IF_SET_IN_BITMAP (from, 0, i, bi)
      {
*************** set_uids_in_ptset (tree ptr, bitmap into
*** 4762,4776 ****
  		    {
  		      bitmap_set_bit (into, DECL_UID (sft));
  		      
! 		      /* If SFT is inside a nested structure, it will
! 			 be needed by the operand scanner to adjust
! 			 offsets when adding operands to memory
  			 expressions that dereference PTR.  This means
  			 that memory partitioning may not partition
  			 this SFT because the operand scanner will not
  			 be able to find the other SFTs next to this
! 			 one.  */
! 		      if (SFT_NESTING_LEVEL (sft) > 0)
  			SFT_UNPARTITIONABLE_P (sft) = true;
  		    }
  		}
--- 4765,4779 ----
  		    {
  		      bitmap_set_bit (into, DECL_UID (sft));
  		      
! 		      /* Pointed-to SFTs are needed by the operand scanner
! 			 to adjust offsets when adding operands to memory
  			 expressions that dereference PTR.  This means
  			 that memory partitioning may not partition
  			 this SFT because the operand scanner will not
  			 be able to find the other SFTs next to this
! 			 one.  But we only need to do this if the pointed
! 			 to type is aggregate.  */
! 		      if (SFT_BASE_FOR_COMPONENTS_P (sft))
  			SFT_UNPARTITIONABLE_P (sft) = true;
  		    }
  		}
*************** find_what_p_points_to (tree p)
*** 4996,5004 ****
  	      pi->pt_global_mem = 1;
  	    }
  
! 	  set_uids_in_ptset (vi->decl, finished_solution, vi->solution,
! 			     vi->directly_dereferenced,
! 			     vi->no_tbaa_pruning);
  	  result = shared_bitmap_lookup (finished_solution);
  
  	  if (!result)
--- 4999,5008 ----
  	      pi->pt_global_mem = 1;
  	    }
  
! 	  if (POINTER_TYPE_P (TREE_TYPE (vi->decl)))
! 	    set_uids_in_ptset (vi->decl, finished_solution, vi->solution,
! 			       vi->directly_dereferenced,
! 			       vi->no_tbaa_pruning);
  	  result = shared_bitmap_lookup (finished_solution);
  
  	  if (!result)
Index: tree-ssa-operands.c
===================================================================
*** tree-ssa-operands.c	(revision 130174)
--- tree-ssa-operands.c	(working copy)
*************** access_can_touch_variable (tree ref, tre
*** 1367,1399 ****
    return true;
  }
  
- 
- /* Given an aggregate expression FULL_REF, return the number of
-    aggregates that are containing FULL_REF.  So, given a structure
-    reference a.b.c.d, the nesting level for this expression is 2 (the
-    number of '.' in the expression minus 1).  */
- 
- static unsigned
- ref_nesting_level (tree full_ref)
- {
-   unsigned nesting_level = 0;
- 
-   if (!handled_component_p (full_ref))
-     return 0;
- 
-   full_ref = TREE_OPERAND (full_ref, 0);
-   while (handled_component_p (full_ref))
-     {
-       nesting_level++;
-       full_ref = TREE_OPERAND (full_ref, 0);
-     }
- 
-   return nesting_level;
- }
- 
- 
  /* Add the actual variables FULL_REF can access, given a member of
!    FULL_REF's points-to set VAR, where FULL_REF is an access of SIZE at
     OFFSET from var. IS_CALL_SITE is true if this is a call, and IS_DEF
     is true if this is supposed to be a vdef, and false if this should
     be a VUSE.
--- 1367,1374 ----
    return true;
  }
  
  /* Add the actual variables FULL_REF can access, given a member of
!    full_ref's points-to set VAR, where FULL_REF is an access of SIZE at
     OFFSET from var. IS_CALL_SITE is true if this is a call, and IS_DEF
     is true if this is supposed to be a vdef, and false if this should
     be a VUSE.
*************** ref_nesting_level (tree full_ref)
*** 1411,1422 ****
     This is necessary because foop only actually points to foo's first
     member, so that is all the points-to set contains.  However, an access
     to foop->a may be touching some single SFT if we have created some
!    SFT's for a structure.
! 
!    FULL_REF is the original memory expression being analyzed.  */
  
  static bool
! add_vars_for_offset (tree full_ref, tree var, unsigned HOST_WIDE_INT offset,
  		     unsigned HOST_WIDE_INT size, bool is_def)
  {
    bool added = false;
--- 1386,1395 ----
     This is necessary because foop only actually points to foo's first
     member, so that is all the points-to set contains.  However, an access
     to foop->a may be touching some single SFT if we have created some
!    SFT's for a structure.  */
  
  static bool
! add_vars_for_offset (tree var, unsigned HOST_WIDE_INT offset,
  		     unsigned HOST_WIDE_INT size, bool is_def)
  {
    bool added = false;
*************** add_vars_for_offset (tree full_ref, tree
*** 1424,1478 ****
    subvar_t sv;
    unsigned int i;
  
!   if (full_ref
!       && SFT_NESTING_LEVEL (var) > 0
!       && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var))
!     {
!       /* Since VAR is an SFT inside a nested structure, the OFFSET
! 	 computed by get_ref_base_and_extent is the offset from the
! 	 start of the immediately containing structure.  If VAR is an
! 	 SFT inside a nested structure, then FULL_REF may be a
! 	 reference to the structure immediately enclosing SFT, and so
! 	 OFFSET will be the offset from the start of the immediately
! 	 enclosing structure.
! 
! 	 However, to find out what other SFTs are affected by this
! 	 reference, we need to know the offsets starting at the root
! 	 structure in the nesting hierarchy.
! 
! 	 For instance, given the following structure:
! 
! 	 	struct X {
! 		  int a;
! 		  struct Y {
! 		    int b;
! 		    struct Z {
! 		      int c[3];
! 		    } d;
! 		  } e;
! 		} m;
! 
! 	 and the following address expression:
! 
! 		p_1 = &m.e.d;
! 
! 	 This structure will receive 5 SFTs, namely 2 for fields 'a'
! 	 and 'b' and 3 for the array 'c' in struct Z.  So, the
! 	 reference p_1->c[2] and m.e.d.c[2] access the exact same
! 	 memory location (ie, SFT.5).
! 
! 	 Now, alias analysis computed the points-to set for pointer
! 	 p_1 as  { SFT.3 } because that is the first field that p_1
! 	 actually points to.  When the expression p_1->c[2] is
! 	 analyzed, get_ref_base_and_extent will return an offset of 96
! 	 because we are accessing the third element of the array.  But
! 	 the SFT we are looking for is actually at offset 160,
! 	 counting from the top of struct X.
! 
! 	 Therefore, we adjust OFFSET by the offset of VAR so that we
! 	 can get at all the fields starting at VAR.  */
!       offset += SFT_OFFSET (var);
!     }
  
    /* Add all subvars of var that overlap with the access.
       Binary search for the first relevant SFT.  */
--- 1397,1404 ----
    subvar_t sv;
    unsigned int i;
  
!   /* Adjust offset by the pointed-to location.  */
!   offset += SFT_OFFSET (var);
  
    /* Add all subvars of var that overlap with the access.
       Binary search for the first relevant SFT.  */
*************** add_virtual_operand (tree var, stmt_ann_
*** 1575,1581 ****
  	     if it is a potential points-to location.  */
  	  if (TREE_CODE (al) == STRUCT_FIELD_TAG
  	      && TREE_CODE (var) == NAME_MEMORY_TAG)
! 	    none_added &= !add_vars_for_offset (full_ref, al, offset, size,
  					        flags & opf_def);
  	  else
  	    {
--- 1501,1507 ----
  	     if it is a potential points-to location.  */
  	  if (TREE_CODE (al) == STRUCT_FIELD_TAG
  	      && TREE_CODE (var) == NAME_MEMORY_TAG)
! 	    none_added &= !add_vars_for_offset (al, offset, size,
  					        flags & opf_def);
  	  else
  	    {

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 15:29                       ` Richard Guenther
@ 2007-11-14 15:30                         ` Diego Novillo
  2007-11-15 13:55                           ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Diego Novillo @ 2007-11-14 15:30 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

Richard Guenther wrote:

> Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend
> to commit this if that (as expected) finishes successfully.

No, let me study this first.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-14 15:30                         ` Diego Novillo
@ 2007-11-15 13:55                           ` Richard Guenther
  2007-11-15 15:47                             ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2007-11-15 13:55 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Wed, 14 Nov 2007, Diego Novillo wrote:

> Richard Guenther wrote:
> 
> > Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend
> > to commit this if that (as expected) finishes successfully.
> 
> No, let me study this first.

We can also add one additional optimization and sanity check.  In
add_virtual_operand instead of dispatching to add_vars_for_offset for
every SFT in the NMT just do so for those that have
SFT_BASE_FOR_COMPONENTS_P set.  All others can (should) be only directly
dereferenced which means offset has to be zero.  If not, we should
visit them during one of the add_vars_for_offset calls.  This avoids
redundant calls to append_vuse/vdef (so it's merely a compile-time
improvement):

         /* For SFTs we have to consider all subvariables of the parent 
var
             if it is a potential points-to location.  */
          if (TREE_CODE (al) == STRUCT_FIELD_TAG
              && TREE_CODE (var) == NAME_MEMORY_TAG)
            {
              if (SFT_BASE_FOR_COMPONENTS_P (al))
                none_added &= !add_vars_for_offset (al, offset, size,
                                                    flags & opf_def);
              else if (offset == 0)
                {
                  /* If this is a SFT that cannot be used as a base for
                     component references, we only need to consider them
                     if offset is zero.  Otherwise if it is aliased via
                     a component ref we will visit it during processing
                     of a SFT_BASE_FOR_COMPONENTS_P pointed-to SFT.
                     In this case just add this SFT as a possible direct
                     dereference target.  */
                  if (flags & opf_def)
                    append_vdef (al);
                  else
                    append_vuse (al);
                  none_added = false;
                }

I guess we don't want to do this as part of the initial patch, but I'll
check the compile-time effects if the above works out.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Fix PR 33870
  2007-11-15 13:55                           ` Richard Guenther
@ 2007-11-15 15:47                             ` Richard Guenther
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Guenther @ 2007-11-15 15:47 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches

On Thu, 15 Nov 2007, Richard Guenther wrote:

> On Wed, 14 Nov 2007, Diego Novillo wrote:
> 
> > Richard Guenther wrote:
> > 
> > > Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend
> > > to commit this if that (as expected) finishes successfully.
> > 
> > No, let me study this first.
> 
> We can also add one additional optimization and sanity check.  In
> add_virtual_operand instead of dispatching to add_vars_for_offset for
> every SFT in the NMT just do so for those that have
> SFT_BASE_FOR_COMPONENTS_P set.  All others can (should) be only directly
> dereferenced which means offset has to be zero.  If not, we should
> visit them during one of the add_vars_for_offset calls.  This avoids
> redundant calls to append_vuse/vdef (so it's merely a compile-time
> improvement):
> 
>          /* For SFTs we have to consider all subvariables of the parent 
> var
>              if it is a potential points-to location.  */
>           if (TREE_CODE (al) == STRUCT_FIELD_TAG
>               && TREE_CODE (var) == NAME_MEMORY_TAG)
>             {
>               if (SFT_BASE_FOR_COMPONENTS_P (al))
>                 none_added &= !add_vars_for_offset (al, offset, size,
>                                                     flags & opf_def);
>               else if (offset == 0)
>                 {
>                   /* If this is a SFT that cannot be used as a base for
>                      component references, we only need to consider them
>                      if offset is zero.  Otherwise if it is aliased via
>                      a component ref we will visit it during processing
>                      of a SFT_BASE_FOR_COMPONENTS_P pointed-to SFT.
>                      In this case just add this SFT as a possible direct
>                      dereference target.  */
>                   if (flags & opf_def)
>                     append_vdef (al);
>                   else
>                     append_vuse (al);
>                   none_added = false;
>                 }
> 
> I guess we don't want to do this as part of the initial patch, but I'll
> check the compile-time effects if the above works out.

Ok, that doesn't work.  As we have cases where we can have sub-references
on an SFT (notably in the case where we glob fields into a single
SFT as we do for arrays with more than 4 elements), the offset of the
access may be bigger than zero.  Even if we only can reach the SFT itself
we cannot avoid the append_vdef/vuse in any case, so the possible speed
difference should be a wash.

That is we can do

          if (TREE_CODE (al) == STRUCT_FIELD_TAG
              && TREE_CODE (var) == NAME_MEMORY_TAG)
            {
              if (SFT_BASE_FOR_COMPONENTS_P (al))
                none_added &= !add_vars_for_offset (al, offset, size,
                                                    flags & opf_def);
              else
                {
                  /* If this is a SFT that cannot be used as a base for
                     component references, we only need to consider them
                     if offset is zero.  Otherwise if it is aliased via
                     a component ref we will visit it during processing
                     of a SFT_BASE_FOR_COMPONENTS_P pointed-to SFT.
                     In this case just add this SFT as a possible direct
                     dereference target.  */
                  if (flags & opf_def)
                    append_vdef (al);
                  else
                    append_vuse (al);
                  none_added = false;
                }

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-11-15 13:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-08  0:24 Fix PR 33870 Diego Novillo
2007-11-08  0:35 ` Jakub Jelinek
2007-11-08  2:20   ` Diego Novillo
2007-11-08 10:45 ` Richard Guenther
2007-11-13 16:30   ` Diego Novillo
2007-11-13 16:47     ` Richard Guenther
2007-11-14 12:36       ` Richard Guenther
2007-11-14 12:41         ` Richard Guenther
2007-11-14 13:01           ` Diego Novillo
2007-11-14 13:18             ` Richard Guenther
2007-11-14 13:59               ` Diego Novillo
2007-11-14 14:07                 ` Richard Guenther
2007-11-14 14:09                   ` Diego Novillo
2007-11-14 14:22                     ` Richard Guenther
2007-11-14 15:29                       ` Richard Guenther
2007-11-14 15:30                         ` Diego Novillo
2007-11-15 13:55                           ` Richard Guenther
2007-11-15 15:47                             ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).