public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch] Fix ICE during RTL expansion at -O1
@ 2013-03-21 16:27 Eric Botcazou
  2013-03-22  9:44 ` Richard Biener
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2013-03-21 16:27 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3433 bytes --]

Hi,

this fixes an ICE on the mainline at -O1:

eric@polaris:~/gnat/bugs/M129-026> ~/install/gcc/bin/gcc -S p.adb -O
+===========================GNAT BUG DETECTED==============================+
| 4.9.0 20130320 (experimental) [trunk revision 196816] (x86_64-suse-linux) 
GCC error:|
| in expand_assignment, at expr.c:4761                                     |
| Error detected around p.ads:11:9      

  MEM[(struct p__rec &)&ret].d = 0;
  MEM[(struct p__rec &)&ret].b1 = 0;
  _26 = MEM[(struct p__rec &)&ret].d;
  p0.8_27 = (integer) _26;
  _28 = (sizetype) p0.8_27;
  _29 = _28 * 8;
  _30 = _29 + _29;
  _31 = _30 + 8;
  _32 = _31 /[ex] 8;
  MEM[(struct p__rec &)&ret].b2{off: _32 * 8} = 1;

The ICE occurs because ret is put into a register instead of memory (it has
an integral mode) and you cannot have a variable offset within an object in
a register.

But, if you look at the above GIMPLE from .optimized, you'll see that the
offset is actually fixed and that the GIMPLE optimizers weren't powerful
enough to see it.  Ironically enough, the RTL optimizers would have been!

So the attached patch adds a clone of nonoverlapping_component_refs_p from
alias.c to tree-ssa-alias.c to disambiguate the memory accesses.  It also
contains a small tweak to both functions to cater to a LTO-specific quirk.

With it, a bunch of testcases of the vectorization testsuite are optimized
so the patch also contains counter-measures for them.

Tested on x86_64-suse-linux, OK for the mainline?


2013-03-21  Eric Botcazou  <ebotcazou@adacore.com>

	* alias.c (nonoverlapping_component_refs_p): Protect again LTO quirk.
	* tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p): New.
	(decl_refs_may_alias_p): Add REF1 and REF2 parameters.
	Use it to disambiguate component references.
	(refs_may_alias_p_1): Adjust call to decl_refs_may_alias_p.


2013-03-21  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/discr41.ad[sb]: New test.
	* gcc.dg/vect/slp-24-big-array.c: Beef up anti-vectorization trick.
	* gcc.dg/vect/slp-24.c: Likewise.
	* gcc.dg/vect/vect-strided-a-mult.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u16-i2.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u16-i4.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u16-mult.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u8-i2-gap.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u8-i8-gap2.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c: Likewise.
	* gcc.dg/vect/vect-strided-a-u8-i8-gap7.c: Likewise.
	* gcc.dg/vect/vect-strided-mult-char-ls.c: Likewise.
	* gcc.dg/vect/vect-strided-mult.c: Likewise.
	* gcc.dg/vect/vect-strided-same-dr.c: Likewise.
	* gcc.dg/vect/vect-strided-u16-i2.c: Likewise.
	* gcc.dg/vect/vect-strided-u16-i4.c: Likewise.
	* gcc.dg/vect/vect-strided-u32-i4.c: Likewise.
	* gcc.dg/vect/vect-strided-u32-i8.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i2-gap.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i2.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap2.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap4.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8-gap7.c: Likewise.
	* gcc.dg/vect/vect-strided-u8-i8.c: Likewise.


-- 
Eric Botcazou

[-- Attachment #2: discr41.ads --]
[-- Type: text/x-adasrc, Size: 436 bytes --]

package Discr41 is

   type Vector is array (Positive range <>) of Long_Float;

   type Date is record
      LF : Long_Float := 0.0;
   end record;

   type Date_Vector is array (Positive range <>) of Date;

   type Rec (D : Natural) is record
      B1 : Boolean := False;
      DL : Date_Vector (1 .. D);
      VL : Vector (1 .. D) := (others => 0.0);
      B2 : Boolean := True;
   end record;

   function F return Rec;

end Discr41;

[-- Attachment #3: discr41.adb --]
[-- Type: text/x-adasrc, Size: 167 bytes --]

-- { dg-do compile }
-- { dg-options "-O" }

package body Discr41 is

   function F return Rec is
      Ret : Rec (0);
   begin
      return Ret;
   end;

end Discr41;

[-- Attachment #4: p.diff --]
[-- Type: text/x-patch, Size: 22640 bytes --]

Index: alias.c
===================================================================
--- alias.c	(revision 196816)
+++ alias.c	(working copy)
@@ -2239,8 +2239,11 @@ nonoverlapping_component_refs_p (const_r
 
     found:
       /* If we're left with accessing different fields of a structure, then no
-	 possible overlap, unless they are both bitfields.  */
-      if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
+	 possible overlap, unless they are both bitfields.
+	 ??? Pointer inequality is too fragile in the LTO compiler.  */
+      if (TREE_CODE (typex) == RECORD_TYPE
+	  && fieldx != fieldy
+	  && DECL_NAME (fieldx) != DECL_NAME (fieldy))
 	return !(DECL_BIT_FIELD (fieldx) && DECL_BIT_FIELD (fieldy));
 
       /* The comparison on the current field failed.  If we're accessing
Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c	(revision 196816)
+++ tree-ssa-alias.c	(working copy)
@@ -700,14 +700,78 @@ aliasing_component_refs_p (tree ref1,
   return false;
 }
 
+/* Return true if we can determine that component references X and Y, that are
+   within a common variable, cannot overlap.  This is a generalized version of
+   nonoverlapping_component_refs_p under the common variable assumption.  */
+
+static bool
+nonoverlapping_component_refs_of_decl_p (const_tree x, const_tree y)
+{
+  const_tree fieldx, fieldy, typex, typey, orig_y = y;
+
+  do
+    {
+      /* The comparison has to be done at a common type, since we don't
+	 know how the inheritance hierarchy works.  */
+      orig_y = y;
+      do
+	{
+	  if (TREE_CODE (x) == COMPONENT_REF)
+	    {
+	      fieldx = TREE_OPERAND (x, 1);
+	      typex = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldx));
+
+	      y = orig_y;
+	      do
+		{
+		  if (TREE_CODE (y) == COMPONENT_REF)
+		    {
+		      fieldy = TREE_OPERAND (y, 1);
+		      typey = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldy));
+
+		      if (typex == typey)
+			goto found;
+		    }
+	          y = TREE_OPERAND (y, 0);
+	        }
+	      while (y && handled_component_p (y));
+	    }
+	  x = TREE_OPERAND (x, 0);
+	}
+      while (x && handled_component_p (x));
+      /* Never found a common type.  */
+      return false;
+
+    found:
+      /* If we're left with accessing different fields of a structure,
+	 then no overlap.
+	 ??? Pointer inequality is too fragile in the LTO compiler.  */
+      if (TREE_CODE (typex) == RECORD_TYPE
+	  && fieldx != fieldy
+	  && DECL_NAME (fieldx) != DECL_NAME (fieldy))
+	return true;
+
+      /* The comparison on the current field failed.  If we're accessing
+	 a very nested structure, look at the next outer level.  */
+      x = TREE_OPERAND (x, 0);
+      y = TREE_OPERAND (y, 0);
+    }
+  while (x && y
+	 && handled_component_p (x)
+	 && handled_component_p (y));
+
+  return false;
+}
+
 /* Return true if two memory references based on the variables BASE1
    and BASE2 constrained to [OFFSET1, OFFSET1 + MAX_SIZE1) and
-   [OFFSET2, OFFSET2 + MAX_SIZE2) may alias.  */
+   [OFFSET2, OFFSET2 + MAX_SIZE2) may alias.  REF1 and REF2
+   if non-NULL are the complete memory reference trees.  */
 
 static bool
-decl_refs_may_alias_p (tree base1,
+decl_refs_may_alias_p (tree ref1, tree base1,
 		       HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1,
-		       tree base2,
+		       tree ref2, tree base2,
 		       HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2)
 {
   gcc_checking_assert (DECL_P (base1) && DECL_P (base2));
@@ -718,7 +782,17 @@ decl_refs_may_alias_p (tree base1,
 
   /* If both references are based on the same variable, they cannot alias if
      the accesses do not overlap.  */
-  return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+  if (!ranges_overlap_p (offset1, max_size1, offset2, max_size2))
+    return false;
+
+  /* For components with variable position, the above test isn't sufficient,
+     so we disambiguate component references manually.  */
+  if (ref1 && ref2
+      && handled_component_p (ref1) && handled_component_p (ref2)
+      && nonoverlapping_component_refs_of_decl_p (ref1, ref2))
+    return false;
+
+  return true;     
 }
 
 /* Return true if an indirect reference based on *PTR1 constrained
@@ -1067,8 +1141,8 @@ refs_may_alias_p_1 (ao_ref *ref1, ao_ref
   var1_p = DECL_P (base1);
   var2_p = DECL_P (base2);
   if (var1_p && var2_p)
-    return decl_refs_may_alias_p (base1, offset1, max_size1,
-				  base2, offset2, max_size2);
+    return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
+				  ref2->ref, base2, offset2, max_size2);
 
   ind1_p = (TREE_CODE (base1) == MEM_REF
 	    || TREE_CODE (base1) == TARGET_MEM_REF);
Index: testsuite/gcc.dg/vect/vect-strided-a-u16-mult.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u16-mult.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u16-mult.c	(working copy)
@@ -10,6 +10,8 @@ typedef struct {
    unsigned short b;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -26,8 +28,8 @@ main1 ()
       arr[i].a = i;
       arr[i].b = i * 2;
       iarr[i] = i * 3;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-u32-i4.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u32-i4.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u32-i4.c	(working copy)
@@ -12,6 +12,8 @@ typedef struct {
    int d;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -54,8 +56,8 @@ int main (void)
       arr[i].b = i * 2;
       arr[i].c = 17;
       arr[i].d = i+34;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c	(working copy)
@@ -17,6 +17,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr, int n)
 {
@@ -102,8 +104,8 @@ int main (void)
       arr[i].f = 16;
       arr[i].g = 3;
       arr[i].h = 56;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr, N-2);
Index: testsuite/gcc.dg/vect/vect-strided-u8-i2-gap.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i2-gap.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i2-gap.c	(working copy)
@@ -10,6 +10,8 @@ typedef struct {
    unsigned char b;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -62,8 +64,8 @@ int main (void)
     { 
       arr[i].a = i;
       arr[i].b = i * 2;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c	(working copy)
@@ -18,6 +18,8 @@ typedef struct {
 
 s check_res[N];
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -103,8 +105,8 @@ int main (void)
       check_res[i].h = arr[i].c;
       check_res[i].g = arr[i].b + arr[i].c;
 
-      if (arr[i].a == 178)
-         abort ();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
   main1 (arr);
 
Index: testsuite/gcc.dg/vect/slp-24-big-array.c
===================================================================
--- testsuite/gcc.dg/vect/slp-24-big-array.c	(revision 196816)
+++ testsuite/gcc.dg/vect/slp-24-big-array.c	(working copy)
@@ -84,8 +84,8 @@ int main (void)
       arr[i].b = i * 2 + 10;
       arr[i].c = 17;
       arr[i].d = i+34;
-      if (arr[i].a == 178)
-         abort ();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
   check_vect ();
 
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -76,8 +78,8 @@ int main (void)
       arr[i].f = i + 5;
       arr[i].g = i + 3;
       arr[i].h = 67;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u16-i2.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u16-i2.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u16-i2.c	(working copy)
@@ -10,6 +10,8 @@ typedef struct {
    unsigned short b;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -46,8 +48,8 @@ int main (void)
     { 
       arr[i].a = i;
       arr[i].b = i * 2;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-a-mult.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-mult.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-mult.c	(working copy)
@@ -15,6 +15,8 @@ typedef struct {
    unsigned int b;
 } ii;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -32,8 +34,8 @@ main1 ()
       arr[i].b = i * 2;
       iarr[i].a = i;
       iarr[i].b = i * 3;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c	(working copy)
@@ -18,6 +18,8 @@ typedef struct {
 
 s check_res[N];
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -91,8 +93,8 @@ int main (void)
       check_res[i].h = arr[i].d;
       check_res[i].g = u + t;
 
-      if (arr[i].a == 178)
-         abort ();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap2.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap2.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap2.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -69,8 +71,8 @@ int main (void)
       arr[i].f = i * 2 + 2;
       arr[i].g = i - 3;
       arr[i].h = 56;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u16-i4.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u16-i4.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u16-i4.c	(working copy)
@@ -12,6 +12,8 @@ typedef struct {
    unsigned short d;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -59,8 +61,8 @@ int main (void)
       arr[i].b = i * 2;
       arr[i].c = 17;
       arr[i].d = i+34;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u32-i8.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u32-i8.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u32-i8.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    int h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -68,8 +70,8 @@ int main (void)
       arr[i].f = i * 5;
       arr[i].g = i - 3;
       arr[i].h = 56;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c	(working copy)
@@ -18,6 +18,8 @@ typedef struct {
 
 s check_res[N];
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -80,8 +82,9 @@ int main (void)
       check_res[i].e = arr[i].f - arr[i].b;
       check_res[i].h = arr[i].f;
       check_res[i].g = arr[i].f - arr[i].b;
-      if (arr[i].a == 178)
-         abort ();
+
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap2.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap2.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap2.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -34,8 +36,8 @@ main1 ()
       arr[i].f = i * 2 + 2;
       arr[i].g = i - 3;
       arr[i].h = 56;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -89,8 +91,8 @@ int main (void)
       arr[i].f = i * 5;
       arr[i].g = i - 3;
       arr[i].h = 56;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/slp-24.c
===================================================================
--- testsuite/gcc.dg/vect/slp-24.c	(revision 196816)
+++ testsuite/gcc.dg/vect/slp-24.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
 unsigned char ub[N*2] = {1,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,1,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
 unsigned char uc[N] = {1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
 
+volatile int y = 0;
+
 void
 main1 (unsigned char x, unsigned char max_result, unsigned char min_result, s *arr)
 {
@@ -67,8 +69,8 @@ int main (void)
       arr[i].b = i * 2 + 10;
       arr[i].c = 17;
       arr[i].d = i+34;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
   check_vect ();
   
Index: testsuite/gcc.dg/vect/vect-strided-a-u16-i2.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u16-i2.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u16-i2.c	(working copy)
@@ -10,6 +10,8 @@ typedef struct {
    unsigned short b;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -22,8 +24,8 @@ main1 ()
     {
       arr[i].a = i;
       arr[i].b = i * 2;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap7.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap7.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap7.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -74,8 +76,8 @@ int main (void)
       arr[i].f = i * 5;
       arr[i].g = i - 3;
       arr[i].h = 67;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-a-u16-i4.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u16-i4.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u16-i4.c	(working copy)
@@ -12,6 +12,8 @@ typedef struct {
    unsigned short d;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -27,8 +29,8 @@ main1 ()
       arr[i].b = i * 2;
       arr[i].c = 17;
       arr[i].d = i+34;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap7.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap7.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap7.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -35,8 +37,8 @@ main1 ()
       arr[i].f = i * 5;
       arr[i].g = i - 3;
       arr[i].h = 67;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -50,8 +52,8 @@ main1 ()
       check_res[i].h = arr[i].d;
       check_res[i].g = u + t;
 
-      if (arr[i].a == 178)
-         abort ();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-mult.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-mult.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-mult.c	(working copy)
@@ -15,6 +15,8 @@ typedef struct {
    unsigned int b;
 } ii;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr, ii *iarr)
 {
@@ -62,8 +64,8 @@ int main (void)
       arr[i].b = i * 2;
       iarr[i].a = i;
       iarr[i].b = i * 3;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   main1 (arr, iarr); 
Index: testsuite/gcc.dg/vect/vect-strided-u8-i2.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-u8-i2.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-u8-i2.c	(working copy)
@@ -10,6 +10,8 @@ typedef struct {
    unsigned char b;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr)
 {
@@ -45,8 +47,8 @@ int main (void)
     { 
       arr[i].a = i;
       arr[i].b = i * 2;
-      if (arr[i].a == 178)
-         abort(); 
+      if (y) /* Avoid vectorization.  */
+        abort ();
     } 
 
   main1 (arr);
Index: testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c	(working copy)
@@ -10,6 +10,8 @@ typedef struct {
    unsigned char b;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -22,8 +24,8 @@ main1 ()
     {
       arr[i].a = i;
       arr[i].b = i * 2;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-strided-same-dr.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-same-dr.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-same-dr.c	(working copy)
@@ -12,6 +12,8 @@ typedef struct {
 
 s buffer1[N], buffer2[N];
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s * __restrict__  pIn, s* __restrict__ pOut)
 {
@@ -61,8 +63,8 @@ int main (void)
       buffer1[i].b = i + 8;
       buffer2[i].a = i * 3;
       buffer2[i].b = i * 2;
-      if (buffer1[i].a == 500)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   check_vect ();
Index: testsuite/gcc.dg/vect/vect-strided-mult-char-ls.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-mult-char-ls.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-mult-char-ls.c	(working copy)
@@ -15,6 +15,8 @@ typedef struct {
    unsigned int b;
 } ii;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 (s *arr, ii *iarr)
 {
@@ -62,8 +64,8 @@ int main (void)
       arr[i].b = i * 2;
       iarr[i].a = i;
       iarr[i].b = i * 3;
-      if (arr[i].a == 178)
-         abort();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   main1 (arr, iarr);
Index: testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c
===================================================================
--- testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c	(revision 196816)
+++ testsuite/gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c	(working copy)
@@ -16,6 +16,8 @@ typedef struct {
    unsigned char h;
 } s;
 
+volatile int y = 0;
+
 __attribute__ ((noinline)) int
 main1 ()
 {
@@ -45,8 +47,8 @@ main1 ()
       check_res[i].h = arr[i].f;
       check_res[i].g = arr[i].f - arr[i].a;
 
-      if (arr[i].a == 178)
-         abort ();
+      if (y) /* Avoid vectorization.  */
+        abort ();
     }
 
   for (i = 0; i < N; i++)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-03-21 16:27 [patch] Fix ICE during RTL expansion at -O1 Eric Botcazou
@ 2013-03-22  9:44 ` Richard Biener
  2013-04-14 13:34   ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Biener @ 2013-03-22  9:44 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Thu, Mar 21, 2013 at 5:24 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> this fixes an ICE on the mainline at -O1:
>
> eric@polaris:~/gnat/bugs/M129-026> ~/install/gcc/bin/gcc -S p.adb -O
> +===========================GNAT BUG DETECTED==============================+
> | 4.9.0 20130320 (experimental) [trunk revision 196816] (x86_64-suse-linux)
> GCC error:|
> | in expand_assignment, at expr.c:4761                                     |
> | Error detected around p.ads:11:9
>
>   MEM[(struct p__rec &)&ret].d = 0;
>   MEM[(struct p__rec &)&ret].b1 = 0;
>   _26 = MEM[(struct p__rec &)&ret].d;
>   p0.8_27 = (integer) _26;
>   _28 = (sizetype) p0.8_27;
>   _29 = _28 * 8;
>   _30 = _29 + _29;
>   _31 = _30 + 8;
>   _32 = _31 /[ex] 8;
>   MEM[(struct p__rec &)&ret].b2{off: _32 * 8} = 1;
>
> The ICE occurs because ret is put into a register instead of memory (it has
> an integral mode) and you cannot have a variable offset within an object in
> a register.
>
> But, if you look at the above GIMPLE from .optimized, you'll see that the
> offset is actually fixed and that the GIMPLE optimizers weren't powerful
> enough to see it.  Ironically enough, the RTL optimizers would have been!
>
> So the attached patch adds a clone of nonoverlapping_component_refs_p from
> alias.c to tree-ssa-alias.c to disambiguate the memory accesses.  It also
> contains a small tweak to both functions to cater to a LTO-specific quirk.
>
> With it, a bunch of testcases of the vectorization testsuite are optimized
> so the patch also contains counter-measures for them.
>
> Tested on x86_64-suse-linux, OK for the mainline?

This is a quadratic algorithm and as such not ok.  We already have
aliasing_component_refs_p in tree-ssa-alias.c which is supposed to be
the non-quadratic replacement.  It's not used via decl_refs_may_alias_p,
so that may be the thing to fix.

nonoverlapping_component_refs_of_decl_p on RTL should go - in fact
we do call the tree oracle from all its callers so we only ever do redundant
work (after your proposed patch even more so).

(I recently tried to cache the ao_ref in mem-attrs but failed because of
weird gengtype issues ... we could avoid the repeated get_ref_base_and_extent
calls done via ao_ref_from_mem that way).

Richard.

>
> 2013-03-21  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * alias.c (nonoverlapping_component_refs_p): Protect again LTO quirk.
>         * tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p): New.
>         (decl_refs_may_alias_p): Add REF1 and REF2 parameters.
>         Use it to disambiguate component references.
>         (refs_may_alias_p_1): Adjust call to decl_refs_may_alias_p.
>
>
> 2013-03-21  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * gnat.dg/discr41.ad[sb]: New test.
>         * gcc.dg/vect/slp-24-big-array.c: Beef up anti-vectorization trick.
>         * gcc.dg/vect/slp-24.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-mult.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u16-i2.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u16-i4.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u16-mult.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u8-i2-gap.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u8-i8-gap2.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c: Likewise.
>         * gcc.dg/vect/vect-strided-a-u8-i8-gap7.c: Likewise.
>         * gcc.dg/vect/vect-strided-mult-char-ls.c: Likewise.
>         * gcc.dg/vect/vect-strided-mult.c: Likewise.
>         * gcc.dg/vect/vect-strided-same-dr.c: Likewise.
>         * gcc.dg/vect/vect-strided-u16-i2.c: Likewise.
>         * gcc.dg/vect/vect-strided-u16-i4.c: Likewise.
>         * gcc.dg/vect/vect-strided-u32-i4.c: Likewise.
>         * gcc.dg/vect/vect-strided-u32-i8.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i2-gap.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i2.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap2.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap4.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8-gap7.c: Likewise.
>         * gcc.dg/vect/vect-strided-u8-i8.c: Likewise.
>
>
> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-03-22  9:44 ` Richard Biener
@ 2013-04-14 13:34   ` Eric Botcazou
  2013-04-15 12:18     ` Richard Biener
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2013-04-14 13:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]

> This is a quadratic algorithm and as such not ok.  We already have
> aliasing_component_refs_p in tree-ssa-alias.c which is supposed to be
> the non-quadratic replacement.  It's not used via decl_refs_may_alias_p,
> so that may be the thing to fix.

aliasing_component_refs_p isn't powerful enough, it eliminates the quadratic 
aspect by assuming that all offsets are constants, so it misses cases like 
(*p)[i].f1 vs a[j].f2.  Moreover it assumes TBAA and we don't need it here.

I can rewrite nonoverlapping_component_refs_of_decl_p to make it non-quadratic 
and catch the same cases I think, patch attached (without the vect testsuite 
adjustments, but they are still needed).

> nonoverlapping_component_refs_of_decl_p on RTL should go - in fact
> we do call the tree oracle from all its callers so we only ever do redundant
> work (after your proposed patch even more so).

Not clear if the tree oracle can catch the above case with *p and a, but, yes, 
nonoverlapping_component_refs_p should go in the long term.


	* alias.c (nonoverlapping_component_refs_p): Protect again LTO quirk.
	* tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p): New.
	(decl_refs_may_alias_p): Add REF1 and REF2 parameters.
	Use nonoverlapping_component_refs_of_decl_p to disambiguate component
	references.
	(refs_may_alias_p_1): Adjust call to decl_refs_may_alias_p.
	* tree-streamer.c (record_common_node): Adjust reference in comment.


-- 
Eric Botcazou

[-- Attachment #2: p2.diff --]
[-- Type: text/x-patch, Size: 6611 bytes --]

Index: alias.c
===================================================================
--- alias.c	(revision 197926)
+++ alias.c	(working copy)
@@ -2232,8 +2232,11 @@ nonoverlapping_component_refs_p (const_r
 
     found:
       /* If we're left with accessing different fields of a structure, then no
-	 possible overlap, unless they are both bitfields.  */
-      if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
+	 possible overlap, unless they are both bitfields.
+	 ??? Pointer inequality is too fragile in the LTO compiler.  */
+      if (TREE_CODE (typex) == RECORD_TYPE
+	  && fieldx != fieldy
+	  && DECL_NAME (fieldx) != DECL_NAME (fieldy))
 	return !(DECL_BIT_FIELD (fieldx) && DECL_BIT_FIELD (fieldy));
 
       /* The comparison on the current field failed.  If we're accessing
Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c	(revision 197926)
+++ tree-ssa-alias.c	(working copy)
@@ -719,14 +719,110 @@ aliasing_component_refs_p (tree ref1,
   return false;
 }
 
+/* Return true if we can determine that component references REF1 and REF2,
+   that are within a common DECL, cannot overlap.  */
+
+static bool
+nonoverlapping_component_refs_of_decl_p (tree ref1, tree ref2)
+{
+  vec<tree, va_stack> component_refs1;
+  vec<tree, va_stack> component_refs2;
+
+  vec_stack_alloc (tree, component_refs1, 16);
+  vec_stack_alloc (tree, component_refs2, 16);
+
+  /* Create the stack of handled components for REF1.  */
+  while (handled_component_p (ref1))
+    {
+      component_refs1.safe_push (ref1);
+      ref1 = TREE_OPERAND (ref1, 0);
+    }
+  if (TREE_CODE (ref1) == MEM_REF)
+    {
+      if (!integer_zerop (TREE_OPERAND (ref1, 1)))
+	goto may_overlap;
+      ref1 = TREE_OPERAND (TREE_OPERAND (ref1, 0), 0);
+    }
+
+  /* Create the stack of handled components for REF2.  */
+  while (handled_component_p (ref2))
+    {
+      component_refs2.safe_push (ref2);
+      ref2 = TREE_OPERAND (ref2, 0);
+    }
+  if (TREE_CODE (ref2) == MEM_REF)
+    {
+      if (!integer_zerop (TREE_OPERAND (ref2, 1)))
+	goto may_overlap;
+      ref2 = TREE_OPERAND (TREE_OPERAND (ref2, 0), 0);
+    }
+
+  /* We must have the same base DECL.  */
+  gcc_assert (ref1 == ref2);
+
+  /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same
+     rank.  This is sufficient because you cannot reference several fields
+     at a time (unlike for arrays with slices), unless you're in a union,
+     in which case the return value will be false in any case.  */
+  while (true)
+    {
+      do
+	{
+	  if (component_refs1.is_empty ())
+	    goto may_overlap;
+	  ref1 = component_refs1.pop ();
+	}
+      while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref1, 0))));
+
+      do
+	{
+	  if (component_refs2.is_empty ())
+	     goto may_overlap;
+	  ref2 = component_refs2.pop ();
+	}
+      while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref2, 0))));
+
+      /* Beware of BIT_FIELD_REF.  */
+      if (TREE_CODE (ref1) != COMPONENT_REF
+	  || TREE_CODE (ref2) != COMPONENT_REF)
+	goto may_overlap;
+
+      tree field1 = TREE_OPERAND (ref1, 1);
+      tree field2 = TREE_OPERAND (ref2, 1);
+
+      /* ??? We cannot simply use the type of operand #0 of the refs here
+	 as the Fortran compiler smuggles type punning into COMPONENT_REFs
+	 for common blocks instead of using unions like everyone else.  */
+      tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
+      tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
+
+      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
+	 goto may_overlap;
+
+      /* ??? Pointer inequality is too fragile in the LTO compiler.  */
+      if (field1 != field2 && DECL_NAME (field1) != DECL_NAME (field2))
+	{
+	  component_refs1.release ();
+	  component_refs2.release ();
+	  return true;
+	}
+    }
+
+may_overlap:
+  component_refs1.release ();
+  component_refs2.release ();
+  return false;
+}
+
 /* Return true if two memory references based on the variables BASE1
    and BASE2 constrained to [OFFSET1, OFFSET1 + MAX_SIZE1) and
-   [OFFSET2, OFFSET2 + MAX_SIZE2) may alias.  */
+   [OFFSET2, OFFSET2 + MAX_SIZE2) may alias.  REF1 and REF2
+   if non-NULL are the complete memory reference trees.  */
 
 static bool
-decl_refs_may_alias_p (tree base1,
+decl_refs_may_alias_p (tree ref1, tree base1,
 		       HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1,
-		       tree base2,
+		       tree ref2, tree base2,
 		       HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2)
 {
   gcc_checking_assert (DECL_P (base1) && DECL_P (base2));
@@ -737,7 +833,17 @@ decl_refs_may_alias_p (tree base1,
 
   /* If both references are based on the same variable, they cannot alias if
      the accesses do not overlap.  */
-  return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+  if (!ranges_overlap_p (offset1, max_size1, offset2, max_size2))
+    return false;
+
+  /* For components with variable position, the above test isn't sufficient,
+     so we disambiguate component references manually.  */
+  if (ref1 && ref2
+      && handled_component_p (ref1) && handled_component_p (ref2)
+      && nonoverlapping_component_refs_of_decl_p (ref1, ref2))
+    return false;
+
+  return true;     
 }
 
 /* Return true if an indirect reference based on *PTR1 constrained
@@ -1086,8 +1192,8 @@ refs_may_alias_p_1 (ao_ref *ref1, ao_ref
   var1_p = DECL_P (base1);
   var2_p = DECL_P (base2);
   if (var1_p && var2_p)
-    return decl_refs_may_alias_p (base1, offset1, max_size1,
-				  base2, offset2, max_size2);
+    return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
+				  ref2->ref, base2, offset2, max_size2);
 
   ind1_p = (TREE_CODE (base1) == MEM_REF
 	    || TREE_CODE (base1) == TARGET_MEM_REF);
Index: tree-streamer.c
===================================================================
--- tree-streamer.c	(revision 197926)
+++ tree-streamer.c	(working copy)
@@ -267,10 +267,9 @@ record_common_node (struct streamer_tree
       /* The FIELD_DECLs of structures should be shared, so that every
 	 COMPONENT_REF uses the same tree node when referencing a field.
 	 Pointer equality between FIELD_DECLs is used by the alias
-	 machinery to compute overlapping memory references (See
-	 nonoverlapping_component_refs_p).  */
-      tree f;
-      for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
+	 machinery to compute overlapping component references (see
+	 nonoverlapping_component_refs_of_decl_p).  */
+      for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
 	record_common_node (cache, f);
     }
 }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-04-14 13:34   ` Eric Botcazou
@ 2013-04-15 12:18     ` Richard Biener
  2013-04-15 14:44       ` Richard Biener
  2013-04-16 12:31       ` Eric Botcazou
  0 siblings, 2 replies; 8+ messages in thread
From: Richard Biener @ 2013-04-15 12:18 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Sun, Apr 14, 2013 at 9:46 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> This is a quadratic algorithm and as such not ok.  We already have
>> aliasing_component_refs_p in tree-ssa-alias.c which is supposed to be
>> the non-quadratic replacement.  It's not used via decl_refs_may_alias_p,
>> so that may be the thing to fix.
>
> aliasing_component_refs_p isn't powerful enough, it eliminates the quadratic
> aspect by assuming that all offsets are constants, so it misses cases like
> (*p)[i].f1 vs a[j].f2.  Moreover it assumes TBAA and we don't need it here.

Note that looking at the access path _is_ assuming TBAA constraints as
soon as the base objects are not the same (in the above case '*p' and 'a'
are not the same and p could alias a in a way that all f1 and f2 overlap).

> I can rewrite nonoverlapping_component_refs_of_decl_p to make it non-quadratic
> and catch the same cases I think, patch attached (without the vect testsuite
> adjustments, but they are still needed).
>
>> nonoverlapping_component_refs_of_decl_p on RTL should go - in fact
>> we do call the tree oracle from all its callers so we only ever do redundant
>> work (after your proposed patch even more so).
>
> Not clear if the tree oracle can catch the above case with *p and a, but, yes,
> nonoverlapping_component_refs_p should go in the long term.
>
>
>         * alias.c (nonoverlapping_component_refs_p): Protect again LTO quirk.
>         * tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p): New.
>         (decl_refs_may_alias_p): Add REF1 and REF2 parameters.
>         Use nonoverlapping_component_refs_of_decl_p to disambiguate component
>         references.
>         (refs_may_alias_p_1): Adjust call to decl_refs_may_alias_p.
>         * tree-streamer.c (record_common_node): Adjust reference in comment.

Index: alias.c
===================================================================
--- alias.c     (revision 197926)
+++ alias.c     (working copy)
@@ -2232,8 +2232,11 @@ nonoverlapping_component_refs_p (const_r

     found:
       /* If we're left with accessing different fields of a structure, then no
-        possible overlap, unless they are both bitfields.  */
-      if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
+        possible overlap, unless they are both bitfields.
+        ??? Pointer inequality is too fragile in the LTO compiler.  */
+      if (TREE_CODE (typex) == RECORD_TYPE
+         && fieldx != fieldy
+         && DECL_NAME (fieldx) != DECL_NAME (fieldy))

this, if at all, should go in with a separate patch and a testcase.
And I think it should _not_ go in.  Instead, as the case passes

              if (typex == typey)
                goto found;

earlier you should assert that DECL_CONTEXT (fieldx) == DECL_CONTEXT
(fieldy) == typex == typey here.  Note that fails of this test are
expected even in the
non-LTO case because I cannot find any IL verification that would verify
that for a COMPONENT_REF TREE_TYPE (TREE_OPERAND (cr, 0))
== DECL_CONTEXT (TREE_OPERAND (cr, 1)) (due to sharing of the FIELD_DECL
chain between different type variants the check will fail for all
non-main-variants
I think, so refining it to look at the main variant is probably advised).

Otoh...

+      /* ??? We cannot simply use the type of operand #0 of the refs here
+        as the Fortran compiler smuggles type punning into COMPONENT_REFs
+        for common blocks instead of using unions like everyone else.  */
+      tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
+      tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
+
+      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
+        goto may_overlap;
+
+      /* ??? Pointer inequality is too fragile in the LTO compiler.  */
+      if (field1 != field2 && DECL_NAME (field1) != DECL_NAME (field2))

this suggests you are seeing multiple FIELD_DECLs for the same field
in the _same_ FIELD_DECL chain ...?!  Are you sure this happens with
GCC 4.8?  There were some fixes in that area in the LTO type merging code.

Index: tree-streamer.c
===================================================================
--- tree-streamer.c     (revision 197926)
+++ tree-streamer.c     (working copy)
@@ -267,10 +267,9 @@ record_common_node (struct streamer_tree
       /* The FIELD_DECLs of structures should be shared, so that every
         COMPONENT_REF uses the same tree node when referencing a field.
         Pointer equality between FIELD_DECLs is used by the alias
-        machinery to compute overlapping memory references (See
-        nonoverlapping_component_refs_p).  */
-      tree f;
-      for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
+        machinery to compute overlapping component references (see
+        nonoverlapping_component_refs_of_decl_p).  */
+      for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
        record_common_node (cache, f);
     }
 }

without actually removing nonoverlapping_component_refs_p it still applies
to both...


Can you port the non-quadratic algorithm to the RTL
nonoverlapping_component_refs_p first?  The core code should exactly
be the same ... (and shared, and only called once from the RTL oracle).

+  /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same
+     rank.  This is sufficient because you cannot reference several fields
+     at a time (unlike for arrays with slices), unless you're in a union,
+     in which case the return value will be false in any case.  */
+  while (true)

I don't understand the "reference several fields" comment.  Because I
can clearly have aggregate copies of RECORD_TYPE.  Can you try
do elaborate more on why the algorithm should be sufficient to catch
all cases?

You could enhance it to not require

+  /* We must have the same base DECL.  */
+  gcc_assert (ref1 == ref2);

for MEM_REF bases under the same conditions like aliasing_component_refs_p,
that is if the MEM_REF isn't view-converting.

That said, if the bases are the same DECLs then indeed you do not
rely on TBAA.  The RTL nonoverlapping_component_refs_p does not
disable itself properly for pointer based accesses that might be
view-converted / aliased accesses (a simple testcase with ref-all
pointers properly offsetted to alias two adjacent fields may be enough to
show that).

Also with your patch enhanced like I suggest we should be able to
remove aliasing_component_refs_p as well, no?

Thanks,
Richard.

>
> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-04-15 12:18     ` Richard Biener
@ 2013-04-15 14:44       ` Richard Biener
  2013-04-16 12:31       ` Eric Botcazou
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Biener @ 2013-04-15 14:44 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Mon, Apr 15, 2013 at 11:47 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Sun, Apr 14, 2013 at 9:46 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>>> This is a quadratic algorithm and as such not ok.  We already have
>>> aliasing_component_refs_p in tree-ssa-alias.c which is supposed to be
>>> the non-quadratic replacement.  It's not used via decl_refs_may_alias_p,
>>> so that may be the thing to fix.
>>
>> aliasing_component_refs_p isn't powerful enough, it eliminates the quadratic
>> aspect by assuming that all offsets are constants, so it misses cases like
>> (*p)[i].f1 vs a[j].f2.  Moreover it assumes TBAA and we don't need it here.
>
> Note that looking at the access path _is_ assuming TBAA constraints as
> soon as the base objects are not the same (in the above case '*p' and 'a'
> are not the same and p could alias a in a way that all f1 and f2 overlap).
>
>> I can rewrite nonoverlapping_component_refs_of_decl_p to make it non-quadratic
>> and catch the same cases I think, patch attached (without the vect testsuite
>> adjustments, but they are still needed).
>>
>>> nonoverlapping_component_refs_of_decl_p on RTL should go - in fact
>>> we do call the tree oracle from all its callers so we only ever do redundant
>>> work (after your proposed patch even more so).
>>
>> Not clear if the tree oracle can catch the above case with *p and a, but, yes,
>> nonoverlapping_component_refs_p should go in the long term.
>>
>>
>>         * alias.c (nonoverlapping_component_refs_p): Protect again LTO quirk.
>>         * tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p): New.
>>         (decl_refs_may_alias_p): Add REF1 and REF2 parameters.
>>         Use nonoverlapping_component_refs_of_decl_p to disambiguate component
>>         references.
>>         (refs_may_alias_p_1): Adjust call to decl_refs_may_alias_p.
>>         * tree-streamer.c (record_common_node): Adjust reference in comment.
>
> Index: alias.c
> ===================================================================
> --- alias.c     (revision 197926)
> +++ alias.c     (working copy)
> @@ -2232,8 +2232,11 @@ nonoverlapping_component_refs_p (const_r
>
>      found:
>        /* If we're left with accessing different fields of a structure, then no
> -        possible overlap, unless they are both bitfields.  */
> -      if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
> +        possible overlap, unless they are both bitfields.
> +        ??? Pointer inequality is too fragile in the LTO compiler.  */
> +      if (TREE_CODE (typex) == RECORD_TYPE
> +         && fieldx != fieldy
> +         && DECL_NAME (fieldx) != DECL_NAME (fieldy))
>
> this, if at all, should go in with a separate patch and a testcase.
> And I think it should _not_ go in.  Instead, as the case passes
>
>               if (typex == typey)
>                 goto found;
>
> earlier you should assert that DECL_CONTEXT (fieldx) == DECL_CONTEXT
> (fieldy) == typex == typey here.  Note that fails of this test are
> expected even in the
> non-LTO case because I cannot find any IL verification that would verify
> that for a COMPONENT_REF TREE_TYPE (TREE_OPERAND (cr, 0))
> == DECL_CONTEXT (TREE_OPERAND (cr, 1)) (due to sharing of the FIELD_DECL
> chain between different type variants the check will fail for all
> non-main-variants
> I think, so refining it to look at the main variant is probably advised).
>
> Otoh...
>
> +      /* ??? We cannot simply use the type of operand #0 of the refs here
> +        as the Fortran compiler smuggles type punning into COMPONENT_REFs
> +        for common blocks instead of using unions like everyone else.  */
> +      tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
> +      tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
> +
> +      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
> +        goto may_overlap;
> +
> +      /* ??? Pointer inequality is too fragile in the LTO compiler.  */
> +      if (field1 != field2 && DECL_NAME (field1) != DECL_NAME (field2))
>
> this suggests you are seeing multiple FIELD_DECLs for the same field
> in the _same_ FIELD_DECL chain ...?!  Are you sure this happens with
> GCC 4.8?  There were some fixes in that area in the LTO type merging code.
>
> Index: tree-streamer.c
> ===================================================================
> --- tree-streamer.c     (revision 197926)
> +++ tree-streamer.c     (working copy)
> @@ -267,10 +267,9 @@ record_common_node (struct streamer_tree
>        /* The FIELD_DECLs of structures should be shared, so that every
>          COMPONENT_REF uses the same tree node when referencing a field.
>          Pointer equality between FIELD_DECLs is used by the alias
> -        machinery to compute overlapping memory references (See
> -        nonoverlapping_component_refs_p).  */
> -      tree f;
> -      for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
> +        machinery to compute overlapping component references (see
> +        nonoverlapping_component_refs_of_decl_p).  */
> +      for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
>         record_common_node (cache, f);
>      }
>  }
>
> without actually removing nonoverlapping_component_refs_p it still applies
> to both...
>
>
> Can you port the non-quadratic algorithm to the RTL
> nonoverlapping_component_refs_p first?  The core code should exactly
> be the same ... (and shared, and only called once from the RTL oracle).
>
> +  /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same
> +     rank.  This is sufficient because you cannot reference several fields
> +     at a time (unlike for arrays with slices), unless you're in a union,
> +     in which case the return value will be false in any case.  */
> +  while (true)
>
> I don't understand the "reference several fields" comment.  Because I
> can clearly have aggregate copies of RECORD_TYPE.  Can you try
> do elaborate more on why the algorithm should be sufficient to catch
> all cases?
>
> You could enhance it to not require
>
> +  /* We must have the same base DECL.  */
> +  gcc_assert (ref1 == ref2);
>
> for MEM_REF bases under the same conditions like aliasing_component_refs_p,
> that is if the MEM_REF isn't view-converting.
>
> That said, if the bases are the same DECLs then indeed you do not
> rely on TBAA.  The RTL nonoverlapping_component_refs_p does not
> disable itself properly for pointer based accesses that might be
> view-converted / aliased accesses (a simple testcase with ref-all
> pointers properly offsetted to alias two adjacent fields may be enough to
> show that).

Testcase:

struct S {
    int i;
    int j;
};
struct U {
    struct S s;
} __attribute__((may_alias));
int __attribute__((noinline,noclone))
foo (struct U *p, struct U *q)
{
  int i;
  q->s.j = 1;
  i = p->s.i;
  return i;
}
int main()
{
  int *p = (int *)__builtin_alloca (sizeof (int) * 3);
  p[1] = 0;
  if (foo ((struct U *)(p + 1), (struct U *)p) != 1)
    __builtin_abort ();
  return 0;
}

fails on x86_64 with -O2 -fschedule-insns because scheduling exchanges
the store and the load.

Richard.

>
> Also with your patch enhanced like I suggest we should be able to
> remove aliasing_component_refs_p as well, no?
>
> Thanks,
> Richard.
>
>>
>> --
>> Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-04-15 12:18     ` Richard Biener
  2013-04-15 14:44       ` Richard Biener
@ 2013-04-16 12:31       ` Eric Botcazou
  2013-04-16 16:47         ` Richard Biener
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2013-04-16 12:31 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5910 bytes --]

> Note that looking at the access path _is_ assuming TBAA constraints as
> soon as the base objects are not the same (in the above case '*p' and 'a'
> are not the same and p could alias a in a way that all f1 and f2 overlap).

Right, but here I'm assuming (and asserting) that the base objects are the 
same.

> Index: alias.c
> ===================================================================
> --- alias.c     (revision 197926)
> +++ alias.c     (working copy)
> @@ -2232,8 +2232,11 @@ nonoverlapping_component_refs_p (const_r
> 
>      found:
>        /* If we're left with accessing different fields of a structure, then
> no -        possible overlap, unless they are both bitfields.  */
> -      if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
> +        possible overlap, unless they are both bitfields.
> +        ??? Pointer inequality is too fragile in the LTO compiler.  */
> +      if (TREE_CODE (typex) == RECORD_TYPE
> +         && fieldx != fieldy
> +         && DECL_NAME (fieldx) != DECL_NAME (fieldy))
> 
> this, if at all, should go in with a separate patch and a testcase.
> And I think it should _not_ go in.

OK, I can remove the LTO related bits.

> Otoh...
> 
> +      /* ??? We cannot simply use the type of operand #0 of the refs here
> +        as the Fortran compiler smuggles type punning into COMPONENT_REFs
> +        for common blocks instead of using unions like everyone else.  */
> +      tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
> +      tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
> +
> +      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
> +        goto may_overlap;
> +
> +      /* ??? Pointer inequality is too fragile in the LTO compiler.  */
> +      if (field1 != field2 && DECL_NAME (field1) != DECL_NAME (field2))
> 
> this suggests you are seeing multiple FIELD_DECLs for the same field
> in the _same_ FIELD_DECL chain ...?!  Are you sure this happens with
> GCC 4.8?  There were some fixes in that area in the LTO type merging code.

No, let's drop the LTO related bits for mainline.  But the Fortran related 
bits are necessary, it's the verification you talked about earlier: for a 
COMPONENT_REF

  TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (t, 0)))
    = TYPE_MAIN_VARIANT (DECL_CONTEXT (TREE_OPERAND (t, 1)))

is expected to be always true (but isn't checked as you pointed out).  But the 
Fortran compiler violates it (4.9, gfortran.dg/aliasing_array_result_1.f90) as 
it implements a common block by defining 2 RECORD_TYPEs with 1 field, one for 
each variables, and does an implicit type conversion of TREE_OPERAND (t, 0).

> Index: tree-streamer.c
> ===================================================================
> --- tree-streamer.c     (revision 197926)
> +++ tree-streamer.c     (working copy)
> @@ -267,10 +267,9 @@ record_common_node (struct streamer_tree
>        /* The FIELD_DECLs of structures should be shared, so that every
>          COMPONENT_REF uses the same tree node when referencing a field.
>          Pointer equality between FIELD_DECLs is used by the alias
> -        machinery to compute overlapping memory references (See
> -        nonoverlapping_component_refs_p).  */
> -      tree f;
> -      for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
> +        machinery to compute overlapping component references (see
> +        nonoverlapping_component_refs_of_decl_p).  */
> +      for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
>         record_common_node (cache, f);
>      }
>  }
> 
> without actually removing nonoverlapping_component_refs_p it still applies
> to both...

Will adjust.

> Can you port the non-quadratic algorithm to the RTL
> nonoverlapping_component_refs_p first?  The core code should exactly
> be the same ... (and shared, and only called once from the RTL oracle).

No, it cannot be ported, the non-quadratic aspect is a consequence of the same 
base object assumption.  It you don't have it, you need to be quadratic to be 
able to deal with variable offsets in this way.

> I don't understand the "reference several fields" comment.  Because I
> can clearly have aggregate copies of RECORD_TYPE.  Can you try
> do elaborate more on why the algorithm should be sufficient to catch
> all cases?

I can, but you need to keep in mind that the base objects are the same.

> You could enhance it to not require
> 
> +  /* We must have the same base DECL.  */
> +  gcc_assert (ref1 == ref2);
> 
> for MEM_REF bases under the same conditions like aliasing_component_refs_p,
> that is if the MEM_REF isn't view-converting.

The code already handles MEM_REF<ADDR_EXPR> though (and checks that the offset 
is zero).  I'm not sure that we need more (but ready to be proved wrong here).

> That said, if the bases are the same DECLs then indeed you do not
> rely on TBAA.  The RTL nonoverlapping_component_refs_p does not
> disable itself properly for pointer based accesses that might be
> view-converted / aliased accesses (a simple testcase with ref-all
> pointers properly offsetted to alias two adjacent fields may be enough to
> show that).
> 
> Also with your patch enhanced like I suggest we should be able to
> remove aliasing_component_refs_p as well, no?

My revised patch is only a safe, non-quadratic, non-TBAA based disambiguation 
for references based on the same DECL, so it's a full replacement neither for 
aliasing_component_refs_p nor for nonoverlapping_component_refs_p.

Thanks for the review.  Updated patch attached.


        * tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p): New.
        (decl_refs_may_alias_p): Add REF1 and REF2 parameters.
        Use nonoverlapping_component_refs_of_decl_p to disambiguate component
        references.
        (refs_may_alias_p_1): Adjust call to decl_refs_may_alias_p.
        * tree-streamer.c (record_common_node): Adjust reference in comment.


-- 
Eric Botcazou

[-- Attachment #2: p3.diff --]
[-- Type: text/x-patch, Size: 5949 bytes --]

Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c	(revision 197926)
+++ tree-ssa-alias.c	(working copy)
@@ -719,14 +719,112 @@ aliasing_component_refs_p (tree ref1,
   return false;
 }
 
+/* Return true if we can determine that component references REF1 and REF2,
+   that are within a common DECL, cannot overlap.  */
+
+static bool
+nonoverlapping_component_refs_of_decl_p (tree ref1, tree ref2)
+{
+  vec<tree, va_stack> component_refs1;
+  vec<tree, va_stack> component_refs2;
+
+  vec_stack_alloc (tree, component_refs1, 16);
+  vec_stack_alloc (tree, component_refs2, 16);
+
+  /* Create the stack of handled components for REF1.  */
+  while (handled_component_p (ref1))
+    {
+      component_refs1.safe_push (ref1);
+      ref1 = TREE_OPERAND (ref1, 0);
+    }
+  if (TREE_CODE (ref1) == MEM_REF)
+    {
+      if (!integer_zerop (TREE_OPERAND (ref1, 1)))
+	goto may_overlap;
+      ref1 = TREE_OPERAND (TREE_OPERAND (ref1, 0), 0);
+    }
+
+  /* Create the stack of handled components for REF2.  */
+  while (handled_component_p (ref2))
+    {
+      component_refs2.safe_push (ref2);
+      ref2 = TREE_OPERAND (ref2, 0);
+    }
+  if (TREE_CODE (ref2) == MEM_REF)
+    {
+      if (!integer_zerop (TREE_OPERAND (ref2, 1)))
+	goto may_overlap;
+      ref2 = TREE_OPERAND (TREE_OPERAND (ref2, 0), 0);
+    }
+
+  /* We must have the same base DECL.  */
+  gcc_assert (ref1 == ref2);
+
+  /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same
+     rank.  This is sufficient because we start from the same DECL and you
+     cannot reference several fields at a time with COMPONENT_REFs (unlike
+     with ARRAY_RANGE_REFs for arrays) so you always need the same number
+     of them to access a sub-component, unless you're in a union, in which
+     case the return value will precisely be false.  */
+  while (true)
+    {
+      do
+	{
+	  if (component_refs1.is_empty ())
+	    goto may_overlap;
+	  ref1 = component_refs1.pop ();
+	}
+      while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref1, 0))));
+
+      do
+	{
+	  if (component_refs2.is_empty ())
+	     goto may_overlap;
+	  ref2 = component_refs2.pop ();
+	}
+      while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref2, 0))));
+
+      /* Beware of BIT_FIELD_REF.  */
+      if (TREE_CODE (ref1) != COMPONENT_REF
+	  || TREE_CODE (ref2) != COMPONENT_REF)
+	goto may_overlap;
+
+      tree field1 = TREE_OPERAND (ref1, 1);
+      tree field2 = TREE_OPERAND (ref2, 1);
+
+      /* ??? We cannot simply use the type of operand #0 of the refs here
+	 as the Fortran compiler smuggles type punning into COMPONENT_REFs
+	 for common blocks instead of using unions like everyone else.  */
+      tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
+      tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
+
+      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
+	 goto may_overlap;
+
+      /* Different fields of the same RECORD_TYPE cannot overlap.  */
+      if (field1 != field2)
+	{
+	  component_refs1.release ();
+	  component_refs2.release ();
+	  return true;
+	}
+    }
+
+may_overlap:
+  component_refs1.release ();
+  component_refs2.release ();
+  return false;
+}
+
 /* Return true if two memory references based on the variables BASE1
    and BASE2 constrained to [OFFSET1, OFFSET1 + MAX_SIZE1) and
-   [OFFSET2, OFFSET2 + MAX_SIZE2) may alias.  */
+   [OFFSET2, OFFSET2 + MAX_SIZE2) may alias.  REF1 and REF2
+   if non-NULL are the complete memory reference trees.  */
 
 static bool
-decl_refs_may_alias_p (tree base1,
+decl_refs_may_alias_p (tree ref1, tree base1,
 		       HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1,
-		       tree base2,
+		       tree ref2, tree base2,
 		       HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2)
 {
   gcc_checking_assert (DECL_P (base1) && DECL_P (base2));
@@ -737,7 +835,17 @@ decl_refs_may_alias_p (tree base1,
 
   /* If both references are based on the same variable, they cannot alias if
      the accesses do not overlap.  */
-  return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
+  if (!ranges_overlap_p (offset1, max_size1, offset2, max_size2))
+    return false;
+
+  /* For components with variable position, the above test isn't sufficient,
+     so we disambiguate component references manually.  */
+  if (ref1 && ref2
+      && handled_component_p (ref1) && handled_component_p (ref2)
+      && nonoverlapping_component_refs_of_decl_p (ref1, ref2))
+    return false;
+
+  return true;     
 }
 
 /* Return true if an indirect reference based on *PTR1 constrained
@@ -1086,8 +1194,8 @@ refs_may_alias_p_1 (ao_ref *ref1, ao_ref
   var1_p = DECL_P (base1);
   var2_p = DECL_P (base2);
   if (var1_p && var2_p)
-    return decl_refs_may_alias_p (base1, offset1, max_size1,
-				  base2, offset2, max_size2);
+    return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
+				  ref2->ref, base2, offset2, max_size2);
 
   ind1_p = (TREE_CODE (base1) == MEM_REF
 	    || TREE_CODE (base1) == TARGET_MEM_REF);
Index: tree-streamer.c
===================================================================
--- tree-streamer.c	(revision 197926)
+++ tree-streamer.c	(working copy)
@@ -267,10 +267,10 @@ record_common_node (struct streamer_tree
       /* The FIELD_DECLs of structures should be shared, so that every
 	 COMPONENT_REF uses the same tree node when referencing a field.
 	 Pointer equality between FIELD_DECLs is used by the alias
-	 machinery to compute overlapping memory references (See
-	 nonoverlapping_component_refs_p).  */
-      tree f;
-      for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
+	 machinery to compute overlapping component references (see
+	 nonoverlapping_component_refs_p and
+	 nonoverlapping_component_refs_of_decl_p).  */
+      for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
 	record_common_node (cache, f);
     }
 }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-04-16 12:31       ` Eric Botcazou
@ 2013-04-16 16:47         ` Richard Biener
  2013-04-17 13:17           ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Biener @ 2013-04-16 16:47 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Tue, Apr 16, 2013 at 11:55 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Note that looking at the access path _is_ assuming TBAA constraints as
>> soon as the base objects are not the same (in the above case '*p' and 'a'
>> are not the same and p could alias a in a way that all f1 and f2 overlap).
>
> Right, but here I'm assuming (and asserting) that the base objects are the
> same.
>
>> Index: alias.c
>> ===================================================================
>> --- alias.c     (revision 197926)
>> +++ alias.c     (working copy)
>> @@ -2232,8 +2232,11 @@ nonoverlapping_component_refs_p (const_r
>>
>>      found:
>>        /* If we're left with accessing different fields of a structure, then
>> no -        possible overlap, unless they are both bitfields.  */
>> -      if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
>> +        possible overlap, unless they are both bitfields.
>> +        ??? Pointer inequality is too fragile in the LTO compiler.  */
>> +      if (TREE_CODE (typex) == RECORD_TYPE
>> +         && fieldx != fieldy
>> +         && DECL_NAME (fieldx) != DECL_NAME (fieldy))
>>
>> this, if at all, should go in with a separate patch and a testcase.
>> And I think it should _not_ go in.
>
> OK, I can remove the LTO related bits.
>
>> Otoh...
>>
>> +      /* ??? We cannot simply use the type of operand #0 of the refs here
>> +        as the Fortran compiler smuggles type punning into COMPONENT_REFs
>> +        for common blocks instead of using unions like everyone else.  */
>> +      tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
>> +      tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
>> +
>> +      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
>> +        goto may_overlap;
>> +
>> +      /* ??? Pointer inequality is too fragile in the LTO compiler.  */
>> +      if (field1 != field2 && DECL_NAME (field1) != DECL_NAME (field2))
>>
>> this suggests you are seeing multiple FIELD_DECLs for the same field
>> in the _same_ FIELD_DECL chain ...?!  Are you sure this happens with
>> GCC 4.8?  There were some fixes in that area in the LTO type merging code.
>
> No, let's drop the LTO related bits for mainline.  But the Fortran related
> bits are necessary, it's the verification you talked about earlier: for a
> COMPONENT_REF
>
>   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (t, 0)))
>     = TYPE_MAIN_VARIANT (DECL_CONTEXT (TREE_OPERAND (t, 1)))
>
> is expected to be always true (but isn't checked as you pointed out).  But the
> Fortran compiler violates it (4.9, gfortran.dg/aliasing_array_result_1.f90) as
> it implements a common block by defining 2 RECORD_TYPEs with 1 field, one for
> each variables, and does an implicit type conversion of TREE_OPERAND (t, 0).
>
>> Index: tree-streamer.c
>> ===================================================================
>> --- tree-streamer.c     (revision 197926)
>> +++ tree-streamer.c     (working copy)
>> @@ -267,10 +267,9 @@ record_common_node (struct streamer_tree
>>        /* The FIELD_DECLs of structures should be shared, so that every
>>          COMPONENT_REF uses the same tree node when referencing a field.
>>          Pointer equality between FIELD_DECLs is used by the alias
>> -        machinery to compute overlapping memory references (See
>> -        nonoverlapping_component_refs_p).  */
>> -      tree f;
>> -      for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
>> +        machinery to compute overlapping component references (see
>> +        nonoverlapping_component_refs_of_decl_p).  */
>> +      for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
>>         record_common_node (cache, f);
>>      }
>>  }
>>
>> without actually removing nonoverlapping_component_refs_p it still applies
>> to both...
>
> Will adjust.
>
>> Can you port the non-quadratic algorithm to the RTL
>> nonoverlapping_component_refs_p first?  The core code should exactly
>> be the same ... (and shared, and only called once from the RTL oracle).
>
> No, it cannot be ported, the non-quadratic aspect is a consequence of the same
> base object assumption.  It you don't have it, you need to be quadratic to be
> able to deal with variable offsets in this way.
>
>> I don't understand the "reference several fields" comment.  Because I
>> can clearly have aggregate copies of RECORD_TYPE.  Can you try
>> do elaborate more on why the algorithm should be sufficient to catch
>> all cases?
>
> I can, but you need to keep in mind that the base objects are the same.
>
>> You could enhance it to not require
>>
>> +  /* We must have the same base DECL.  */
>> +  gcc_assert (ref1 == ref2);
>>
>> for MEM_REF bases under the same conditions like aliasing_component_refs_p,
>> that is if the MEM_REF isn't view-converting.
>
> The code already handles MEM_REF<ADDR_EXPR> though (and checks that the offset
> is zero).  I'm not sure that we need more (but ready to be proved wrong here).
>
>> That said, if the bases are the same DECLs then indeed you do not
>> rely on TBAA.  The RTL nonoverlapping_component_refs_p does not
>> disable itself properly for pointer based accesses that might be
>> view-converted / aliased accesses (a simple testcase with ref-all
>> pointers properly offsetted to alias two adjacent fields may be enough to
>> show that).
>>
>> Also with your patch enhanced like I suggest we should be able to
>> remove aliasing_component_refs_p as well, no?
>
> My revised patch is only a safe, non-quadratic, non-TBAA based disambiguation
> for references based on the same DECL, so it's a full replacement neither for
> aliasing_component_refs_p nor for nonoverlapping_component_refs_p.

Hmm, ok.  Sad :/

> Thanks for the review.  Updated patch attached.

+      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
+        goto may_overlap;

ick, can TREE_CODE (type1) != RECORD_TYPE happen as well here?
Please add a comment similar to the Fortran ??? above.

Can you please also add at least one testcase as gcc.dg/tree-ssa/ssa-fre-??.c
that tests the functionality of this and that wasn't handled before?
I suppose it
would be sth like

struct S { int i; int j; };
struct U
{
  struct S a[10];
} u;

u.a[n].i= i;
u.a[n].j = j;
return u.a[n].i;

where we miss to CSE the load from u.a[n].i.

Otherwise the patch is ok.

Thanks!
Richard.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] Fix ICE during RTL expansion at -O1
  2013-04-16 16:47         ` Richard Biener
@ 2013-04-17 13:17           ` Eric Botcazou
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Botcazou @ 2013-04-17 13:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

> +      if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
> +        goto may_overlap;
> 
> ick, can TREE_CODE (type1) != RECORD_TYPE happen as well here?
> Please add a comment similar to the Fortran ??? above.

It can happen because we stop at unions (and qualified unions) and for them we 
cannot disambiguate based on the fields.  I'll add a regular comment.

> Can you please also add at least one testcase as
> gcc.dg/tree-ssa/ssa-fre-??.c that tests the functionality of this and that
> wasn't handled before? I suppose it would be sth like
> 
> struct S { int i; int j; };
> struct U
> {
>   struct S a[10];
> } u;
> 
> u.a[n].i= i;
> u.a[n].j = j;
> return u.a[n].i;
> 
> where we miss to CSE the load from u.a[n].i.

Yes, the patch does eliminate the redundant load in .fre1:

  u.a[n_2(D)].i = i_3(D);
  u.a[n_2(D)].j = j_5(D);
  _7 = u.a[n_2(D)].i;
  return _7;

becomes:

  u.a[n_2(D)].i = i_3(D);
  u.a[n_2(D)].j = j_5(D);
  _7 = i_3(D);
  return _7;

> Otherwise the patch is ok.

Thanks.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-04-17  9:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-21 16:27 [patch] Fix ICE during RTL expansion at -O1 Eric Botcazou
2013-03-22  9:44 ` Richard Biener
2013-04-14 13:34   ` Eric Botcazou
2013-04-15 12:18     ` Richard Biener
2013-04-15 14:44       ` Richard Biener
2013-04-16 12:31       ` Eric Botcazou
2013-04-16 16:47         ` Richard Biener
2013-04-17 13:17           ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).