public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Alias analysis of zero sized arrays
@ 2017-04-26  6:34 Steve Ellcey
  2017-04-26 10:05 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Ellcey @ 2017-04-26  6:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

This patch changes get_ref_base_and_extent to treat zero sized arrays
like C99 flexible arrays and assume that references to zero sized
arrays can also be made beyond the 'end' of the array.  C99 flexible
arrays are recognized by not having an INTEGER_CST limit/size and the
routine then sets maxsize to -1 for special handling.  This patch
checks for a value of 0 when we do have an INTEGER_CST limit/size and
handles it like a flexible array.

Tested with a bootstrap on aarch64 and a GCC test run with no
regressions.

I did not create a testcase because the test I have (attached) is not
runnable.  If you compile this test case for aarch64 with
'-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two
stores in the main loop.  In the original large program that this came
from, that generated bad code.  If compiled with -DFLEX instead of
-UFLEX, you get load, store, load, store and that was working.  With
this patch, both versions (-UFLEX and -DFLEX) generate the load, store,
load, store sequence.

OK for checkin?

Steve Ellcey
sellcey@cavium.com


2017-04-25  Steve Ellcey  <sellcey@cavium.com>

	* tree-dfa.c (get_ref_base_and_extent): Treat zero size array like
	a C99 flexible array.


[-- Attachment #2: alias.patch --]
[-- Type: text/x-patch, Size: 977 bytes --]

diff --git a/gcc/tree-dfa.c b/gcc/tree-dfa.c
index 8ee46dc..79c3489 100644
--- a/gcc/tree-dfa.c
+++ b/gcc/tree-dfa.c
@@ -438,6 +438,10 @@ get_ref_base_and_extent (tree exp, HOST_WIDE_INT *poffset,
       && TREE_CODE (size_tree) == INTEGER_CST)
     bitsize = wi::to_offset (size_tree);
 
+  /* If zero sized array, treat it like a flexible array.  */
+  if (bitsize == 0)
+    bitsize = -1;
+
   /* Initially, maxsize is the same as the accessed element size.
      In the following it will only grow (or become -1).  */
   maxsize = bitsize;
@@ -504,7 +508,13 @@ get_ref_base_and_extent (tree exp, HOST_WIDE_INT *poffset,
 		if (maxsize != -1
 		    && csize
 		    && TREE_CODE (csize) == INTEGER_CST)
-		  maxsize = wi::to_offset (csize) - bit_offset;
+		  {
+		    maxsize = wi::to_offset (csize) - bit_offset;
+
+		    /* If zero sized array, treat it like a flexible array.  */
+		    if (maxsize == 0)
+		      maxsize = -1;
+		  }
 		else
 		  maxsize = -1;
 	      }

[-- Attachment #3: test.c --]
[-- Type: text/x-csrc, Size: 430 bytes --]

struct q {
	int b;
};
struct r {
   int n;
   struct q slot[0];
};
struct s {
   int n;
#ifdef FLEX
 long int o[];
#else
 long int o[0];
#endif
};
extern int x, y, m;
extern struct s *a;
extern struct r *b;
extern void bar();
int foo() {
   int i,j;
   for (i = 0; i < m; i++) {
   	a->o[i] = sizeof(*a);
   	b = ((struct r *)(((char *)a) + a->o[a->n]));
	for (j = 0; j < 10; j++) {
		b->slot[j].b = 0;
   	}
        bar();
  }
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Alias analysis of zero sized arrays
  2017-04-26  6:34 [PATCH] Alias analysis of zero sized arrays Steve Ellcey
@ 2017-04-26 10:05 ` Richard Biener
  2017-04-26 10:07   ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2017-04-26 10:05 UTC (permalink / raw)
  To: sellcey; +Cc: gcc-patches

On Wed, Apr 26, 2017 at 12:27 AM, Steve Ellcey <sellcey@cavium.com> wrote:
> This patch changes get_ref_base_and_extent to treat zero sized arrays
> like C99 flexible arrays and assume that references to zero sized
> arrays can also be made beyond the 'end' of the array.  C99 flexible
> arrays are recognized by not having an INTEGER_CST limit/size and the
> routine then sets maxsize to -1 for special handling.  This patch
> checks for a value of 0 when we do have an INTEGER_CST limit/size and
> handles it like a flexible array.
>
> Tested with a bootstrap on aarch64 and a GCC test run with no
> regressions.
>
> I did not create a testcase because the test I have (attached) is not
> runnable.  If you compile this test case for aarch64 with
> '-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two
> stores in the main loop.  In the original large program that this came
> from, that generated bad code.  If compiled with -DFLEX instead of
> -UFLEX, you get load, store, load, store and that was working.  With
> this patch, both versions (-UFLEX and -DFLEX) generate the load, store,
> load, store sequence.
>
> OK for checkin?

Huh.  This doesn't look like the correct fix for anything.

Ah, so the issue is that we prune MEM_EXPR of MEMs off ARRAY_REFs
and end up with a MEM_EXPR of a.0_1->o which has zero size.  Then
we run into

  /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
     is conservative, so trust it.  */
  if (!MEM_OFFSET_KNOWN_P (mem)
      || !MEM_SIZE_KNOWN_P (mem))
    return true;

but MEM_SIZE_KNOWN_P is true so we miss to do

  /* Refine size and offset we got from analyzing MEM_EXPR by using
     MEM_SIZE and MEM_OFFSET.  */

  ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
  ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;

thus a nearly minimal fix would be (looks like some other sanity checks
might better be moved before the early "trust it" out):

Index: gcc/alias.c
===================================================================
--- gcc/alias.c (revision 247273)
+++ gcc/alias.c (working copy)
@@ -322,6 +322,13 @@ ao_ref_from_mem (ao_ref *ref, const_rtx

   ref->ref_alias_set = MEM_ALIAS_SET (mem);

+  /* Refine size and offset we got from analyzing MEM_EXPR by using
+     MEM_SIZE and MEM_OFFSET.  */
+  if (MEM_OFFSET_KNOWN_P (mem))
+    ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
+  if (MEM_SIZE_KNOWN_P (mem))
+    ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
+
   /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
      is conservative, so trust it.  */
   if (!MEM_OFFSET_KNOWN_P (mem)
@@ -336,12 +343,6 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
              > ref->max_size)))
     ref->ref = NULL_TREE;

-  /* Refine size and offset we got from analyzing MEM_EXPR by using
-     MEM_SIZE and MEM_OFFSET.  */
-
-  ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
-  ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
-
   /* The MEM may extend into adjacent fields, so adjust max_size if
      necessary.  */
   if (ref->max_size != -1

note that in the end we might want to stop pruning MEM_EXPR so much...
(set_mem_attributes_minus_bitpos).  At least dropping component/array
refs in this case isn't conservative as the size zero access doesn't cover
the array ref stripped.  So arguably the bug is in
set_mem_attributes_minus_bitpos
itself.  Thus

Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c      (revision 247273)
+++ gcc/emit-rtl.c      (working copy)
@@ -1954,7 +1954,9 @@ set_mem_attributes_minus_bitpos (rtx ref
          while (TREE_CODE (t2) == ARRAY_REF);

          if (DECL_P (t2)
-             || TREE_CODE (t2) == COMPONENT_REF)
+             || (TREE_CODE (t2) == COMPONENT_REF
+                 && (! DECL_SIZE (TREE_OPERAND (t2, 1))
+                     || ! integer_zerop (DECL_SIZE (TREE_OPERAND (t2, 1))))))
            {
              attrs.expr = t2;
              attrs.offset_known_p = false;

is an alternative / additional fix (OTOH for all trailing arrays this
isn't really a
conservative MEM_EXPR, not just for size zero ones).  Thus

Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c      (revision 247273)
+++ gcc/emit-rtl.c      (working copy)
@@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
          while (TREE_CODE (t2) == ARRAY_REF);

          if (DECL_P (t2)
-             || TREE_CODE (t2) == COMPONENT_REF)
+             || (TREE_CODE (t2) == COMPONENT_REF
+                 /* For trailing arrays t2 doesn't have a size that
+                    covers all valid accesses.  */
+                 && ! array_at_struct_end_p (t, false)))
            {
              attrs.expr = t2;
              attrs.offset_known_p = false;

is probably better.

Richard.

> Steve Ellcey
> sellcey@cavium.com
>
>
> 2017-04-25  Steve Ellcey  <sellcey@cavium.com>
>
>         * tree-dfa.c (get_ref_base_and_extent): Treat zero size array like
>         a C99 flexible array.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Alias analysis of zero sized arrays
  2017-04-26 10:05 ` Richard Biener
@ 2017-04-26 10:07   ` Richard Biener
  2017-04-26 19:09     ` Steve Ellcey
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2017-04-26 10:07 UTC (permalink / raw)
  To: sellcey; +Cc: gcc-patches

On Wed, Apr 26, 2017 at 11:27 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 12:27 AM, Steve Ellcey <sellcey@cavium.com> wrote:
>> This patch changes get_ref_base_and_extent to treat zero sized arrays
>> like C99 flexible arrays and assume that references to zero sized
>> arrays can also be made beyond the 'end' of the array.  C99 flexible
>> arrays are recognized by not having an INTEGER_CST limit/size and the
>> routine then sets maxsize to -1 for special handling.  This patch
>> checks for a value of 0 when we do have an INTEGER_CST limit/size and
>> handles it like a flexible array.
>>
>> Tested with a bootstrap on aarch64 and a GCC test run with no
>> regressions.
>>
>> I did not create a testcase because the test I have (attached) is not
>> runnable.  If you compile this test case for aarch64 with
>> '-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two
>> stores in the main loop.  In the original large program that this came
>> from, that generated bad code.  If compiled with -DFLEX instead of
>> -UFLEX, you get load, store, load, store and that was working.  With
>> this patch, both versions (-UFLEX and -DFLEX) generate the load, store,
>> load, store sequence.
>>
>> OK for checkin?
>
> Huh.  This doesn't look like the correct fix for anything.
>
> Ah, so the issue is that we prune MEM_EXPR of MEMs off ARRAY_REFs
> and end up with a MEM_EXPR of a.0_1->o which has zero size.  Then
> we run into
>
>   /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
>      is conservative, so trust it.  */
>   if (!MEM_OFFSET_KNOWN_P (mem)
>       || !MEM_SIZE_KNOWN_P (mem))
>     return true;
>
> but MEM_SIZE_KNOWN_P is true so we miss to do
>
>   /* Refine size and offset we got from analyzing MEM_EXPR by using
>      MEM_SIZE and MEM_OFFSET.  */
>
>   ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
>   ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
>
> thus a nearly minimal fix would be (looks like some other sanity checks
> might better be moved before the early "trust it" out):
>
> Index: gcc/alias.c
> ===================================================================
> --- gcc/alias.c (revision 247273)
> +++ gcc/alias.c (working copy)
> @@ -322,6 +322,13 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
>
>    ref->ref_alias_set = MEM_ALIAS_SET (mem);
>
> +  /* Refine size and offset we got from analyzing MEM_EXPR by using
> +     MEM_SIZE and MEM_OFFSET.  */
> +  if (MEM_OFFSET_KNOWN_P (mem))
> +    ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
> +  if (MEM_SIZE_KNOWN_P (mem))
> +    ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
> +
>    /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
>       is conservative, so trust it.  */
>    if (!MEM_OFFSET_KNOWN_P (mem)
> @@ -336,12 +343,6 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
>               > ref->max_size)))
>      ref->ref = NULL_TREE;
>
> -  /* Refine size and offset we got from analyzing MEM_EXPR by using
> -     MEM_SIZE and MEM_OFFSET.  */
> -
> -  ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
> -  ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
> -
>    /* The MEM may extend into adjacent fields, so adjust max_size if
>       necessary.  */
>    if (ref->max_size != -1

Actually this change is wrong!  Leaves the one below as the fix.

> note that in the end we might want to stop pruning MEM_EXPR so much...
> (set_mem_attributes_minus_bitpos).  At least dropping component/array
> refs in this case isn't conservative as the size zero access doesn't cover
> the array ref stripped.  So arguably the bug is in
> set_mem_attributes_minus_bitpos
> itself.  Thus
>
> Index: gcc/emit-rtl.c
> ===================================================================
> --- gcc/emit-rtl.c      (revision 247273)
> +++ gcc/emit-rtl.c      (working copy)
> @@ -1954,7 +1954,9 @@ set_mem_attributes_minus_bitpos (rtx ref
>           while (TREE_CODE (t2) == ARRAY_REF);
>
>           if (DECL_P (t2)
> -             || TREE_CODE (t2) == COMPONENT_REF)
> +             || (TREE_CODE (t2) == COMPONENT_REF
> +                 && (! DECL_SIZE (TREE_OPERAND (t2, 1))
> +                     || ! integer_zerop (DECL_SIZE (TREE_OPERAND (t2, 1))))))
>             {
>               attrs.expr = t2;
>               attrs.offset_known_p = false;
>
> is an alternative / additional fix (OTOH for all trailing arrays this
> isn't really a
> conservative MEM_EXPR, not just for size zero ones).  Thus
>
> Index: gcc/emit-rtl.c
> ===================================================================
> --- gcc/emit-rtl.c      (revision 247273)
> +++ gcc/emit-rtl.c      (working copy)
> @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
>           while (TREE_CODE (t2) == ARRAY_REF);
>
>           if (DECL_P (t2)
> -             || TREE_CODE (t2) == COMPONENT_REF)
> +             || (TREE_CODE (t2) == COMPONENT_REF
> +                 /* For trailing arrays t2 doesn't have a size that
> +                    covers all valid accesses.  */
> +                 && ! array_at_struct_end_p (t, false)))
>             {
>               attrs.expr = t2;
>               attrs.offset_known_p = false;
>
> is probably better.
>
> Richard.
>
>> Steve Ellcey
>> sellcey@cavium.com
>>
>>
>> 2017-04-25  Steve Ellcey  <sellcey@cavium.com>
>>
>>         * tree-dfa.c (get_ref_base_and_extent): Treat zero size array like
>>         a C99 flexible array.
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Alias analysis of zero sized arrays
  2017-04-26 10:07   ` Richard Biener
@ 2017-04-26 19:09     ` Steve Ellcey
  2017-04-26 20:08       ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Ellcey @ 2017-04-26 19:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Wed, 2017-04-26 at 11:33 +0200, Richard Biener wrote:
> > 
> > Index: gcc/emit-rtl.c
> > ===================================================================
> > --- gcc/emit-rtl.c      (revision 247273)
> > +++ gcc/emit-rtl.c      (working copy)
> > @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
> >           while (TREE_CODE (t2) == ARRAY_REF);
> > 
> >           if (DECL_P (t2)
> > -             || TREE_CODE (t2) == COMPONENT_REF)
> > +             || (TREE_CODE (t2) == COMPONENT_REF
> > +                 /* For trailing arrays t2 doesn't have a size that
> > +                    covers all valid accesses.  */
> > +                 && ! array_at_struct_end_p (t, false)))
> >             {
> >               attrs.expr = t2;
> >               attrs.offset_known_p = false;
> > 
> > is probably better.
> > 
> > Richard.

I tested this patch and it fixed my problem.  It does seem better than
my patch because now the [0] and [] versions of my test generate
exactly the same code.  With my patch they still generated different
code though both versions worked OK.

I did a full bootstrap and gcc testsuite run on aarch64 with no
regressions.  Will you check this in or do you want me to do a
submission and checkin?

Steve Ellcey
sellcey@cavium.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Alias analysis of zero sized arrays
  2017-04-26 19:09     ` Steve Ellcey
@ 2017-04-26 20:08       ` Richard Biener
  2017-04-26 21:59         ` Steve Ellcey
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2017-04-26 20:08 UTC (permalink / raw)
  To: sellcey, Steve Ellcey; +Cc: gcc-patches

On April 26, 2017 8:03:44 PM GMT+02:00, Steve Ellcey <sellcey@cavium.com> wrote:
>On Wed, 2017-04-26 at 11:33 +0200, Richard Biener wrote:
>> > 
>> > Index: gcc/emit-rtl.c
>> > ===================================================================
>> > --- gcc/emit-rtl.c      (revision 247273)
>> > +++ gcc/emit-rtl.c      (working copy)
>> > @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
>> >           while (TREE_CODE (t2) == ARRAY_REF);
>> > 
>> >           if (DECL_P (t2)
>> > -             || TREE_CODE (t2) == COMPONENT_REF)
>> > +             || (TREE_CODE (t2) == COMPONENT_REF
>> > +                 /* For trailing arrays t2 doesn't have a size
>that
>> > +                    covers all valid accesses.  */
>> > +                 && ! array_at_struct_end_p (t, false)))
>> >             {
>> >               attrs.expr = t2;
>> >               attrs.offset_known_p = false;
>> > 
>> > is probably better.
>> > 
>> > Richard.
>
>I tested this patch and it fixed my problem.  It does seem better than
>my patch because now the [0] and [] versions of my test generate
>exactly the same code.  With my patch they still generated different
>code though both versions worked OK.
>
>I did a full bootstrap and gcc testsuite run on aarch64 with no
>regressions.  Will you check this in or do you want me to do a
>submission and checkin?

I'll throw it on x86 testing and will commit it.  What branches are affected?  Can you open a bugreport?

I'll try to add a runtime testcase as well if you don't beat me to that.  On x86 it likely requires -fschedule-insns to trigger.

Richard.

>Steve Ellcey
>sellcey@cavium.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Alias analysis of zero sized arrays
  2017-04-26 20:08       ` Richard Biener
@ 2017-04-26 21:59         ` Steve Ellcey
  0 siblings, 0 replies; 6+ messages in thread
From: Steve Ellcey @ 2017-04-26 21:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Wed, 2017-04-26 at 20:52 +0200, Richard Biener wrote:
> 
> I'll throw it on x86 testing and will commit it.  What branches are
> affected?  Can you open a bugreport?
> 
> I'll try to add a runtime testcase as well if you don't beat me to
> that.  On x86 it likely requires -fschedule-insns to trigger.
> 
> Richard.

I created bug 80533.  The problem was found on the gcc-5 branch and is
also on the gcc-6 and gcc-7 branches.

Steve Ellcey

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-26 19:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-26  6:34 [PATCH] Alias analysis of zero sized arrays Steve Ellcey
2017-04-26 10:05 ` Richard Biener
2017-04-26 10:07   ` Richard Biener
2017-04-26 19:09     ` Steve Ellcey
2017-04-26 20:08       ` Richard Biener
2017-04-26 21:59         ` Steve Ellcey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).