public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch] Fix PR tree-optimization/70884
@ 2016-05-07 21:22 Eric Botcazou
  2016-05-09 10:02 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Botcazou @ 2016-05-07 21:22 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

Hi,

this is a tentative fix for the regression introduced in SRA by the machinery 
which deals with the constant pool.  initialize_constant_pool_replacements is 
supposed to issues new loads from the pool for scalarized variables, but it 
fails to do so for variables that are only partially scalarized.

Tested on PowerPC/Linux and x86-64/Linux, OK for mainline and 6 branch?


2016-05-07  Eric Botcazou  <ebotcazou@adacore.com>

	PR tree-optimization/70884
	* tree-sra.c (initialize_constant_pool_replacements): Process all the
	candidate variables.


2016-05-07  Eric Botcazou  <ebotcazou@adacore.com>

	* gcc.dg/pr70884.c: New test.

-- 
Eric Botcazou

[-- Attachment #2: pr70884.diff --]
[-- Type: text/x-patch, Size: 463 bytes --]

Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 235544)
+++ tree-sra.c	(working copy)
@@ -3559,8 +3559,6 @@ initialize_constant_pool_replacements (v
   unsigned i;
 
   EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
-    if (bitmap_bit_p (should_scalarize_away_bitmap, i)
-	&& !bitmap_bit_p (cannot_scalarize_away_bitmap, i))
       {
 	tree var = candidate (i);
 	if (!constant_decl_p (var))

[-- Attachment #3: pr70884.c --]
[-- Type: text/x-csrc, Size: 469 bytes --]

/* { dg-do run } */
/* { dg-options "-O -fno-tree-fre" } */

extern void abort (void);

typedef __UINTPTR_TYPE__ uintptr_t;

struct S { uintptr_t a; int i; };

static void __attribute__((noclone, noinline)) foo (struct S s)
{
  if (!s.a)
    abort ();
}

int i;

static void f1 (void)
{
  struct S s1 = { (uintptr_t) &i, 1 };
  foo (s1);
}

static void f2 (void)
{
  struct S s2 = { (uintptr_t) &i, 1 };
  foo (s2);
}

int main (void)
{
  f1 ();
  f2 ();
  return 0;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] Fix PR tree-optimization/70884
  2016-05-07 21:22 [patch] Fix PR tree-optimization/70884 Eric Botcazou
@ 2016-05-09 10:02 ` Richard Biener
  2016-05-13 11:01   ` Eric Botcazou
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2016-05-09 10:02 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: GCC Patches, alan.lawrence

On Sat, May 7, 2016 at 11:22 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> this is a tentative fix for the regression introduced in SRA by the machinery
> which deals with the constant pool.  initialize_constant_pool_replacements is
> supposed to issues new loads from the pool for scalarized variables, but it
> fails to do so for variables that are only partially scalarized.
>
> Tested on PowerPC/Linux and x86-64/Linux, OK for mainline and 6 branch?

Hmm, the patch looks obvious if it was the intent to allow constant
pool replacements
_not_ only when the whole constant pool entry may go away.  But I
think the intent was
to not do this otherwise it will generate worse code by forcing all
loads from the constant pool to appear at
function start.

So - the "real" issue may be a missing
should_scalarize_away_bitmap/cannot_scalarize_away_bitmap
check somewhere.

Alan?

Thanks,
Richrd.

>
> 2016-05-07  Eric Botcazou  <ebotcazou@adacore.com>
>
>         PR tree-optimization/70884
>         * tree-sra.c (initialize_constant_pool_replacements): Process all the
>         candidate variables.
>
>
> 2016-05-07  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * gcc.dg/pr70884.c: New test.
>
> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] Fix PR tree-optimization/70884
  2016-05-09 10:02 ` Richard Biener
@ 2016-05-13 11:01   ` Eric Botcazou
  2016-05-13 11:52     ` Richard Biener
  2016-05-13 15:24     ` Martin Jambor
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Botcazou @ 2016-05-13 11:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, alan.lawrence, Martin Jambor

> Hmm, the patch looks obvious if it was the intent to allow constant
> pool replacements
> _not_ only when the whole constant pool entry may go away.  But I
> think the intent was
> to not do this otherwise it will generate worse code by forcing all
> loads from the constant pool to appear at
> function start.

Do you mean when the whole constant pool entry is scalarized as opposed to 
partially scalarized?

> So - the "real" issue may be a missing
> should_scalarize_away_bitmap/cannot_scalarize_away_bitmap
> check somewhere.

This seems to work:

Index: tree-sra.c
===================================================================
--- tree-sra.c  (revision 236195)
+++ tree-sra.c  (working copy)
@@ -2680,6 +2680,10 @@ analyze_all_variable_accesses (void)
   EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
     {
       tree var = candidate (i);
+      if (constant_decl_p (var)
+         && (!bitmap_bit_p (should_scalarize_away_bitmap, i)
+             || bitmap_bit_p (cannot_scalarize_away_bitmap, i)))
+       continue;
       struct access *access;
 
       access = sort_and_splice_var_accesses (var);

but I have no idea whether this is correct or not.


Martin, are we sure to disable scalarization of constant_decl_p variables not 
covered by initialize_constant_pool_replacements that way?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] Fix PR tree-optimization/70884
  2016-05-13 11:01   ` Eric Botcazou
@ 2016-05-13 11:52     ` Richard Biener
  2016-05-13 15:24     ` Martin Jambor
  1 sibling, 0 replies; 7+ messages in thread
From: Richard Biener @ 2016-05-13 11:52 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: GCC Patches, alan.lawrence, Martin Jambor

On Fri, May 13, 2016 at 1:01 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Hmm, the patch looks obvious if it was the intent to allow constant
>> pool replacements
>> _not_ only when the whole constant pool entry may go away.  But I
>> think the intent was
>> to not do this otherwise it will generate worse code by forcing all
>> loads from the constant pool to appear at
>> function start.
>
> Do you mean when the whole constant pool entry is scalarized as opposed to
> partially scalarized?

When we scalarized all constant pool accesses (even if it is not fully
accessed).
The whole point was to be able to remove the constant pool entry later.

At least if I remember correctly ... (it should in the end do what we now do
at gimplification time but with better analysis on the cost/benefit).

>> So - the "real" issue may be a missing
>> should_scalarize_away_bitmap/cannot_scalarize_away_bitmap
>> check somewhere.
>
> This seems to work:
>
> Index: tree-sra.c
> ===================================================================
> --- tree-sra.c  (revision 236195)
> +++ tree-sra.c  (working copy)
> @@ -2680,6 +2680,10 @@ analyze_all_variable_accesses (void)
>    EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
>      {
>        tree var = candidate (i);
> +      if (constant_decl_p (var)
> +         && (!bitmap_bit_p (should_scalarize_away_bitmap, i)
> +             || bitmap_bit_p (cannot_scalarize_away_bitmap, i)))
> +       continue;
>        struct access *access;
>
>        access = sort_and_splice_var_accesses (var);
>
> but I have no idea whether this is correct or not.
>
>
> Martin, are we sure to disable scalarization of constant_decl_p variables not
> covered by initialize_constant_pool_replacements that way?

Does the above "work"?  Aka, not cause testsuite regressions?  I remember
the original patch was mostly for ARM (where we don't scalarize sth at
gimplification time
for some "cost" reason).

Thanks,
Richard.

> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] Fix PR tree-optimization/70884
  2016-05-13 11:01   ` Eric Botcazou
  2016-05-13 11:52     ` Richard Biener
@ 2016-05-13 15:24     ` Martin Jambor
  2016-05-20  7:34       ` Eric Botcazou
  1 sibling, 1 reply; 7+ messages in thread
From: Martin Jambor @ 2016-05-13 15:24 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches, alan.lawrence

Hi,

On Fri, May 13, 2016 at 01:01:50PM +0200, Eric Botcazou wrote:
> > Hmm, the patch looks obvious if it was the intent to allow constant
> > pool replacements
> > _not_ only when the whole constant pool entry may go away.  But I
> > think the intent was
> > to not do this otherwise it will generate worse code by forcing all
> > loads from the constant pool to appear at
> > function start.
> 
> Do you mean when the whole constant pool entry is scalarized as opposed to 
> partially scalarized?
> 
> > So - the "real" issue may be a missing
> > should_scalarize_away_bitmap/cannot_scalarize_away_bitmap
> > check somewhere.
> 
> This seems to work:
> 
> Index: tree-sra.c
> ===================================================================
> --- tree-sra.c  (revision 236195)
> +++ tree-sra.c  (working copy)
> @@ -2680,6 +2680,10 @@ analyze_all_variable_accesses (void)
>    EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
>      {
>        tree var = candidate (i);
> +      if (constant_decl_p (var)
> +         && (!bitmap_bit_p (should_scalarize_away_bitmap, i)
> +             || bitmap_bit_p (cannot_scalarize_away_bitmap, i)))
> +       continue;
>        struct access *access;
>  
>        access = sort_and_splice_var_accesses (var);
> 
> but I have no idea whether this is correct or not.

This would skip creation of access trees for the variables without
disqualifying them and while it may "work," it is certainly ugly,
causing lookups to traverse uninitialized structures.

> 
> Martin, are we sure to disable scalarization of constant_decl_p variables not 
> covered by initialize_constant_pool_replacements that way?

Overall, I think that removal of bitmap tests in
initialize_constant_pool_replacements is certainly correct, if SRA
decides to create replacements for a constant pool decl, it has to
initialize them.  Whether we want to perform such scalarization is
another matter.

SRA creates replacement for all parts of an aggregate that is loaded
as a scalar multiple times.  If moving (multiple) scalar loads from
the constant pool to the beginning of the function is a bad idea, then
I would propose the fix below which disables that heuristics for them.
Effectively, the patch prevents late-SRA from doing anything for both
testcases (PR 70884 and PR 70919).  I have started a bootstrap and
testing on x86_64 and i686 only a few moments ago but it would be
great if someone also tried on an architecture for which the
constant-pool SRA enhancement was intended, just to be sure.

Thanks,

Martin


2016-05-13  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/70884
	* tree-sra.c (initialize_constant_pool_replacements): Do not check
	should_scalarize_away_bitmap and cannot_scalarize_away_bitmap bits.
	(sort_and_splice_var_accesses): Do not consider multiple scalar reads
	of constant pool data as a reason for scalarization.

testsuite/
	* gcc.dg/tree-ssa/pr70919.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr70919.c | 46 ++++++++++++++++++++++++++++
 gcc/tree-sra.c                          | 54 ++++++++++++++++-----------------
 2 files changed, 73 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr70919.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr70919.c b/gcc/testsuite/gcc.dg/tree-ssa/pr70919.c
new file mode 100644
index 0000000..bed0ab3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr70919.c
@@ -0,0 +1,46 @@
+/* { dg-do run } */
+/* { dg-options "-O" } */
+
+#pragma pack(1)
+struct S0
+{
+  int f0:24;
+};
+
+struct S1
+{
+  int f1;
+} a;
+
+int b, c;
+
+char
+fn1 (struct S1 p1)
+{
+  return 0;
+}
+
+int
+main ()
+{
+  c = fn1 (a);
+  if (b)
+    {
+      struct S0 f[3][9] =
+	{ { { 0 }, { 0 }, { 1 }, { 1 }, { 0 }, { 0 }, { 0 }, { 1 }, { 1 } },
+	  { { 0 }, { 0 }, { 1 }, { 1 }, { 0 }, { 0 }, { 0 }, { 1 }, { 1 } },
+	  { { 0 }, { 0 }, { 1 }, { 1 }, { 0 }, { 0 }, { 0 }, { 1 }, { 1 } }
+	};
+      b = f[1][8].f0;
+    }
+  struct S0 g[3][9] =
+	{ { { 0 }, { 0 }, { 1 }, { 1 }, { 0 }, { 0 }, { 0 }, { 1 }, { 1 } },
+	  { { 0 }, { 0 }, { 1 }, { 1 }, { 0 }, { 0 }, { 0 }, { 1 }, { 1 } },
+	  { { 0 }, { 0 }, { 1 }, { 1 }, { 0 }, { 0 }, { 0 }, { 1 }, { 1 } }
+	};
+
+  if (g[1][8].f0 != 1)
+    __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 936d3a6..7c0e90d 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2074,7 +2074,8 @@ sort_and_splice_var_accesses (tree var)
       access->grp_scalar_write = grp_scalar_write;
       access->grp_assignment_read = grp_assignment_read;
       access->grp_assignment_write = grp_assignment_write;
-      access->grp_hint = multiple_scalar_reads || total_scalarization;
+      access->grp_hint = total_scalarization
+	|| (multiple_scalar_reads && !constant_decl_p (var));
       access->grp_total_scalarization = total_scalarization;
       access->grp_partial_lhs = grp_partial_lhs;
       access->grp_unscalarizable_region = unscalarizable_region;
@@ -3559,32 +3560,31 @@ initialize_constant_pool_replacements (void)
   unsigned i;
 
   EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
-    if (bitmap_bit_p (should_scalarize_away_bitmap, i)
-	&& !bitmap_bit_p (cannot_scalarize_away_bitmap, i))
-      {
-	tree var = candidate (i);
-	if (!constant_decl_p (var))
-	  continue;
-	vec<access_p> *access_vec = get_base_access_vector (var);
-	if (!access_vec)
-	  continue;
-	for (unsigned i = 0; i < access_vec->length (); i++)
-	  {
-	    struct access *access = (*access_vec)[i];
-	    if (!access->replacement_decl)
-	      continue;
-	    gassign *stmt = gimple_build_assign (
-	      get_access_replacement (access), unshare_expr (access->expr));
-	    if (dump_file && (dump_flags & TDF_DETAILS))
-	      {
-		fprintf (dump_file, "Generating constant initializer: ");
-		print_gimple_stmt (dump_file, stmt, 0, 1);
-		fprintf (dump_file, "\n");
-	      }
-	    gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
-	    update_stmt (stmt);
-	  }
-      }
+    {
+      tree var = candidate (i);
+      if (!constant_decl_p (var))
+	continue;
+      vec<access_p> *access_vec = get_base_access_vector (var);
+      if (!access_vec)
+	continue;
+      for (unsigned i = 0; i < access_vec->length (); i++)
+	{
+	  struct access *access = (*access_vec)[i];
+	  if (!access->replacement_decl)
+	    continue;
+	  gassign *stmt
+	    = gimple_build_assign (get_access_replacement (access),
+				   unshare_expr (access->expr));
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "Generating constant initializer: ");
+	      print_gimple_stmt (dump_file, stmt, 0, 1);
+	      fprintf (dump_file, "\n");
+	    }
+	  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+	  update_stmt (stmt);
+	}
+    }
 
   seq = gsi_seq (gsi);
   if (seq)
-- 
2.8.2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] Fix PR tree-optimization/70884
  2016-05-13 15:24     ` Martin Jambor
@ 2016-05-20  7:34       ` Eric Botcazou
  2016-05-20  9:32         ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Botcazou @ 2016-05-20  7:34 UTC (permalink / raw)
  To: Martin Jambor; +Cc: gcc-patches, Richard Biener, alan.lawrence

> Effectively, the patch prevents late-SRA from doing anything for both
> testcases (PR 70884 and PR 70919).  I have started a bootstrap and
> testing on x86_64 and i686 only a few moments ago but it would be
> great if someone also tried on an architecture for which the
> constant-pool SRA enhancement was intended, just to be sure.

Can you apply it now?  It's a wrong-code regression on the 6 branch and people 
can still chime it later in any case.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] Fix PR tree-optimization/70884
  2016-05-20  7:34       ` Eric Botcazou
@ 2016-05-20  9:32         ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2016-05-20  9:32 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Martin Jambor, GCC Patches, alan.lawrence

On Fri, May 20, 2016 at 9:34 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Effectively, the patch prevents late-SRA from doing anything for both
>> testcases (PR 70884 and PR 70919).  I have started a bootstrap and
>> testing on x86_64 and i686 only a few moments ago but it would be
>> great if someone also tried on an architecture for which the
>> constant-pool SRA enhancement was intended, just to be sure.
>
> Can you apply it now?  It's a wrong-code regression on the 6 branch and people
> can still chime it later in any case.

The patch is ok if it passed bootstrap/regtest.  I believe at least on
ARM we had
some tree-ssa.exp testcase(s) that were no longer XFAILing with the
added support.

Thanks,
Richard.

> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-20  9:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-07 21:22 [patch] Fix PR tree-optimization/70884 Eric Botcazou
2016-05-09 10:02 ` Richard Biener
2016-05-13 11:01   ` Eric Botcazou
2016-05-13 11:52     ` Richard Biener
2016-05-13 15:24     ` Martin Jambor
2016-05-20  7:34       ` Eric Botcazou
2016-05-20  9:32         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).