public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
@ 2012-10-12 20:14 Jan Hubicka
  2012-10-12 20:36 ` Markus Trippelsdorf
                   ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Jan Hubicka @ 2012-10-12 20:14 UTC (permalink / raw)
  To: gcc-patches

Hi,
I finally tracked down twolf misoptimization triggered by my loop-unroll.c
changes.  It has turned out to be semi-latent wrong code issue in webizer.
What happens is:

1) gcse.c drop REG_EQUAL note on the induction variable
2) loop optimizer unrolls the loop enabling webizer to cleanup
3) webizer do not track reaching refs into REG_EQUAL note because
   the variable is dead and renames it to unused pseudo
   Program is stil valid but the REG_EQUAL is bogus.
4) CSE eventually takes the value and put it back into the code
5) init-regs initializes it to 0

and result is a segfault on the following testcase.
Fixed by making webizer to not prune dead regs.
Will commit it after testing on x86_64-linux.

Honza

/* { dg-do run } */
/* { dg-options "-O3 -funroll-loops" } */
typedef struct rowbox {
    int startx ;
    int endx ;
    int endx1 ;
    int startx2 ;
    int ypos ;
    int desiredL ;
} ROWBOX ;
ROWBOX rowArray1[2] ;
ROWBOX *rowArray = rowArray1;

int numRows = 2;

int row = 1;
int block = 0;
double ckt_size_factor ;

__attribute__ ((noinline))
configure2()
{
  block = 0 ;
  for( row = 1 ; row <= numRows ; row++ ) {
      block++ ;
    if( rowArray[row].endx1 > 0 ) {
      block++ ;
    }
  }
}

main()
{
  configure2();
}

	* web.c (web_main): Do not set DF_RD_PRUNE_DEAD_DEFS flag.
Index: web.c
===================================================================
--- web.c	(revision 192369)
+++ web.c	(working copy)
@@ -313,7 +313,8 @@ web_main (void)
   rtx insn;
 
   df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
-  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
+  /* We can not RD_PRUNE_DEAD_DEFS, because we care about REG_EQUAL
+     notes.  */
   df_chain_add_problem (DF_UD_CHAIN);
   df_analyze ();
   df_set_flags (DF_DEFER_INSN_RESCAN);

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 20:14 Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug) Jan Hubicka
@ 2012-10-12 20:36 ` Markus Trippelsdorf
  2012-10-12 20:44 ` Steven Bosscher
  2012-10-12 21:05 ` Steven Bosscher
  2 siblings, 0 replies; 68+ messages in thread
From: Markus Trippelsdorf @ 2012-10-12 20:36 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On 2012.10.12 at 21:43 +0200, Jan Hubicka wrote:
> I finally tracked down twolf misoptimization triggered by my loop-unroll.c
> changes.  It has turned out to be semi-latent wrong code issue in webizer.
> What happens is:
> 
> 1) gcse.c drop REG_EQUAL note on the induction variable
> 2) loop optimizer unrolls the loop enabling webizer to cleanup
> 3) webizer do not track reaching refs into REG_EQUAL note because
>    the variable is dead and renames it to unused pseudo
>    Program is stil valid but the REG_EQUAL is bogus.
> 4) CSE eventually takes the value and put it back into the code
> 5) init-regs initializes it to 0
> 
> and result is a segfault on the following testcase.
> Fixed by making webizer to not prune dead regs.
> Will commit it after testing on x86_64-linux.

FYI this also fixes the lto/profiledbootstrap breakage, PR 54885.

Thanks.
-- 
Markus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 20:14 Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug) Jan Hubicka
  2012-10-12 20:36 ` Markus Trippelsdorf
@ 2012-10-12 20:44 ` Steven Bosscher
  2012-10-12 21:16   ` Jan Hubicka
  2012-10-12 21:05 ` Steven Bosscher
  2 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-12 20:44 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, Oct 12, 2012 at 9:43 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> I finally tracked down twolf misoptimization triggered by my loop-unroll.c
> changes.  It has turned out to be semi-latent wrong code issue in webizer.
> What happens is:
>
> 1) gcse.c drop REG_EQUAL note on the induction variable
> 2) loop optimizer unrolls the loop enabling webizer to cleanup
> 3) webizer do not track reaching refs into REG_EQUAL note because
>    the variable is dead and renames it to unused pseudo
>    Program is stil valid but the REG_EQUAL is bogus.

What do you mean, "not track"? web.c does track EQ notes via DF this:

web.c:315:  df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);



> 4) CSE eventually takes the value and put it back into the code
> 5) init-regs initializes it to 0
>
> and result is a segfault on the following testcase.
> Fixed by making webizer to not prune dead regs.
> Will commit it after testing on x86_64-linux.
>
> Honza
>
> /* { dg-do run } */
> /* { dg-options "-O3 -funroll-loops" } */
> typedef struct rowbox {
>     int startx ;
>     int endx ;
>     int endx1 ;
>     int startx2 ;
>     int ypos ;
>     int desiredL ;
> } ROWBOX ;
> ROWBOX rowArray1[2] ;
> ROWBOX *rowArray = rowArray1;
>
> int numRows = 2;
>
> int row = 1;
> int block = 0;
> double ckt_size_factor ;
>
> __attribute__ ((noinline))
> configure2()
> {
>   block = 0 ;
>   for( row = 1 ; row <= numRows ; row++ ) {
>       block++ ;
>     if( rowArray[row].endx1 > 0 ) {
>       block++ ;
>     }
>   }
> }
>
> main()
> {
>   configure2();
> }
>
>         * web.c (web_main): Do not set DF_RD_PRUNE_DEAD_DEFS flag.
> Index: web.c
> ===================================================================
> --- web.c       (revision 192369)
> +++ web.c       (working copy)
> @@ -313,7 +313,8 @@ web_main (void)
>    rtx insn;
>
>    df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
> -  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
> +  /* We can not RD_PRUNE_DEAD_DEFS, because we care about REG_EQUAL
> +     notes.  */
>    df_chain_add_problem (DF_UD_CHAIN);
>    df_analyze ();
>    df_set_flags (DF_DEFER_INSN_RESCAN);

This is bogus. You're papering over the underlying bug of an invalid
REG_EQUAL note. This patch is not OK!

I did not add DF_RD_PRUNE_DEAD_DEFS for no reason. You're
re-introducing a major compile time hog. Did you do timings on some
significant body of code with -fweb enabled?

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 20:14 Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug) Jan Hubicka
  2012-10-12 20:36 ` Markus Trippelsdorf
  2012-10-12 20:44 ` Steven Bosscher
@ 2012-10-12 21:05 ` Steven Bosscher
  2 siblings, 0 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-10-12 21:05 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

Hi,

On your test case, I have this:

Web oldreg=72 newreg=111
Web oldreg=72 newreg=112
Web oldreg=72 newreg=139
Web oldreg=72 newreg=145
Web oldreg=72 newreg=151
Web oldreg=72 newreg=157
Web oldreg=72 newreg=163
Web oldreg=72 newreg=169

--- t.c.182r.loop2_done  2012-10-12 22:23:59.000000000 +0200
+++ t.c.183r.web 2012-10-12 22:24:23.000000000 +0200

-   79 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+   79 r112:DI=r102:DI
+      REG_EQUAL: r111:DI+0x18

   102 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

   124 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

   150 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

   176 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

   202 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

   228 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

   254 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+      REG_EQUAL: r111:DI+0x18

-  285 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  285 r139:DI=r114:DI
+      REG_EQUAL: r111:DI+0x18

-  306 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  306 r145:DI=r141:DI
+      REG_EQUAL: r111:DI+0x18

-  327 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  327 r151:DI=r147:DI
+      REG_EQUAL: r111:DI+0x18

-  348 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  348 r157:DI=r153:DI
+      REG_EQUAL: r111:DI+0x18

-  369 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  369 r163:DI=r159:DI
+      REG_EQUAL: r111:DI+0x18

-  390 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  390 r169:DI=r165:DI
+      REG_EQUAL: r111:DI+0x18

-  411 r72:DI=r102:DI
-      REG_EQUAL: r72:DI+0x18
+  411 r72:DI=r171:DI
+      REG_EQUAL: r111:DI+0x18


So all appearances of r72 in REG_EQUAL notes have been replaced with
r111. That means the constructed webs are wrong. If the REG_EQUAL is
for the DEF operand of the insn (which must be a single_set insn, or
there wouldn't be a note) then the DEF and the EQ_USE should be in the
same web.

I don't think this has anything to do with DF_RD_PRUNE_DEAD_DEFS. This
is a pre-existing bug in web.c that just happens to be exposed by loop
unrolling.

Please do not apply the patch, it only papers over another problem in web.c.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 20:44 ` Steven Bosscher
@ 2012-10-12 21:16   ` Jan Hubicka
  2012-10-12 21:19     ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Jan Hubicka @ 2012-10-12 21:16 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> On Fri, Oct 12, 2012 at 9:43 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> > Hi,
> > I finally tracked down twolf misoptimization triggered by my loop-unroll.c
> > changes.  It has turned out to be semi-latent wrong code issue in webizer.
> > What happens is:
> >
> > 1) gcse.c drop REG_EQUAL note on the induction variable
> > 2) loop optimizer unrolls the loop enabling webizer to cleanup
> > 3) webizer do not track reaching refs into REG_EQUAL note because
> >    the variable is dead and renames it to unused pseudo
> >    Program is stil valid but the REG_EQUAL is bogus.
> 
> What do you mean, "not track"? web.c does track EQ notes via DF this:
> 
> web.c:315:  df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
> 
> This is bogus. You're papering over the underlying bug of an invalid
> REG_EQUAL note. This patch is not OK!

The REG_EQUAL note is valid until webizer breaks it.

DF_EQ_NOTES makes UD chain problem to look into uses of REG_EQUAL and add all
reaching definitions into the list.  To do the reaching definitions it uses RD
problem solution.  This solution do not really care about REG_EQUAL notes
because it does only over definitions.  It however uses results of LIVENESS
problem via REG_DEAD notes and the pruning trick.  Now th REG_EQUAL note uses
dead register (that is aloved), because the liveness is computed ignoring
REG_EQUAL notes and thus the definition gets prunned from the list and
consequentely webizer misses the link.
> 
> I did not add DF_RD_PRUNE_DEAD_DEFS for no reason. You're
> re-introducing a major compile time hog. Did you do timings on some
> significant body of code with -fweb enabled?

Well, I can not think how to fix it without
 1) computing liveness with REG_EQUAL included prior RD that means a lot
    of shuffling of REG_DEAD notes
 2) making webizer to drop all REG_EQUAL notes that use dead register
    (for that it will need to track liveness too)

Do you have better ideas?
Honza
> 
> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 21:16   ` Jan Hubicka
@ 2012-10-12 21:19     ` Steven Bosscher
  2012-10-12 21:31       ` Jan Hubicka
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-12 21:19 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>  1) computing liveness with REG_EQUAL included prior RD that means a lot
>     of shuffling of REG_DEAD notes

I was already working on a patch for this. I'll send it here later tonight.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 21:19     ` Steven Bosscher
@ 2012-10-12 21:31       ` Jan Hubicka
  2012-10-12 22:41         ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Jan Hubicka @ 2012-10-12 21:31 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >  1) computing liveness with REG_EQUAL included prior RD that means a lot
> >     of shuffling of REG_DEAD notes
> 
> I was already working on a patch for this. I'll send it here later tonight.

Great, thanks!  This is probably most sensible approach even if we will need to
recompute liveness before/after webizer.

This bug really took me days to reduce into something sensible.  It tends to
reproduce only in terribly complex functions with many loops.

Honza
> 
> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 21:31       ` Jan Hubicka
@ 2012-10-12 22:41         ` Steven Bosscher
  2012-10-14  9:13           ` Paolo Bonzini
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-12 22:41 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 756 bytes --]

On Fri, Oct 12, 2012 at 11:16 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >  1) computing liveness with REG_EQUAL included prior RD that means a lot
>> >     of shuffling of REG_DEAD notes
>>
>> I was already working on a patch for this. I'll send it here later tonight.
>
> Great, thanks!  This is probably most sensible approach even if we will need to
> recompute liveness before/after webizer.

I don't think we have to touch the liveness sets. We can compute an
extra set of registers live only for REG_EQUAL/REG_EQUIV notes.
Attached is what I had in mind. Untested, etc. it's late (and the
Yankees are playing) so I'll get back to properly testing this
tomorrow.

Ciao!
Steven

[-- Attachment #2: df_rd_lr_eq.diff --]
[-- Type: application/octet-stream, Size: 7593 bytes --]

	* df.h (struct df_rd_bb_info): Add rd_lr_eq_regs field.
	* df-problems.c (df_rd_lr_eq_local_compute, df_rd_lr_eq_confluence_n,
	df_rd_lr_eq_transfer_function, df_rd_init_lr_eq_regs): New functions
	to compute registers live for REG_EQUAL and REG_EQUIV notes.
	(df_rd_free_bb_info): Set up rd_lr_eq_regs.
	(df_rd_alloc): Likewise.
	(df_rd_bb_local_compute, df_rd_init_solution): When pruning the RD
	problem to only live DEFs, and DEFs reaching REG_EQUAL and REG_EQUIV
	notes must also be computed, compute a DF_LR-like set for REG_EQUAL
	and REG_EQUIV uses.
	(df_rd_transfer_function): Keep DEFs reaching REG_EQUAL and REG_EQUIV
	notes alive if necessary.
	(df_rd_top_dump): Dump rd_lr_eq_regs.

Index: df.h
===================================================================
--- df.h	(revision 192391)
+++ df.h	(working copy)
@@ -793,6 +793,10 @@ struct df_rd_bb_info
   bitmap_head sparse_kill;
   bitmap_head gen;   /* The set of defs generated in this block.  */
 
+  /* The set of registers that are considered live for the LR&RD
+     problem because they are used in REG_EQUAL or REG_EQUIV nodes.  */
+  bitmap_head rd_lr_eq_regs;
+
   /* The results of the dataflow problem.  */
   bitmap_head in;    /* At the top of the block.  */
   bitmap_head out;   /* At the bottom of the block.  */
Index: df-problems.c
===================================================================
--- df-problems.c	(revision 192391)
+++ df-problems.c	(working copy)
@@ -200,6 +200,95 @@ struct df_rd_problem_data
   bitmap_obstack rd_bitmaps;
 };
 
+/* Liveness sets for EQ_NOTES.  This is a much simplified form of the
+   normal LR problem.  For a given basic block B:
+    * EQ_USE[B] is the set of registers in REG_EQUAL or REG_EQUIV in B
+    * DEF[B] is the DEF set from the LR problem
+    * IN[B] is the set of EQ_USES exposed upward in the CFG but that
+      are not in the LR_IN set (if they're in LR_IN already we have no
+      reason to track them as special-case EQ_USES)
+    * OUT[B] = union_of_succs(IN[B])
+
+   The RD_LR_EQ_REGS bitmap on the RD basic block info is used first
+   to hold the RD_LR_EQ_USE set, computed in df_rd_lr_eq_local_compute.
+   During df_rd_init_solution this set is used to compute the global set,
+   and after df_rd_init_solution RD_LR_EQ_REGS will be the set of registers
+   that would have been in LR_OUT if the registers would have been real
+   uses (and not just EQ_USES).  */
+
+/* Process a list of EQ_USEs for df_rd_bb_local_compute.  */
+static void
+df_rd_lr_eq_local_compute (struct df_rd_bb_info *bb_info,
+			   df_ref *eq_use_rec)
+{
+  while (*eq_use_rec)
+    {
+      df_ref eq_use = *eq_use_rec;
+      unsigned int regno = DF_REF_REGNO (eq_use);
+      if ((!(df->changeable_flags & DF_NO_HARD_REGS))
+	  || (regno >= FIRST_PSEUDO_REGISTER))
+	bitmap_set_bit (&bb_info->rd_lr_eq_regs, regno);
+      eq_use_rec++;
+    }
+}
+
+/* The IN and EQ_USE sets.  EQ_USE will be filled from RD_LR_RQ_REGS.  */
+bitmap_head *rd_lr_eq_in, *rd_lr_eq_use;
+
+static bool
+df_rd_lr_eq_confluence_n (edge e)
+{
+  bitmap out = &df_rd_get_bb_info (e->src->index)->rd_lr_eq_regs;
+  bitmap in = &rd_lr_eq_in[e->dest->index];
+  return bitmap_ior_into (out, in);
+}
+
+static bool
+df_rd_lr_eq_transfer_function (int bb_index)
+{
+  bitmap in = &rd_lr_eq_in[bb_index];
+  bitmap eq_use = &rd_lr_eq_use[bb_index];
+  bitmap out = &df_rd_get_bb_info (bb_index)->rd_lr_eq_regs;
+  bitmap lr_in = &df_lr_get_bb_info (bb_index)->in;
+  bitmap def = &df_lr_get_bb_info (bb_index)->def;
+  bool changed = bitmap_ior_and_compl (in, eq_use, out, def);
+  return bitmap_and_compl_into (in, lr_in) || changed;
+}
+
+static void
+df_rd_init_lr_eq_regs (bitmap all_blocks)
+{
+  unsigned int bb_index;
+  bitmap_iterator bi;
+  struct df_rd_problem_data *problem_data;
+
+  rd_lr_eq_in = XNEWVEC (bitmap_head, last_basic_block);
+  rd_lr_eq_use = XNEWVEC (bitmap_head, last_basic_block);
+
+  problem_data = (struct df_rd_problem_data *) df_rd->problem_data;
+  EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
+    {
+      bitmap_initialize (&rd_lr_eq_in[bb_index], &problem_data->rd_bitmaps);
+      bitmap_initialize (&rd_lr_eq_use[bb_index], &problem_data->rd_bitmaps);
+      bitmap rd_lr_eq_regs = &df_rd_get_bb_info (bb_index)->rd_lr_eq_regs;
+      bitmap_copy (&rd_lr_eq_use[bb_index], rd_lr_eq_regs);
+      bitmap_clear (rd_lr_eq_regs);
+    }
+
+  df_simple_dataflow (DF_BACKWARD, NULL,
+		      NULL, df_rd_lr_eq_confluence_n,
+		      df_rd_lr_eq_transfer_function,
+		      all_blocks, df_get_postorder (DF_BACKWARD),
+		      df_get_n_blocks (DF_BACKWARD));
+
+  EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
+    {
+      bitmap_clear (&rd_lr_eq_in[bb_index]);
+      bitmap_clear (&rd_lr_eq_use[bb_index]);
+    }
+  free (rd_lr_eq_in);
+  free (rd_lr_eq_use);
+}
 
 /* Free basic block info.  */
 
@@ -213,6 +302,7 @@ df_rd_free_bb_info (basic_block bb ATTRI
       bitmap_clear (&bb_info->kill);
       bitmap_clear (&bb_info->sparse_kill);
       bitmap_clear (&bb_info->gen);
+      bitmap_clear (&bb_info->rd_lr_eq_regs);
       bitmap_clear (&bb_info->in);
       bitmap_clear (&bb_info->out);
     }
@@ -262,12 +352,14 @@ df_rd_alloc (bitmap all_blocks)
 	  bitmap_clear (&bb_info->kill);
 	  bitmap_clear (&bb_info->sparse_kill);
 	  bitmap_clear (&bb_info->gen);
+	  bitmap_clear (&bb_info->rd_lr_eq_regs);
 	}
       else
 	{
 	  bitmap_initialize (&bb_info->kill, &problem_data->rd_bitmaps);
 	  bitmap_initialize (&bb_info->sparse_kill, &problem_data->rd_bitmaps);
 	  bitmap_initialize (&bb_info->gen, &problem_data->rd_bitmaps);
+	  bitmap_initialize (&bb_info->rd_lr_eq_regs, &problem_data->rd_bitmaps);
 	  bitmap_initialize (&bb_info->in, &problem_data->rd_bitmaps);
 	  bitmap_initialize (&bb_info->out, &problem_data->rd_bitmaps);
 	}
@@ -414,6 +506,12 @@ df_rd_bb_local_compute (unsigned int bb_
       df_rd_bb_local_compute_process_def (bb_info,
 					  DF_INSN_UID_DEFS (uid), 0);
 
+      /* If there are EQ_NOTES, and we're computing the LR&RD problem,
+	 we have to keep registers reaching the EQ_NOTES live.  */
+      if ((df->changeable_flags & DF_EQ_NOTES)
+	  && (df->changeable_flags & DF_RD_PRUNE_DEAD_DEFS))
+	df_rd_lr_eq_local_compute (bb_info, DF_INSN_UID_EQ_USES (uid));
+
       /* This complex dance with the two bitmaps is required because
 	 instructions can assign twice to the same pseudo.  This
 	 generally happens with calls that will have one def for the
@@ -485,6 +583,10 @@ df_rd_init_solution (bitmap all_blocks)
   unsigned int bb_index;
   bitmap_iterator bi;
 
+  if ((df->changeable_flags & DF_EQ_NOTES)
+      && (df->changeable_flags & DF_RD_PRUNE_DEAD_DEFS))
+    df_rd_init_lr_eq_regs (all_blocks);
+
   EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
     {
       struct df_rd_bb_info *bb_info = df_rd_get_bb_info (bb_index);
@@ -597,6 +699,11 @@ df_rd_transfer_function (int bb_index)
 	bitmap_set_range (live_defs,
 			  DF_DEFS_BEGIN (regno),
 			  DF_DEFS_COUNT (regno));
+      EXECUTE_IF_SET_IN_BITMAP (&bb_info->rd_lr_eq_regs, 0, regno, bi)
+	bitmap_set_range (live_defs,
+			  DF_DEFS_BEGIN (regno),
+			  DF_DEFS_COUNT (regno));
+
       changed |= bitmap_and_into (&bb_info->out, live_defs);
       BITMAP_FREE (live_defs);
     }
@@ -710,6 +817,7 @@ df_rd_top_dump (basic_block bb, FILE *fi
   df_rd_dump_defs_set (&bb_info->in, ";; rd  in  ", file);
   df_rd_dump_defs_set (&bb_info->gen, ";; rd  gen ", file);
   df_rd_dump_defs_set (&bb_info->kill, ";; rd  kill", file);
+  df_rd_dump_defs_set (&bb_info->rd_lr_eq_regs, ";; rd lr_eq", file);
 }
 
 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-12 22:41         ` Steven Bosscher
@ 2012-10-14  9:13           ` Paolo Bonzini
  2012-10-14 21:27             ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-10-14  9:13 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

Il 13/10/2012 00:25, Steven Bosscher ha scritto:
> On Fri, Oct 12, 2012 at 11:16 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>>  1) computing liveness with REG_EQUAL included prior RD that means a lot
>>>>     of shuffling of REG_DEAD notes
>>>
>>> I was already working on a patch for this. I'll send it here later tonight.
>>
>> Great, thanks!  This is probably most sensible approach even if we will need to
>> recompute liveness before/after webizer.
> 
> I don't think we have to touch the liveness sets. We can compute an
> extra set of registers live only for REG_EQUAL/REG_EQUIV notes.
> Attached is what I had in mind. Untested, etc. it's late (and the
> Yankees are playing) so I'll get back to properly testing this
> tomorrow.

Can we just simulate liveness for web, and drop REG_EQUAL/REG_EQUIV
notes that refer to a dead pseudo?

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-14  9:13           ` Paolo Bonzini
@ 2012-10-14 21:27             ` Steven Bosscher
  2012-10-14 21:35               ` Eric Botcazou
  2012-10-15  8:14               ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-10-14 21:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Sun, Oct 14, 2012 at 9:02 AM, Paolo Bonzini wrote:
> Can we just simulate liveness for web, and drop REG_EQUAL/REG_EQUIV
> notes that refer to a dead pseudo?

I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a
pseudo that isn't live and still be valid. Consider a simple example
like this:

a = b + 3
// b dies here
c = a {REG_EQUAL b+3}

The REG_EQUAL note is valid and may help optimization. Removing it
just because b is dead at that point would be unnecessarily
pessimistic.

I also don't want to compute DF_LR taking EQ_USES into account as real
uses for liveness, because that involves recomputing and enlarging the
DF_LR sets (all of them, both globally and locally) before LR&RD and
after LR&RD. That's why I implemented the quick-and-dirty liveness
computation for the notes: It's non-intrusive on DF_LR and it's cheap.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-14 21:27             ` Steven Bosscher
@ 2012-10-14 21:35               ` Eric Botcazou
  2012-10-14 21:45                 ` Steven Bosscher
  2012-10-15  8:14               ` Paolo Bonzini
  1 sibling, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-10-14 21:35 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, Paolo Bonzini, Jan Hubicka

> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a
> pseudo that isn't live and still be valid. Consider a simple example
> like this:
> 
> a = b + 3
> // b dies here
> c = a {REG_EQUAL b+3}
> 
> The REG_EQUAL note is valid and may help optimization. Removing it
> just because b is dead at that point would be unnecessarily
> pessimistic.

But if you have a REG_DEAD note for b on the first insn, then you cannot 
rematerialize the REG_EQUAL note after it, otherwise bad things can happen.

See PR rtl-optimization/51505 for an example.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-14 21:35               ` Eric Botcazou
@ 2012-10-14 21:45                 ` Steven Bosscher
  0 siblings, 0 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-10-14 21:45 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Paolo Bonzini, Jan Hubicka

On Sun, Oct 14, 2012 at 11:25 PM, Eric Botcazou wrote:
>> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a
>> pseudo that isn't live and still be valid. Consider a simple example
>> like this:
>>
>> a = b + 3
>> // b dies here
>> c = a {REG_EQUAL b+3}
>>
>> The REG_EQUAL note is valid and may help optimization. Removing it
>> just because b is dead at that point would be unnecessarily
>> pessimistic.
>
> But if you have a REG_DEAD note for b on the first insn, then you cannot
> rematerialize the REG_EQUAL note after it, otherwise bad things can happen.
>
> See PR rtl-optimization/51505 for an example.

That's not the case here. The register is only dead because the
webizer renamed one of its live ranges but "forgets" to rename the
EQ_NOTE use.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-14 21:27             ` Steven Bosscher
  2012-10-14 21:35               ` Eric Botcazou
@ 2012-10-15  8:14               ` Paolo Bonzini
  2012-10-15  8:23                 ` Steven Bosscher
  1 sibling, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-10-15  8:14 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

Il 14/10/2012 22:59, Steven Bosscher ha scritto:
> On Sun, Oct 14, 2012 at 9:02 AM, Paolo Bonzini wrote:
>> Can we just simulate liveness for web, and drop REG_EQUAL/REG_EQUIV
>> notes that refer to a dead pseudo?
> 
> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a
> pseudo that isn't live and still be valid. Consider a simple example
> like this:
> 
> a = b + 3
> // b dies here
> c = a {REG_EQUAL b+3}
> 
> The REG_EQUAL note is valid and may help optimization. Removing it
> just because b is dead at that point would be unnecessarily
> pessimistic.

I disagree that it is valid.  At least it is risky to consider it valid,
because a pass that simulates liveness might end up doing something
wrong because of that note.  If simulation is done backwards, it doesn't
even require any interaction with REG_DEAD notes.

> I also don't want to compute DF_LR taking EQ_USES into account as real
> uses for liveness, because that involves recomputing and enlarging the
> DF_LR sets (all of them, both globally and locally) before LR&RD and
> after LR&RD. That's why I implemented the quick-and-dirty liveness
> computation for the notes: It's non-intrusive on DF_LR and it's cheap.

Yes, I agree on that part of the implementation. :)

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15  8:14               ` Paolo Bonzini
@ 2012-10-15  8:23                 ` Steven Bosscher
  2012-10-15  8:35                   ` Paolo Bonzini
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-15  8:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Mon, Oct 15, 2012 at 9:38 AM, Paolo Bonzini wrote:
>> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a
>> pseudo that isn't live and still be valid. Consider a simple example
>> like this:
>>
>> a = b + 3
>> // b dies here
>> c = a {REG_EQUAL b+3}
>>
>> The REG_EQUAL note is valid and may help optimization. Removing it
>> just because b is dead at that point would be unnecessarily
>> pessimistic.
>
> I disagree that it is valid.  At least it is risky to consider it valid,
> because a pass that simulates liveness might end up doing something
> wrong because of that note.  If simulation is done backwards, it doesn't
> even require any interaction with REG_DEAD notes.

In any case, if web doesn't properly rename the register in the
REG_EQUAL note (which it doesn't do without my patch) and we declare
such a note invalid, then we should remove the note. You're right that
GCC ends up doing something wrong, that's why Honza's test case fails.

With my patch, the registers in the notes are properly renamed and
there are no REG_EQUAL notes that refer to dead registers, so the
point whether such a note would be valid is moot. After renaming, the
note is valid, and the behavior is restored that GCC had before I
added DF_RD_PRUNE_DEAD_DEFS.

I think we should come to a conclusion of this discussion: Either we
drop the notes (e.g. by re-computing the DF_NOTE problem after web) or
we update them, like my patch does. I prefer the patch I proposed
because it re-instates the behavior GCC had before.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15  8:23                 ` Steven Bosscher
@ 2012-10-15  8:35                   ` Paolo Bonzini
  2012-10-15  8:38                     ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-10-15  8:35 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

Il 15/10/2012 10:13, Steven Bosscher ha scritto:
> > I disagree that it is valid.  At least it is risky to consider it valid,
> > because a pass that simulates liveness might end up doing something
> > wrong because of that note.  If simulation is done backwards, it doesn't
> > even require any interaction with REG_DEAD notes.
> 
> In any case, if web doesn't properly rename the register in the
> REG_EQUAL note (which it doesn't do without my patch) and we declare
> such a note invalid, then we should remove the note. You're right that
> GCC ends up doing something wrong, that's why Honza's test case fails.
> 
> I think we should come to a conclusion of this discussion: Either we
> drop the notes (e.g. by re-computing the DF_NOTE problem after web) or
> we update them, like my patch does. I prefer the patch I proposed
> because it re-instates the behavior GCC had before.

I prefer to declare the notes invalid and drop the notes.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15  8:35                   ` Paolo Bonzini
@ 2012-10-15  8:38                     ` Steven Bosscher
  2012-10-15 10:49                       ` Steven Bosscher
  2012-10-15 12:28                       ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-10-15  8:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 755 bytes --]

On Mon, Oct 15, 2012 at 10:19 AM, Paolo Bonzini wrote:
> I prefer to declare the notes invalid and drop the notes.

Then, afaic, our only option is to drop them all in web, as per attached patch.

I strongly disagree with this approach though. It destroys information
that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can
update, and that helps with optimization.

This whole discussion about notes being dead has gone in completely
the wrong direction. With renaming these notes are valid, and do not
refer to dead regs. Perhaps you could be convinced if you look at
Honza's test case with the patch of r192413 reverted.
The test case still fails with --param max-unroll-times=3, that makes
visualizing the problem easier.

Ciao!
Steven

[-- Attachment #2: web_destroy_eq_notes.diff --]
[-- Type: application/octet-stream, Size: 3675 bytes --]

	* web.c (union_defs): Do not proces REG_EQUAL uses.
	(web_main): Drop all REG_EQUAL notes.  Add back DF_RD_PRUNE_DEAD_DEFS.

Index: web.c
===================================================================
--- web.c	(revision 192438)
+++ web.c	(working copy)
@@ -141,37 +141,22 @@ union_defs (df_ref use, struct web_entry
 {
   struct df_insn_info *insn_info = DF_REF_INSN_INFO (use);
   struct df_link *link = DF_REF_CHAIN (use);
-  df_ref *eq_use_link;
   df_ref *def_link;
   rtx set;
 
   if (insn_info)
     {
       rtx insn = insn_info->insn;
-      eq_use_link = DF_INSN_INFO_EQ_USES (insn_info);
       def_link = DF_INSN_INFO_DEFS (insn_info);
       set = single_set (insn);
     }
   else
     {
       /* An artificial use.  It links up with nothing.  */
-      eq_use_link = NULL;
       def_link = NULL;
       set = NULL;
     }
 
-  /* Union all occurrences of the same register in reg notes.  */
-
-  if (eq_use_link)
-    while (*eq_use_link)
-      {
-	if (use != *eq_use_link
-	    && DF_REF_REAL_REG (use) == DF_REF_REAL_REG (*eq_use_link))
-	  (*fun) (use_entry + DF_REF_ID (use),
-		  use_entry + DF_REF_ID (*eq_use_link));
-	eq_use_link++;
-    }
-
   /* Recognize trivial noop moves and attempt to keep them as noop.  */
 
   if (set
@@ -312,9 +297,7 @@ web_main (void)
   unsigned int uses_num = 0;
   rtx insn;
 
-  df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
-  /* We can not RD_PRUNE_DEAD_DEFS, because we care about REG_EQUAL
-     notes.  */
+  df_set_flags (DF_NO_HARD_REGS + DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
   df_analyze ();
   df_set_flags (DF_DEFER_INSN_RESCAN);
@@ -333,12 +316,6 @@ web_main (void)
 	      if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
 		DF_REF_ID (use) = uses_num++;
 	    }
-	  for (use_rec = DF_INSN_UID_EQ_USES (uid); *use_rec; use_rec++)
-	    {
-	      df_ref use = *use_rec;
-	      if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
-		DF_REF_ID (use) = uses_num++;
-	    }
 	}
     }
 
@@ -355,6 +332,14 @@ web_main (void)
       if (NONDEBUG_INSN_P (insn))
 	{
 	  df_ref *use_rec;
+	  /* We do not update USE operands in in REG_EQUAL notes because
+	     such USEs do not make a register live and therefore they do
+	     not appear in the webs we construct.  We cannot prove whether
+	     a note remains valid after renaming, so drop them all here.
+	     USEs in REG_EQUIV notes would also not renamed, but they are,
+	     by definition, only set once so they are not candidates for
+	     renaming anyway.  */
+	  remove_note (insn, find_reg_note (insn, REG_EQUAL, NULL_RTX));
 	  union_match_dups (insn, def_entry, use_entry, unionfind_union);
 	  for (use_rec = DF_INSN_UID_USES (uid); *use_rec; use_rec++)
 	    {
@@ -362,12 +347,6 @@ web_main (void)
 	      if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
 		union_defs (use, def_entry, used, use_entry, unionfind_union);
 	    }
-	  for (use_rec = DF_INSN_UID_EQ_USES (uid); *use_rec; use_rec++)
-	    {
-	      df_ref use = *use_rec;
-	      if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
-		union_defs (use, def_entry, used, use_entry, unionfind_union);
-	    }
 	}
     }
 
@@ -397,12 +376,6 @@ web_main (void)
 	      if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
 		replace_ref (use, entry_register (use_entry + DF_REF_ID (use), use, used));
 	    }
-	  for (use_rec = DF_INSN_UID_EQ_USES (uid); *use_rec; use_rec++)
-	    {
-	      df_ref use = *use_rec;
-	      if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
-		replace_ref (use, entry_register (use_entry + DF_REF_ID (use), use, used));
-	    }
 	  for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++)
 	    {
 	      df_ref def = *def_rec;

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15  8:38                     ` Steven Bosscher
@ 2012-10-15 10:49                       ` Steven Bosscher
  2012-10-15 12:28                       ` Paolo Bonzini
  1 sibling, 0 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-10-15 10:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Mon, Oct 15, 2012 at 10:37 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Mon, Oct 15, 2012 at 10:19 AM, Paolo Bonzini wrote:
>> I prefer to declare the notes invalid and drop the notes.
>
> I strongly disagree with this approach though. It destroys information
> that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can
> update, and that helps with optimization.

PR54916 is a case where dropping the notes would cause a missed
optimization in cse-after-loop. With my patch to update the notes, the
optimization (CSE of a load-immediate) is retained.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15  8:38                     ` Steven Bosscher
  2012-10-15 10:49                       ` Steven Bosscher
@ 2012-10-15 12:28                       ` Paolo Bonzini
  2012-10-15 13:19                         ` Steven Bosscher
  1 sibling, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-10-15 12:28 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

Il 15/10/2012 10:37, Steven Bosscher ha scritto:
>> I prefer to declare the notes invalid and drop the notes.
> Then, afaic, our only option is to drop them all in web, as per attached patch.
>
> I strongly disagree with this approach though. It destroys information
> that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can
> update, and that helps with optimization. With renaming these notes
> are valid, and do not refer to dead regs

I agree it is bad.  But I do not understand the last sentence: I
suppose you mean that _without_ renaming these notes are valid, on the
other hand it is normal that some of the notes will be dropped if you
shorten live ranges.

Without removing all of the notes you can do something like this:

- drop the deferred rescanning from web.c.  Instead, make replace_ref
return a bool and call df_insn_rescan manually from web_main.

- attribute new registers to webs in a separate pass that happens
before rewriting, and compute a special version of LR_IN/LR_OUT that
uses the rewritten registers.

- process instructions in reverse order; before starting the visit of
a basic block, initialize the local LR bitmap with the rewritten
LR_OUT of the previous step

- after rewriting and scanning each statement, simulate liveness using
the new defs and uses.

- after rewriting each statement, look for EQ_USES referring to
registers that are dead just before the statement, and delete
REG_EQUAL notes if this is the case

Paolo

> This whole discussion about notes being dead has gone in completely
> the wrong direction. With renaming these notes are valid, and do not
> refer to dead regs. Perhaps you could be convinced if you look at
> Honza's test case with the patch of r192413 reverted.
> The test case still fails with --param max-unroll-times=3, that makes
> visualizing the problem easier.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15 12:28                       ` Paolo Bonzini
@ 2012-10-15 13:19                         ` Steven Bosscher
  2012-10-15 13:29                           ` Paolo Bonzini
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-15 13:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Mon, Oct 15, 2012 at 1:49 PM, Paolo Bonzini wrote:
>> I strongly disagree with this approach though. It destroys information
>> that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can
>> update, and that helps with optimization. With renaming these notes
>> are valid, and do not refer to dead regs
>
> I agree it is bad.  But I do not understand the last sentence: I
> suppose you mean that _without_ renaming these notes are valid, on the
> other hand it is normal that some of the notes will be dropped if you
> shorten live ranges.

OK, I now got so confused that I looked into this a bit deeper.

The situation is something like the following just after unrolling,
but before web:

   33 r72:DI=[`rowArray']
   34 {r103:DI=r72:DI+0x18;clobber flags:CC;}
   ...
   45 flags:CCNO=cmp([r72:DI+0x20],0)
      REG_DEAD: r72:DI
;; diamond shape region follows, joining up again in bb 9:
   79 r72:DI=r103:DI
      REG_EQUAL: r72:DI+0x18

On entry to bb9, r72 is not in LR_IN, so after loop unrolling this
note is already invalid if we say that a note should not refer to a
dead register.


But the register dies much earlier. The first place where insn 79
appears is in the .169r.pre dump:

;; basic block 8, loop depth 1, count 0, freq 9100, maybe hot
;;  prev block 7, next block 9, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       7
;;              6
;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u43(6){ }u44(7){ }u45(16){ }u46(20){ }}
;; lr  in        6 [bp] 7 [sp] 16 [argp] 20 [frame] 72 82 85 87
;; lr  use       6 [bp] 7 [sp] 16 [argp] 20 [frame] 72 87
;; lr  def       17 [flags] 72
;; live  in      6 [bp] 7 [sp] 16 [argp] 20 [frame] 72 82 85 87
;; live  gen     17 [flags] 72
;; live  kill    17 [flags]
L49:
   50 NOTE_INSN_BASIC_BLOCK
   79 r72:DI=r103:DI
      REG_EQUAL: r72:DI+0x18

Here, r72 is still in LR_IN so the note is valid.


Then in .171r.cprop2:

;; basic block 8, loop depth 1, count 0, freq 9100, maybe hot
;;  prev block 7, next block 9, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       7
;;              6
;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u43(6){ }u44(7){ }u45(16){ }u46(20){ }}
;; lr  in        6 [bp] 7 [sp] 16 [argp] 20 [frame] 82 85 87 103
;; lr  use       6 [bp] 7 [sp] 16 [argp] 20 [frame] 87 103
;; lr  def       17 [flags] 72
;; live  in      6 [bp] 7 [sp] 16 [argp] 20 [frame] 82 85 87 103
;; live  gen     17 [flags] 72
;; live  kill
L49:
   50 NOTE_INSN_BASIC_BLOCK
   79 r72:DI=r103:DI
      REG_EQUAL: r72:DI+0x18

So already after CPROP2, the REG_EQUAL note is invalid if we require
that they only refer to live registers. This all happens well before
any pass uses the DF_RD problem, so this is a pre-existing problem if
we consider this kind of REG_EQUAL note to be invalid.


> Without removing all of the notes you can do something like this:
>
> - drop the deferred rescanning from web.c.  Instead, make replace_ref
> return a bool and call df_insn_rescan manually from web_main.
>
> - attribute new registers to webs in a separate pass that happens
> before rewriting, and compute a special version of LR_IN/LR_OUT that
> uses the rewritten registers.
>
> - process instructions in reverse order; before starting the visit of
> a basic block, initialize the local LR bitmap with the rewritten
> LR_OUT of the previous step
>
> - after rewriting and scanning each statement, simulate liveness using
> the new defs and uses.
>
> - after rewriting each statement, look for EQ_USES referring to
> registers that are dead just before the statement, and delete
> REG_EQUAL notes if this is the case

I think I've shown above that we're all looking at the wrong pass...

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15 13:19                         ` Steven Bosscher
@ 2012-10-15 13:29                           ` Paolo Bonzini
  2012-10-15 13:49                             ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-10-15 13:29 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

Il 15/10/2012 14:53, Steven Bosscher ha scritto:
> I think I've shown above that we're all looking at the wrong pass...

I think you have... so we want a patch like this?

Index: df-problems.c
===================================================================
--- df-problems.c	(revisione 183719)
+++ df-problems.c	(copia locale)
@@ -3480,6 +3485,18 @@ df_note_bb_compute (unsigned int bb_inde
 	    }
 	}
 
+      for (use_rec = DF_INSN_UID_EQ_USES (uid); *use_rec; use_rec++)
+	{
+	  df_ref use = *use_rec;
+	  unsigned int uregno = DF_REF_REGNO (use);
+
+	  if (!bitmap_bit_p (live, uregno))
+            {
+              remove_note (insn, find_reg_equal_equiv_note (insn));
+              break;
+            }
+	}
+
       if (debug_insn == -1)
 	{
 	  /* ??? We could probably do better here, replacing dead

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15 13:29                           ` Paolo Bonzini
@ 2012-10-15 13:49                             ` Steven Bosscher
  2012-10-16 10:35                               ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-15 13:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Mon, Oct 15, 2012 at 3:21 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> Il 15/10/2012 14:53, Steven Bosscher ha scritto:
>> I think I've shown above that we're all looking at the wrong pass...
>
> I think you have... so we want a patch like this?

I don't think so. df_kill_notes is already supposed to take care of this.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-15 13:49                             ` Steven Bosscher
@ 2012-10-16 10:35                               ` Steven Bosscher
  2012-10-16 11:05                                 ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-16 10:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Mon, Oct 15, 2012 at 3:26 PM, Steven Bosscher wrote:
> On Mon, Oct 15, 2012 at 3:21 PM, Paolo Bonzini wrote:
>> Il 15/10/2012 14:53, Steven Bosscher ha scritto:
>>> I think I've shown above that we're all looking at the wrong pass...
>>
>> I think you have... so we want a patch like this?
>
> I don't think so. df_kill_notes is already supposed to take care of this.

But it doesn't because if the SET_DEST of an insn is the same as the
register dieing in the insn's REG_EQUAL note, the reg is live at the
end of the insn and so the note stays:

Breakpoint 2, df_kill_notes (insn=0x7ffff5e3e7e0, live=0x7fffffffda90)
at ../../trunk/gcc/df-problems.c:2833
2833      rtx *pprev = &REG_NOTES (insn);
1: debug_rtx(insn) = (insn 79 50 52 8 (set (reg:DI 72 [ ivtmp.17D.1758 ])
        (reg:DI 103 [ ivtmp.17D.1758 ])) -1
     (expr_list:REG_DEAD (reg:DI 103 [ ivtmp.17D.1758 ])
        (expr_list:REG_EQUAL (plus:DI (reg:DI 72 [ ivtmp.17D.1758 ])
                (const_int 24 [0x18]))
            (nil))))
void
(gdb) undisp 1
(gdb) p debug_bitmap(live)

first = 0x1627200 current = 0x1627200 indx = 0
        0x1627200 next = (nil) prev = (nil) indx = 0
                bits = { 6 7 16 20 72 82 85 87 }
$2 = void
(gdb)


So GCC should be looking at whether the reg in the REG_EQUAL note is
dead *before* the insn.

Bottom line: This is a bug in df_kill_notes.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-16 10:35                               ` Steven Bosscher
@ 2012-10-16 11:05                                 ` Steven Bosscher
  2012-10-16 11:42                                   ` Paolo Bonzini
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-16 11:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1588 bytes --]

On Tue, Oct 16, 2012 at 12:15 PM, Steven Bosscher wrote:
> On Mon, Oct 15, 2012 at 3:26 PM, Steven Bosscher wrote:
>> On Mon, Oct 15, 2012 at 3:21 PM, Paolo Bonzini wrote:
>>> Il 15/10/2012 14:53, Steven Bosscher ha scritto:
>>>> I think I've shown above that we're all looking at the wrong pass...
>>>
>>> I think you have... so we want a patch like this?
>>
>> I don't think so. df_kill_notes is already supposed to take care of this.
>
> But it doesn't because if the SET_DEST of an insn is the same as the
> register dieing in the insn's REG_EQUAL note, the reg is live at the
> end of the insn and so the note stays:
>
> Breakpoint 2, df_kill_notes (insn=0x7ffff5e3e7e0, live=0x7fffffffda90)
> at ../../trunk/gcc/df-problems.c:2833
> 2833      rtx *pprev = &REG_NOTES (insn);
> 1: debug_rtx(insn) = (insn 79 50 52 8 (set (reg:DI 72 [ ivtmp.17D.1758 ])
>         (reg:DI 103 [ ivtmp.17D.1758 ])) -1
>      (expr_list:REG_DEAD (reg:DI 103 [ ivtmp.17D.1758 ])
>         (expr_list:REG_EQUAL (plus:DI (reg:DI 72 [ ivtmp.17D.1758 ])
>                 (const_int 24 [0x18]))
>             (nil))))
> void
> (gdb) undisp 1
> (gdb) p debug_bitmap(live)
>
> first = 0x1627200 current = 0x1627200 indx = 0
>         0x1627200 next = (nil) prev = (nil) indx = 0
>                 bits = { 6 7 16 20 72 82 85 87 }
> $2 = void
> (gdb)
>
>
> So GCC should be looking at whether the reg in the REG_EQUAL note is
> dead *before* the insn.
>
> Bottom line: This is a bug in df_kill_notes.

I think this should fix it. Can't test it right now, so help
appreciated (Honza: hint hint! ;-)

Ciao!
Steven

[-- Attachment #2: remove_dead_eq_notes.diff --]
[-- Type: application/octet-stream, Size: 2996 bytes --]

	* df-problems.c (df_kill_notes): Split up in two functions.
	(df_remove_dead_and_unused_notes): New function, first half of
	df_kill notes to remove all REG_DEAD and REG_UNUSED notes.
	(df_remove_dead_eq_notes): New function, second half of df_kill_notes
	to remove REG_EQUAL and REG_EQUIV notes referring to dead registers.
	(df_note_bb_compute): Call df_remove_dead_and_unused_notes instead
	of df_kill_notes.  Call df_remove_dead_eq_notes after processing insn.

	* web.c (web): Re-add DF_RD_PRUNE_DEAD_DEFS;

Index: df-problems.c
===================================================================
--- df-problems.c	(revision 192490)
+++ df-problems.c	(working copy)
@@ -2822,13 +2822,10 @@ df_ignore_stack_reg (int regno ATTRIBUTE
 #endif
 
 
-/* Remove all of the REG_DEAD or REG_UNUSED notes from INSN and add
-   them to OLD_DEAD_NOTES and OLD_UNUSED_NOTES.  Remove also
-   REG_EQUAL/REG_EQUIV notes referring to dead pseudos using LIVE
-   as the bitmap of currently live registers.  */
+/* Remove all of the REG_DEAD or REG_UNUSED notes from INSN.  */
 
 static void
-df_kill_notes (rtx insn, bitmap live)
+df_remove_dead_and_unused_notes (rtx insn)
 {
   rtx *pprev = &REG_NOTES (insn);
   rtx link = *pprev;
@@ -2873,6 +2870,27 @@ df_kill_notes (rtx insn, bitmap live)
 	    }
 	  break;
 
+	default:
+	  pprev = &XEXP (link, 1);
+	  link = *pprev;
+	  break;
+	}
+    }
+}
+
+/* Remove REG_EQUAL/REG_EQUIV notes referring to dead pseudos using LIVE
+   as the bitmap of currently live registers.  */
+
+static void
+df_remove_dead_eq_notes (rtx insn, bitmap live)
+{
+  rtx *pprev = &REG_NOTES (insn);
+  rtx link = *pprev;
+
+  while (link)
+    {
+      switch (REG_NOTE_KIND (link))
+	{
 	case REG_EQUAL:
 	case REG_EQUIV:
 	  {
@@ -2913,6 +2931,7 @@ df_kill_notes (rtx insn, bitmap live)
 	      }
 	    break;
 	  }
+
 	default:
 	  pprev = &XEXP (link, 1);
 	  link = *pprev;
@@ -2921,7 +2940,6 @@ df_kill_notes (rtx insn, bitmap live)
     }
 }
 
-
 /* Set a NOTE_TYPE note for REG in INSN.  */
 
 static inline void
@@ -3195,7 +3213,7 @@ df_note_bb_compute (unsigned int bb_inde
       debug_insn = DEBUG_INSN_P (insn);
 
       bitmap_clear (do_not_gen);
-      df_kill_notes (insn, live);
+      df_remove_dead_and_unused_notes (insn);
 
       /* Process the defs.  */
       if (CALL_P (insn))
@@ -3336,6 +3354,8 @@ df_note_bb_compute (unsigned int bb_inde
 	    }
 	}
 
+      df_remove_dead_eq_notes (insn, live);
+
       if (debug_insn == -1)
 	{
 	  /* ??? We could probably do better here, replacing dead
Index: web.c
===================================================================
--- web.c	(revision 192490)
+++ web.c	(working copy)
@@ -336,8 +336,7 @@ web_main (void)
   rtx insn;
 
   df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
-  /* We can not RD_PRUNE_DEAD_DEFS, because we care about REG_EQUAL
-     notes.  */
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
   df_analyze ();
   df_set_flags (DF_DEFER_INSN_RESCAN);

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-16 11:05                                 ` Steven Bosscher
@ 2012-10-16 11:42                                   ` Paolo Bonzini
  2012-10-16 22:57                                     ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-10-16 11:42 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

Il 16/10/2012 12:35, Steven Bosscher ha scritto:
> On Tue, Oct 16, 2012 at 12:15 PM, Steven Bosscher wrote:
>> On Mon, Oct 15, 2012 at 3:26 PM, Steven Bosscher wrote:
>>> On Mon, Oct 15, 2012 at 3:21 PM, Paolo Bonzini wrote:
>>>> Il 15/10/2012 14:53, Steven Bosscher ha scritto:
>>>>> I think I've shown above that we're all looking at the wrong pass...
>>>>
>>>> I think you have... so we want a patch like this?
>>>
>>> I don't think so. df_kill_notes is already supposed to take care of this.
>>
>> But it doesn't because if the SET_DEST of an insn is the same as the
>> register dieing in the insn's REG_EQUAL note, the reg is live at the
>> end of the insn and so the note stays:
>>
>> Breakpoint 2, df_kill_notes (insn=0x7ffff5e3e7e0, live=0x7fffffffda90)
>> at ../../trunk/gcc/df-problems.c:2833
>> 2833      rtx *pprev = &REG_NOTES (insn);
>> 1: debug_rtx(insn) = (insn 79 50 52 8 (set (reg:DI 72 [ ivtmp.17D.1758 ])
>>         (reg:DI 103 [ ivtmp.17D.1758 ])) -1
>>      (expr_list:REG_DEAD (reg:DI 103 [ ivtmp.17D.1758 ])
>>         (expr_list:REG_EQUAL (plus:DI (reg:DI 72 [ ivtmp.17D.1758 ])
>>                 (const_int 24 [0x18]))
>>             (nil))))
>> void
>> (gdb) undisp 1
>> (gdb) p debug_bitmap(live)
>>
>> first = 0x1627200 current = 0x1627200 indx = 0
>>         0x1627200 next = (nil) prev = (nil) indx = 0
>>                 bits = { 6 7 16 20 72 82 85 87 }
>> $2 = void
>> (gdb)
>>
>>
>> So GCC should be looking at whether the reg in the REG_EQUAL note is
>> dead *before* the insn.
>>
>> Bottom line: This is a bug in df_kill_notes.

Yep, and it could cause wrong code generation, couldn't it?  Because the
new (reg:DI 72) is obviously not equal to (plus:DI (reg:DI 72)
(const_int 24)), but it could be propagated to subsequent insns.

So I think this patch should be backported to all release branches after
regstrapping.

> I think this should fix it. Can't test it right now, so help
> appreciated (Honza: hint hint! ;-)

Ok after bootstrap, regtest and checking that it actually fixes the bug.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-16 11:42                                   ` Paolo Bonzini
@ 2012-10-16 22:57                                     ` Steven Bosscher
  2012-10-19  5:14                                       ` Bin.Cheng
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-10-16 22:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Hubicka, gcc-patches

On Tue, Oct 16, 2012 at 1:38 PM, Paolo Bonzini wrote:
>>> Bottom line: This is a bug in df_kill_notes.
>
> Yep, and it could cause wrong code generation, couldn't it?  Because the
> new (reg:DI 72) is obviously not equal to (plus:DI (reg:DI 72)
> (const_int 24)), but it could be propagated to subsequent insns.

Right.


> So I think this patch should be backported to all release branches after
> regstrapping.

Agreed, but let's wait a few days at least to see what happens with
the patch on the trunk.


> Ok after bootstrap, regtest and checking that it actually fixes the bug.

I made sure of that before posting the patch ;-)

Bootstrap&regtest is running on x86_64-unknown-linux-gnu and
powerpc64-unknown-linux-gnu. I'll commit the patch tomorrow if there
are no test suite regressions.

Well done to us all for probing until we found the root cause of this bug!

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-10-16 22:57                                     ` Steven Bosscher
@ 2012-10-19  5:14                                       ` Bin.Cheng
  0 siblings, 0 replies; 68+ messages in thread
From: Bin.Cheng @ 2012-10-19  5:14 UTC (permalink / raw)
  To: Jan Hubicka, Steven Bosscher; +Cc: Paolo Bonzini, gcc-patches

On Wed, Oct 17, 2012 at 6:54 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Tue, Oct 16, 2012 at 1:38 PM, Paolo Bonzini wrote:
>>>> Bottom line: This is a bug in df_kill_notes.
>>
>> Yep, and it could cause wrong code generation, couldn't it?  Because the
>> new (reg:DI 72) is obviously not equal to (plus:DI (reg:DI 72)
>> (const_int 24)), but it could be propagated to subsequent insns.
>
> Right.
>
>
>> So I think this patch should be backported to all release branches after
>> regstrapping.
>
> Agreed, but let's wait a few days at least to see what happens with
> the patch on the trunk.
>
>
>> Ok after bootstrap, regtest and checking that it actually fixes the bug.
>
> I made sure of that before posting the patch ;-)
>
> Bootstrap&regtest is running on x86_64-unknown-linux-gnu and
> powerpc64-unknown-linux-gnu. I'll commit the patch tomorrow if there
> are no test suite regressions.
>
> Well done to us all for probing until we found the root cause of this bug!
>

Hi Jan,

For the test case:

main()
{
  configure2();
}

Please make main return ZERO by default.

When cross testing with qemu/simulator, the wrap exit checks whether
the case by verifying the return value.
For ARM target, R0 is checked while it may be clobbered(thus non-ZERO)
if main does not return any value, which causes failure of test case.

Thanks very much.

-- 
Best Regards.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-12-03 20:20                                                           ` Steven Bosscher
  2012-12-03 21:12                                                             ` Eric Botcazou
@ 2012-12-03 23:28                                                             ` Steven Bosscher
  1 sibling, 0 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-12-03 23:28 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Mon, Dec 3, 2012 at 9:19 PM, Steven Bosscher wrote:
>> So the compiler doesn't bootstrap with the gcse.c patch you posted earlier in
>> the thread?  Still this seems too bold to me, the note datum could be a
>> constant and should be preserved in this case.
>
> You mean the patch at
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02275.html right?
>
> I haven't tried that other patch. I'll test that one.
>
...
>> The cse.c hunk is OK then.
>
> Thanks, I'll commit it separately.

This is what I'll commit in a second.
A new proposed cprop.c change will follow later.

Ciao!
Steven


        * gcse.c (struct reg_use): Remove unused struct.
        (gcse_emit_move_after): Do not create REG_EQUAL notes that reference
        the SET_DEST of the instruction the note would be attached to.
        * cse.c (cse_main): Add the DF_NOTE problem.

Index: gcse.c
===================================================================
--- gcse.c      (revision 194084)
+++ gcse.c      (working copy)
@@ -255,8 +255,6 @@ int flag_rerun_cse_after_global_opts;
 /* An obstack for our working variables.  */
 static struct obstack gcse_obstack;

-struct reg_use {rtx reg_rtx; };
-
 /* Hash table of expressions.  */

 struct expr
@@ -2491,23 +2489,27 @@ gcse_emit_move_after (rtx dest, rtx src, rtx insn)
   rtx new_rtx;
   rtx set = single_set (insn), set2;
   rtx note;
-  rtx eqv;
+  rtx eqv = NULL_RTX;

   /* This should never fail since we're creating a reg->reg copy
      we've verified to be valid.  */

   new_rtx = emit_insn_after (gen_move_insn (dest, src), insn);

-  /* Note the equivalence for local CSE pass.  */
+  /* Note the equivalence for local CSE pass.  Take the note from the old
+     set if there was one.  Otherwise record the SET_SRC from the old set
+     unless DEST is also an operand of the SET_SRC.  */
   set2 = single_set (new_rtx);
   if (!set2 || !rtx_equal_p (SET_DEST (set2), dest))
     return new_rtx;
   if ((note = find_reg_equal_equiv_note (insn)))
     eqv = XEXP (note, 0);
-  else
+  else if (! REG_P (dest)
+          || ! reg_mentioned_p (dest, SET_SRC (set)))
     eqv = SET_SRC (set);

-  set_unique_reg_note (new_rtx, REG_EQUAL, copy_insn_1 (eqv));
+  if (eqv != NULL_RTX)
+    set_unique_reg_note (new_rtx, REG_EQUAL, copy_insn_1 (eqv));

   return new_rtx;
 }
Index: cse.c
===================================================================
--- cse.c       (revision 194084)
+++ cse.c       (working copy)
@@ -6520,6 +6520,7 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nregs)
   int i, n_blocks;

   df_set_flags (DF_LR_RUN_DCE);
+  df_note_add_problem ();
   df_analyze ();
   df_set_flags (DF_DEFER_INSN_RESCAN);

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-12-03 20:20                                                           ` Steven Bosscher
@ 2012-12-03 21:12                                                             ` Eric Botcazou
  2012-12-03 23:28                                                             ` Steven Bosscher
  1 sibling, 0 replies; 68+ messages in thread
From: Eric Botcazou @ 2012-12-03 21:12 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> You mean the patch at
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02275.html right?
> 
> I haven't tried that other patch. I'll test that one.

Yes, thanks.

> I suppose so. But this was all added before RTL fwprop and way before
> GIMPLE optimizations. Avoiding the self-referential case is just more
> difficult to do, quite expensive (have to scan the SET_SRC pattern),
> and AFAICT doesn't bring much pay-off.
> 
> I'll prepare something to avoid the self-referential case, but I think
> we're making our lives complicated for no good reason.

Can't you just use the same best-effort approach used for fwprop.c here?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-12-03 18:26                                                         ` Eric Botcazou
@ 2012-12-03 20:20                                                           ` Steven Bosscher
  2012-12-03 21:12                                                             ` Eric Botcazou
  2012-12-03 23:28                                                             ` Steven Bosscher
  0 siblings, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-12-03 20:20 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Mon, Dec 3, 2012 at 7:23 PM, Eric Botcazou wrote:
>> >> 2. gcse.c: gcse_emit_move_after added notes, but none of them was very
>> >> useful as far as I could tell, and almost all of them turned
>> >> self-referencing after CPROP. So I propose we just never add notes in
>> >> this case.
>> >
>> > gcse_emit_move_after also preserves existing notes.  Are they
>> > problematic?
>> Yes, they tend to be invalid after PRE because the registers used in
>> the PRE'd expression usually are not live anymore (making the note
>> invalid). Sometimes CPROP "re-validates" the notes, but it doesn't
>> seem wise to me to rely on that.
>
> So the compiler doesn't bootstrap with the gcse.c patch you posted earlier in
> the thread?  Still this seems too bold to me, the note datum could be a
> constant and should be preserved in this case.

You mean the patch at
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02275.html right?

I haven't tried that other patch. I'll test that one.


>> >> 3. cprop.c: It seems to me that the purpose here is to propagate
>> >> constants. If a reg could not be propagated, then the REG_EQUAL note
>> >> will not help much either. Propagating constants via REG_EQUAL notes
>> >> still helps folding comparisons sometimes, so I'm proposing we only
>> >> propagate those. As a bonus: less garbage RTL because a
>> >> cprop_constant_p can be shared.
>> >
>> > That seems a bit radical to me, especially in try_replace_reg which is
>> > used for copy propagation as well.  In cprop_jump, why is attaching a
>> > note to the jump problematic?
>>
>> Most of the time a note from copy-propagation was not valid because
>> the copy-prop'd reg was not live at the point of the note.
>
> This one I think we should drop for now, or just avoid the self-referential
> case.  There is a comment explicitly saying that the REG_EQUAL note added by
> try_replace_reg are part of the algorithm.

I suppose so. But this was all added before RTL fwprop and way before
GIMPLE optimizations. Avoiding the self-referential case is just more
difficult to do, quite expensive (have to scan the SET_SRC pattern),
and AFAICT doesn't bring much pay-off.

I'll prepare something to avoid the self-referential case, but I think
we're making our lives complicated for no good reason.


>> Not really. It uses single_set in a few places, including
>> delete_trivially_dead_insns and cse_extended_basic_block.
>>
>> > so it seems like we're back to the earlier
>> > trick of using df_note_add_problem to clean up pre-existing REG_EQ*
>> > notes.
>> Again: Not really. I also bootstrapped&tested without the cse.c change.
>
> The cse.c hunk is OK then.

Thanks, I'll commit it separately.


>> I plan (and promise ;-) to add a REG_EQ* note verifier for GCC 4.9.
>
> Thanks (no need to promise though :-), that will be helpful.  In the meantime,
> I don't think that we should aim for perfection in 4.8, these REG_EQ* notes
> and their quirks have been with us for a long time...

Well, yes they've been with us for a long time, but my LR_RD change
exposed all these problems that were simply hidden before. I think
we're safe for GCC 4.8 but I don't feel comfortable about this
situation...

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-12-01 14:57                                                     ` Eric Botcazou
  2012-12-01 16:45                                                       ` Steven Bosscher
@ 2012-12-03 20:15                                                       ` Paolo Bonzini
  1 sibling, 0 replies; 68+ messages in thread
From: Paolo Bonzini @ 2012-12-03 20:15 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Steven Bosscher, gcc-patches, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

Il 01/12/2012 15:54, Eric Botcazou ha scritto:
>> Attached is a different fix. It splits DF_REF_IN_NOTE in two: One flag
>> > for each kind of note. This allows the dead note removal code to
>> > distinguish the source note for the EQ_USES.  I needed to remove one
>> > flag to keep the df_ref_flags 16-bit, but the DF_REF_SUBREG flag looks
>> > completely unnecessary to me, and only ira.c uses it, so it wasn't to
>> > hard to scavenge a single bit.
> I'll defer this to Paolo.

Ok, I'll review this tomorrow.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-12-01 16:45                                                       ` Steven Bosscher
@ 2012-12-03 18:26                                                         ` Eric Botcazou
  2012-12-03 20:20                                                           ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-12-03 18:26 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> >> 2. gcse.c: gcse_emit_move_after added notes, but none of them was very
> >> useful as far as I could tell, and almost all of them turned
> >> self-referencing after CPROP. So I propose we just never add notes in
> >> this case.
> > 
> > gcse_emit_move_after also preserves existing notes.  Are they
> > problematic?
> Yes, they tend to be invalid after PRE because the registers used in
> the PRE'd expression usually are not live anymore (making the note
> invalid). Sometimes CPROP "re-validates" the notes, but it doesn't
> seem wise to me to rely on that.

So the compiler doesn't bootstrap with the gcse.c patch you posted earlier in 
the thread?  Still this seems too bold to me, the note datum could be a 
constant and should be preserved in this case.

> >> 3. cprop.c: It seems to me that the purpose here is to propagate
> >> constants. If a reg could not be propagated, then the REG_EQUAL note
> >> will not help much either. Propagating constants via REG_EQUAL notes
> >> still helps folding comparisons sometimes, so I'm proposing we only
> >> propagate those. As a bonus: less garbage RTL because a
> >> cprop_constant_p can be shared.
> > 
> > That seems a bit radical to me, especially in try_replace_reg which is
> > used for copy propagation as well.  In cprop_jump, why is attaching a
> > note to the jump problematic?
> 
> Most of the time a note from copy-propagation was not valid because
> the copy-prop'd reg was not live at the point of the note.

This one I think we should drop for now, or just avoid the self-referential 
case.  There is a comment explicitly saying that the REG_EQUAL note added by 
try_replace_reg are part of the algorithm.

> Not really. It uses single_set in a few places, including
> delete_trivially_dead_insns and cse_extended_basic_block.
> 
> > so it seems like we're back to the earlier
> > trick of using df_note_add_problem to clean up pre-existing REG_EQ*
> > notes.
> Again: Not really. I also bootstrapped&tested without the cse.c change.

The cse.c hunk is OK then.

> I plan (and promise ;-) to add a REG_EQ* note verifier for GCC 4.9.

Thanks (no need to promise though :-), that will be helpful.  In the meantime, 
I don't think that we should aim for perfection in 4.8, these REG_EQ* notes 
and their quirks have been with us for a long time...

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-12-01 14:57                                                     ` Eric Botcazou
@ 2012-12-01 16:45                                                       ` Steven Bosscher
  2012-12-03 18:26                                                         ` Eric Botcazou
  2012-12-03 20:15                                                       ` Paolo Bonzini
  1 sibling, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-12-01 16:45 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Sat, Dec 1, 2012 at 3:54 PM, Eric Botcazou wrote:
>> The patch also includes all places I've found so far where the
>> compiler could create self-referencing notes:
>>
>> 1. optabs.c: Not sure what it was trying to do, but now it just
>> refuses to add a note if TARGET is mentioned in one of the source
>> operands.
>
> OK.

Thanks. I'll commit this separately.


>> 2. gcse.c: gcse_emit_move_after added notes, but none of them was very
>> useful as far as I could tell, and almost all of them turned
>> self-referencing after CPROP. So I propose we just never add notes in
>> this case.
>
> gcse_emit_move_after also preserves existing notes.  Are they problematic?

Yes, they tend to be invalid after PRE because the registers used in
the PRE'd expression usually are not live anymore (making the note
invalid). Sometimes CPROP "re-validates" the notes, but it doesn't
seem wise to me to rely on that.


>> 3. cprop.c: It seems to me that the purpose here is to propagate
>> constants. If a reg could not be propagated, then the REG_EQUAL note
>> will not help much either. Propagating constants via REG_EQUAL notes
>> still helps folding comparisons sometimes, so I'm proposing we only
>> propagate those. As a bonus: less garbage RTL because a
>> cprop_constant_p can be shared.
>
> That seems a bit radical to me, especially in try_replace_reg which is used
> for copy propagation as well.  In cprop_jump, why is attaching a note to the
> jump problematic?

Most of the time a note from copy-propagation was not valid because
the copy-prop'd reg was not live at the point of the note.


>> 4. fwprop.c: If the SET_DEST is a REG that is mentioned in the
>> SET_SRC, don' create a note. This one I'm not very happy with, because
>> it doesn't handle the case that the REG is somehow simplified out of
>> the SET_SRC, but being smarter about this would only complicate
>> things. I for one can't think of anything better for the moment,
>> anyway.
>
> OK.

I'll commit this along with the optabs.c part.


>> Finally, it makes sense to compute the NOTE problem for CSE.
>
> Why?  It only uses REG_EQ* notes,

Not really. It uses single_set in a few places, including
delete_trivially_dead_insns and cse_extended_basic_block.


> so it seems like we're back to the earlier
> trick of using df_note_add_problem to clean up pre-existing REG_EQ* notes.

Again: Not really. I also bootstrapped&tested without the cse.c change.

I plan (and promise ;-) to add a REG_EQ* note verifier for GCC 4.9.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-28 23:54                                                   ` Steven Bosscher
@ 2012-12-01 14:57                                                     ` Eric Botcazou
  2012-12-01 16:45                                                       ` Steven Bosscher
  2012-12-03 20:15                                                       ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Eric Botcazou @ 2012-12-01 14:57 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> Attached is a different fix. It splits DF_REF_IN_NOTE in two: One flag
> for each kind of note. This allows the dead note removal code to
> distinguish the source note for the EQ_USES.  I needed to remove one
> flag to keep the df_ref_flags 16-bit, but the DF_REF_SUBREG flag looks
> completely unnecessary to me, and only ira.c uses it, so it wasn't to
> hard to scavenge a single bit.

I'll defer this to Paolo.

> The patch also includes all places I've found so far where the
> compiler could create self-referencing notes:
> 
> 1. optabs.c: Not sure what it was trying to do, but now it just
> refuses to add a note if TARGET is mentioned in one of the source
> operands.

OK.

> 2. gcse.c: gcse_emit_move_after added notes, but none of them was very
> useful as far as I could tell, and almost all of them turned
> self-referencing after CPROP. So I propose we just never add notes in
> this case.

gcse_emit_move_after also preserves existing notes.  Are they problematic?

> 3. cprop.c: It seems to me that the purpose here is to propagate
> constants. If a reg could not be propagated, then the REG_EQUAL note
> will not help much either. Propagating constants via REG_EQUAL notes
> still helps folding comparisons sometimes, so I'm proposing we only
> propagate those. As a bonus: less garbage RTL because a
> cprop_constant_p can be shared.

That seems a bit radical to me, especially in try_replace_reg which is used 
for copy propagation as well.  In cprop_jump, why is attaching a note to the 
jump problematic?

> 4. fwprop.c: If the SET_DEST is a REG that is mentioned in the
> SET_SRC, don' create a note. This one I'm not very happy with, because
> it doesn't handle the case that the REG is somehow simplified out of
> the SET_SRC, but being smarter about this would only complicate
> things. I for one can't think of anything better for the moment,
> anyway.

OK.

> Finally, it makes sense to compute the NOTE problem for CSE.

Why?  It only uses REG_EQ* notes, so it seems like we're back to the earlier 
trick of using df_note_add_problem to clean up pre-existing REG_EQ* notes.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-28 22:12                                                 ` Eric Botcazou
@ 2012-11-28 23:54                                                   ` Steven Bosscher
  2012-12-01 14:57                                                     ` Eric Botcazou
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-28 23:54 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 3629 bytes --]

On Wed, Nov 28, 2012 at 11:10 PM, Eric Botcazou wrote:
>> Well, I'm not sure I agree that they are wrong. Consider:
>>
>> S0: r1 = ...
>> S1: r2 = r1 + 10
>> S2: r1 = r2 + 10 { REG_EQUAL r1 + 20 }
>> S3: r3 = r1 + 10
>>
>> Clearly it would be wrong to use the REG_EQUAL note on S2 to optimize S3 to:
>>
>> S0: r1 = ...
>> S1: r2 = r1 + 10
>> S2: r1 = r1 + 20
>> S3: r3 = r1 + 30
>
> But the note is wrong by itself.  The semantics is clear: the note means that
> r1 is equal to r1 + 20 right after S2, which doesn't make any sense.

Or maybe the semantics are not what the compiler is actually doing.
Because clearly several places in the compiler can create this kind of
self-referencing note right now, and the main consumer of the notes,
CSE, has apparently not had too much trouble handing them.

But with the documented semantics, you're right. And, to be honest,
I'm of the "the fewer notes, the better" camp :-)


>> 2: debug_rtx((rtx)0x7ffff4df5e58) = (insn 638 637 639 85 (parallel [
>>             (set (reg:SI 891)
>>                 (minus:SI (reg:SI 890)
>>                     (reg:SI 546 [ D.45472 ])))
>>             (clobber (reg:CC 17 flags))
>>         ]) ../../trunk/gcc/value-prof.c:928 283 {*subsi_1}
>>      (expr_list:REG_EQUIV (mem:SI (plus:DI (reg/v/f:DI 204 [ e12 ])
>>                 (const_int 52 [0x34])) [4 e12_223->probability+0 S4 A32])
>>         (expr_list:REG_UNUSED (reg:CC 17 flags)
>>             (expr_list:REG_EQUAL (minus:SI (const_int 10000 [0x2710])
>>                     (reg:SI 546 [ D.45472 ]))
>>                 (nil)))))
>> void
>> (gdb)
>>
>> Is that valid?
>
> Yes, the comment is wrong and should have been removed in r183719:
>   http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01547.html

Ugh, that doesn't look like a very solid fix.

Attached is a different fix. It splits DF_REF_IN_NOTE in two: One flag
for each kind of note. This allows the dead note removal code to
distinguish the source note for the EQ_USES.  I needed to remove one
flag to keep the df_ref_flags 16-bit, but the DF_REF_SUBREG flag looks
completely unnecessary to me, and only ira.c uses it, so it wasn't to
hard to scavenge a single bit.


The patch also includes all places I've found so far where the
compiler could create self-referencing notes:

1. optabs.c: Not sure what it was trying to do, but now it just
refuses to add a note if TARGET is mentioned in one of the source
operands.

2. gcse.c: gcse_emit_move_after added notes, but none of them was very
useful as far as I could tell, and almost all of them turned
self-referencing after CPROP. So I propose we just never add notes in
this case.

3. cprop.c: It seems to me that the purpose here is to propagate
constants. If a reg could not be propagated, then the REG_EQUAL note
will not help much either. Propagating constants via REG_EQUAL notes
still helps folding comparisons sometimes, so I'm proposing we only
propagate those. As a bonus: less garbage RTL because a
cprop_constant_p can be shared.

4. fwprop.c: If the SET_DEST is a REG that is mentioned in the
SET_SRC, don' create a note. This one I'm not very happy with, because
it doesn't handle the case that the REG is somehow simplified out of
the SET_SRC, but being smarter about this would only complicate
things. I for one can't think of anything better for the moment,
anyway.


Finally, it makes sense to compute the NOTE problem for CSE.

Bootstrap&testing in progress at the older revision I've been using to
debug the darwin10 and sparc bugs. I'll test trunk head on x86_64 and
powerpc later. In the mean time, let me hear what you think of this
one :-)

Ciao!
Steven

[-- Attachment #2: PR55006_2.diff --]
[-- Type: application/octet-stream, Size: 18801 bytes --]

	* ira.c (build_insn_chain): Look at the rtx code to determine if
	the use is actually a SUBREG.
	* df.h (enum df_ref_flags): Add new flags DF_REF_IN_EQUAL_NOTE and
	DF_REF_IN_EQUIV_NOTE, make DF_REF_IN_NOTE a combination of the two,
	and remove DF_REF_SUBREG to keep the number of flags at 16.  Renumber.
	* df-core.c (df_ref_dump): Print 'e' for a REG_EQUAL use,
	and 'q' for a REG_EQUIV use.
	* df-problems.c (df_chain_dump): Likewise.
	(df_remove_dead_eq_notes): Simplify condition.  Look only at refs
	in the kind of note we're processing.  Add sanity check to make
	sure that the SCAN problem is up-to-date, i.e. the EQ_USE reg is
	really mentioned in the note.
	* df-scan.c (df_uses_create): Adjust assert.
	(df_notes_rescan): Split handling of REG_EQUAL and REG_EQUIV notes.
	(df_insn_refs_collect): Likewise.
	(df_def_record_1): Do not set DF_REF_SUBREG.
	(df_uses_record): Likewise.
	* fwprop.c (update_df): Pass DF_REF_IN_EQUAL_NOTE.
	(forward_propagate_and_simplify): Only look in REG_EQUAL notes.
	Do not create self-referencing REG_EQUAL notes.
	(forward_propagate_into): Only look in REG_EQUAL notes.
	* gcse.c (gcse_emit_move_after): Remove.
	(pre_delete): Use emit_insn_after.
	(hoist_code): Likewise.
	* cprop.c (try_replace_reg): Only attach notes if the source was
	folded to a CPROP constant.
	(cprop_jump): Likewise.
	* optabs.c (add_equal_note): Do not create self-referencing REG_EQUAL
	notes.
	* cse.c (cse_main): Add the DF_NOTE problem.

Index: ira.c
===================================================================
--- ira.c	(revision 193394)
+++ ira.c	(working copy)
@@ -3669,7 +3669,7 @@ build_insn_chain (void)
 		       fabricated use. */
 		    if (DF_REF_FLAGS_IS_SET (use, DF_REF_READ_WRITE)
 			&& !DF_REF_FLAGS_IS_SET (use, DF_REF_ZERO_EXTRACT)
-			&& DF_REF_FLAGS_IS_SET (use, DF_REF_SUBREG))
+			&& GET_CODE (DF_REF_REG (use)) == SUBREG)
 		      continue;
 
 		    /* Add the last use of each var to dead_or_set.  */
Index: df.h
===================================================================
--- df.h	(revision 193394)
+++ df.h	(working copy)
@@ -87,30 +87,31 @@ enum df_ref_flags
        bottom of the block.  This is never set for regular refs.  */
     DF_REF_AT_TOP = 1 << 1,
 
-    /* This flag is set if the use is inside a REG_EQUAL or REG_EQUIV
-       note.  */
-    DF_REF_IN_NOTE = 1 << 2,
+    /* This flag is set if the use is inside a REG_EQUAL note,
+       or a REG_EQUIV note, or any one of the two types of note.  */
+    DF_REF_IN_EQUAL_NOTE = 1 << 2,
+    DF_REF_IN_EQUIV_NOTE = 1 << 3,
+    DF_REF_IN_NOTE = DF_REF_IN_EQUAL_NOTE | DF_REF_IN_EQUIV_NOTE,
 
     /* This bit is true if this ref can make regs_ever_live true for
        this regno.  */
-    DF_HARD_REG_LIVE = 1 << 3,
-
+    DF_HARD_REG_LIVE = 1 << 4,
 
     /* This flag is set if this ref is a partial use or def of the
        associated register.  */
-    DF_REF_PARTIAL = 1 << 4,
+    DF_REF_PARTIAL = 1 << 5,
 
     /* Read-modify-write refs generate both a use and a def and
        these are marked with this flag to show that they are not
        independent.  */
-    DF_REF_READ_WRITE = 1 << 5,
+    DF_REF_READ_WRITE = 1 << 6,
 
     /* This flag is set if this ref, generally a def, may clobber the
        referenced register.  This is generally only set for hard
        registers that cross a call site.  With better information
        about calls, some of these could be changed in the future to
        DF_REF_MUST_CLOBBER.  */
-    DF_REF_MAY_CLOBBER = 1 << 6,
+    DF_REF_MAY_CLOBBER = 1 << 7,
 
     /* This flag is set if this ref, generally a def, is a real
        clobber. This is not currently set for registers live across a
@@ -121,25 +122,20 @@ enum df_ref_flags
        clobber is to a subreg.  So in order to tell if the clobber
        wipes out the entire register, it is necessary to also check
        the DF_REF_PARTIAL flag.  */
-    DF_REF_MUST_CLOBBER = 1 << 7,
-
+    DF_REF_MUST_CLOBBER = 1 << 8,
 
     /* If the ref has one of the following two flags set, then the
        struct df_ref can be cast to struct df_ref_extract to access
        the width and offset fields.  */
 
     /* This flag is set if the ref contains a SIGN_EXTRACT.  */
-    DF_REF_SIGN_EXTRACT = 1 << 8,
+    DF_REF_SIGN_EXTRACT = 1 << 9,
 
     /* This flag is set if the ref contains a ZERO_EXTRACT.  */
-    DF_REF_ZERO_EXTRACT = 1 << 9,
+    DF_REF_ZERO_EXTRACT = 1 << 10,
 
     /* This flag is set if the ref contains a STRICT_LOW_PART.  */
-    DF_REF_STRICT_LOW_PART = 1 << 10,
-
-    /* This flag is set if the ref contains a SUBREG.  */
-    DF_REF_SUBREG = 1 << 11,
-
+    DF_REF_STRICT_LOW_PART = 1 << 11,
 
     /* This bit is true if this ref is part of a multiword hardreg.  */
     DF_REF_MW_HARDREG = 1 << 12,
Index: df-core.c
===================================================================
--- df-core.c	(revision 193394)
+++ df-core.c	(working copy)
@@ -2127,9 +2127,10 @@ static void
 df_ref_dump (df_ref ref, FILE *file)
 {
   fprintf (file, "%c%d(%d)",
-	   DF_REF_REG_DEF_P (ref)
-	   ? 'd'
-	   : (DF_REF_FLAGS (ref) & DF_REF_IN_NOTE) ? 'e' : 'u',
+	   DF_REF_REG_DEF_P (ref) ? 'd'
+	   : (DF_REF_FLAGS (ref) & DF_REF_IN_EQUAL_NOTE) ? 'e'
+	   : (DF_REF_FLAGS (ref) & DF_REF_IN_EQUIV_NOTE) ? 'q'
+	   : 'u',
 	   DF_REF_ID (ref),
 	   DF_REF_REGNO (ref));
 }
Index: df-problems.c
===================================================================
--- df-problems.c	(revision 193394)
+++ df-problems.c	(working copy)
@@ -73,9 +73,10 @@ df_chain_dump (struct df_link *link, FIL
   for (; link; link = link->next)
     {
       fprintf (file, "%c%d(bb %d insn %d) ",
-	       DF_REF_REG_DEF_P (link->ref)
-	       ? 'd'
-	       : (DF_REF_FLAGS (link->ref) & DF_REF_IN_NOTE) ? 'e' : 'u',
+	       DF_REF_REG_DEF_P (link->ref) ? 'd'
+	       : (DF_REF_FLAGS (link->ref) & DF_REF_IN_EQUAL_NOTE) ? 'e'
+	       : (DF_REF_FLAGS (link->ref) & DF_REF_IN_EQUIV_NOTE) ? 'q'
+	       : 'u',
 	       DF_REF_ID (link->ref),
 	       DF_REF_BBNO (link->ref),
 	       DF_REF_IS_ARTIFICIAL (link->ref)
@@ -2889,27 +2890,39 @@ df_remove_dead_eq_notes (rtx insn, bitma
 
   while (link)
     {
+      /* A mask for the kind of note we should be looking for.  Because
+	 there is always at most one REG_EQUAL and/or at most one REG_EQUIV
+	 note, and we visit the notes one at a time, we can tell with this
+	 flag whether the EQ_USE we're looking at is in the note we are
+	 processing.  */
+      int note_kind_flag = DF_REF_IN_EQUAL_NOTE;
       switch (REG_NOTE_KIND (link))
 	{
-	case REG_EQUAL:
 	case REG_EQUIV:
+	  note_kind_flag = DF_REF_IN_EQUIV_NOTE;
+	  /* fallthru */
+	case REG_EQUAL:
 	  {
-	    /* Remove the notes that refer to dead registers.  As we have at most
-	       one REG_EQUAL/EQUIV note, all of EQ_USES will refer to this note
-	       so we need to purge the complete EQ_USES vector when removing
-	       the note using df_notes_rescan.  */
 	    df_ref *use_rec;
 	    bool deleted = false;
 
 	    for (use_rec = DF_INSN_EQ_USES (insn); *use_rec; use_rec++)
 	      {
 		df_ref use = *use_rec;
-		if (DF_REF_REGNO (use) > FIRST_PSEUDO_REGISTER
-		    && DF_REF_LOC (use)
-		    && (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
-		    && ! bitmap_bit_p (live, DF_REF_REGNO (use))
-		    && loc_mentioned_in_p (DF_REF_LOC (use), XEXP (link, 0)))
+
+		if (! (DF_REF_FLAGS (use) & note_kind_flag))
+		  continue;
+
+		/* A REG_EQUAL/REG_EQUIV note may not refer registers that
+		   are not live at the point of the EQ_USE.  Notes referencing
+		   registers without a definition are also dead.  (And wrong,
+		   but that's another story...)  */
+		if ((DF_REF_REGNO (use) > FIRST_PSEUDO_REGISTER
+		     && ! bitmap_bit_p (live, DF_REF_REGNO (use)))
+		    || DF_REG_DEF_COUNT (DF_REF_REGNO (use)) == 0)
 		  {
+		    gcc_checking_assert (reg_mentioned_p (DF_REF_REAL_REG (use),
+							  XEXP (link, 0)));
 		    deleted = true;
 		    break;
 		  }
Index: df-scan.c
===================================================================
--- df-scan.c	(revision 193394)
+++ df-scan.c	(working copy)
@@ -682,13 +682,16 @@ df_scan_blocks (void)
 }
 
 /* Create new refs under address LOC within INSN.  This function is
-   only used externally.  REF_FLAGS must be either 0 or DF_REF_IN_NOTE,
-   depending on whether LOC is inside PATTERN (INSN) or a note.  */
+   only used externally.  REF_FLAGS must be either 0 or one of the
+   flavors of DF_REF_IN_NOTE flags, depending on whether LOC is
+   inside PATTERN (INSN) or a note.  */
 
 void
 df_uses_create (rtx *loc, rtx insn, int ref_flags)
 {
-  gcc_assert (!(ref_flags & ~DF_REF_IN_NOTE));
+  gcc_checking_assert (ref_flags == 0
+		       || ref_flags == DF_REF_IN_EQUAL_NOTE
+		       || ref_flags == DF_REF_IN_EQUIV_NOTE);
   df_uses_record (NULL, loc, DF_REF_REG_USE,
                   BLOCK_FOR_INSN (insn),
                   DF_INSN_INFO_GET (insn),
@@ -2212,10 +2215,17 @@ df_notes_rescan (rtx insn)
 	  switch (REG_NOTE_KIND (note))
 	    {
 	    case REG_EQUIV:
+	      df_uses_record (&collection_rec,
+			      &XEXP (note, 0), DF_REF_REG_USE,
+			      bb, insn_info, DF_REF_IN_EQUIV_NOTE);
+	      break;
+
 	    case REG_EQUAL:
 	      df_uses_record (&collection_rec,
 			      &XEXP (note, 0), DF_REF_REG_USE,
-			      bb, insn_info, DF_REF_IN_NOTE);
+			      bb, insn_info, DF_REF_IN_EQUAL_NOTE);
+	      break;
+
 	    default:
 	      break;
 	    }
@@ -2975,8 +2985,6 @@ df_def_record_1 (struct df_collection_re
       if (df_read_modify_subreg_p (dst))
 	flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL;
 
-      flags |= DF_REF_SUBREG;
-
       df_ref_record (DF_REF_REGULAR, collection_rec,
 		     dst, loc, bb, insn_info, DF_REF_REG_DEF, flags);
     }
@@ -3190,7 +3198,7 @@ df_uses_record (struct df_collection_rec
 		{
 		  df_uses_record (collection_rec, &SUBREG_REG (dst),
 				  DF_REF_REG_USE, bb, insn_info,
-				  flags | DF_REF_READ_WRITE | DF_REF_SUBREG);
+				  flags | DF_REF_READ_WRITE);
 		  break;
 		}
 	      /* Fall through.  */
@@ -3463,17 +3471,21 @@ df_insn_refs_collect (struct df_collecti
   VEC_truncate (df_ref, collection_rec->eq_use_vec, 0);
   VEC_truncate (df_mw_hardreg_ptr, collection_rec->mw_vec, 0);
 
-  /* Process REG_EQUIV/REG_EQUAL notes.  */
+  /* Process REG_EQUIV/REG_EQUAL/REG_NON_LOCAL_GOTO notes.  */
   for (note = REG_NOTES (insn_info->insn); note;
        note = XEXP (note, 1))
     {
       switch (REG_NOTE_KIND (note))
         {
         case REG_EQUIV:
+          df_uses_record (collection_rec,
+                          &XEXP (note, 0), DF_REF_REG_USE,
+                          bb, insn_info, DF_REF_IN_EQUIV_NOTE);
+          break;
         case REG_EQUAL:
           df_uses_record (collection_rec,
                           &XEXP (note, 0), DF_REF_REG_USE,
-                          bb, insn_info, DF_REF_IN_NOTE);
+                          bb, insn_info, DF_REF_IN_EQUAL_NOTE);
           break;
         case REG_NON_LOCAL_GOTO:
           /* The frame ptr is used by a non-local goto.  */
Index: fwprop.c
===================================================================
--- fwprop.c	(revision 193394)
+++ fwprop.c	(working copy)
@@ -934,7 +934,7 @@ update_df (rtx insn, rtx note)
 
   if (note)
     {
-      df_uses_create (&XEXP (note, 0), insn, DF_REF_IN_NOTE);
+      df_uses_create (&XEXP (note, 0), insn, DF_REF_IN_EQUAL_NOTE);
       df_notes_rescan (insn);
     }
   else
@@ -1311,7 +1311,7 @@ forward_propagate_and_simplify (df_ref u
   else
     {
       rtx note = find_reg_note (use_insn, REG_EQUAL, NULL_RTX);
-      if (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
+      if (DF_REF_FLAGS (use) & DF_REF_IN_EQUAL_NOTE)
 	loc = &XEXP (note, 0);
       else
 	loc = &SET_SRC (use_set);
@@ -1319,9 +1319,12 @@ forward_propagate_and_simplify (df_ref u
       /* Do not replace an existing REG_EQUAL note if the insn is not
 	 recognized.  Either we're already replacing in the note, or we'll
 	 separately try plugging the definition in the note and simplifying.
-	 And only install a REQ_EQUAL note when the destination is a REG,
-	 as the note would be invalid otherwise.  */
-      set_reg_equal = (note == NULL_RTX && REG_P (SET_DEST (use_set)));
+	 And only install a REQ_EQUAL note when the destination is a REG
+	 that isn't mentioned in USE_SET, as the note would be invalid
+	 otherwise.  */
+      set_reg_equal = (note == NULL_RTX && REG_P (SET_DEST (use_set))
+		       && ! reg_mentioned_p (SET_DEST (use_set),
+					     SET_SRC (use_set)));
     }
 
   if (GET_MODE (*loc) == VOIDmode)
@@ -1370,7 +1373,7 @@ forward_propagate_into (df_ref use)
 
   /* Check if the use is still present in the insn!  */
   use_insn = DF_REF_INSN (use);
-  if (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
+  if (DF_REF_FLAGS (use) & DF_REF_IN_EQUAL_NOTE)
     parent = find_reg_note (use_insn, REG_EQUAL, NULL_RTX);
   else
     parent = PATTERN (use_insn);
Index: gcse.c
===================================================================
--- gcse.c	(revision 193394)
+++ gcse.c	(working copy)
@@ -501,7 +501,6 @@ static void trim_ld_motion_mems (void);
 static void update_ld_motion_stores (struct expr *);
 static void clear_modify_mem_tables (void);
 static void free_modify_mem_tables (void);
-static rtx gcse_emit_move_after (rtx, rtx, rtx);
 static bool is_too_expensive (const char *);
 
 #define GNEW(T)			((T *) gmalloc (sizeof (T)))
@@ -2473,36 +2472,6 @@ pre_insert_copies (void)
       }
 }
 
-/* Emit move from SRC to DEST noting the equivalence with expression computed
-   in INSN.  */
-
-static rtx
-gcse_emit_move_after (rtx dest, rtx src, rtx insn)
-{
-  rtx new_rtx;
-  rtx set = single_set (insn), set2;
-  rtx note;
-  rtx eqv;
-
-  /* This should never fail since we're creating a reg->reg copy
-     we've verified to be valid.  */
-
-  new_rtx = emit_insn_after (gen_move_insn (dest, src), insn);
-
-  /* Note the equivalence for local CSE pass.  */
-  set2 = single_set (new_rtx);
-  if (!set2 || !rtx_equal_p (SET_DEST (set2), dest))
-    return new_rtx;
-  if ((note = find_reg_equal_equiv_note (insn)))
-    eqv = XEXP (note, 0);
-  else
-    eqv = SET_SRC (set);
-
-  set_unique_reg_note (new_rtx, REG_EQUAL, copy_insn_1 (eqv));
-
-  return new_rtx;
-}
-
 /* Delete redundant computations.
    Deletion is done by changing the insn to copy the `reaching_reg' of
    the expression into the result of the SET.  It is left to later passes
@@ -2542,7 +2511,8 @@ pre_delete (void)
 		if (expr->reaching_reg == NULL)
 		  expr->reaching_reg = gen_reg_rtx_and_attrs (SET_DEST (set));
 
-		gcse_emit_move_after (SET_DEST (set), expr->reaching_reg, insn);
+		emit_insn_after (gen_move_insn (SET_DEST (set),
+						expr->reaching_reg), insn);
 		delete_insn (insn);
 		occr->deleted_p = 1;
 		changed = 1;
@@ -3252,8 +3222,8 @@ hoist_code (void)
 		    expr->reaching_reg
 		      = gen_reg_rtx_and_attrs (SET_DEST (set));
 
-		  gcse_emit_move_after (SET_DEST (set), expr->reaching_reg,
-					insn);
+		  emit_insn_after (gen_move_insn (SET_DEST (set),
+						  expr->reaching_reg), insn);
 		  delete_insn (insn);
 		  occr->deleted_p = 1;
 		  changed = 1;
Index: cprop.c
===================================================================
--- cprop.c	(revision 193394)
+++ cprop.c	(working copy)
@@ -774,10 +774,11 @@ try_replace_reg (rtx from, rtx to, rtx i
 	success = 1;
 
       /* If we've failed perform the replacement, have a single SET to
-	 a REG destination and don't yet have a note, add a REG_EQUAL note
-	 to not lose information.  */
-      if (!success && note == 0 && set != 0 && REG_P (SET_DEST (set)))
-	note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
+	 a REG destination, don't yet have a note, and have simplified
+	 SRC to a constant, add a REG_EQUAL note to not lose information.  */
+      if (!success && note == 0 && set != 0 && REG_P (SET_DEST (set))
+	  && cprop_constant_p (src))
+	note = set_unique_reg_note (insn, REG_EQUAL, src);
     }
 
   if (set && MEM_P (SET_DEST (set)) && reg_mentioned_p (from, SET_DEST (set)))
@@ -936,8 +937,9 @@ cprop_jump (basic_block bb, rtx setcc, r
 	     we need to attach a note to the branch itself to make this
 	     optimization work.  */
 
-	  if (!rtx_equal_p (new_rtx, note_src))
-	    set_unique_reg_note (jump, REG_EQUAL, copy_rtx (new_rtx));
+	  if (!rtx_equal_p (new_rtx, note_src)
+	      && cprop_constant_p (new_rtx))
+	    set_unique_reg_note (jump, REG_EQUAL, new_rtx);
 	  return 0;
 	}
 
Index: optabs.c
===================================================================
--- optabs.c	(revision 193394)
+++ optabs.c	(working copy)
@@ -170,14 +170,14 @@ optab_libfunc (optab optab, enum machine
 
    If the last insn does not set TARGET, don't do anything, but return 1.
 
-   If a previous insn sets TARGET and TARGET is one of OP0 or OP1,
-   don't add the REG_EQUAL note but return 0.  Our caller can then try
-   again, ensuring that TARGET is not one of the operands.  */
+   If the last insn or a previous insn sets TARGET and TARGET is one of OP0
+   or OP1, don't add the REG_EQUAL note but return 0.  Our caller can then
+   try again, ensuring that TARGET is not one of the operands.  */
 
 static int
 add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
 {
-  rtx last_insn, insn, set;
+  rtx last_insn, set;
   rtx note;
 
   gcc_assert (insns && INSN_P (insns) && NEXT_INSN (insns));
@@ -192,6 +192,12 @@ add_equal_note (rtx insns, rtx target, e
   if (GET_CODE (target) == ZERO_EXTRACT)
     return 1;
 
+  /* If TARGET is in OP0 or OP1, punt.  We'd end up with a note referencing
+     a value changing in the insn, so the note would be invalid for CSE.  */
+  if (reg_overlap_mentioned_p (target, op0)
+      || (op1 && reg_overlap_mentioned_p (target, op1)))
+    return 0;
+
   for (last_insn = insns;
        NEXT_INSN (last_insn) != NULL_RTX;
        last_insn = NEXT_INSN (last_insn))
@@ -207,21 +213,6 @@ add_equal_note (rtx insns, rtx target, e
 	  || ! rtx_equal_p (XEXP (SET_DEST (set), 0), target)))
     return 1;
 
-  /* If TARGET is in OP0 or OP1, check if anything in SEQ sets TARGET
-     besides the last insn.  */
-  if (reg_overlap_mentioned_p (target, op0)
-      || (op1 && reg_overlap_mentioned_p (target, op1)))
-    {
-      insn = PREV_INSN (last_insn);
-      while (insn != NULL_RTX)
-	{
-	  if (reg_set_p (target, insn))
-	    return 0;
-
-	  insn = PREV_INSN (insn);
-	}
-    }
-
   if (GET_RTX_CLASS (code) == RTX_UNARY)
     switch (code)
       {
Index: cse.c
===================================================================
--- cse.c	(revision 193394)
+++ cse.c	(working copy)
@@ -6521,6 +6521,7 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
   int i, n_blocks;
 
   df_set_flags (DF_LR_RUN_DCE);
+  df_note_add_problem ();
   df_analyze ();
   df_set_flags (DF_DEFER_INSN_RESCAN);
 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-28 15:57                                               ` Steven Bosscher
@ 2012-11-28 22:12                                                 ` Eric Botcazou
  2012-11-28 23:54                                                   ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-28 22:12 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> Well, I'm not sure I agree that they are wrong. Consider:
> 
> S0: r1 = ...
> S1: r2 = r1 + 10
> S2: r1 = r2 + 10 { REG_EQUAL r1 + 20 }
> S3: r3 = r1 + 10
> 
> Clearly it would be wrong to use the REG_EQUAL note on S2 to optimize S3 to:
> 
> S0: r1 = ...
> S1: r2 = r1 + 10
> S2: r1 = r1 + 20
> S3: r3 = r1 + 30

But the note is wrong by itself.  The semantics is clear: the note means that 
r1 is equal to r1 + 20 right after S2, which doesn't make any sense.

> However, It seems to me that this would be valid if e.g. the webizer
> turns the above into:
> 
> S0: r1 = ...
> S1: r2 = r1 + 10
> S2: r4 = r2 + 10 { REG_EQUAL r1 + 20 }
> S3: r3 = r4 + 10
> 
> because now the optimization would be valid:
> 
> S0: r1 = ...
> S1: r2 = r1 + 10
> S2: r4 = r1 + 20
> S3: r3 = r1 + 30

Sure, but we have no guarantee that the RTL stream will be massaged that way.

> So self-referencing REG_EQUAL notes should IMHO be considered valid,
> and transformations using the notes should just be careful to
> invalidate the equivalence before processing the insn that the note
> appears on. cse.c doesn't appear to do this, however.

IMO that's a recipe for bugs.
 
> On a completely different note:
> 
> df-problems.c has this comment:
> 
>             /* Remove the notes that refer to dead registers.  As we
> have at most
>                one REG_EQUAL/EQUIV note, all of EQ_USES will refer to this
> note so we need to purge the complete EQ_USES vector when removing the note
> using df_notes_rescan.  */
> 
> But ira.c:update_equiv_regs creates insns with a REG_EQUAL *and* a
> REG_EQUIV note:
> (gdb)
> update_equiv_regs () at ../../trunk/gcc/ira.c:3177
> 3177                    = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
> 2: debug_rtx((rtx)0x7ffff4df5e58) = (insn 638 637 639 85 (parallel [
>             (set (reg:SI 891)
>                 (minus:SI (reg:SI 890)
>                     (reg:SI 546 [ D.45472 ])))
>             (clobber (reg:CC 17 flags))
>         ]) ../../trunk/gcc/value-prof.c:928 283 {*subsi_1}
>      (expr_list:REG_EQUIV (mem:SI (plus:DI (reg/v/f:DI 204 [ e12 ])
>                 (const_int 52 [0x34])) [4 e12_223->probability+0 S4 A32])
>         (expr_list:REG_UNUSED (reg:CC 17 flags)
>             (expr_list:REG_EQUAL (minus:SI (const_int 10000 [0x2710])
>                     (reg:SI 546 [ D.45472 ]))
>                 (nil)))))
> void
> (gdb)
> 
> Is that valid?

Yes, the comment is wrong and should have been removed in r183719:
  http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01547.html

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-28  7:46                                             ` Eric Botcazou
@ 2012-11-28 15:57                                               ` Steven Bosscher
  2012-11-28 22:12                                                 ` Eric Botcazou
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-28 15:57 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Wed, Nov 28, 2012 at 8:44 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Again: Is this really the direction we should be going in? (I haven't
>> any better ideas...)
>
> Do you mean for self-referencing notes?  If so, definitely, these notes are
> either wrong or almost always redundant so we should eliminate them.

Well, I'm not sure I agree that they are wrong. Consider:

S0: r1 = ...
S1: r2 = r1 + 10
S2: r1 = r2 + 10 { REG_EQUAL r1 + 20 }
S3: r3 = r1 + 10

Clearly it would be wrong to use the REG_EQUAL note on S2 to optimize S3 to:

S0: r1 = ...
S1: r2 = r1 + 10
S2: r1 = r1 + 20
S3: r3 = r1 + 30

However, It seems to me that this would be valid if e.g. the webizer
turns the above into:

S0: r1 = ...
S1: r2 = r1 + 10
S2: r4 = r2 + 10 { REG_EQUAL r1 + 20 }
S3: r3 = r4 + 10

because now the optimization would be valid:

S0: r1 = ...
S1: r2 = r1 + 10
S2: r4 = r1 + 20
S3: r3 = r1 + 30


So self-referencing REG_EQUAL notes should IMHO be considered valid,
and transformations using the notes should just be careful to
invalidate the equivalence before processing the insn that the note
appears on. cse.c doesn't appear to do this, however.


On a completely different note:

df-problems.c has this comment:

            /* Remove the notes that refer to dead registers.  As we
have at most
               one REG_EQUAL/EQUIV note, all of EQ_USES will refer to this note
               so we need to purge the complete EQ_USES vector when removing
               the note using df_notes_rescan.  */

But ira.c:update_equiv_regs creates insns with a REG_EQUAL *and* a
REG_EQUIV note:
(gdb)
update_equiv_regs () at ../../trunk/gcc/ira.c:3177
3177                    = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
2: debug_rtx((rtx)0x7ffff4df5e58) = (insn 638 637 639 85 (parallel [
            (set (reg:SI 891)
                (minus:SI (reg:SI 890)
                    (reg:SI 546 [ D.45472 ])))
            (clobber (reg:CC 17 flags))
        ]) ../../trunk/gcc/value-prof.c:928 283 {*subsi_1}
     (expr_list:REG_EQUIV (mem:SI (plus:DI (reg/v/f:DI 204 [ e12 ])
                (const_int 52 [0x34])) [4 e12_223->probability+0 S4 A32])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (expr_list:REG_EQUAL (minus:SI (const_int 10000 [0x2710])
                    (reg:SI 546 [ D.45472 ]))
                (nil)))))
void
(gdb)

Is that valid?

Ciao!
Steven




> As for the more general problem of REG_EQ* notes, perhaps it's time to devise
> a global infrastructure to handle them.  But, as far as I recall, they always
> have been problematic, before or after the DF merge.

            /* Remove the notes that refer to dead registers.  As we
have at most
               one REG_EQUAL/EQUIV note, all of EQ_USES will refer to this note
               so we need to purge the complete EQ_USES vector when removing
               the note using df_notes_rescan.  */

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-28  0:43                                           ` Steven Bosscher
@ 2012-11-28  7:46                                             ` Eric Botcazou
  2012-11-28 15:57                                               ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-28  7:46 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> Again: Is this really the direction we should be going in? (I haven't
> any better ideas...)

Do you mean for self-referencing notes?  If so, definitely, these notes are 
either wrong or almost always redundant so we should eliminate them.

As for the more general problem of REG_EQ* notes, perhaps it's time to devise 
a global infrastructure to handle them.  But, as far as I recall, they always 
have been problematic, before or after the DF merge.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 23:59                                         ` Steven Bosscher
@ 2012-11-28  0:43                                           ` Steven Bosscher
  2012-11-28  7:46                                             ` Eric Botcazou
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-28  0:43 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Wed, Nov 28, 2012 at 12:58 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Wed, Nov 28, 2012 at 12:53 AM, Steven Bosscher wrote:
>> On Wed, Nov 28, 2012 at 12:47 AM, Eric Botcazou wrote:
>>>> Count me in, too.  So let's avoid it:
>>>>
>>>>         * gcse.c (struct reg_use): Remove unused struct.
>>>>         (gcse_emit_move_after): Do not create REG_EQUAL notes that reference
>>>> the SET_DEST of the instruction the note would be attached to.
>>>
>>> OK (with PR rtl-optimization/55006) if it passes bootstrap & regtest, thanks.
>>
>> Thanks for the quick review.
>>
>>
>>>> And perhaps a bit in emit-rtl.c for good measure? I'll see if/where
>>>> this causes breakage...
>>>
>>> I think this would need to be wrapped in a #ifdef ENABLE_CHECKING/#endif pair.
>>
>> Well, let me first try to make it pass some set of tests. This breaks
>> the compiler in too many places to name right now. Here's the first of
>> what's probably going to be a whole series of patches if I keep
>> hitting this internal_error as often as I am now in my set of cc1-i
>> files...
>>
>> I know: ChangeLog, testing and all that. That's not the point yet.
>> This is just a brain dump, to get this saved in some places safer than
>> the compile farm or (worse) my head ;-)
>
> And another one:

And one to end the night here.

Again: Is this really the direction we should be going in? (I haven't
any better ideas...)


Index: fwprop.c
===================================================================
--- fwprop.c    (revision 193394)
+++ fwprop.c    (working copy)
@@ -1321,7 +1321,9 @@ forward_propagate_and_simplify (df_ref u
         separately try plugging the definition in the note and simplifying.
         And only install a REQ_EQUAL note when the destination is a REG,
         as the note would be invalid otherwise.  */
-      set_reg_equal = (note == NULL_RTX && REG_P (SET_DEST (use_set)));
+      set_reg_equal = (note == NULL_RTX && REG_P (SET_DEST (use_set))
+                      && ! reg_mentioned_p (SET_DEST (use_set),
+                                            SET_SRC (use_set)));
     }

   if (GET_MODE (*loc) == VOIDmode)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 23:54                                       ` Steven Bosscher
@ 2012-11-27 23:59                                         ` Steven Bosscher
  2012-11-28  0:43                                           ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 23:59 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Wed, Nov 28, 2012 at 12:53 AM, Steven Bosscher wrote:
> On Wed, Nov 28, 2012 at 12:47 AM, Eric Botcazou wrote:
>>> Count me in, too.  So let's avoid it:
>>>
>>>         * gcse.c (struct reg_use): Remove unused struct.
>>>         (gcse_emit_move_after): Do not create REG_EQUAL notes that reference
>>> the SET_DEST of the instruction the note would be attached to.
>>
>> OK (with PR rtl-optimization/55006) if it passes bootstrap & regtest, thanks.
>
> Thanks for the quick review.
>
>
>>> And perhaps a bit in emit-rtl.c for good measure? I'll see if/where
>>> this causes breakage...
>>
>> I think this would need to be wrapped in a #ifdef ENABLE_CHECKING/#endif pair.
>
> Well, let me first try to make it pass some set of tests. This breaks
> the compiler in too many places to name right now. Here's the first of
> what's probably going to be a whole series of patches if I keep
> hitting this internal_error as often as I am now in my set of cc1-i
> files...
>
> I know: ChangeLog, testing and all that. That's not the point yet.
> This is just a brain dump, to get this saved in some places safer than
> the compile farm or (worse) my head ;-)

And another one:

Index: cprop.c
===================================================================
--- cprop.c     (revision 193394)
+++ cprop.c     (working copy)
@@ -776,7 +776,8 @@ try_replace_reg (rtx from, rtx to, rtx i
       /* If we've failed perform the replacement, have a single SET to
         a REG destination and don't yet have a note, add a REG_EQUAL note
         to not lose information.  */
-      if (!success && note == 0 && set != 0 && REG_P (SET_DEST (set)))
+      if (!success && note == 0 && set != 0 && REG_P (SET_DEST (set))
+         && ! reg_mentioned_p (SET_DEST (set), src))
        note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
     }



This one's for an interesting case of note, of a kind I've never seen
before. Had to share this gem of a note :-)

Ciao!
Steven


Breakpoint 1, internal_error (gmsgid=0x11455f8 "self-reference in
REG_EQUAL or REG_EQUIV note!\n") at ../../trunk/gcc/diagnostic.c:1087
1087      va_start (ap, gmsgid);
(gdb) up
#1  0x0000000000709dea in set_unique_reg_note (insn=0x7ffff507b3a8,
kind=REG_EQUAL, datum=0x7ffff528c2e0) at
../../trunk/gcc/emit-rtl.c:5001
5001            internal_error ("self-reference in REG_EQUAL or
REG_EQUIV note!\n");
(gdb)
#2  0x0000000000f608c4 in try_replace_reg (from=0x7ffff5301c60,
to=0x7ffff62e5460, insn=0x7ffff507b3a8) at ../../trunk/gcc/cprop.c:780
780             note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
(gdb) l
775
776           /* If we've failed perform the replacement, have a single SET to
777              a REG destination and don't yet have a note, add a
REG_EQUAL note
778              to not lose information.  */
779           if (!success && note == 0 && set != 0 && REG_P (SET_DEST (set)))
780             note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
781         }
782
783       if (set && MEM_P (SET_DEST (set)) && reg_mentioned_p (from,
SET_DEST (set)))
784         {
(gdb) p debug_rtx (set)
(set (reg/v:SI 173 [ xsize ])
    (if_then_else:SI (ge (reg:CCGC 17 flags)
            (const_int 0 [0]))
        (reg/v:SI 173 [ xsize ])
        (reg:SI 241)))
$1 = void
(gdb) p debug_rtx(insn)
(insn 741 739 678 81 (set (reg/v:SI 173 [ xsize ])
        (if_then_else:SI (ge (reg:CCGC 17 flags)
                (const_int 0 [0]))
            (reg/v:SI 173 [ xsize ])
            (reg:SI 241))) ../../trunk/gcc/alias.c:1940 893 {*movsicc_noc}
     (expr_list:REG_EQUAL (if_then_else:SI (ge (reg:CCGC 17 flags)
                (const_int 0 [0]))
            (reg/v:SI 173 [ xsize ])
            (const_int -1 [0xffffffffffffffff]))
        (nil)))
$2 = void
(gdb)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 23:50                                     ` Eric Botcazou
@ 2012-11-27 23:54                                       ` Steven Bosscher
  2012-11-27 23:59                                         ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 23:54 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Wed, Nov 28, 2012 at 12:47 AM, Eric Botcazou wrote:
>> Count me in, too.  So let's avoid it:
>>
>>         * gcse.c (struct reg_use): Remove unused struct.
>>         (gcse_emit_move_after): Do not create REG_EQUAL notes that reference
>> the SET_DEST of the instruction the note would be attached to.
>
> OK (with PR rtl-optimization/55006) if it passes bootstrap & regtest, thanks.

Thanks for the quick review.


>> And perhaps a bit in emit-rtl.c for good measure? I'll see if/where
>> this causes breakage...
>
> I think this would need to be wrapped in a #ifdef ENABLE_CHECKING/#endif pair.

Well, let me first try to make it pass some set of tests. This breaks
the compiler in too many places to name right now. Here's the first of
what's probably going to be a whole series of patches if I keep
hitting this internal_error as often as I am now in my set of cc1-i
files...

I know: ChangeLog, testing and all that. That's not the point yet.
This is just a brain dump, to get this saved in some places safer than
the compile farm or (worse) my head ;-)


Index: optabs.c
===================================================================
--- optabs.c    (revision 193394)
+++ optabs.c    (working copy)
@@ -170,9 +170,9 @@ optab_libfunc (optab optab, enum machine

    If the last insn does not set TARGET, don't do anything, but return 1.

-   If a previous insn sets TARGET and TARGET is one of OP0 or OP1,
-   don't add the REG_EQUAL note but return 0.  Our caller can then try
-   again, ensuring that TARGET is not one of the operands.  */
+   If the last insn or a previous insn sets TARGET and TARGET is one of OP0
+   or OP1, don't add the REG_EQUAL note but return 0.  Our caller can then
+   try again, ensuring that TARGET is not one of the operands.  */

 static int
 add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
@@ -192,6 +192,12 @@ add_equal_note (rtx insns, rtx target, e
   if (GET_CODE (target) == ZERO_EXTRACT)
     return 1;

+  /* If TARGET is in OP0 or OP1, punt.  We'd end up with a note referencing
+     a value changing in the insn, so the note would be invalid for CSE.  */
+  if (reg_overlap_mentioned_p (target, op0)
+      || (op1 && reg_overlap_mentioned_p (target, op1)))
+    return 0;
+
   for (last_insn = insns;
        NEXT_INSN (last_insn) != NULL_RTX;
        last_insn = NEXT_INSN (last_insn))
@@ -207,21 +213,6 @@ add_equal_note (rtx insns, rtx target, e
          || ! rtx_equal_p (XEXP (SET_DEST (set), 0), target)))
     return 1;

-  /* If TARGET is in OP0 or OP1, check if anything in SEQ sets TARGET
-     besides the last insn.  */
-  if (reg_overlap_mentioned_p (target, op0)
-      || (op1 && reg_overlap_mentioned_p (target, op1)))
-    {
-      insn = PREV_INSN (last_insn);
-      while (insn != NULL_RTX)
-       {
-         if (reg_set_p (target, insn))
-           return 0;
-
-         insn = PREV_INSN (insn);
-       }
-    }
-
   if (GET_RTX_CLASS (code) == RTX_UNARY)
     switch (code)
       {

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 23:29                                   ` Steven Bosscher
@ 2012-11-27 23:50                                     ` Eric Botcazou
  2012-11-27 23:54                                       ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-27 23:50 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: gcc-patches, Paolo Bonzini, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> Count me in, too.  So let's avoid it:
> 
>         * gcse.c (struct reg_use): Remove unused struct.
>         (gcse_emit_move_after): Do not create REG_EQUAL notes that reference
> the SET_DEST of the instruction the note would be attached to.

OK (with PR rtl-optimization/55006) if it passes bootstrap & regtest, thanks.

> And perhaps a bit in emit-rtl.c for good measure? I'll see if/where
> this causes breakage...

I think this would need to be wrapped in a #ifdef ENABLE_CHECKING/#endif pair.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 16:59                                 ` Eric Botcazou
@ 2012-11-27 23:29                                   ` Steven Bosscher
  2012-11-27 23:50                                     ` Eric Botcazou
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 23:29 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Paolo Bonzini, gcc-patches, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

On Tue, Nov 27, 2012 at 5:57 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> This note seems very very weird. For one thing, it becomes invalid on
>> the very instruction where it is created.  I would say that it should
>> not be there.
>
> Agreed.

Count me in, too.  So let's avoid it:

        * gcse.c (struct reg_use): Remove unused struct.
        (gcse_emit_move_after): Do not create REG_EQUAL notes that reference
        the SET_DEST of the instruction the note would be attached to.

Index: gcse.c
===================================================================
--- gcse.c      (revision 193394)
+++ gcse.c      (working copy)
@@ -258,8 +258,6 @@ int flag_rerun_cse_after_global_opts;
 /* An obstack for our working variables.  */
 static struct obstack gcse_obstack;

-struct reg_use {rtx reg_rtx; };
-
 /* Hash table of expressions.  */

 struct expr
@@ -2482,23 +2480,27 @@ gcse_emit_move_after (rtx dest, rtx src,
   rtx new_rtx;
   rtx set = single_set (insn), set2;
   rtx note;
-  rtx eqv;
+  rtx eqv = NULL_RTX;

   /* This should never fail since we're creating a reg->reg copy
      we've verified to be valid.  */

   new_rtx = emit_insn_after (gen_move_insn (dest, src), insn);

-  /* Note the equivalence for local CSE pass.  */
+  /* Note the equivalence for local CSE pass.  Take the note from the old
+     set if there was one.  Otherwise record the SET_SRC from the old set
+     unless DEST is also an operand of the SET_SRC.  */
   set2 = single_set (new_rtx);
   if (!set2 || !rtx_equal_p (SET_DEST (set2), dest))
     return new_rtx;
   if ((note = find_reg_equal_equiv_note (insn)))
     eqv = XEXP (note, 0);
-  else
+  else if (! REG_P (dest)
+          || ! reg_mentioned_p (dest, SET_SRC (set)))
     eqv = SET_SRC (set);

-  set_unique_reg_note (new_rtx, REG_EQUAL, copy_insn_1 (eqv));
+  if (eqv != NULL_RTX)
+    set_unique_reg_note (new_rtx, REG_EQUAL, copy_insn_1 (eqv));

   return new_rtx;
 }


And perhaps a bit in emit-rtl.c for good measure? I'll see if/where
this causes breakage...

Index: emit-rtl.c
===================================================================
--- emit-rtl.c  (revision 193394)
+++ emit-rtl.c  (working copy)
@@ -4932,6 +4932,19 @@ gen_use (rtx x)
   return seq;
 }

+/* Return true if DATUM has a USE of the SET_DEST of INSN.  */
+static bool
+self_ref_note_p (rtx insn, rtx datum)
+{
+  rtx set = single_set (insn);
+  if (! set)
+    return false;
+  rtx dest = SET_DEST (set);
+  if (! REG_P (dest))
+    return false;
+  return reg_mentioned_p (dest, datum);
+}
+
 /* Place a note of KIND on insn INSN with DATUM as the datum. If a
    note of this type already exists, remove it first.  */

@@ -4961,6 +4974,8 @@ set_unique_reg_note (rtx insn, enum reg_

       if (note)
        {
+         if (self_ref_note_p (insn, datum))
+           internal_error ("self-reference in REG_EQUAL or REG_EQUIV note!\n");
          XEXP (note, 0) = datum;
          df_notes_rescan (insn);
          return note;
@@ -4982,6 +4997,8 @@ set_unique_reg_note (rtx insn, enum reg_
     {
     case REG_EQUAL:
     case REG_EQUIV:
+      if (self_ref_note_p (insn, datum))
+       internal_error ("self-reference in REG_EQUAL or REG_EQUIV note!\n");
       df_notes_rescan (insn);
       break;
     default:

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 14:33                               ` Paolo Bonzini
@ 2012-11-27 16:59                                 ` Eric Botcazou
  2012-11-27 23:29                                   ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-27 16:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: gcc-patches, Steven Bosscher, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

> This note seems very very weird.  For one thing, it becomes invalid on
> the very instruction where it is created.  I would say that it should
> not be there.

Agreed.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 14:25                                   ` Dominique Dhumieres
@ 2012-11-27 14:49                                     ` Steven Bosscher
  0 siblings, 0 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 14:49 UTC (permalink / raw)
  To: Dominique Dhumieres, Vladimir Makarov
  Cc: ebotcazou, rth, law, hubicka, gcc-patches, bonzini

On Tue, Nov 27, 2012 at 3:25 PM, Dominique Dhumieres <dominiq@lps.ens.fr> wrote:
>> And more band-aid, ...
>
> The gcc_assert triggers at bootstrap when compiling gcc/ada/ali.adb:
>
> +===========================GNAT BUG DETECTED==============================+
> | 4.8.0 20121127 (experimental) [trunk revision 193848p10] (x86_64-apple-darwin10.8.0) GCC error:|
> | in df_remove_dead_eq_notes, at df-problems.c:2917                        |
> | Error detected around ../../work/gcc/ada/ali.adb:2682:8                  |
> | Please submit a bug report; see http://gcc.gnu.org/bugs.html.            |
> | Use a subject line meaningful to you and us to track the bug.            |
> | Include the entire contents of this bug box in the report.               |
> | Include the exact gcc or gnatmake command that you entered.              |
> | Also include sources listed below in gnatchop format                     |
> | (concatenated together with no headers between files).                   |
> +==========================================================================+

Yes, I found that one already, too.

And, oh joy, we have pseudos in REG_EQUAL notes after LRA! (Probably
also after reload, btw.).  In the ICE I got, a pseudo's live range got
split and an inheritance move is injected, but REG_EQUAL notes were
not updated or removed. Finding and removing the notes is hard in IRA
and LRA because they don't use the DF caches.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 12:01                             ` Steven Bosscher
  2012-11-27 12:29                               ` Steven Bosscher
@ 2012-11-27 14:33                               ` Paolo Bonzini
  2012-11-27 16:59                                 ` Eric Botcazou
  1 sibling, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2012-11-27 14:33 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: Eric Botcazou, gcc-patches, Dominique Dhumieres, hubicka,
	Richard Henderson, Jeff Law

Il 27/11/2012 13:00, Steven Bosscher ha scritto:
> It all
> starts with PRE creating a REG_EQUAL note that references the register
> that's SET in the insn the note is attached to, but the register is
> live after the insn so from that point of view the note is not
> invalid.

This note seems very very weird.  For one thing, it becomes invalid on
the very instruction where it is created.  I would say that it should
not be there.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 13:04                                 ` Steven Bosscher
@ 2012-11-27 14:25                                   ` Dominique Dhumieres
  2012-11-27 14:49                                     ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Dominique Dhumieres @ 2012-11-27 14:25 UTC (permalink / raw)
  To: stevenb.gcc, ebotcazou; +Cc: rth, law, hubicka, gcc-patches, dominiq, bonzini

> And more band-aid, ...

The gcc_assert triggers at bootstrap when compiling gcc/ada/ali.adb:

+===========================GNAT BUG DETECTED==============================+
| 4.8.0 20121127 (experimental) [trunk revision 193848p10] (x86_64-apple-darwin10.8.0) GCC error:|
| in df_remove_dead_eq_notes, at df-problems.c:2917                        |
| Error detected around ../../work/gcc/ada/ali.adb:2682:8                  |
| Please submit a bug report; see http://gcc.gnu.org/bugs.html.            |
| Use a subject line meaningful to you and us to track the bug.            |
| Include the entire contents of this bug box in the report.               |
| Include the exact gcc or gnatmake command that you entered.              |
| Also include sources listed below in gnatchop format                     |
| (concatenated together with no headers between files).                   |
+==========================================================================+

Dominique

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 12:29                               ` Steven Bosscher
  2012-11-27 13:04                                 ` Steven Bosscher
@ 2012-11-27 13:48                                 ` Dominique Dhumieres
  1 sibling, 0 replies; 68+ messages in thread
From: Dominique Dhumieres @ 2012-11-27 13:48 UTC (permalink / raw)
  To: stevenb.gcc, ebotcazou; +Cc: rth, law, hubicka, gcc-patches, dominiq, bonzini

> Dominique, could you give this a try and see if it helps?

With the patch, the miscompilation of aermod.f90 is gone.

Thanks,

Dominique

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 12:29                               ` Steven Bosscher
@ 2012-11-27 13:04                                 ` Steven Bosscher
  2012-11-27 14:25                                   ` Dominique Dhumieres
  2012-11-27 13:48                                 ` Dominique Dhumieres
  1 sibling, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 13:04 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka,
	Richard Henderson, Jeff Law

On Tue, Nov 27, 2012 at 1:28 PM, Steven Bosscher wrote:
> Dominique, could you give this a try and see if it helps?
> (But as I said up-thread: I'm not sure this is a proper fix, or just
> another band-aid...)

And more band-aid, but this time I'm not even sure where things go
wrong. In any case, we end up with a REG_EQUAL note referencing a
SUBREG of a dead register. This leaked because df_remove_dead_eq_notes
uses loc_mentioned_in_p not on the note but on the first operand of
the note.

Index: df-problems.c
===================================================================
--- df-problems.c       (revision 193394)
+++ df-problems.c       (working copy)
@@ -2907,9 +2907,10 @@ df_remove_dead_eq_notes (rtx insn, bitma
                if (DF_REF_REGNO (use) > FIRST_PSEUDO_REGISTER
                    && DF_REF_LOC (use)
                    && (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
-                   && ! bitmap_bit_p (live, DF_REF_REGNO (use))
-                   && loc_mentioned_in_p (DF_REF_LOC (use), XEXP (link, 0)))
+                   && ! bitmap_bit_p (live, DF_REF_REGNO (use)))
                  {
+                   /* Make sure that DF_SCAN is up-to-date.  */
+                   gcc_assert (loc_mentioned_in_p (DF_REF_LOC (use), link));
                    deleted = true;
                    break;
                  }

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 12:01                             ` Steven Bosscher
@ 2012-11-27 12:29                               ` Steven Bosscher
  2012-11-27 13:04                                 ` Steven Bosscher
  2012-11-27 13:48                                 ` Dominique Dhumieres
  2012-11-27 14:33                               ` Paolo Bonzini
  1 sibling, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 12:29 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka,
	Richard Henderson, Jeff Law

On Tue, Nov 27, 2012 at 1:00 PM, Steven Bosscher wrote:
> Now that USE of r719 is a use of a dead register, rendering the
> REG_EQUAL invalid. From there on the problem is the same as the ones I
> had to "fix" in loop-unroll.c. First the webizer puts the USE in a
> different web and renames the USE to r1591:
>
>  2889 r719:DI=r1562:DI
>       REG_EQUAL: r1591:DI+0x4

With this patch, the self-referencing EQ_USE is kept together with its DEF:

 2889 r719:DI=r1562:DI
      REG_EQUAL: r719:DI+0x4

Dominique, could you give this a try and see if it helps?
(But as I said up-thread: I'm not sure this is a proper fix, or just
another band-aid...)

Ciao!
Steven


        * web.c (union_eq_uses): New function to keep self-referencing
notes intact.
        (web_main): Call it.

Index: web.c
===================================================================
--- web.c       (revision 193394)
+++ web.c       (working copy)
@@ -148,6 +148,35 @@ union_match_dups (rtx insn, struct web_e
     }
 }

+/* If INSN has a REG_EQUAL note that references a DEF of INSN, union
+   them.  FUN is the function that does the union.  */
+
+static void
+union_eq_uses (rtx insn, struct web_entry *def_entry,
+              struct web_entry *use_entry,
+              bool (*fun) (struct web_entry *, struct web_entry *))
+{
+  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+  df_ref *def_link = DF_INSN_INFO_DEFS (insn_info);
+  df_ref *eq_use_link = DF_INSN_INFO_EQ_USES (insn_info);
+
+  if (! (def_link && eq_use_link))
+    return;
+
+  while (*eq_use_link)
+    {
+      df_ref *def_rec = def_link;
+      while (*def_rec)
+       {
+         if (DF_REF_REAL_REG (*eq_use_link) == DF_REF_REAL_REG (*def_rec))
+           (*fun) (use_entry + DF_REF_ID (*eq_use_link),
+                   def_entry + DF_REF_ID (*def_rec));
+         def_rec++;
+       }
+      eq_use_link++;
+    }
+}
+
 /* For each use, all possible defs reaching it must come in the same
    register, union them.
    FUN is the function that does the union.
@@ -390,6 +419,7 @@ web_main (void)
              if (DF_REF_REGNO (use) >= FIRST_PSEUDO_REGISTER)
                union_defs (use, def_entry, used, use_entry, unionfind_union);
            }
+         union_eq_uses (insn, def_entry, use_entry, unionfind_union);
        }
     }

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27 10:35                           ` Steven Bosscher
@ 2012-11-27 12:01                             ` Steven Bosscher
  2012-11-27 12:29                               ` Steven Bosscher
  2012-11-27 14:33                               ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 12:01 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka,
	Richard Henderson, Jeff Law

On Tue, Nov 27, 2012 at 11:34 AM, Steven Bosscher wrote:
>>> gcc/
>>>       * loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
>>>       (analyze_iv_to_split_insn): Record it.
>>>       (maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
>>>       notes that refer to IVs that are being split.
>>>       (apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
>>>       Use FOR_BB_INSNS_SAFE.
>>
>> That's fine with me, thanks.  You might want to defer applying it until the
>> reason why it isn't apparently sufficient for aermod.f90 is understood though.
>
> Thanks. Yes, I'm first going to try and figure out why this doesn't
> fix aermod for Dominique. I suspect the problem is in variable
> expansion in the unroller. A previous patch of mine updates REG_EQUAL
> notes in variable expansion, but it doesn't clean up notes that refer
> to maybe dead registers.

Well, that's not it. But what I said below:


> I have to say, I've got a uncomfortable feeling about REG_EQUAL notes
> not being allowed to refer to dead registers.

applies even more after analyzing Dominique's bug.

At the start of RTL PRE we have:

 2000 {r719:DI=r719:DI+0x4;clobber flags:CC;}
      REG_UNUSED: flags:CC

where r719:DI+0x4 is found to be partially redundant. So PRE optimizes this to:

 2889 r719:DI=r1562:DI
      REG_EQUAL: r719:DI+0x4
...
 2913 r1562:DI=r719:DI+0x4

This self-reference in the REG_EQUAL note is the problem. The SET
kills r719, but there is a USE in the PRE'ed expression that keeps
r719 alive. But this use is copy-propagated away in CPROP2:

 2913 r1562:DI=r1562:DI+0x4

Now that USE of r719 is a use of a dead register, rendering the
REG_EQUAL invalid. From there on the problem is the same as the ones I
had to "fix" in loop-unroll.c. First the webizer puts the USE in a
different web and renames the USE to r1591:

 2889 r719:DI=r1562:DI
      REG_EQUAL: r1591:DI+0x4

CSE2 uses this and the equivalence of r719 and r1562 to "optimize" the
PRE expression:

 2913 r1562:DI=r1591:DI+0x8

Then init-regs notices that this reg is quasi-used uninitialized:

 3122 r1591:DI=0
 2913 r1562:DI=r1591:DI+0x8
      REG_DEAD: r1591:DI

And combine finally produces:

 2913 r1562:DI=0x8


I'm not sure what to name as the "root cause" for all of this. It all
starts with PRE creating a REG_EQUAL note that references the register
that's SET in the insn the note is attached to, but the register is
live after the insn so from that point of view the note is not
invalid. CPROP2 kills r179 by propagating it out and making the note
invalid. I think CPROP2 could do that to any register, and if passes
should make sure REG_EQUAL notes are updated or removed then this is a
bug in RTL CPROP.

Very, very messy...

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-27  9:58                         ` Eric Botcazou
@ 2012-11-27 10:35                           ` Steven Bosscher
  2012-11-27 12:01                             ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-27 10:35 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka,
	Richard Henderson, Jeff Law

On Tue, Nov 27, 2012 at 10:55 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Or rather this one. Same hammer, different color. It turns out that
>> the rtlanal.c change caused problems, so I've got to use a home-brewn
>> equivalent of remove_reg_equal_equiv_notes_for_regno...
>>
>> Test case is unchanged so I'm omitting it here.
>>
>> Ciao!
>> Steven
>>
>>
>> gcc/
>>       * loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
>>       (analyze_iv_to_split_insn): Record it.
>>       (maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
>>       notes that refer to IVs that are being split.
>>       (apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
>>       Use FOR_BB_INSNS_SAFE.
>
> That's fine with me, thanks.  You might want to defer applying it until the
> reason why it isn't apparently sufficient for aermod.f90 is understood though.

Thanks. Yes, I'm first going to try and figure out why this doesn't
fix aermod for Dominique. I suspect the problem is in variable
expansion in the unroller. A previous patch of mine updates REG_EQUAL
notes in variable expansion, but it doesn't clean up notes that refer
to maybe dead registers.

I have to say, I've got a uncomfortable feeling about REG_EQUAL notes
not being allowed to refer to dead registers. In the case at hand, the
code basically does:

S1: r1 = r2 + 0x4  // r2 is the reg from split_iv, r1 was the original IV
S2: r3 = mem[r1]
S3: if ... goto L1;
S4: r4 = r3 // REG_EQUAL mem[r1]
L1:
S5: r1 = r2 + 0x8

At S4, r1 is not live in the usual meaning of register liveness, but
the DEF from S1 still reaches the REG_EQUAL note. This situation is
not only created by loop unrolling. At least gcse.c (PRE),
loop-invariant.c, cse.c, dce.c and combine.c can create situations
like the above. ifcvt.c jumps through hoops to avoid it (see e.g.
PR46315, which you fixed).

Most of the problems are papered over by occasional calls to
df_note_add_problem from some passes, so that df_remove_dead_eq_notes
cleans up any notes referencing dead registers. But if we really want
to declare this kind of REG_EQUAL note reference to a dead register
invalid RTL (which is something at least you, Paolo, and me agree on),
and we expect passes to clean up their own mess by either updating or
removing any notes they invalidate, then df_remove_dead_eq_notes
shouldn't be necessary for correctness because all notes it encounters
should be valid (being updated by passes).

Removing df_remove_dead_eq_notes breaks bootstrap on the four targets
I tried it on (x86_64, powerpc64, ia64, and hppa). That is, breaks
*without* -funroll-loops, and *without* -fweb. With a hack to disable
pass_initialize_regs, bootstrap still fails, but other files are
miscompiled. With another hack on top to disable CSE2, bootstrap still
fails, and yet other files are miscompiled.

What I'm really trying to say, is that even when I fix this particular
issue with aermod (which is apparently 3 issues: the one I fixed last
month for x86_64, the one fixed with the patch in this thread for
SPARC, and the yet-to-be-identified problem Dominique is still seeing
on darwin10) then it's likely other, similar bugs will show up. Bugs
that are hard to reproduce, dependent on the target's RTL, and
difficult to debug as they are usually wrong-code bugs on larger test
cases.

We really need a more robust solution. I've added Jeff and rth to the
CC, looking for opinions/thoughts/suggestions/$0.02s.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-25 22:38                       ` Steven Bosscher
  2012-11-26 14:38                         ` Dominique Dhumieres
@ 2012-11-27  9:58                         ` Eric Botcazou
  2012-11-27 10:35                           ` Steven Bosscher
  1 sibling, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-27  9:58 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

> Or rather this one. Same hammer, different color. It turns out that
> the rtlanal.c change caused problems, so I've got to use a home-brewn
> equivalent of remove_reg_equal_equiv_notes_for_regno...
> 
> Test case is unchanged so I'm omitting it here.
> 
> Ciao!
> Steven
> 
> 
> gcc/
> 	* loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
> 	(analyze_iv_to_split_insn): Record it.
> 	(maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
> 	notes that refer to IVs that are being split.
> 	(apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
> 	Use FOR_BB_INSNS_SAFE.

That's fine with me, thanks.  You might want to defer applying it until the 
reason why it isn't apparently sufficient for aermod.f90 is understood though.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-26 14:38                         ` Dominique Dhumieres
@ 2012-11-26 15:46                           ` Steven Bosscher
  0 siblings, 0 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-11-26 15:46 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: ebotcazou, hubicka, gcc-patches, bonzini

On Mon, Nov 26, 2012 at 3:38 PM, Dominique Dhumieres <dominiq@lps.ens.fr> wrote:
>> Or rather this one. Same hammer, different color. It turns out that
>> the rtlanal.c change caused problems, so I've got to use a home-brewn
>> equivalent of remove_reg_equal_equiv_notes_for_regno...
>
> This does not fix pr55006 on x86_64-apple-darwin10;-(while the patch in
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01607.html
> does).

Does it fail on x86_64-linux also? If not as-is, can you try with -fPIC?


>> Test case is unchanged so I'm omitting it here.
>
> This test passes on x86_64-apple-darwin10 for a clean trunk.

Yes, I know. As you can probably tell from the patch, I've been
working on an older revision because I can't reproduce the problem on
the trunk. It's a very elusive problem.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-25 22:38                       ` Steven Bosscher
@ 2012-11-26 14:38                         ` Dominique Dhumieres
  2012-11-26 15:46                           ` Steven Bosscher
  2012-11-27  9:58                         ` Eric Botcazou
  1 sibling, 1 reply; 68+ messages in thread
From: Dominique Dhumieres @ 2012-11-26 14:38 UTC (permalink / raw)
  To: stevenb.gcc, ebotcazou; +Cc: hubicka, gcc-patches, dominiq, bonzini

> Or rather this one. Same hammer, different color. It turns out that
> the rtlanal.c change caused problems, so I've got to use a home-brewn
> equivalent of remove_reg_equal_equiv_notes_for_regno...

This does not fix pr55006 on x86_64-apple-darwin10;-(while the patch in
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01607.html
does).

> Test case is unchanged so I'm omitting it here.

This test passes on x86_64-apple-darwin10 for a clean trunk.

Thanks for looking at the problem.

Dominique

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-25 20:44                     ` Steven Bosscher
@ 2012-11-25 22:38                       ` Steven Bosscher
  2012-11-26 14:38                         ` Dominique Dhumieres
  2012-11-27  9:58                         ` Eric Botcazou
  0 siblings, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-11-25 22:38 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

On Sun, Nov 25, 2012 at 9:44 PM, Steven Bosscher wrote:
> On Sat, Nov 24, 2012 at 2:09 AM, Steven Bosscher wrote:
>> On Fri, Nov 23, 2012 at 11:45 PM, Steven Bosscher wrote:
>>> Removing the note is easier but it may hurt optimization later on:
>>> CSE2 puts the note back in but this introduces a pass ordering
>>> dependency. Perhaps that's not a big problem, but I'm not sure...
>>
>> Hmm, actually CSE2 fails to put the note back, I guess because it's a
>> local CSE pass only...
>
> Here's another possible fix: Just punt on all REG_EQUAL notes for the
> IV. It's a bit of a big hammer, but there are no code changes in my
> set of cc1-i files (compiled with -funroll-loops) on x86_64 and
> powerpc64 so apparently it's not all that important to keep the notes.
>
> Comments on this one?

Or rather this one. Same hammer, different color. It turns out that
the rtlanal.c change caused problems, so I've got to use a home-brewn
equivalent of remove_reg_equal_equiv_notes_for_regno...

Test case is unchanged so I'm omitting it here.

Ciao!
Steven


gcc/
	* loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
	(analyze_iv_to_split_insn): Record it.
	(maybe_strip_eq_note_for_split_iv): New function to remove REG_EQUAL
	notes that refer to IVs that are being split.
	(apply_opt_in_copies): Use maybe_strip_eq_note_for_split_iv.  Twice.
	Use FOR_BB_INSNS_SAFE.

testsuite/
	* gcc.dg/pr55006.c: New test.

Index: loop-unroll.c
===================================================================
--- loop-unroll.c	(revision 193394)
+++ loop-unroll.c	(working copy)
@@ -74,6 +74,7 @@ along with GCC; see the file COPYING3.
 struct iv_to_split
 {
   rtx insn;		/* The insn in that the induction variable occurs.  */
+  rtx orig_var;		/* The variable (register) for the IV before split.  */
   rtx base_var;		/* The variable on that the values in the further
 			   iterations are based.  */
   rtx step;		/* Step of the induction variable.  */
@@ -1833,6 +1834,7 @@ analyze_iv_to_split_insn (rtx insn)
   /* Record the insn to split.  */
   ivts = XNEW (struct iv_to_split);
   ivts->insn = insn;
+  ivts->orig_var = dest;
   ivts->base_var = NULL_RTX;
   ivts->step = iv.step;
   ivts->next = NULL;
@@ -2253,6 +2255,32 @@ combine_var_copies_in_loop_exit (struct
   emit_insn_after (seq, insn);
 }

+/* Strip away REG_EQUAL notes for IVs we're splitting.
+
+   Updating REG_EQUAL notes for IVs we split is tricky: We
+   cannot tell until after unrolling, DF-rescanning, and liveness
+   updating, whether an EQ_USE is reached by the split IV while
+   the IV reg is still live.  See PR55006.
+
+   ??? We cannot use remove_reg_equal_equiv_notes_for_regno,
+   because RTL loop-iv requires us to defer rescanning insns and
+   any notes attached to them.  So resort to old techniques...  */
+
+static void
+maybe_strip_eq_note_for_split_iv (struct opt_info *opt_info, rtx insn)
+{
+  struct iv_to_split *ivts;
+  rtx note = find_reg_equal_equiv_note (insn);
+  if (! note)
+    return;
+  for (ivts = opt_info->iv_to_split_head; ivts; ivts = ivts->next)
+    if (reg_mentioned_p (ivts->orig_var, note))
+      {
+	remove_note (insn, note);
+	return;
+      }
+}
+
 /* Apply loop optimizations in loop copies using the
    data which gathered during the unrolling.  Structure
    OPT_INFO record that data.
@@ -2293,9 +2321,8 @@ apply_opt_in_copies (struct opt_info *op
 					unrolling);
       bb->aux = 0;
       orig_insn = BB_HEAD (orig_bb);
-      for (insn = BB_HEAD (bb); insn != NEXT_INSN (BB_END (bb)); insn = next)
+      FOR_BB_INSNS_SAFE (bb, insn, next)
         {
-          next = NEXT_INSN (insn);
 	  if (!INSN_P (insn)
 	      || (DEBUG_INSN_P (insn)
 		  && TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == LABEL_DECL))
@@ -2313,6 +2340,8 @@ apply_opt_in_copies (struct opt_info *op
           /* Apply splitting iv optimization.  */
           if (opt_info->insns_to_split)
             {
+	      maybe_strip_eq_note_for_split_iv (opt_info, insn);
+
               ivts = (struct iv_to_split *)
 		htab_find (opt_info->insns_to_split, &ivts_templ);

@@ -2378,6 +2407,8 @@ apply_opt_in_copies (struct opt_info *op
           ivts_templ.insn = orig_insn;
           if (opt_info->insns_to_split)
             {
+	      maybe_strip_eq_note_for_split_iv (opt_info, orig_insn);
+
               ivts = (struct iv_to_split *)
 		htab_find (opt_info->insns_to_split, &ivts_templ);
               if (ivts)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-24  1:10                   ` Steven Bosscher
@ 2012-11-25 20:44                     ` Steven Bosscher
  2012-11-25 22:38                       ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-25 20:44 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

On Sat, Nov 24, 2012 at 2:09 AM, Steven Bosscher wrote:
> On Fri, Nov 23, 2012 at 11:45 PM, Steven Bosscher wrote:
>> Removing the note is easier but it may hurt optimization later on:
>> CSE2 puts the note back in but this introduces a pass ordering
>> dependency. Perhaps that's not a big problem, but I'm not sure...
>
> Hmm, actually CSE2 fails to put the note back, I guess because it's a
> local CSE pass only...

Here's another possible fix: Just punt on all REG_EQUAL notes for the
IV. It's a bit of a big hammer, but there are no code changes in my
set of cc1-i files (compiled with -funroll-loops) on x86_64 and
powerpc64 so apparently it's not all that important to keep the notes.

Comments on this one?

Ciao!
Steven

gcc/
	* rtlanal.c (remove_reg_equal_equiv_notes_for_regno): Fix for
	deferred re-scanning.
	* loop-unroll.c (struct iv_to_split): Add new 'orig_var' member.
	(analyze_iv_to_split_insn): Record it.
	(analyze_insns_in_loop): Remove all REG_EQUAL notes for IVs that
	will be split.
	(apply_opt_in_copies): Use FOR_BB_INSNS_SAFE.

testsuite/
	* gcc.dg/pr55006.c: New test.

Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 193394)
+++ rtlanal.c	(working copy)
@@ -2015,15 +2015,18 @@ remove_reg_equal_equiv_notes (rtx insn)
 void
 remove_reg_equal_equiv_notes_for_regno (unsigned int regno)
 {
-  df_ref eq_use;
+  df_ref eq_use, next_eq_use;

   if (!df)
     return;

   /* This loop is a little tricky.  We cannot just go down the chain because
-     it is being modified by some actions in the loop.  So we just iterate
-     over the head.  We plan to drain the list anyway.  */
-  while ((eq_use = DF_REG_EQ_USE_CHAIN (regno)) != NULL)
+     it may be modified by some actions in the loop if re-scanning is not
+     deferred.  We also can't just iterate over the head because the chain
+     may be *not* modified if re-scanning *is* deferred.  */
+  for (eq_use = DF_REG_EQ_USE_CHAIN (regno);
+       eq_use != NULL;
+       eq_use = next_eq_use)
     {
       rtx insn = DF_REF_INSN (eq_use);
       rtx note = find_reg_equal_equiv_note (insn);
@@ -2033,6 +2036,8 @@ remove_reg_equal_equiv_notes_for_regno (
 	 remove_note.  */
       gcc_assert (note);

+      next_eq_use = DF_REF_NEXT_REG (eq_use);
+
       remove_note (insn, note);
     }
 }
Index: loop-unroll.c
===================================================================
--- loop-unroll.c	(revision 193394)
+++ loop-unroll.c	(working copy)
@@ -74,6 +74,7 @@ along with GCC; see the file COPYING3.
 struct iv_to_split
 {
   rtx insn;		/* The insn in that the induction variable occurs.  */
+  rtx orig_var;		/* The variable (register) for the IV before split.  */
   rtx base_var;		/* The variable on that the values in the further
 			   iterations are based.  */
   rtx step;		/* Step of the induction variable.  */
@@ -1833,6 +1834,7 @@ analyze_iv_to_split_insn (rtx insn)
   /* Record the insn to split.  */
   ivts = XNEW (struct iv_to_split);
   ivts->insn = insn;
+  ivts->orig_var = dest;
   ivts->base_var = NULL_RTX;
   ivts->step = iv.step;
   ivts->next = NULL;
@@ -1913,6 +1915,12 @@ analyze_insns_in_loop (struct loop *loop

         if (ivts)
           {
+	    /* Updating REG_EQUAL notes for this IV is tricky: We cannot tell
+	       until after unrolling whether an EQ_USE is reached by the split
+	       IV while the IV reg is still live.  See PR55006.
+	       Be conservative, remove all such notes.  */
+	    remove_reg_equal_equiv_notes_for_regno (REGNO (ivts->orig_var));
+
             slot1 = htab_find_slot (opt_info->insns_to_split, ivts, INSERT);
 	    gcc_assert (*slot1 == NULL);
             *slot1 = ivts;
@@ -2293,9 +2301,8 @@ apply_opt_in_copies (struct opt_info *op
 					unrolling);
       bb->aux = 0;
       orig_insn = BB_HEAD (orig_bb);
-      for (insn = BB_HEAD (bb); insn != NEXT_INSN (BB_END (bb)); insn = next)
+      FOR_BB_INSNS_SAFE (bb, insn, next)
         {
-          next = NEXT_INSN (insn);
 	  if (!INSN_P (insn)
 	      || (DEBUG_INSN_P (insn)
 		  && TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == LABEL_DECL))
Index: testsuite/gcc.dg/pr55006.c
===================================================================
--- testsuite/gcc.dg/pr55006.c	(revision 0)
+++ testsuite/gcc.dg/pr55006.c	(revision 0)
@@ -0,0 +1,88 @@
+/* PR rtl-optimization/55006 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fomit-frame-pointer -funroll-loops" } */
+
+extern void abort (void) __attribute__ ((__noreturn__));
+extern void *memcpy (void *__restrict, const void *__restrict, __SIZE_TYPE__);
+
+static void __attribute__ ((__noinline__, __noclone__)) ga4076 (void)
+{
+  int D833;
+  float D834;
+  int D837;
+  int D840;
+  float D841;
+  int D844;
+  float dda[100];
+  int ids;
+  float limit3;
+  int offset2;
+  int pos1;
+  int S4;
+
+  static float A0[100] =
+  { 10e+0, 20e+0, 30e+0, 40e+0, 50e+0, 60e+0, 70e+0, 80e+0, 90e+0,
+    10e+1, 11e+1, 12e+1, 13e+1, 14e+1, 15e+1, 16e+1, 17e+1, 18e+1,
+    19e+1, 20e+1, 21e+1, 22e+1, 23e+1, 24e+1, 25e+1, 26e+1, 27e+1,
+    28e+1, 29e+1, 30e+1, 31e+1, 32e+1, 33e+1, 34e+1, 35e+1, 36e+1,
+    37e+1, 38e+1, 39e+1, 40e+1, 41e+1, 42e+1, 43e+1, 44e+1, 45e+1,
+    46e+1, 47e+1, 48e+1, 49e+1, 50e+1, 51e+1, 52e+1, 53e+1, 54e+1,
+    55e+1, 56e+1, 57e+1, 58e+1, 59e+1, 60e+1, 61e+1, 62e+1, 63e+1,
+    64e+1, 65e+1, 66e+1, 67e+1, 68e+1, 69e+1, 70e+1, 71e+1, 72e+1,
+    73e+1, 74e+1, 75e+1, 76e+1, 77e+1, 78e+1, 79e+1, 80e+1, 81e+1,
+    82e+1, 83e+1, 84e+1, 85e+1, 86e+1, 87e+1, 88e+1, 89e+1, 90e+1,
+    91e+1, 92e+1, 93e+1, 94e+1, 95e+1, 96e+1, 97e+1, 98e+1, 99e+1,
+    10e+2
+  };
+
+  memcpy (dda, A0, sizeof (dda));
+
+  limit3 = -1. * __builtin_inf();
+  pos1 = 0;
+  offset2 = 0;
+  S4 = 1;
+
+D831:
+  if (S4 > 100) goto L3; else goto D832;
+D832:
+  D833 = S4 - 1;
+  D834 = dda[D833];
+  if (D834 >= limit3) goto D835; else goto D836;
+D835:
+  D837 = S4 - 1;
+  limit3 = dda[D837];
+  pos1 = S4 + offset2;
+  goto L1;
+D836:
+  S4 = S4 + 1;
+  goto D831;
+L3:
+  pos1 = 1;
+  goto L2;
+L1:
+  if (S4 > 100) goto L2; else goto D839;
+D839:
+  D840 = S4 - 1;
+  D841 = dda[D840];
+  if (D841 > limit3) goto D842; else goto D843;
+D842:
+  D844 = S4 - 1;
+  limit3 = dda[D844];
+  pos1 = S4 + offset2;
+D843:
+  S4 = S4 + 1;
+  goto L1;
+L2:
+  ids = pos1;
+  if (ids != 100)
+    abort ();
+}
+
+
+int
+main (void)
+{
+  ga4076 ();
+  return 0;
+}
+

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-23 22:46                 ` Steven Bosscher
@ 2012-11-24  1:10                   ` Steven Bosscher
  2012-11-25 20:44                     ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-24  1:10 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

On Fri, Nov 23, 2012 at 11:45 PM, Steven Bosscher wrote:
> Removing the note is easier but it may hurt optimization later on:
> CSE2 puts the note back in but this introduces a pass ordering
> dependency. Perhaps that's not a big problem, but I'm not sure...

Hmm, actually CSE2 fails to put the note back, I guess because it's a
local CSE pass only...

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 22:48               ` Steven Bosscher
@ 2012-11-23 22:46                 ` Steven Bosscher
  2012-11-24  1:10                   ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-23 22:46 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

[-- Attachment #1: Type: text/plain, Size: 2970 bytes --]

On Mon, Nov 19, 2012 at 11:47 PM, Steven Bosscher wrote:
> On Mon, Nov 19, 2012 at 11:42 PM, Eric Botcazou wrote:
>>> That could be done, yes. Cleaning up the REG_EQ* notes requires
>>> liveness at the insn level, so it'd require a bigger re-organization
>>> of the code. Perhaps adding a new pass (conditional on DF_EQ_NOTES)
>>> over all insn in df_lr_finalize, tracking liveness and calling
>>> df_remove_dead_eq_notes on each insn.
>>>
>>> But most passes that use the REG_EQ* notes don't set the DF_EQ_NOTES flag.
>>
>> Right.  After the DF merge, the de-facto consensus was that individual passes
>> were still responsible for cleaning up the REG_EQ* notes when they change the
>> code, whereas the REG_DEAD/REG_UNUSED notes are entirely handled by DF.  As
>> such, the bug here is in the unroller, not in the webizer.
>
> Then it's in -fsplit-ivs-in-unroller. Looking into it.

This bug is giving me head-aches, but I think I finally understand
what is going wrong here.

Compiling the C test case with --param max-unroll-times=2, you get the
attached dumps. The loop that is unrolled to wrong-code is
bb10->bb7->bb8->bb9->bb10 in the loop2_unswitch dump (the last dump
before unrolling). The IV that is being split is in r132. In basic
block 9 of the loop2_unswitch dump there is the IV increment in insn
78:

   78 r132:SI=r132:SI+0x4

In the loop2_unroll dump the loop is
bb10->bb7->bb8->bb9->bb23->bb24->bb25->bb26->bb10. The loop is
unrolled by copying the body once, and the IV increment is expanded
into r165 at the end of basic block 9 in the loop2_unroll dump. This
makes r132 die after insn 181. The incremented IV is copied back into
r132 in insn 78, but there are no no-EQ_USES of this DEF so r132 is
still dead after insn 78. The IV increment of the unrolled loop
happens in insn 175 in basic block 25, which makes r132 live again:

  181 r165:SI=r132:SI+0x4
   78 r132:SI=r165:SI
...
  175 r132:SI=r165:SI+0x4

The insn with the REG_EQUAL note using r132 is insn 172 in the copied body:

  172 r131:SF=r158:SF
      REG_EQUAL: [r132:SI]
      REG_DEAD: r158:SF

With the (assumed agreed) semantics of REG_EQ* notes being invalid if
they use a register that isn't live in the insn with the note, this
REG_EQUAL note is invalid because r132 is dead after insn 78.

The note would be valid if loop2_unroll would copy-prop out r132 for
r165, or if it would remove the note in the copied body. I'd like to
do the former but DEF-USE chains aren't available at this point. We
could run CPROP before web (instead of web before CPROP like we do
now) which would make the REG_EQUAL note valid again, but that'd be
papering over the problem that loop2_unroll leaves this bad note.
Removing the note is easier but it may hurt optimization later on:
CSE2 puts the note back in but this introduces a pass ordering
dependency. Perhaps that's not a big problem, but I'm not sure...

Anyway, just so you know I haven't forgotten about this one yet :-)

Ciao!
Steven

[-- Attachment #2: PR55006.c.179r.loop2_unswitch.svg --]
[-- Type: image/svg+xml, Size: 29901 bytes --]

[-- Attachment #3: PR55006.c.180r.loop2_unroll.svg --]
[-- Type: image/svg+xml, Size: 55210 bytes --]

[-- Attachment #4: PR55006.c.182r.loop2_done.svg --]
[-- Type: image/svg+xml, Size: 44083 bytes --]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 22:44             ` Eric Botcazou
@ 2012-11-19 22:48               ` Steven Bosscher
  2012-11-23 22:46                 ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-19 22:48 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

On Mon, Nov 19, 2012 at 11:42 PM, Eric Botcazou wrote:
>> That could be done, yes. Cleaning up the REG_EQ* notes requires
>> liveness at the insn level, so it'd require a bigger re-organization
>> of the code. Perhaps adding a new pass (conditional on DF_EQ_NOTES)
>> over all insn in df_lr_finalize, tracking liveness and calling
>> df_remove_dead_eq_notes on each insn.
>>
>> But most passes that use the REG_EQ* notes don't set the DF_EQ_NOTES flag.
>
> Right.  After the DF merge, the de-facto consensus was that individual passes
> were still responsible for cleaning up the REG_EQ* notes when they change the
> code, whereas the REG_DEAD/REG_UNUSED notes are entirely handled by DF.  As
> such, the bug here is in the unroller, not in the webizer.

Then it's in -fsplit-ivs-in-unroller. Looking into it.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 22:03           ` Steven Bosscher
@ 2012-11-19 22:44             ` Eric Botcazou
  2012-11-19 22:48               ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-19 22:44 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

> That could be done, yes. Cleaning up the REG_EQ* notes requires
> liveness at the insn level, so it'd require a bigger re-organization
> of the code. Perhaps adding a new pass (conditional on DF_EQ_NOTES)
> over all insn in df_lr_finalize, tracking liveness and calling
> df_remove_dead_eq_notes on each insn.
> 
> But most passes that use the REG_EQ* notes don't set the DF_EQ_NOTES flag.

Right.  After the DF merge, the de-facto consensus was that individual passes 
were still responsible for cleaning up the REG_EQ* notes when they change the 
code, whereas the REG_DEAD/REG_UNUSED notes are entirely handled by DF.  As 
such, the bug here is in the unroller, not in the webizer.

So I think that we should try to fix the unroller first.  If that's not really 
doable, then a fallback could be to shield the passes using DF_EQ_NOTES from 
dangling REG_EQ* notes by means of DF.  But using df_note_add_problem for this 
task seems to be a little far-fetched.

> Perhaps df_note_add_problem should imply setting DF_EQ_NOTES?

Do you mean the opposite, i.e setting DF_EQ_NOTES should imply adding DF_NOTE?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 21:52         ` Eric Botcazou
@ 2012-11-19 22:03           ` Steven Bosscher
  2012-11-19 22:44             ` Eric Botcazou
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-19 22:03 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

On Mon, Nov 19, 2012 at 10:50 PM, Eric Botcazou wrote:
>> The root cause is the bad REG_EQUAL note. I think the most robust
>> solution is to make the webizer re-compute notes before renaming.
>> Patch for that is attached.
>
> Thanks for the analysis.  However...
>
>> Ciao!
>> Steven
>>
>>
>>         PR rtl-optimization/55006
>>         * web.c (web_main): Add the DF_NOTE problem, and explain whatfor.
>>
>> Index: web.c
>> ===================================================================
>> --- web.c       (revision 193454)
>> +++ web.c       (working copy)
>> @@ -335,9 +335,16 @@ web_main (void)
>>    unsigned int uses_num = 0;
>>    rtx insn;
>>
>> +  /* Add the flags and problems we need.  The DF_EQ_NOTES flag is set so
>> +     that uses of registers in REG_EQUAL and REG_EQUIV notes are included
>> +     in the web that their DEF belongs to, so that these uses are also
>> +     properly renamed.  The DF_NOTE problem is added to make sure that
>> +     all notes are up-to-date and valid: Re-computing the notes problem
>> +     also cleans up all dead REG_EQUAL notes.  */
>>    df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
>>    df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
>>    df_chain_add_problem (DF_UD_CHAIN);
>> +  df_note_add_problem ();
>>    df_analyze ();
>>    df_set_flags (DF_DEFER_INSN_RESCAN);
>
> ... that's not very satisfactory, as web doesn't use the DF_NOTE problem, so
> adding it just to clean things up in the other kind of notes is weird.

True.

> Can't we arrange to clean up the REG_EQUAL/REG_EQUIV notes when we use them,
> i.e. when DF_EQ_NOTES is set?

That could be done, yes. Cleaning up the REG_EQ* notes requires
liveness at the insn level, so it'd require a bigger re-organization
of the code. Perhaps adding a new pass (conditional on DF_EQ_NOTES)
over all insn in df_lr_finalize, tracking liveness and calling
df_remove_dead_eq_notes on each insn.

But most passes that use the REG_EQ* notes don't set the DF_EQ_NOTES flag.

Perhaps df_note_add_problem should imply setting DF_EQ_NOTES?

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 21:20       ` Steven Bosscher
@ 2012-11-19 21:52         ` Eric Botcazou
  2012-11-19 22:03           ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-19 21:52 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

> The root cause is the bad REG_EQUAL note. I think the most robust
> solution is to make the webizer re-compute notes before renaming.
> Patch for that is attached.

Thanks for the analysis.  However...

> Ciao!
> Steven
> 
> 
>         PR rtl-optimization/55006
>         * web.c (web_main): Add the DF_NOTE problem, and explain whatfor.
> 
> Index: web.c
> ===================================================================
> --- web.c       (revision 193454)
> +++ web.c       (working copy)
> @@ -335,9 +335,16 @@ web_main (void)
>    unsigned int uses_num = 0;
>    rtx insn;
> 
> +  /* Add the flags and problems we need.  The DF_EQ_NOTES flag is set so
> +     that uses of registers in REG_EQUAL and REG_EQUIV notes are included
> +     in the web that their DEF belongs to, so that these uses are also
> +     properly renamed.  The DF_NOTE problem is added to make sure that
> +     all notes are up-to-date and valid: Re-computing the notes problem
> +     also cleans up all dead REG_EQUAL notes.  */
>    df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
>    df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
>    df_chain_add_problem (DF_UD_CHAIN);
> +  df_note_add_problem ();
>    df_analyze ();
>    df_set_flags (DF_DEFER_INSN_RESCAN);

... that's not very satisfactory, as web doesn't use the DF_NOTE problem, so 
adding it just to clean things up in the other kind of notes is weird.

Can't we arrange to clean up the REG_EQUAL/REG_EQUIV notes when we use them, 
i.e. when DF_EQ_NOTES is set?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 20:35     ` Steven Bosscher
  2012-11-19 21:20       ` Steven Bosscher
@ 2012-11-19 21:29       ` Eric Botcazou
  1 sibling, 0 replies; 68+ messages in thread
From: Eric Botcazou @ 2012-11-19 21:29 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, Dominique Dhumieres, bonzini, hubicka

> Thanks for this reduced test case, that's saving me a lot of work!

You're welcome.

> Can you please try and see if the following C test case also fails?

Yup, with the same compilation options, there are the same dreadful lines

	ld	[%g0+0], %fn

in the assembly file, which translates into (mem:SF (const_int 0 [0])) in the 
RTL dumps.  And the executable does segfault at runtime.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 20:35     ` Steven Bosscher
@ 2012-11-19 21:20       ` Steven Bosscher
  2012-11-19 21:52         ` Eric Botcazou
  2012-11-19 21:29       ` Eric Botcazou
  1 sibling, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-19 21:20 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Dominique Dhumieres, gcc-patches, bonzini, hubicka

On Mon, Nov 19, 2012 at 9:34 PM, Steven Bosscher wrote:
> On Mon, Nov 19, 2012 at 12:24 PM, Eric Botcazou wrote:
>>> Yes, I'll be looking into this soon.
>>
>> We have a related regression on SPARC:
>>
>> FAIL: gfortran.dg/minmaxloc_5.f90  -O3 -fomit-frame-pointer -funroll-loops
>> execution test
>> FAIL: gfortran.dg/minmaxloc_5.f90  -O3 -fomit-frame-pointer -funroll-all-loops
>> -finline-functions  execution test
>> FAIL: gfortran.dg/minmaxloc_6.f90  -O3 -fomit-frame-pointer -funroll-loops
>> execution test
>> FAIL: gfortran.dg/minmaxloc_6.f90  -O3 -fomit-frame-pointer -funroll-all-loops
>> -finline-functions  execution test
>>
>> whose failure mode is exactly the same as for Honza's testcase.  I even have a
>> more reduced testcase (6 lines, attached) in case you'd prefer working on it.
>>
>> Reproducible with a cross to sparc-linux with -mcpu=v8 -O3 -funroll-loops and
>> grepping for mem:SF (const_int 0 [0]) in the RTL dumps.
>
> Thanks for this reduced test case, that's saving me a lot of work!
>
> Can you please try and see if the following C test case also fails?
>
> It looks like the memcpy is expanded to a loop that is unrolled, and
> then mis-optimized. I haven't found out yet why...

It's another case of an invalid REG_EQUAL note. Shown below is an
excerpt from the C test case .loop2_done dump (after calling
df_analyze to update liveness). The note on insn 172 uses (reg:SI 132)
which is not live on entry to basic block 25.

;; basic block 25, loop depth 0, count 0, freq 2485, maybe hot
;;  prev block 20, next block 26, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       11 [50.5%]
;; bb 25 artificial_defs: { }
;; bb 25 artificial_uses: { u-1(14){ }u-1(30){ }u-1(101){ }}
;; lr  in        14 [%sp] 30 [%fp] 31 [%i7] 101 [%sfp] 138 158 165
;; lr  use       14 [%sp] 30 [%fp] 101 [%sfp] 138 158
;; lr  def       131 133
;; live  in      14 [%sp] 101 [%sfp] 138 158 165
;; live  gen     131 133
;; live  kill

(code_label 179 149 173 25 22 "" [1 uses])
(note 173 179 171 25 [bb 25] NOTE_INSN_BASIC_BLOCK)
(insn 171 173 172 25 (set (reg/v:SI 133 [ idsD.1386 ])
        (reg/v:SI 138 [ idsD.1386 ])) 40 {*movsi_insn}
     (expr_list:REG_UNUSED (reg/v:SI 133 [ idsD.1386 ])
        (nil)))
(insn 172 171 176 25 (set (reg/v:SF 131 [ limit3D.1388 ])
        (reg:SF 158 [ limit3D.1388 ])) t.c:62 -1
     (expr_list:REG_EQUAL (mem:SF (reg:SI 132 [ ivtmp.7D.1419 ]) [2
MEM[base: _23, offset: 0B]+0 S4 A32])
        (expr_list:REG_DEAD (reg:SF 158 [ limit3D.1388 ])
            (nil))))
;; lr  out       14 [%sp] 30 [%fp] 31 [%i7] 101 [%sfp] 131 133 138 165
;; live  out     14 [%sp] 101 [%sfp] 131 133 138 165


The use of the dead reg is renamed in the webizer:

(insn 172 171 176 25 (set (reg/v:SF 131 [ limit3D.1388 ])
        (reg:SF 172 [ limit3D.1388 ])) t.c:62 66 {*movsf_insn}
     (expr_list:REG_EQUAL (mem:SF (reg:SI 174 [ ivtmp.7D.1419 ]) [2
MEM[base: _23, offset: 0B]+0 S4 A32])
        (expr_list:REG_DEAD (reg:SF 158 [ limit3D.1388 ])
            (nil))))


Next, cse2 thinks it's a good idea to actually use the REG_EQUAL note,
turning the EQ-use of the uninitialized reg 174 into a real use:

(insn 172 171 176 17 (set (reg/v:SF 131 [ limit3D.1388 ])
        (mem:SF (reg:SI 174 [ ivtmp.7D.1419 ]) [2 MEM[base: _23,
offset: 0B]+0 S4 A32])) t.c:62 66 {*movsf_insn}
     (expr_list:REG_EQUAL (mem:SF (reg:SI 174 [ ivtmp.7D.1419 ]) [2
MEM[base: _23, offset: 0B]+0 S4 A32])
        (expr_list:REG_DEAD (reg:SF 158 [ limit3D.1388 ])
            (nil))))


Then init-regs concludes that reg 174 is used ininitialized:

adding initialization in ga4076 of reg 174 at in block 16 for insn 172.
(insn 206 171 172 16 (set (reg:SI 174 [ ivtmp.7D.1419 ])
        (const_int 0 [0])) t.c:62 -1
     (nil))
(insn 172 206 176 16 (set (reg/v:SF 131 [ limit3D.1388 ])
        (mem:SF (reg:SI 174 [ ivtmp.7D.1419 ]) [2 MEM[base: _23,
offset: 0B]+0 S4 A32])) t.c:62 66 {*movsf_insn}
     (expr_list:REG_DEAD (reg:SI 174 [ ivtmp.7D.1419 ])
        (expr_list:REG_EQUAL (mem:SF (reg:SI 174 [ ivtmp.7D.1419 ]) [2
MEM[base: _23, offset: 0B]+0 S4 A32])
            (nil))))


And finally, combine merges the above to yield the final result:
Trying 206 -> 172:
Successfully matched this instruction:
(set (reg/v:SF 131 [ limit3D.1388 ])
    (mem:SF (const_int 0 [0]) [2 MEM[base: _23, offset: 0B]+0 S4 A32]))


The root cause is the bad REG_EQUAL note. I think the most robust
solution is to make the webizer re-compute notes before renaming.
Patch for that is attached.

Ciao!
Steven


        PR rtl-optimization/55006
        * web.c (web_main): Add the DF_NOTE problem, and explain whatfor.

Index: web.c
===================================================================
--- web.c       (revision 193454)
+++ web.c       (working copy)
@@ -335,9 +335,16 @@ web_main (void)
   unsigned int uses_num = 0;
   rtx insn;

+  /* Add the flags and problems we need.  The DF_EQ_NOTES flag is set so
+     that uses of registers in REG_EQUAL and REG_EQUIV notes are included
+     in the web that their DEF belongs to, so that these uses are also
+     properly renamed.  The DF_NOTE problem is added to make sure that
+     all notes are up-to-date and valid: Re-computing the notes problem
+     also cleans up all dead REG_EQUAL notes.  */
   df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES);
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_UD_CHAIN);
+  df_note_add_problem ();
   df_analyze ();
   df_set_flags (DF_DEFER_INSN_RESCAN);

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19 11:27   ` Eric Botcazou
@ 2012-11-19 20:35     ` Steven Bosscher
  2012-11-19 21:20       ` Steven Bosscher
  2012-11-19 21:29       ` Eric Botcazou
  0 siblings, 2 replies; 68+ messages in thread
From: Steven Bosscher @ 2012-11-19 20:35 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Dominique Dhumieres, gcc-patches, bonzini, hubicka

On Mon, Nov 19, 2012 at 12:24 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Yes, I'll be looking into this soon.
>
> We have a related regression on SPARC:
>
> FAIL: gfortran.dg/minmaxloc_5.f90  -O3 -fomit-frame-pointer -funroll-loops
> execution test
> FAIL: gfortran.dg/minmaxloc_5.f90  -O3 -fomit-frame-pointer -funroll-all-loops
> -finline-functions  execution test
> FAIL: gfortran.dg/minmaxloc_6.f90  -O3 -fomit-frame-pointer -funroll-loops
> execution test
> FAIL: gfortran.dg/minmaxloc_6.f90  -O3 -fomit-frame-pointer -funroll-all-loops
> -finline-functions  execution test
>
> whose failure mode is exactly the same as for Honza's testcase.  I even have a
> more reduced testcase (6 lines, attached) in case you'd prefer working on it.
>
> Reproducible with a cross to sparc-linux with -mcpu=v8 -O3 -funroll-loops and
> grepping for mem:SF (const_int 0 [0]) in the RTL dumps.

Thanks for this reduced test case, that's saving me a lot of work!

Can you please try and see if the following C test case also fails?

It looks like the memcpy is expanded to a loop that is unrolled, and
then mis-optimized. I haven't found out yet why...

Ciao!
Steven



extern void abort (void) __attribute__ ((__noreturn__));
extern void *memcpy (void *__restrict, const void *__restrict, __SIZE_TYPE__);

static void __attribute__ ((__noinline__, __noclone__)) ga4076 (void)
{
  int D833;
  float D834;
  int D837;
  int D840;
  float D841;
  int D844;
  float dda[100];
  int ids;

  static float A0[100] =
  { 10e+0, 20e+0, 30e+0, 40e+0, 50e+0, 60e+0, 70e+0, 80e+0, 90e+0,
    10e+1, 11e+1, 12e+1, 13e+1, 14e+1, 15e+1, 16e+1, 17e+1, 18e+1,
    19e+1, 20e+1, 21e+1, 22e+1, 23e+1, 24e+1, 25e+1, 26e+1, 27e+1,
    28e+1, 29e+1, 30e+1, 31e+1, 32e+1, 33e+1, 34e+1, 35e+1, 36e+1,
    37e+1, 38e+1, 39e+1, 40e+1, 41e+1, 42e+1, 43e+1, 44e+1, 45e+1,
    46e+1, 47e+1, 48e+1, 49e+1, 50e+1, 51e+1, 52e+1, 53e+1, 54e+1,
    55e+1, 56e+1, 57e+1, 58e+1, 59e+1, 60e+1, 61e+1, 62e+1, 63e+1,
    64e+1, 65e+1, 66e+1, 67e+1, 68e+1, 69e+1, 70e+1, 71e+1, 72e+1,
    73e+1, 74e+1, 75e+1, 76e+1, 77e+1, 78e+1, 79e+1, 80e+1, 81e+1,
    82e+1, 83e+1, 84e+1, 85e+1, 86e+1, 87e+1, 88e+1, 89e+1, 90e+1,
    91e+1, 92e+1, 93e+1, 94e+1, 95e+1, 96e+1, 97e+1, 98e+1, 99e+1,
    10e+2
  };

  memcpy (dda, A0, sizeof (dda));
  float limit3;
  int offset2;
  int pos1;
  int S4;

  limit3 = -1. *  __builtin_inf();
  pos1 = 0;
  offset2 = 0;

  S4 = 1;
D831:
  if (S4 > 100) goto L3; else goto D832;
D832:
  D833 = S4 - 1;
  D834 = dda[D833];
  if (D834 >= limit3) goto D835; else goto D836;
D835:
  D837 = S4 - 1;
  limit3 = dda[D837];
  pos1 = S4 + offset2;
  goto L1;
D836:
  S4 = S4 + 1;
  goto D831;
L3:
  pos1 = 1;
  goto L2;
L1:
  if (S4 > 100) goto L2; else goto D839;
D839:
  D840 = S4 - 1;
  D841 = dda[D840];
  if (D841 > limit3) goto D842; else goto D843;
D842:
  D844 = S4 - 1;
  limit3 = dda[D844];
  pos1 = S4 + offset2;
D843:
  S4 = S4 + 1;
  goto L1;
L2:
  ids = pos1;
  if (ids != 100)
    abort ();
}

int
main (void)
{
  ga4076 ();
  return 0;
}

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-19  9:50 ` Steven Bosscher
@ 2012-11-19 11:27   ` Eric Botcazou
  2012-11-19 20:35     ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Botcazou @ 2012-11-19 11:27 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Dominique Dhumieres, gcc-patches, bonzini, hubicka

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

> Yes, I'll be looking into this soon.

We have a related regression on SPARC:

FAIL: gfortran.dg/minmaxloc_5.f90  -O3 -fomit-frame-pointer -funroll-loops  
execution test
FAIL: gfortran.dg/minmaxloc_5.f90  -O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions  execution test
FAIL: gfortran.dg/minmaxloc_6.f90  -O3 -fomit-frame-pointer -funroll-loops  
execution test
FAIL: gfortran.dg/minmaxloc_6.f90  -O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions  execution test

whose failure mode is exactly the same as for Honza's testcase.  I even have a 
more reduced testcase (6 lines, attached) in case you'd prefer working on it.

Reproducible with a cross to sparc-linux with -mcpu=v8 -O3 -funroll-loops and 
grepping for mem:SF (const_int 0 [0]) in the RTL dumps.

-- 
Eric Botcazou

[-- Attachment #2: minmaxloc_5.f90 --]
[-- Type: text/x-fortran, Size: 115 bytes --]

program GA4076

  REAL DDA(100)
  dda = (/(J1,J1=1,100)/)
  IDS = MAXLOC(DDA,1)
  if (ids.ne.100) call abort
  
END

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
  2012-11-18 23:15 Dominique Dhumieres
@ 2012-11-19  9:50 ` Steven Bosscher
  2012-11-19 11:27   ` Eric Botcazou
  0 siblings, 1 reply; 68+ messages in thread
From: Steven Bosscher @ 2012-11-19  9:50 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: gcc-patches, bonzini, hubicka

On Mon, Nov 19, 2012 at 12:15 AM, Dominique Dhumieres wrote:
>> I think this should fix it. Can't test it right now, so help
>> appreciated (Honza: hint hint! ;-)
>
> The change at http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01511/remove_dead_eq_notes.diff
> (revision 192526) caused
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55006

Yes, I'll be looking into this soon.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
@ 2012-11-18 23:15 Dominique Dhumieres
  2012-11-19  9:50 ` Steven Bosscher
  0 siblings, 1 reply; 68+ messages in thread
From: Dominique Dhumieres @ 2012-11-18 23:15 UTC (permalink / raw)
  To: gcc-patches; +Cc: bonzini, hubicka, stevenb.gcc

> I think this should fix it. Can't test it right now, so help
> appreciated (Honza: hint hint! ;-)

The change at http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01511/remove_dead_eq_notes.diff
(revision 192526) caused

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55006

Dominique

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2012-12-03 23:28 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-12 20:14 Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug) Jan Hubicka
2012-10-12 20:36 ` Markus Trippelsdorf
2012-10-12 20:44 ` Steven Bosscher
2012-10-12 21:16   ` Jan Hubicka
2012-10-12 21:19     ` Steven Bosscher
2012-10-12 21:31       ` Jan Hubicka
2012-10-12 22:41         ` Steven Bosscher
2012-10-14  9:13           ` Paolo Bonzini
2012-10-14 21:27             ` Steven Bosscher
2012-10-14 21:35               ` Eric Botcazou
2012-10-14 21:45                 ` Steven Bosscher
2012-10-15  8:14               ` Paolo Bonzini
2012-10-15  8:23                 ` Steven Bosscher
2012-10-15  8:35                   ` Paolo Bonzini
2012-10-15  8:38                     ` Steven Bosscher
2012-10-15 10:49                       ` Steven Bosscher
2012-10-15 12:28                       ` Paolo Bonzini
2012-10-15 13:19                         ` Steven Bosscher
2012-10-15 13:29                           ` Paolo Bonzini
2012-10-15 13:49                             ` Steven Bosscher
2012-10-16 10:35                               ` Steven Bosscher
2012-10-16 11:05                                 ` Steven Bosscher
2012-10-16 11:42                                   ` Paolo Bonzini
2012-10-16 22:57                                     ` Steven Bosscher
2012-10-19  5:14                                       ` Bin.Cheng
2012-10-12 21:05 ` Steven Bosscher
2012-11-18 23:15 Dominique Dhumieres
2012-11-19  9:50 ` Steven Bosscher
2012-11-19 11:27   ` Eric Botcazou
2012-11-19 20:35     ` Steven Bosscher
2012-11-19 21:20       ` Steven Bosscher
2012-11-19 21:52         ` Eric Botcazou
2012-11-19 22:03           ` Steven Bosscher
2012-11-19 22:44             ` Eric Botcazou
2012-11-19 22:48               ` Steven Bosscher
2012-11-23 22:46                 ` Steven Bosscher
2012-11-24  1:10                   ` Steven Bosscher
2012-11-25 20:44                     ` Steven Bosscher
2012-11-25 22:38                       ` Steven Bosscher
2012-11-26 14:38                         ` Dominique Dhumieres
2012-11-26 15:46                           ` Steven Bosscher
2012-11-27  9:58                         ` Eric Botcazou
2012-11-27 10:35                           ` Steven Bosscher
2012-11-27 12:01                             ` Steven Bosscher
2012-11-27 12:29                               ` Steven Bosscher
2012-11-27 13:04                                 ` Steven Bosscher
2012-11-27 14:25                                   ` Dominique Dhumieres
2012-11-27 14:49                                     ` Steven Bosscher
2012-11-27 13:48                                 ` Dominique Dhumieres
2012-11-27 14:33                               ` Paolo Bonzini
2012-11-27 16:59                                 ` Eric Botcazou
2012-11-27 23:29                                   ` Steven Bosscher
2012-11-27 23:50                                     ` Eric Botcazou
2012-11-27 23:54                                       ` Steven Bosscher
2012-11-27 23:59                                         ` Steven Bosscher
2012-11-28  0:43                                           ` Steven Bosscher
2012-11-28  7:46                                             ` Eric Botcazou
2012-11-28 15:57                                               ` Steven Bosscher
2012-11-28 22:12                                                 ` Eric Botcazou
2012-11-28 23:54                                                   ` Steven Bosscher
2012-12-01 14:57                                                     ` Eric Botcazou
2012-12-01 16:45                                                       ` Steven Bosscher
2012-12-03 18:26                                                         ` Eric Botcazou
2012-12-03 20:20                                                           ` Steven Bosscher
2012-12-03 21:12                                                             ` Eric Botcazou
2012-12-03 23:28                                                             ` Steven Bosscher
2012-12-03 20:15                                                       ` Paolo Bonzini
2012-11-19 21:29       ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).