public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] PR69195, Reload confused by invalid reg equivs
@ 2016-03-04 14:54 Alan Modra
  2016-03-04 17:40 ` Bernd Schmidt
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Modra @ 2016-03-04 14:54 UTC (permalink / raw)
  To: gcc-patches

This is a fix for two testcases that show reload replacing pseudos
that don't get hard regs, with their equivalent mem initialization,
but failing to address the mem properly.  The short story is that ira
analysis creates reg equivalence info for use by reload, based on
register lifetimes that are invalidated by ira itself deleting dead
insns.

My first attempt to fix this problem was to just run
delete_trivially_dead_insns early in ira.  That's enough to cure the
testcase failures.  However, ira also deletes unreachable blocks (that
are only recognized as such after ira reg equivalence analysis), and
I figure it is possible that deleting those insns may similarly affect
register lifetimes.

So what this patch does is revalidate the reg equivalences after insns
have been deleted, recreating them as if insns had been deleted before
any analysis occurred.  The patch has been bootstrapped and regression
tested on powerpc64le-linux.  Do we go with this approach, or just run
simple delete_trivially_dead_insns early?  Or something else?

gcc/
	PR rtl-optimization/69195
	* ira.c (pdx_subregs): New static, extracted from update_equiv_regs.
	(set_paradoxical_subreg): Delete pdx_subregs param.
	(add_store_equivs): New function, extracted from..
	(update_equiv_regs): ..here.  Don't create and free pdx_subregs or
	reg_equiv.
	(validate_equiv_regs): New function.
	(ira): Create and free pdx_subregs and reg_equiv.  After deleting
	insns, call validate_equiv_regs.  Call setup_reg_equiv later.
	(setup_reg_equiv): Protect against deleted insns.
gcc/testsuite/
	* gcc.dg/pr69195.c: New.
	* gcc.dg/pr69238.c: New.

diff --git a/gcc/ira.c b/gcc/ira.c
index 0973258..047e164 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3284,11 +3284,15 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED,
     }
 }
 
+/* Use pdx_subregs to show whether a reg is used in a paradoxical
+   subreg.  */
+static  bool *pdx_subregs;
+
 /* Check whether the SUBREG is a paradoxical subreg and set the result
    in PDX_SUBREGS.  */
 
 static void
-set_paradoxical_subreg (rtx_insn *insn, bool *pdx_subregs)
+set_paradoxical_subreg (rtx_insn *insn)
 {
   subrtx_iterator::array_type array;
   FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
@@ -3319,6 +3323,100 @@ adjust_cleared_regs (rtx loc, const_rtx old_rtx ATTRIBUTE_UNUSED, void *data)
   return NULL_RTX;
 }
 
+/* For insns that set a MEM to the contents of a REG that is only used
+   in a single basic block, see if the register is always equivalent
+   to that memory location and if moving the store from INSN to the
+   insn that sets REG is safe.  If so, put a REG_EQUIV note on the
+   initializing insn.
+
+   Don't add a REG_EQUIV note if the insn already has one.  The
+   existing REG_EQUIV is likely more useful than the one we are
+   adding.
+
+   If one of the regs in the address has reg_equiv[REGNO].replace set,
+   then we can't add this REG_EQUIV note.  The reg_equiv[REGNO].replace
+   optimization may move the set of this register immediately before
+   insn, which puts it after reg_equiv[REGNO].init_insns, and hence the
+   mention in the REG_EQUIV note would be to an uninitialized pseudo.  */
+static void
+add_store_equivs (void)
+{
+  for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      if (! INSN_P (insn))
+	continue;
+
+      rtx set = single_set (insn);
+      if (! set)
+	continue;
+
+      rtx dest = SET_DEST (set);
+      rtx src = SET_SRC (set);
+      unsigned int regno;
+      rtx_insn *init_insn;
+
+      if (MEM_P (dest)
+	  && REG_P (src)
+	  && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
+	  && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
+	  && DF_REG_DEF_COUNT (regno) == 1
+	  && ! pdx_subregs[regno]
+	  && reg_equiv[regno].init_insns != 0
+	  && (init_insn = reg_equiv[regno].init_insns->insn ()) != 0
+	  && ! find_reg_note (init_insn, REG_EQUIV, NULL_RTX)
+	  && ! contains_replace_regs (XEXP (dest, 0))
+	  && validate_equiv_mem (init_insn, src, dest)
+	  && ! memref_used_between_p (dest, init_insn, insn)
+	  /* Attaching a REG_EQUIV note will fail if INIT_INSN has
+	     multiple sets.  */
+	  && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
+	{
+	  /* This insn makes the equivalence, not the one initializing
+	     the register.  */
+	  ira_reg_equiv[regno].init_insns
+	    = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
+	  df_notes_rescan (init_insn);
+	}
+    }
+}
+
+/* After deleting dead insns, check that REG_EQUIV notes are still
+   valid.  */
+static void
+validate_equiv_regs (void)
+{
+  bool done_something = false;
+  unsigned int max = vec_safe_length (reg_equivs);
+
+  for (unsigned int regno = FIRST_PSEUDO_REGISTER; regno < max; regno++)
+    {
+      for (rtx_insn_list *elem = reg_equiv[regno].init_insns;
+	   elem;
+	   elem = elem->next ())
+	{
+	  rtx_insn *insn = elem->insn ();
+	  rtx set, note;
+
+	  if (insn != 0
+	      && ! insn->deleted ()
+	      && (set = single_set (insn)) != 0
+	      && REG_P (SET_DEST (set))
+	      && REGNO (SET_DEST (set)) == regno
+	      && MEM_P (SET_SRC (set))
+	      && (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != 0
+	      && rtx_equal_p (SET_SRC (set), XEXP (note, 0))
+	      && ! validate_equiv_mem (insn, SET_DEST (set), SET_SRC (set)))
+	    {
+	      remove_note (insn, note);
+	      done_something = true;
+	    }
+	}
+    }
+
+  if (done_something)
+    add_store_equivs ();
+}
+
 /* Nonzero if we recorded an equivalence for a LABEL_REF.  */
 static int recorded_label_ref;
 
@@ -3341,17 +3439,11 @@ update_equiv_regs (void)
   basic_block bb;
   int loop_depth;
   bitmap cleared_regs;
-  bool *pdx_subregs;
 
   /* We need to keep track of whether or not we recorded a LABEL_REF so
      that we know if the jump optimizer needs to be rerun.  */
   recorded_label_ref = 0;
 
-  /* Use pdx_subregs to show whether a reg is used in a paradoxical
-     subreg.  */
-  pdx_subregs = XCNEWVEC (bool, max_regno);
-
-  reg_equiv = XCNEWVEC (struct equivalence, max_regno);
   grow_reg_equivs ();
 
   init_alias_analysis ();
@@ -3363,7 +3455,7 @@ update_equiv_regs (void)
   FOR_EACH_BB_FN (bb, cfun)
     FOR_BB_INSNS (bb, insn)
       if (NONDEBUG_INSN_P (insn))
-	set_paradoxical_subreg (insn, pdx_subregs);
+	set_paradoxical_subreg (insn);
 
   /* Scan the insns and find which registers have equivalences.  Do this
      in a separate scan of the insns because (due to -fcse-follow-jumps)
@@ -3625,65 +3717,7 @@ update_equiv_regs (void)
 
   /* A second pass, to gather additional equivalences with memory.  This needs
      to be done after we know which registers we are going to replace.  */
-
-  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
-    {
-      rtx set, src, dest;
-      unsigned regno;
-
-      if (! INSN_P (insn))
-	continue;
-
-      set = single_set (insn);
-      if (! set)
-	continue;
-
-      dest = SET_DEST (set);
-      src = SET_SRC (set);
-
-      /* If this sets a MEM to the contents of a REG that is only used
-	 in a single basic block, see if the register is always equivalent
-	 to that memory location and if moving the store from INSN to the
-	 insn that set REG is safe.  If so, put a REG_EQUIV note on the
-	 initializing insn.
-
-	 Don't add a REG_EQUIV note if the insn already has one.  The existing
-	 REG_EQUIV is likely more useful than the one we are adding.
-
-	 If one of the regs in the address has reg_equiv[REGNO].replace set,
-	 then we can't add this REG_EQUIV note.  The reg_equiv[REGNO].replace
-	 optimization may move the set of this register immediately before
-	 insn, which puts it after reg_equiv[REGNO].init_insns, and hence
-	 the mention in the REG_EQUIV note would be to an uninitialized
-	 pseudo.  */
-
-      if (MEM_P (dest) && REG_P (src)
-	  && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
-	  && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
-	  && DF_REG_DEF_COUNT (regno) == 1
-	  && reg_equiv[regno].init_insns != NULL
-	  && reg_equiv[regno].init_insns->insn () != NULL
-	  && ! find_reg_note (XEXP (reg_equiv[regno].init_insns, 0),
-			      REG_EQUIV, NULL_RTX)
-	  && ! contains_replace_regs (XEXP (dest, 0))
-	  && ! pdx_subregs[regno])
-	{
-	  rtx_insn *init_insn =
-	    as_a <rtx_insn *> (XEXP (reg_equiv[regno].init_insns, 0));
-	  if (validate_equiv_mem (init_insn, src, dest)
-	      && ! memref_used_between_p (dest, init_insn, insn)
-	      /* Attaching a REG_EQUIV note will fail if INIT_INSN has
-		 multiple sets.  */
-	      && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
-	    {
-	      /* This insn makes the equivalence, not the one initializing
-		 the register.  */
-	      ira_reg_equiv[regno].init_insns
-		= gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
-	      df_notes_rescan (init_insn);
-	    }
-	}
-    }
+  add_store_equivs ();
 
   cleared_regs = BITMAP_ALLOC (NULL);
   /* Now scan all regs killed in an insn to see if any of them are
@@ -3858,8 +3892,6 @@ update_equiv_regs (void)
   /* Clean up.  */
 
   end_alias_analysis ();
-  free (reg_equiv);
-  free (pdx_subregs);
   return recorded_label_ref;
 }
 
@@ -3882,11 +3914,12 @@ setup_reg_equiv (void)
       {
 	next_elem = elem->next ();
 	insn = elem->insn ();
-	set = single_set (insn);
 	
 	/* Init insns can set up equivalence when the reg is a destination or
 	   a source (in this case the destination is memory).  */
-	if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
+	if (! insn->deleted ()
+	    && (set = single_set (insn)) != 0
+	    && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
 	  {
 	    if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
 	      {
@@ -5090,7 +5123,6 @@ ira (FILE *f)
 {
   bool loops_p;
   int ira_max_point_before_emit;
-  int rebuild_p;
   bool saved_flag_caller_saves = flag_caller_saves;
   enum ira_region saved_flag_ira_region = flag_ira_region;
 
@@ -5184,10 +5216,9 @@ ira (FILE *f)
   if (resize_reg_info () && flag_ira_loop_pressure)
     ira_set_pseudo_classes (true, ira_dump_file);
 
-  rebuild_p = update_equiv_regs ();
-  setup_reg_equiv ();
-  setup_reg_equiv_init ();
-
+  pdx_subregs = XCNEWVEC (bool, max_regno);
+  reg_equiv = XCNEWVEC (struct equivalence, max_regno);
+  bool rebuild_p = update_equiv_regs ();
   bool update_regstat = false;
 
   if (optimize && rebuild_p)
@@ -5210,6 +5241,14 @@ ira (FILE *f)
       update_regstat = true;
     }
 
+  if (update_regstat)
+    validate_equiv_regs ();
+  free (reg_equiv);
+  free (pdx_subregs);
+
+  setup_reg_equiv ();
+  setup_reg_equiv_init ();
+
   /* It is not worth to do such improvement when we use a simple
      allocation because of -O0 usage or because the function is too
      big.  */
diff --git a/gcc/testsuite/gcc.dg/pr69195.c b/gcc/testsuite/gcc.dg/pr69195.c
new file mode 100644
index 0000000..af373a1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69195.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-dce -fno-forward-propagate" } */
+
+void __attribute__ ((noinline, noclone))
+foo (int *a, int n)
+{
+  int *lasta = a + n;
+  for (; a != lasta; a++)
+    {
+      *a *= 2;
+      a[1] = a[-1] + a[-2];
+    }
+}
+
+int
+main ()
+{
+  int a[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
+  int r[16] = { 1, 2, 6, 6, 16, 24, 44, 80,
+		136, 248, 432, 768, 1360, 2400, 4256, 3760 };
+  unsigned i;
+  foo (&a[2], 13);
+  for (i = 0; i < 8; ++i)
+    if (a[i] != r[i])
+      __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr69238.c b/gcc/testsuite/gcc.dg/pr69238.c
new file mode 100644
index 0000000..3538e63
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69238.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-rerun-cse-after-loop -funroll-loops" } */
+
+
+#define N 32
+
+short sa[N];
+short sb[N];
+int ia[N];
+int ib[N];
+
+int __attribute__ ((noinline, noclone))
+main1 (int n)
+{
+  int i;
+  for (i = 0; i < n; i++)
+    {
+      sa[i+7] = sb[i];
+      ia[i+3] = ib[i+1];
+    }
+  return 0;
+}
+
+int
+main (void)
+{ 
+  return main1 (N-7);
+}

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC] PR69195, Reload confused by invalid reg equivs
  2016-03-04 14:54 [RFC] PR69195, Reload confused by invalid reg equivs Alan Modra
@ 2016-03-04 17:40 ` Bernd Schmidt
  2016-03-07 14:47   ` Alan Modra
  0 siblings, 1 reply; 18+ messages in thread
From: Bernd Schmidt @ 2016-03-04 17:40 UTC (permalink / raw)
  To: Alan Modra, gcc-patches

On 03/04/2016 03:54 PM, Alan Modra wrote:
> This is a fix for two testcases that show reload replacing pseudos
> that don't get hard regs, with their equivalent mem initialization,
> but failing to address the mem properly.  The short story is that ira
> analysis creates reg equivalence info for use by reload, based on
> register lifetimes that are invalidated by ira itself deleting dead
> insns.
>
> My first attempt to fix this problem was to just run
> delete_trivially_dead_insns early in ira.  That's enough to cure the
> testcase failures.  However, ira also deletes unreachable blocks (that
> are only recognized as such after ira reg equivalence analysis), and
> I figure it is possible that deleting those insns may similarly affect
> register lifetimes.
>
> So what this patch does is revalidate the reg equivalences after insns
> have been deleted, recreating them as if insns had been deleted before
> any analysis occurred.  The patch has been bootstrapped and regression
> tested on powerpc64le-linux.  Do we go with this approach, or just run
> simple delete_trivially_dead_insns early?  Or something else?

I've managed to reproduce this, and I think your analysis of the problem 
is correct. So the patch is probably ok from the point of you of "will 
it work". I can probably be convinced to approve it as-is, but I wonder 
if you'd be prepared to try something else first.

This whole area in IRA is turning into spaghetti a little bit. I would 
prefer that we schedule a new small optimization pass beforehand (either 
as a real pass or just a function call) that does only the 
label-substituting trick from update_equiv_regs, then removes dead code 
as necessary. When we then get to IRA and setting up equiv regs, we 
should no longer get to a point where we need to reverify equivalences 
or update regstat.


Bernd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC] PR69195, Reload confused by invalid reg equivs
  2016-03-04 17:40 ` Bernd Schmidt
@ 2016-03-07 14:47   ` Alan Modra
  2016-03-10  9:18     ` [PATCH] " Alan Modra
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Modra @ 2016-03-07 14:47 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

On Fri, Mar 04, 2016 at 06:39:54PM +0100, Bernd Schmidt wrote:
> I've managed to reproduce this, and I think your analysis of the problem is
> correct. So the patch is probably ok from the point of you of "will it
> work". I can probably be convinced to approve it as-is, but I wonder if
> you'd be prepared to try something else first.
> 
> This whole area in IRA is turning into spaghetti a little bit. I would
> prefer that we schedule a new small optimization pass beforehand (either as
> a real pass or just a function call) that does only the label-substituting
> trick from update_equiv_regs, then removes dead code as necessary. When we
> then get to IRA and setting up equiv regs, we should no longer get to a
> point where we need to reverify equivalences or update regstat.

I had the same thought myself, but wondered what I'd be getting into
so didn't mention the idea.  ;-)  I'll have a go.  (Not sure if I'd
trust me to get it right though, for stage 4.)

Something for later:  It seems to my untrained eye that more than just
the label substitution should be separated out.  All of the latter
half of update_equiv_regs doing global combine and moving insns could
be better separated from the rest of ira.  It is to some extent, by
use of reg_equiv[] vs. ira_reg_equiv[], but shares the REG_EQUIV
notes.  That means the REG_EQUIV notes created by ira need to take
into account ira, lra and reload behaviour and are thus rather more
restrictive than needed for the mini-combine pass.  eg. See the
comment about calls in validate_equiv_mem.

Incidentally, an all-langs bootstrap and regression test of gcc
triggers delete_unreachable_blocks just once for x86_64 and never for
powerpc64le.  The file is testsuite/gfortran.dg/pr46755.f, which of
course isn't that surprising since the pr46755 fix added the
delete_unreachable_blocks call.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-07 14:47   ` Alan Modra
@ 2016-03-10  9:18     ` Alan Modra
  2016-03-10  9:48       ` Bernd Schmidt
  2016-03-11 20:40       ` Andreas Schwab
  0 siblings, 2 replies; 18+ messages in thread
From: Alan Modra @ 2016-03-10  9:18 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

Doing the indirect jump optimization turned out to be quite easy.

Bootstrapped and regression tested powerpc64le-linux, gcc-6, gcc-5 and
gcc-4.9.  Bootstrap and regression test x86_64-linux still running.
OK to apply?

gcc/
	PR rtl-optimization/69195
	PR rtl-optimization/47992
	* ira.c (recorded_label_ref): Delete.
	(update_equiv_regs): Return void.
	(indirect_jump_optimize): New function.
	(ira): Call indirect_jump_optimize and delete_trivially_dead_insns
	before regstat_compute_ri.  Don't rebuild_jump_labels here.
	Delete update_regstat.
gcc/testsuite/
	* gcc.dg/pr69195.c: New.
	* gcc.dg/pr69238.c: New.

diff --git a/gcc/ira.c b/gcc/ira.c
index 0973258..5e7a2ed 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3319,9 +3319,6 @@ adjust_cleared_regs (rtx loc, const_rtx old_rtx ATTRIBUTE_UNUSED, void *data)
   return NULL_RTX;
 }
 
-/* Nonzero if we recorded an equivalence for a LABEL_REF.  */
-static int recorded_label_ref;
-
 /* Find registers that are equivalent to a single value throughout the
    compilation (either because they can be referenced in memory or are
    set once from a single constant).  Lower their priority for a
@@ -3331,10 +3328,8 @@ static int recorded_label_ref;
    value into the using insn.  If it succeeds, we can eliminate the
    register completely.
 
-   Initialize init_insns in ira_reg_equiv array.
-
-   Return non-zero if jump label rebuilding should be done.  */
-static int
+   Initialize init_insns in ira_reg_equiv array.  */
+static void
 update_equiv_regs (void)
 {
   rtx_insn *insn;
@@ -3343,10 +3338,6 @@ update_equiv_regs (void)
   bitmap cleared_regs;
   bool *pdx_subregs;
 
-  /* We need to keep track of whether or not we recorded a LABEL_REF so
-     that we know if the jump optimizer needs to be rerun.  */
-  recorded_label_ref = 0;
-
   /* Use pdx_subregs to show whether a reg is used in a paradoxical
      subreg.  */
   pdx_subregs = XCNEWVEC (bool, max_regno);
@@ -3578,17 +3569,6 @@ update_equiv_regs (void)
 		  = gen_rtx_INSN_LIST (VOIDmode, insn,
 				       ira_reg_equiv[regno].init_insns);
 
-	      /* Record whether or not we created a REG_EQUIV note for a LABEL_REF.
-		 We might end up substituting the LABEL_REF for uses of the
-		 pseudo here or later.  That kind of transformation may turn an
-		 indirect jump into a direct jump, in which case we must rerun the
-		 jump optimizer to ensure that the JUMP_LABEL fields are valid.  */
-	      if (GET_CODE (x) == LABEL_REF
-		  || (GET_CODE (x) == CONST
-		      && GET_CODE (XEXP (x, 0)) == PLUS
-		      && (GET_CODE (XEXP (XEXP (x, 0), 0)) == LABEL_REF)))
-		recorded_label_ref = 1;
-
 	      reg_equiv[regno].replacement = x;
 	      reg_equiv[regno].src_p = &SET_SRC (set);
 	      reg_equiv[regno].loop_depth = (short) loop_depth;
@@ -3706,9 +3686,9 @@ update_equiv_regs (void)
 	  if (! INSN_P (insn))
 	    continue;
 
-	  /* Don't substitute into a non-local goto, this confuses CFG.  */
-	  if (JUMP_P (insn)
-	      && find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX))
+	  /* Don't substitute into jumps.  indirect_jump_optimize does
+	     this for anything we are prepared to handle.  */
+	  if (JUMP_P (insn))
 	    continue;
 
 	  for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
@@ -3860,11 +3840,50 @@ update_equiv_regs (void)
   end_alias_analysis ();
   free (reg_equiv);
   free (pdx_subregs);
-  return recorded_label_ref;
 }
 
-\f
+/* A pass over indirect jumps, converting simple cases to direct jumps.  */
+static void
+indirect_jump_optimize (void)
+{
+  basic_block bb;
+  bool rebuild_p = false;
 
+  FOR_EACH_BB_REVERSE_FN (bb, cfun)
+    {
+      rtx_insn *insn = BB_END (bb);
+      if (!JUMP_P (insn))
+	continue;
+
+      rtx x = pc_set (insn);
+      if (!x || !REG_P (SET_SRC (x)))
+	continue;
+
+      int regno = REGNO (SET_SRC (x));
+      if (DF_REG_DEF_COUNT (regno) == 1)
+	{
+	  rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));
+	  rtx note = find_reg_note (def_insn, REG_LABEL_OPERAND, NULL_RTX);
+
+	  if (note)
+	    {
+	      rtx lab = gen_rtx_LABEL_REF (Pmode, XEXP (note, 0));
+	      if (validate_replace_rtx (SET_SRC (x), lab, insn))
+		rebuild_p = true;
+	    }
+	}
+    }
+
+  if (rebuild_p)
+    {
+      timevar_push (TV_JUMP);
+      rebuild_jump_labels (get_insns ());
+      if (purge_all_dead_edges ())
+	delete_unreachable_blocks ();
+      timevar_pop (TV_JUMP);
+    }
+}
+\f
 /* Set up fields memory, constant, and invariant from init_insns in
    the structures of array ira_reg_equiv.  */
 static void
@@ -5090,7 +5109,6 @@ ira (FILE *f)
 {
   bool loops_p;
   int ira_max_point_before_emit;
-  int rebuild_p;
   bool saved_flag_caller_saves = flag_caller_saves;
   enum ira_region saved_flag_ira_region = flag_ira_region;
 
@@ -5167,6 +5185,10 @@ ira (FILE *f)
 
   df_clear_flags (DF_NO_INSN_RESCAN);
 
+  indirect_jump_optimize ();
+  if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
+    df_analyze ();
+
   regstat_init_n_sets_and_refs ();
   regstat_compute_ri ();
 
@@ -5184,32 +5206,12 @@ ira (FILE *f)
   if (resize_reg_info () && flag_ira_loop_pressure)
     ira_set_pseudo_classes (true, ira_dump_file);
 
-  rebuild_p = update_equiv_regs ();
+  update_equiv_regs ();
   setup_reg_equiv ();
   setup_reg_equiv_init ();
 
-  bool update_regstat = false;
-
-  if (optimize && rebuild_p)
-    {
-      timevar_push (TV_JUMP);
-      rebuild_jump_labels (get_insns ());
-      if (purge_all_dead_edges ())
-	{
-	  delete_unreachable_blocks ();
-	  update_regstat = true;
-	}
-      timevar_pop (TV_JUMP);
-    }
-
   allocated_reg_info_size = max_reg_num ();
 
-  if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
-    {
-      df_analyze ();
-      update_regstat = true;
-    }
-
   /* It is not worth to do such improvement when we use a simple
      allocation because of -O0 usage or because the function is too
      big.  */
@@ -5319,7 +5321,7 @@ ira (FILE *f)
     check_allocation ();
 #endif
 
-  if (update_regstat || max_regno != max_regno_before_ira)
+  if (max_regno != max_regno_before_ira)
     {
       regstat_free_n_sets_and_refs ();
       regstat_free_ri ();
diff --git a/gcc/testsuite/gcc.dg/pr69195.c b/gcc/testsuite/gcc.dg/pr69195.c
new file mode 100644
index 0000000..af373a1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69195.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-dce -fno-forward-propagate" } */
+
+void __attribute__ ((noinline, noclone))
+foo (int *a, int n)
+{
+  int *lasta = a + n;
+  for (; a != lasta; a++)
+    {
+      *a *= 2;
+      a[1] = a[-1] + a[-2];
+    }
+}
+
+int
+main ()
+{
+  int a[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
+  int r[16] = { 1, 2, 6, 6, 16, 24, 44, 80,
+		136, 248, 432, 768, 1360, 2400, 4256, 3760 };
+  unsigned i;
+  foo (&a[2], 13);
+  for (i = 0; i < 8; ++i)
+    if (a[i] != r[i])
+      __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr69238.c b/gcc/testsuite/gcc.dg/pr69238.c
new file mode 100644
index 0000000..3538e63
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69238.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-rerun-cse-after-loop -funroll-loops" } */
+
+
+#define N 32
+
+short sa[N];
+short sb[N];
+int ia[N];
+int ib[N];
+
+int __attribute__ ((noinline, noclone))
+main1 (int n)
+{
+  int i;
+  for (i = 0; i < n; i++)
+    {
+      sa[i+7] = sb[i];
+      ia[i+3] = ib[i+1];
+    }
+  return 0;
+}
+
+int
+main (void)
+{ 
+  return main1 (N-7);
+}

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-10  9:18     ` [PATCH] " Alan Modra
@ 2016-03-10  9:48       ` Bernd Schmidt
  2016-03-11 20:40       ` Andreas Schwab
  1 sibling, 0 replies; 18+ messages in thread
From: Bernd Schmidt @ 2016-03-10  9:48 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

On 03/10/2016 10:18 AM, Alan Modra wrote:
> Doing the indirect jump optimization turned out to be quite easy.
>
> Bootstrapped and regression tested powerpc64le-linux, gcc-6, gcc-5 and
> gcc-4.9.  Bootstrap and regression test x86_64-linux still running.
> OK to apply?

So much nicer. Ok, and thanks.


Bernd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-10  9:18     ` [PATCH] " Alan Modra
  2016-03-10  9:48       ` Bernd Schmidt
@ 2016-03-11 20:40       ` Andreas Schwab
  2016-03-11 22:16         ` Alan Modra
  1 sibling, 1 reply; 18+ messages in thread
From: Andreas Schwab @ 2016-03-11 20:40 UTC (permalink / raw)
  To: Alan Modra; +Cc: Bernd Schmidt, gcc-patches

I'm getting this crash on ia64 for gcc.c-torture/compile/pr58164.c:

Program received signal SIGSEGV, Segmentation fault.
0x40000000012286e0 in indirect_jump_optimize () at ../../gcc/ira.c:3865
3865              rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));
(gdb) bt
#0  0x40000000012286e0 in indirect_jump_optimize () at ../../gcc/ira.c:3865
#1  0x4000000001239570 in ira (f=0x0) at ../../gcc/ira.c:5188
#2  0x400000000123a6d0 in (anonymous namespace)::pass_ira::execute (
    this=0x60000000002e2f50) at ../../gcc/ira.c:5539
Python Exception <type 'exceptions.AttributeError'> 'tuple' object has no attribute 'major': 
#3  0x4000000001628ab0 in execute_one_pass (pass=) at ../../gcc/passes.c:2336
Python Exception <type 'exceptions.AttributeError'> 'tuple' object has no attribute 'major': 
#4  0x4000000001629aa0 in execute_pass_list_1 (pass=)
    at ../../gcc/passes.c:2410
Python Exception <type 'exceptions.AttributeError'> 'tuple' object has no attribute 'major': 
#5  0x4000000001629bb0 in execute_pass_list_1 (pass=)
    at ../../gcc/passes.c:2411
Python Exception <type 'exceptions.AttributeError'> 'tuple' object has no attribute 'major': 
#6  0x4000000001629cf0 in execute_pass_list (fn=0x200000000104f0d8, pass=)
    at ../../gcc/passes.c:2421
Python Exception <type 'exceptions.AttributeError'> 'tuple' object has no attribute 'major': 
#7  0x4000000000917ee0 in cgraph_node::expand (this=)
    at ../../gcc/cgraphunit.c:1980
#8  0x40000000009196c0 in output_in_order (no_reorder=false)
    at ../../gcc/cgraphunit.c:2218
#9  0x4000000000923790 in symbol_table::compile (this=0x2000000000fb0000)
    at ../../gcc/cgraphunit.c:2466
#10 0x4000000000926ae0 in symbol_table::finalize_compilation_unit (
    this=0x2000000000fb0000) at ../../gcc/cgraphunit.c:2562
#11 0x4000000001a3ea80 in compile_file () at ../../gcc/toplev.c:490
#12 0x4000000001a44870 in do_compile () at ../../gcc/toplev.c:1988
#13 0x4000000001a45120 in toplev::main (this=0x600ffffffffef090, argc=2, 
    argv=0x600ffffffffef348) at ../../gcc/toplev.c:2096
#14 0x400000000318cad0 in main (argc=2, argv=0x600ffffffffef348)
    at ../../gcc/main.c:39

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-11 20:40       ` Andreas Schwab
@ 2016-03-11 22:16         ` Alan Modra
  2016-03-11 23:13           ` Alan Modra
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Modra @ 2016-03-11 22:16 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Bernd Schmidt, gcc-patches

On Fri, Mar 11, 2016 at 09:39:58PM +0100, Andreas Schwab wrote:
> I'm getting this crash on ia64 for gcc.c-torture/compile/pr58164.c:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x40000000012286e0 in indirect_jump_optimize () at ../../gcc/ira.c:3865
> 3865              rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));

Breakpoint 1, indirect_jump_optimize () at /src/gcc.git/gcc/ira.c:3863
(gdb) s
(gdb) p regno
$1 = 328
(gdb) p df->def_regs[328]
$2 = (df_reg_info *) 0x19772d0
(gdb) p *df->def_regs[328]
$3 = {reg_chain = 0x1981e60, n_refs = 1}
(gdb) p *df->def_regs[328]->reg_chain
$4 = {base = {cl = DF_REF_ARTIFICIAL, type = DF_REF_REG_DEF, flags = 0, regno = 328, reg = 0x7ffff6c271c8, next_loc = 0x0, chain = 0x0, insn_info = 0x0, next_reg = 0x0, prev_reg = 0x0, id = -1, ref_order = 22}, regular_ref = {base = {cl = DF_REF_ARTIFICIAL, type = DF_REF_REG_DEF, flags = 0, regno = 328, reg = 0x7ffff6c271c8, next_loc = 0x0, chain = 0x0, insn_info = 0x0, next_reg = 0x0, prev_reg = 0x0, id = -1, ref_order = 22}, loc = 0x7ffff6cf8068}, artificial_ref = {base = {cl = DF_REF_ARTIFICIAL, type = DF_REF_REG_DEF, flags = 0, regno = 328, reg = 0x7ffff6c271c8, next_loc = 0x0, chain = 0x0, insn_info = 0x0, next_reg = 0x0, prev_reg = 0x0, id = -1, ref_order = 22}, bb = 0x7ffff6cf8068}}

Bootstrapping the following (diff -w).

	* ira.c (indirect_jump_optimize): Ignore artificial defs.

diff --git a/gcc/ira.c b/gcc/ira.c
index 5e7a2ed..c4d76fc 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3862,7 +3862,10 @@ indirect_jump_optimize (void)
       int regno = REGNO (SET_SRC (x));
       if (DF_REG_DEF_COUNT (regno) == 1)
 	{
-	  rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));
+	  df_ref def = DF_REG_DEF_CHAIN (regno);
+	  if (!DF_REF_IS_ARTIFICIAL (def))
+	    {
+	      rtx_insn *def_insn = DF_REF_INSN (def);
 	      rtx note = find_reg_note (def_insn, REG_LABEL_OPERAND, NULL_RTX);
 
 	      if (note)
@@ -3873,6 +3876,7 @@ indirect_jump_optimize (void)
 		}
 	    }
 	}
+    }
 
   if (rebuild_p)
     {

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-11 22:16         ` Alan Modra
@ 2016-03-11 23:13           ` Alan Modra
  2016-03-12  7:58             ` Jakub Jelinek
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Modra @ 2016-03-11 23:13 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Bernd Schmidt, gcc-patches

The underlying problem happens somewhere in tree-ssa-dse.c.  So we get
an indirect jump to a random location instead of a jump to 0.

pr58164.c.035t.mergephi1
;; Function foo (foo, funcdef_no=0, decl_uid=1389, cgraph_uid=0, symbol_order=0)

foo ()
{
  int x;

  <bb 2>:
  x = 0;
  goto &x;

}

pr58164.c.036t.dse1
;; Function foo (foo, funcdef_no=0, decl_uid=1389, cgraph_uid=0, symbol_order=0)

  Deleted dead store 'x = 0;
'
foo ()
{
  int x;

  <bb 2>:
  goto &x;

}

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-11 23:13           ` Alan Modra
@ 2016-03-12  7:58             ` Jakub Jelinek
  2016-03-12  8:26               ` Jeff Law
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Jelinek @ 2016-03-12  7:58 UTC (permalink / raw)
  To: Alan Modra; +Cc: Andreas Schwab, Bernd Schmidt, gcc-patches

On Sat, Mar 12, 2016 at 09:43:50AM +1030, Alan Modra wrote:
> The underlying problem happens somewhere in tree-ssa-dse.c.  So we get
> an indirect jump to a random location instead of a jump to 0.

Well, the testcase is there just to make sure we don't ICE on it.
And, changing just DSE can't be a complete solution, because one can use
uninitialized var from the beginning:
int
foo (void)
{
  int x;
  goto *&x;
}

	Jakub

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-12  7:58             ` Jakub Jelinek
@ 2016-03-12  8:26               ` Jeff Law
  2016-03-12  9:07                 ` Alan Modra
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2016-03-12  8:26 UTC (permalink / raw)
  To: Jakub Jelinek, Alan Modra; +Cc: Andreas Schwab, Bernd Schmidt, gcc-patches

On 03/12/2016 12:58 AM, Jakub Jelinek wrote:
> On Sat, Mar 12, 2016 at 09:43:50AM +1030, Alan Modra wrote:
>> The underlying problem happens somewhere in tree-ssa-dse.c.  So we get
>> an indirect jump to a random location instead of a jump to 0.
>
> Well, the testcase is there just to make sure we don't ICE on it.
> And, changing just DSE can't be a complete solution, because one can use
> uninitialized var from the beginning:
> int
> foo (void)
> {
>    int x;
>    goto *&x;
> }
>
> 	Jakub
>
I believe Alan's point is DSE deleted the assignment to x which can't be 
right as long as we've left in goto *&x.

The goto *&x should be a use of x and thus should have kept the 
assignment live.

Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-12  8:26               ` Jeff Law
@ 2016-03-12  9:07                 ` Alan Modra
  2016-03-12  9:29                   ` Jakub Jelinek
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Modra @ 2016-03-12  9:07 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, Andreas Schwab, Bernd Schmidt, gcc-patches

On Sat, Mar 12, 2016 at 01:26:39AM -0700, Jeff Law wrote:
> On 03/12/2016 12:58 AM, Jakub Jelinek wrote:
> >On Sat, Mar 12, 2016 at 09:43:50AM +1030, Alan Modra wrote:
> >>The underlying problem happens somewhere in tree-ssa-dse.c.  So we get
> >>an indirect jump to a random location instead of a jump to 0.
> >
> >Well, the testcase is there just to make sure we don't ICE on it.
> >And, changing just DSE can't be a complete solution, because one can use
> >uninitialized var from the beginning:
> >int
> >foo (void)
> >{
> >   int x;
> >   goto *&x;
> >}
> >
> >	Jakub
> >
> I believe Alan's point is DSE deleted the assignment to x which can't be
> right as long as we've left in goto *&x.
> 
> The goto *&x should be a use of x and thus should have kept the assignment
> live.

Right, I wasn't trying to say that ira.c:indirect_jump_optimize is
OK.  It needs the patch I posted or perhaps even better a test of
DF_REF_INSN_INFO rather than !DF_REF_IS_ARTIFICIAL (simply because the
flag test is reading another field, and we need to read
DF_REF_INSN_INFO anyway).

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-12  9:07                 ` Alan Modra
@ 2016-03-12  9:29                   ` Jakub Jelinek
  2016-03-12 11:10                     ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Jelinek @ 2016-03-12  9:29 UTC (permalink / raw)
  To: Alan Modra; +Cc: Jeff Law, Andreas Schwab, Bernd Schmidt, gcc-patches

On Sat, Mar 12, 2016 at 07:37:25PM +1030, Alan Modra wrote:
> > I believe Alan's point is DSE deleted the assignment to x which can't be
> > right as long as we've left in goto *&x.
> > 
> > The goto *&x should be a use of x and thus should have kept the assignment
> > live.
> 
> Right, I wasn't trying to say that ira.c:indirect_jump_optimize is
> OK.  It needs the patch I posted or perhaps even better a test of
> DF_REF_INSN_INFO rather than !DF_REF_IS_ARTIFICIAL (simply because the
> flag test is reading another field, and we need to read
> DF_REF_INSN_INFO anyway).

Ok, that was my point.  BTW, DSE isn't the only one that deletes x = 0;
cddce deletes it too.  -fno-tree-dse -fno-tree-dce preserves it till
expansion.

	Jakub

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-12  9:29                   ` Jakub Jelinek
@ 2016-03-12 11:10                     ` Richard Biener
  2016-03-12 17:07                       ` Jeff Law
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2016-03-12 11:10 UTC (permalink / raw)
  To: Jakub Jelinek, Alan Modra
  Cc: Jeff Law, Andreas Schwab, Bernd Schmidt, gcc-patches

On March 12, 2016 10:29:40 AM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote:
>On Sat, Mar 12, 2016 at 07:37:25PM +1030, Alan Modra wrote:
>> > I believe Alan's point is DSE deleted the assignment to x which
>can't be
>> > right as long as we've left in goto *&x.
>> > 
>> > The goto *&x should be a use of x and thus should have kept the
>assignment
>> > live.
>> 
>> Right, I wasn't trying to say that ira.c:indirect_jump_optimize is
>> OK.  It needs the patch I posted or perhaps even better a test of
>> DF_REF_INSN_INFO rather than !DF_REF_IS_ARTIFICIAL (simply because
>the
>> flag test is reading another field, and we need to read
>> DF_REF_INSN_INFO anyway).
>
>Ok, that was my point.  BTW, DSE isn't the only one that deletes x = 0;
>cddce deletes it too.  -fno-tree-dse -fno-tree-dce preserves it till
>expansion.

GIMPLE_GOTO doesn't have VOPs and I don't think that we'd want VUSEs on all gotos. But just having them on indirect gotos would be inconsistent.

I believe the code is undefined anyway and out of scope of a reasonable QOI.

Using alloca to create/jump to code on the stack should work (we might transform that into a decl though).

Richard.

>	Jakub


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-12 11:10                     ` Richard Biener
@ 2016-03-12 17:07                       ` Jeff Law
  2016-03-14  9:56                         ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2016-03-12 17:07 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek, Alan Modra
  Cc: Andreas Schwab, Bernd Schmidt, gcc-patches

On 03/12/2016 04:10 AM, Richard Biener wrote:
> On March 12, 2016 10:29:40 AM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Sat, Mar 12, 2016 at 07:37:25PM +1030, Alan Modra wrote:
>>>> I believe Alan's point is DSE deleted the assignment to x which
>> can't be
>>>> right as long as we've left in goto *&x.
>>>>
>>>> The goto *&x should be a use of x and thus should have kept the
>> assignment
>>>> live.
>>>
>>> Right, I wasn't trying to say that ira.c:indirect_jump_optimize is
>>> OK.  It needs the patch I posted or perhaps even better a test of
>>> DF_REF_INSN_INFO rather than !DF_REF_IS_ARTIFICIAL (simply because
>> the
>>> flag test is reading another field, and we need to read
>>> DF_REF_INSN_INFO anyway).
>>
>> Ok, that was my point.  BTW, DSE isn't the only one that deletes x = 0;
>> cddce deletes it too.  -fno-tree-dse -fno-tree-dce preserves it till
>> expansion.
>
> GIMPLE_GOTO doesn't have VOPs and I don't think that we'd want VUSEs on all gotos. But just having them on indirect gotos would be inconsistent.
>
> I believe the code is undefined anyway and out of scope of a reasonable QOI.
Undefined?  Most likely.  But we still have to do something sensible. 
As Jakub noted, a user could create the problematic code just as easily 
as DCE/DSE, so IRA probably needs to be tolerant of this situation.

So it seems like you're suggesting we leave DCE/DSE alone (declaring 
this usage undefined) and fix IRA to be tolerant, right?


> Using alloca to create/jump to code on the stack should work (we might transform that into a decl though).
Given that executable stacks are a huge security hole, I'd be willing to 
go out on a limb and declare that undefined as well.  It's not as clear 
cut, but that's the argument I'd make.

And yes, I realize that goes in opposition to what GCC has allowed for 
20+ years.  I still think it'd be the right thing to do.

jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-12 17:07                       ` Jeff Law
@ 2016-03-14  9:56                         ` Richard Biener
  2016-03-14 19:00                           ` Jeff Law
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2016-03-14  9:56 UTC (permalink / raw)
  To: Jeff Law
  Cc: Jakub Jelinek, Alan Modra, Andreas Schwab, Bernd Schmidt, GCC Patches

On Sat, Mar 12, 2016 at 6:07 PM, Jeff Law <law@redhat.com> wrote:
> On 03/12/2016 04:10 AM, Richard Biener wrote:
>>
>> On March 12, 2016 10:29:40 AM GMT+01:00, Jakub Jelinek <jakub@redhat.com>
>> wrote:
>>>
>>> On Sat, Mar 12, 2016 at 07:37:25PM +1030, Alan Modra wrote:
>>>>>
>>>>> I believe Alan's point is DSE deleted the assignment to x which
>>>
>>> can't be
>>>>>
>>>>> right as long as we've left in goto *&x.
>>>>>
>>>>> The goto *&x should be a use of x and thus should have kept the
>>>
>>> assignment
>>>>>
>>>>> live.
>>>>
>>>>
>>>> Right, I wasn't trying to say that ira.c:indirect_jump_optimize is
>>>> OK.  It needs the patch I posted or perhaps even better a test of
>>>> DF_REF_INSN_INFO rather than !DF_REF_IS_ARTIFICIAL (simply because
>>>
>>> the
>>>>
>>>> flag test is reading another field, and we need to read
>>>> DF_REF_INSN_INFO anyway).
>>>
>>>
>>> Ok, that was my point.  BTW, DSE isn't the only one that deletes x = 0;
>>> cddce deletes it too.  -fno-tree-dse -fno-tree-dce preserves it till
>>> expansion.
>>
>>
>> GIMPLE_GOTO doesn't have VOPs and I don't think that we'd want VUSEs on
>> all gotos. But just having them on indirect gotos would be inconsistent.
>>
>> I believe the code is undefined anyway and out of scope of a reasonable
>> QOI.
>
> Undefined?  Most likely.  But we still have to do something sensible. As
> Jakub noted, a user could create the problematic code just as easily as
> DCE/DSE, so IRA probably needs to be tolerant of this situation.
>
> So it seems like you're suggesting we leave DCE/DSE alone (declaring this
> usage undefined) and fix IRA to be tolerant, right?

Tolerant as in not crash?  Yes.

Note that DCE/DSE would be happy if the stores were global memory.  After
my recent fix even addresses based on functions and labels work here.

What does not work is if you jump to automatic storage the compiler knows
how to compute liveness of as gotos are not considered here.  I can't think
of a way to make the IL handle this without severely pessimizing regular
DCE/DSE or making it very "hacky" in only giving VUSEs to gotos that
possibly may reach "local" memory.

That said, I can of course try and will once I see a testcase we break that
matters in real life from a correctness perspective - like I did for that ARM
kernel patching bug.

>> Using alloca to create/jump to code on the stack should work (we might
>> transform that into a decl though).
>
> Given that executable stacks are a huge security hole, I'd be willing to go
> out on a limb and declare that undefined as well.  It's not as clear cut,
> but that's the argument I'd make.

Well, I thought about somebody trying to build trampolines in a way exposed
to GCC.

> And yes, I realize that goes in opposition to what GCC has allowed for 20+
> years.  I still think it'd be the right thing to do.

Did we allow this?  Not by design but rather by accident I suppose.

Richard.

> jeff
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-14  9:56                         ` Richard Biener
@ 2016-03-14 19:00                           ` Jeff Law
  2016-03-15  2:27                             ` Alan Modra
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2016-03-14 19:00 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Alan Modra, Andreas Schwab, Bernd Schmidt, GCC Patches

On 03/14/2016 03:56 AM, Richard Biener wrote:

>> Undefined?  Most likely.  But we still have to do something sensible. As
>> Jakub noted, a user could create the problematic code just as easily as
>> DCE/DSE, so IRA probably needs to be tolerant of this situation.
>>
>> So it seems like you're suggesting we leave DCE/DSE alone (declaring this
>> usage undefined) and fix IRA to be tolerant, right?
>
> Tolerant as in not crash?  Yes.
Right.  Tolerant as in not crash.

>
>>> Using alloca to create/jump to code on the stack should work (we might
>>> transform that into a decl though).
>>
>> Given that executable stacks are a huge security hole, I'd be willing to go
>> out on a limb and declare that undefined as well.  It's not as clear cut,
>> but that's the argument I'd make.
>
> Well, I thought about somebody trying to build trampolines in a way exposed
> to GCC.
Right or other dynamic, short-lived code fragments.

>
>> And yes, I realize that goes in opposition to what GCC has allowed for 20+
>> years.  I still think it'd be the right thing to do.
>
> Did we allow this?  Not by design but rather by accident I suppose.
I don't think it was ever specifically allowed or disallowed; like many 
of the old extensions, it was never crisply defined.

I can distinctly remember having to declare that taking the address of a 
blob of code on the stack, then calling/jumping to it after the 
containing function went out of scope as undefined.  I think it was the 
address of a trampoline, but I'm not entirely sure -- there's a small 
chance it was user-created code.  I only remember it because I was 
surprised at how controversial it was to declare that as undefined :(

That most likely predates egcs, so the discussion is not likely in the 
public archives.  It may have been a private discussion between Kenner, 
Jim, Doug and myself or some subset thereof.


jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-14 19:00                           ` Jeff Law
@ 2016-03-15  2:27                             ` Alan Modra
  2016-03-15 12:17                               ` Bernd Schmidt
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Modra @ 2016-03-15  2:27 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Biener, Jakub Jelinek, Andreas Schwab, Bernd Schmidt,
	GCC Patches

On Mon, Mar 14, 2016 at 01:00:39PM -0600, Jeff Law wrote:
> Right.  Tolerant as in not crash.

So can someone please approve my ira.c:indirect_jump_optimize patch?
I'm not quite audacious enough to claim it is obvious.  The original
is at https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00720.html,
reposted here with some additional comments.  Bootstrapped and
regression tested powerpc64le-linux and x86_64-linux.

(In the thread I mentioned we could use DF_REF_INSN_INFO in place of
!DF_REF_IS_ARTIFICAL.  I believe that is true, and it saves an access
to another field of df_ref, but !DF_REF_IS_ARTIFICIAL before access to
DF_REF_INSN or DF_REF_INSN_UID is somewhat more prevalent in the gcc
sources.)

	PR rtl-optimization/69195
	PR rtl-optimization/47992
	* ira.c (indirect_jump_optimize): Ignore artificial defs.
	Add comments.

diff --git a/gcc/ira.c b/gcc/ira.c
index 5e7a2ed..a543d90 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3842,7 +3842,8 @@ update_equiv_regs (void)
   free (pdx_subregs);
 }
 
-/* A pass over indirect jumps, converting simple cases to direct jumps.  */
+/* A pass over indirect jumps, converting simple cases to direct jumps.
+   Combine does this optimization too, but only within a basic block.  */
 static void
 indirect_jump_optimize (void)
 {
@@ -3862,14 +3863,23 @@ indirect_jump_optimize (void)
       int regno = REGNO (SET_SRC (x));
       if (DF_REG_DEF_COUNT (regno) == 1)
 	{
-	  rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));
-	  rtx note = find_reg_note (def_insn, REG_LABEL_OPERAND, NULL_RTX);
-
-	  if (note)
+	  df_ref def = DF_REG_DEF_CHAIN (regno);
+	  if (!DF_REF_IS_ARTIFICIAL (def))
 	    {
-	      rtx lab = gen_rtx_LABEL_REF (Pmode, XEXP (note, 0));
-	      if (validate_replace_rtx (SET_SRC (x), lab, insn))
-		rebuild_p = true;
+	      rtx_insn *def_insn = DF_REF_INSN (def);
+	      rtx note = find_reg_note (def_insn, REG_LABEL_OPERAND, NULL_RTX);
+
+	      if (note)
+		{
+		  /* Substitute a LABEL_REF to the label given by the
+		     note rather than using SET_SRC of DEF_INSN.
+		     DEF_INSN might be loading the label constant from
+		     a constant pool, which isn't what we want in a
+		     direct branch.  */
+		  rtx lab = gen_rtx_LABEL_REF (Pmode, XEXP (note, 0));
+		  if (validate_replace_rtx (SET_SRC (x), lab, insn))
+		    rebuild_p = true;
+		}
 	    }
 	}
     }

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PR69195, Reload confused by invalid reg equivs
  2016-03-15  2:27                             ` Alan Modra
@ 2016-03-15 12:17                               ` Bernd Schmidt
  0 siblings, 0 replies; 18+ messages in thread
From: Bernd Schmidt @ 2016-03-15 12:17 UTC (permalink / raw)
  To: Alan Modra, Jeff Law
  Cc: Richard Biener, Jakub Jelinek, Andreas Schwab, GCC Patches

On 03/15/2016 03:27 AM, Alan Modra wrote:
> On Mon, Mar 14, 2016 at 01:00:39PM -0600, Jeff Law wrote:
>> Right.  Tolerant as in not crash.
>
> So can someone please approve my ira.c:indirect_jump_optimize patch?
> I'm not quite audacious enough to claim it is obvious.

Looks good to me.


Bernd

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-03-15 12:17 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-04 14:54 [RFC] PR69195, Reload confused by invalid reg equivs Alan Modra
2016-03-04 17:40 ` Bernd Schmidt
2016-03-07 14:47   ` Alan Modra
2016-03-10  9:18     ` [PATCH] " Alan Modra
2016-03-10  9:48       ` Bernd Schmidt
2016-03-11 20:40       ` Andreas Schwab
2016-03-11 22:16         ` Alan Modra
2016-03-11 23:13           ` Alan Modra
2016-03-12  7:58             ` Jakub Jelinek
2016-03-12  8:26               ` Jeff Law
2016-03-12  9:07                 ` Alan Modra
2016-03-12  9:29                   ` Jakub Jelinek
2016-03-12 11:10                     ` Richard Biener
2016-03-12 17:07                       ` Jeff Law
2016-03-14  9:56                         ` Richard Biener
2016-03-14 19:00                           ` Jeff Law
2016-03-15  2:27                             ` Alan Modra
2016-03-15 12:17                               ` Bernd Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).