public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
@ 2018-09-26 21:14 Peter Bergner
  2018-09-26 21:16 ` [PATCH 1/2][IRA,LRA] " Peter Bergner
                   ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Peter Bergner @ 2018-09-26 21:14 UTC (permalink / raw)
  To: GCC Patches; +Cc: Vladimir N Makarov, Jeff Law

PR86939 shows a problem where IRA (and LRA) adds an unneeded conflict between
a pseudo reg and a hard reg leading to an unnecessary copy.  The definition
of conflict that most register allocators use is that two live ranges conflict
if one is "live" at the definition of the other and they have different
values (here "live" means live and available).  In computing conflicts,
IRA/LRA do not try and represent the "...and they have different values"
which leads to the unneeded conflict.  They also add conflicts when they
handle insn operand uses, rather than at operand definitions that the
definition above implies and that other allocators I know about do.

The following 2 patches fixes (some) of the problems mentioned above
and is enough to fix the performance problem reported in the PR.
Firstly, PATCH 1 changes IRA and LRA to compute conflicts when handling
definitions, rather than at uses.  This change is required by the second
patch, since we cannot handle reg to reg copies specially (ie, skip the
addition of a conflict) if we've already made them conflict because we
noticed they're live simultaneously.  Technically, this should only
make a difference in the conflicts computed for two live ranges that
are both undefined.  Not really likely.  I also took it upon myself to
rename the *_born functions, since the live ranges are not born at the
point we call them.  They are actually deaths/last uses and we need to
add them to the "live" sets (we traverse the basic blocks in reverse).
Therefore, I changes the suffix from _born to _live.

PATCH 2 adds the special handling of pseudo to/from hard reg copies.
Specifically, we ignore the source reg of the copy when adding conflicts
for the reg that is being defined in the copy insn.  A few things I'd
like to point out.  Normally, in the other allocators I'm familiar with,
we handle conflicts for copies by removing the source reg from the "live"
set, even if it doesn't die in the copy.  Then when we process the
copies definitions, we add conflicts with everything that is live.
Since we've removed the source reg, then we've successfully not added a
conflict here.  Then we process the copies uses which will automatically
add the source reg back to the live set.  I tried that here too, but
given allocno ranges computation, I couldn't do that here.  I decided
just to create a global var that I can test for when adding conflicts.
Also, this patch does not handle pseudo to pseudo copies, since their
conflicts are computed by checking their allocno ranges for overlap
and I could not determine how to recognize and handle copies during
that process.  If someone has ideas, please let me know.  Finally, I
have explicitly disallowed special handling of copies of register pairs,
since that is difficult to get right.  I tried several solutions before
finally just giving up on them.  That could be a future work item along
with pseudo to pseudo handling if people want.

I'll note that I have bootstrapped and regtested PATCH 1 on both
powerpc64le-linux and x86_64-linux with not regressions.  I have
also bootstrapped and regtested PATCH 1 + PATCH 2 on the same two
platforms with no regressions.

Vlad, Jeff or anyone else, any comments on the approach used here
to fix PR86939?

If the patches are "ok" as is, I'd like to commit PATCH 1 first and
let it bake on trunk for a little while before committing PATCH 2.

Peter


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 1/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-09-26 21:14 [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register Peter Bergner
@ 2018-09-26 21:16 ` Peter Bergner
  2018-09-26 21:36 ` [PATCH 2/2][IRA,LRA] " Peter Bergner
  2018-09-28 21:45 ` [PATCH 0/2][IRA,LRA] " Vladimir Makarov
  2 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-09-26 21:16 UTC (permalink / raw)
  To: GCC Patches; +Cc: Vladimir N Makarov, Jeff Law

gcc/
	* ira-lives.c (make_hard_regno_born): Rename from this...
	(make_hard_regno_live): ... to this.  Remove update to conflict
	information.  Update function comment.
	(make_hard_regno_dead): Add conflict information update.  Update
	function comment.
	(make_object_born): Rename from this...
	(make_object_live): ... to this.  Remove update to conflict information.
	Update function comment.
	(make_object_dead):  Add conflict information update.  Update function
	comment.
	(mark_pseudo_regno_live): Call make_object_live.
	(mark_pseudo_regno_subword_live): Likewise.
	(mark_hard_reg_dead): Update function comment.
	(mark_hard_reg_live): Call make_hard_regno_live.
	(process_bb_node_lives): Likewise.
	* lra-lives.c (make_hard_regno_born): Rename from this...
	(make_hard_regno_live): ... to this.  Remove update to conflict
	information.  Remove now uneeded check_pic_pseudo_p argument.
	Update function comment.
	(make_hard_regno_dead): Add check_pic_pseudo_p argument and add update
	to conflict information.  Update function comment.
	(mark_pseudo_live): Remove update to conflict information.  Update
	function comment.
	(mark_pseudo_dead): Add conflict information update.
	(mark_regno_live): Call make_hard_regno_live.
	(mark_regno_dead): Call make_hard_regno_dead with new arguement.
	(process_bb_lives): Call make_hard_regno_live and make_hard_regno_dead.

Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 263506)
+++ gcc/ira-lives.c	(working copy)
@@ -84,14 +84,19 @@ static int *allocno_saved_at_call;
    supplemental to recog_data.  */
 static alternative_mask preferred_alternatives;
 
-/* Record the birth of hard register REGNO, updating hard_regs_live and
-   hard reg conflict information for living allocnos.  */
+/* Record hard register REGNO as now being live.  */
 static void
-make_hard_regno_born (int regno)
+make_hard_regno_live (int regno)
 {
-  unsigned int i;
-
   SET_HARD_REG_BIT (hard_regs_live, regno);
+}
+
+/* Process the definition of hard register REGNO.  This updates
+   hard_regs_live and hard reg conflict information for living allocnos.  */
+static void
+make_hard_regno_dead (int regno)
+{
+  unsigned int i;
   EXECUTE_IF_SET_IN_SPARSESET (objects_live, i)
     {
       ira_object_t obj = ira_object_id_map[i];
@@ -99,28 +104,17 @@ make_hard_regno_born (int regno)
       SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
       SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
     }
-}
-
-/* Process the death of hard register REGNO.  This updates
-   hard_regs_live.  */
-static void
-make_hard_regno_dead (int regno)
-{
   CLEAR_HARD_REG_BIT (hard_regs_live, regno);
 }
 
-/* Record the birth of object OBJ.  Set a bit for it in objects_live,
-   start a new live range for it if necessary and update hard register
-   conflicts.  */
+/* Record object OBJ as now being live.  Set a bit for it in objects_live,
+   and start a new live range for it if necessary.  */
 static void
-make_object_born (ira_object_t obj)
+make_object_live (ira_object_t obj)
 {
-  live_range_t lr = OBJECT_LIVE_RANGES (obj);
-
   sparseset_set_bit (objects_live, OBJECT_CONFLICT_ID (obj));
-  IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj), hard_regs_live);
-  IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), hard_regs_live);
 
+  live_range_t lr = OBJECT_LIVE_RANGES (obj);
   if (lr == NULL
       || (lr->finish != curr_point && lr->finish + 1 != curr_point))
     ira_add_live_range_to_object (obj, curr_point, -1);
@@ -154,14 +148,18 @@ update_allocno_pressure_excess_length (i
     }
 }
 
-/* Process the death of object OBJ, which is associated with allocno
-   A.  This finishes the current live range for it.  */
+/* Process the definition of object OBJ, which is associated with allocno A.
+   This finishes the current live range for it.  */
 static void
 make_object_dead (ira_object_t obj)
 {
   live_range_t lr;
 
   sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (obj));
+
+  IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj), hard_regs_live);
+  IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), hard_regs_live);
+
   lr = OBJECT_LIVE_RANGES (obj);
   ira_assert (lr != NULL);
   lr->finish = curr_point;
@@ -290,7 +288,7 @@ mark_pseudo_regno_live (int regno)
 	continue;
 
       inc_register_pressure (pclass, nregs);
-      make_object_born (obj);
+      make_object_live (obj);
     }
 }
 
@@ -327,7 +325,7 @@ mark_pseudo_regno_subword_live (int regn
     return;
 
   inc_register_pressure (pclass, 1);
-  make_object_born (obj);
+  make_object_live (obj);
 }
 
 /* Mark the register REG as live.  Store a 1 in hard_regs_live for
@@ -351,7 +349,7 @@ mark_hard_reg_live (rtx reg)
 	      aclass = ira_hard_regno_allocno_class[regno];
 	      pclass = ira_pressure_class_translate[aclass];
 	      inc_register_pressure (pclass, 1);
-	      make_hard_regno_born (regno);
+	      make_hard_regno_live (regno);
 	    }
 	  regno++;
 	}
@@ -457,8 +455,8 @@ mark_pseudo_regno_subword_dead (int regn
   make_object_dead (obj);
 }
 
-/* Mark the hard register REG as dead.  Store a 0 in hard_regs_live for the
-   register.  */
+/* Process the definition of hard register REG.  This updates hard_regs_live
+   and hard reg conflict information for living allocnos.  */
 static void
 mark_hard_reg_dead (rtx reg)
 {
@@ -1298,7 +1296,7 @@ process_bb_node_lives (ira_loop_tree_nod
 	    unsigned int regno = EH_RETURN_DATA_REGNO (j);
 	    if (regno == INVALID_REGNUM)
 	      break;
-	    make_hard_regno_born (regno);
+	    make_hard_regno_live (regno);
 	  }
 
       /* Allocnos can't go in stack regs at the start of a basic block
@@ -1317,7 +1315,7 @@ process_bb_node_lives (ira_loop_tree_nod
 	      ALLOCNO_TOTAL_NO_STACK_REG_P (a) = true;
 	    }
 	  for (px = FIRST_STACK_REG; px <= LAST_STACK_REG; px++)
-	    make_hard_regno_born (px);
+	    make_hard_regno_live (px);
 #endif
 	  /* No need to record conflicts for call clobbered regs if we
 	     have nonlocal labels around, as we don't ever try to
@@ -1340,7 +1338,7 @@ process_bb_node_lives (ira_loop_tree_nod
 		      && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER)
 #endif
 		  )
-		make_hard_regno_born (px);
+		make_hard_regno_live (px);
 	}
 
       EXECUTE_IF_SET_IN_SPARSESET (objects_live, i)
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	(revision 263506)
+++ gcc/lra-lives.c	(working copy)
@@ -223,42 +223,41 @@ lra_intersected_live_ranges_p (lra_live_
 /* The corresponding bitmaps of BB currently being processed.  */
 static bitmap bb_killed_pseudos, bb_gen_pseudos;
 
-/* The function processing birth of hard register REGNO.  It updates
-   living hard regs, START_LIVING, and conflict hard regs for living
-   pseudos.  Conflict hard regs for the pic pseudo is not updated if
-   REGNO is REAL_PIC_OFFSET_TABLE_REGNUM and CHECK_PIC_PSEUDO_P is
-   true.  */
+/* Record hard register REGNO as now being live.  It updates
+   living hard regs and START_LIVING.  */
 static void
-make_hard_regno_born (int regno, bool check_pic_pseudo_p ATTRIBUTE_UNUSED)
+make_hard_regno_live (int regno)
 {
-  unsigned int i;
-
   lra_assert (regno < FIRST_PSEUDO_REGISTER);
   if (TEST_HARD_REG_BIT (hard_regs_live, regno))
     return;
   SET_HARD_REG_BIT (hard_regs_live, regno);
   sparseset_set_bit (start_living, regno);
-  EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-    if (! check_pic_pseudo_p
-	|| regno != REAL_PIC_OFFSET_TABLE_REGNUM
-	|| pic_offset_table_rtx == NULL
-	|| i != REGNO (pic_offset_table_rtx))
-#endif
-      SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
   if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno))
     bitmap_set_bit (bb_gen_pseudos, regno);
 }
 
-/* Process the death of hard register REGNO.  This updates
-   hard_regs_live and START_DYING.  */
+/* Process the definition of hard register REGNO.  This updates
+   hard_regs_live, START_DYING and conflict hard regs for living
+   pseudos.  Conflict hard regs for the pic pseudo is not updated if
+   REGNO is REAL_PIC_OFFSET_TABLE_REGNUM and CHECK_PIC_PSEUDO_P is
+   true.  */
 static void
-make_hard_regno_dead (int regno)
+make_hard_regno_dead (int regno, bool check_pic_pseudo_p ATTRIBUTE_UNUSED)
 {
   lra_assert (regno < FIRST_PSEUDO_REGISTER);
   if (! TEST_HARD_REG_BIT (hard_regs_live, regno))
     return;
   sparseset_set_bit (start_dying, regno);
+  unsigned int i;
+  EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
+#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
+    if (! check_pic_pseudo_p
+	|| regno != REAL_PIC_OFFSET_TABLE_REGNUM
+	|| pic_offset_table_rtx == NULL
+	|| i != REGNO (pic_offset_table_rtx))
+#endif
+      SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
   CLEAR_HARD_REG_BIT (hard_regs_live, regno);
   if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno))
     {
@@ -267,9 +266,9 @@ make_hard_regno_dead (int regno)
     }
 }
 
-/* Mark pseudo REGNO as living at program point POINT, update conflicting
-   hard registers of the pseudo and START_LIVING, and start a new live
-   range for the pseudo corresponding to REGNO if it is necessary.  */
+/* Mark pseudo REGNO as living at program point POINT, update START_LIVING
+   and start a new live range for the pseudo corresponding to REGNO if it
+   is necessary.  */
 static void
 mark_pseudo_live (int regno, int point)
 {
@@ -278,7 +277,6 @@ mark_pseudo_live (int regno, int point)
   lra_assert (regno >= FIRST_PSEUDO_REGISTER);
   lra_assert (! sparseset_bit_p (pseudos_live, regno));
   sparseset_set_bit (pseudos_live, regno);
-  IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live);
 
   if ((complete_info_p || lra_get_regno_hard_regno (regno) < 0)
       && ((p = lra_reg_info[regno].live_ranges) == NULL
@@ -301,6 +299,9 @@ mark_pseudo_dead (int regno, int point)
   lra_assert (sparseset_bit_p (pseudos_live, regno));
   sparseset_clear_bit (pseudos_live, regno);
   sparseset_set_bit (start_dying, regno);
+
+  IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live);
+
   if (complete_info_p || lra_get_regno_hard_regno (regno) < 0)
     {
       p = lra_reg_info[regno].live_ranges;
@@ -322,7 +323,7 @@ mark_regno_live (int regno, machine_mode
   if (regno < FIRST_PSEUDO_REGISTER)
     {
       for (last = end_hard_regno (mode, regno); regno < last; regno++)
-	make_hard_regno_born (regno, false);
+	make_hard_regno_live (regno);
     }
   else
     {
@@ -349,7 +350,7 @@ mark_regno_dead (int regno, machine_mode
   if (regno < FIRST_PSEUDO_REGISTER)
     {
       for (last = end_hard_regno (mode, regno); regno < last; regno++)
-	make_hard_regno_dead (regno);
+	make_hard_regno_dead (regno, false);
     }
   else
     {
@@ -834,13 +835,13 @@ process_bb_lives (basic_block bb, int &c
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type != OP_IN)
-	  make_hard_regno_born (reg->regno, false);
+	  make_hard_regno_live (reg->regno);
 
       if (curr_id->arg_hard_regs != NULL)
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno >= FIRST_PSEUDO_REGISTER)
 	    /* It is a clobber.  */
-	    make_hard_regno_born (regno - FIRST_PSEUDO_REGISTER, false);
+	    make_hard_regno_live (regno - FIRST_PSEUDO_REGISTER);
 
       sparseset_copy (unused_set, start_living);
 
@@ -857,13 +858,14 @@ process_bb_lives (basic_block bb, int &c
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT
 	    && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
-	  make_hard_regno_dead (reg->regno);
+	  make_hard_regno_dead (reg->regno, false);
 
       if (curr_id->arg_hard_regs != NULL)
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno >= FIRST_PSEUDO_REGISTER)
-	    /* It is a clobber.  */
-	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER);
+	    /* It is a clobber.  Don't create conflict of used
+	       REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
+	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER, true);
 
       if (call_p)
 	{
@@ -920,14 +922,14 @@ process_bb_lives (basic_block bb, int &c
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_IN)
-	  make_hard_regno_born (reg->regno, false);
+	  make_hard_regno_live (reg->regno);
 
       if (curr_id->arg_hard_regs != NULL)
 	/* Make argument hard registers live.  Don't create conflict
 	   of used REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno < FIRST_PSEUDO_REGISTER)
-	    make_hard_regno_born (regno, true);
+	    make_hard_regno_live (regno);
 
       sparseset_and_compl (dead_set, start_living, start_dying);
 
@@ -952,7 +954,7 @@ process_bb_lives (basic_block bb, int &c
 	      if (reg2->type != OP_OUT && reg2->regno == reg->regno)
 		break;
 	    if (reg2 == NULL)
-	      make_hard_regno_dead (reg->regno);
+	      make_hard_regno_dead (reg->regno, false);
 	  }
 
       if (need_curr_point_incr)
@@ -995,7 +997,7 @@ process_bb_lives (basic_block bb, int &c
 
 	if (regno == INVALID_REGNUM)
 	  break;
-	make_hard_regno_born (regno, false);
+	make_hard_regno_live (regno);
       }
 
   /* Pseudos can't go in stack regs at the start of a basic block that
@@ -1009,7 +1011,7 @@ process_bb_lives (basic_block bb, int &c
       EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, px)
 	lra_reg_info[px].no_stack_p = true;
       for (px = FIRST_STACK_REG; px <= LAST_STACK_REG; px++)
-	make_hard_regno_born (px, false);
+	make_hard_regno_live (px);
 #endif
       /* No need to record conflicts for call clobbered regs if we
 	 have nonlocal labels around, as we don't ever try to
@@ -1029,7 +1031,7 @@ process_bb_lives (basic_block bb, int &c
 		  && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER)
 #endif
 	      )
-	    make_hard_regno_born (px, false);
+	    make_hard_regno_live (px);
     }
 
   bool live_change_p = false;

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 2/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-09-26 21:14 [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register Peter Bergner
  2018-09-26 21:16 ` [PATCH 1/2][IRA,LRA] " Peter Bergner
@ 2018-09-26 21:36 ` Peter Bergner
  2018-09-28 21:45 ` [PATCH 0/2][IRA,LRA] " Vladimir Makarov
  2 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-09-26 21:36 UTC (permalink / raw)
  To: GCC Patches; +Cc: Vladimir N Makarov, Jeff Law

gcc/
	* ira.h (copy_insn_p): New prototype.
	* ira-lives.c (ignore_reg_for_conflicts): New static variable.
	(make_hard_regno_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.
	(make_object_dead): Likewise.
	(copy_insn_p): New function.
	(process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
	* lra-lives.c (ignore_reg_for_conflicts): New static variable.
	(make_hard_regno_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.
	(mark_pseudo_dead): Likewise.
	(process_bb_lives): Set ignore_reg_for_conflicts for copies.

gcc/testsuite/
	* gcc.target/powerpc/pr86939.c: New test.

Index: gcc/ira.h
===================================================================
--- gcc/ira.h     (revision 263506)
+++ gcc/ira.h     (working copy)
@@ -210,6 +210,9 @@ extern void ira_adjust_equiv_reg_cost (u
 /* ira-costs.c */
 extern void ira_costs_c_finalize (void);
 
+/* ira-lives.c */
+extern rtx copy_insn_p (rtx_insn *);
+
 /* Spilling static chain pseudo may result in generation of wrong
    non-local goto code using frame-pointer to address saved stack
    pointer value after restoring old frame pointer value.  The
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c     (revision 263506)
+++ gcc/ira-lives.c     (working copy)
@@ -84,6 +84,10 @@ static int *allocno_saved_at_call;
    supplemental to recog_data.  */
 static alternative_mask preferred_alternatives;
 
+/* If non-NULL, the source operand of a register to register copy for which
+   we should not add a conflict with the copy's destination operand.  */
+static rtx ignore_reg_for_conflicts;
+
 /* Record hard register REGNO as now being live.  */
 static void
 make_hard_regno_live (int regno)
@@ -101,6 +105,11 @@ make_hard_regno_dead (int regno)
     {
       ira_object_t obj = ira_object_id_map[i];
 
+      if (ignore_reg_for_conflicts != NULL_RTX
+	  && REGNO (ignore_reg_for_conflicts)
+	     == (unsigned int) ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)))
+	continue;
+
       SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
       SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
     }
@@ -154,12 +163,38 @@ static void
 make_object_dead (ira_object_t obj)
 {
   live_range_t lr;
+  int ignore_regno = -1;
+  int last_regno = -1;
 
   sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (obj));
 
+  /* Check whether any part of IGNORE_REG_FOR_CONFLICTS already conflicts
+     with OBJ.  */
+  if (ignore_reg_for_conflicts != NULL_RTX
+      && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
+    {
+      last_regno = END_REGNO (ignore_reg_for_conflicts);
+      int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+
+      while (src_regno < last_regno)
+	{
+	  if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), src_regno))
+	    {
+	      ignore_regno = last_regno = -1;
+	      break;
+	    }
+	  src_regno++;
+	}
+    }
+
   IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj), hard_regs_live);
   IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), hard_regs_live);
 
+  /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with OBJ, make
+     sure it still doesn't.  */
+  for (; ignore_regno < last_regno; ignore_regno++)
+    CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), ignore_regno);
+
   lr = OBJECT_LIVE_RANGES (obj);
   ira_assert (lr != NULL);
   lr->finish = curr_point;
@@ -1022,6 +1057,38 @@ find_call_crossed_cheap_reg (rtx_insn *i
   return cheap_reg;
 }  
 
+/* Determine whether INSN is a register to register copy of the type where
+   we do not need to make the source and destiniation registers conflict.
+   If this is a copy instruction, then return the source reg.  Otherwise,
+   return NULL_RTX.  */
+rtx
+copy_insn_p (rtx_insn *insn)
+{
+  rtx set;
+
+  /* Disallow anything other than a simple register to register copy
+     that has no side effects.  */
+  if ((set = single_set (insn)) == NULL_RTX
+      || !REG_P (SET_DEST (set))
+      || !REG_P (SET_SRC (set))
+      || side_effects_p (set))
+    return NULL_RTX;
+
+  int dst_regno = REGNO (SET_DEST (set));
+  int src_regno = REGNO (SET_SRC (set));
+  machine_mode mode = GET_MODE (SET_DEST (set));
+
+  /* Computing conflicts for register pairs is difficult to get right, so
+     for now, disallow it.  */
+  if ((dst_regno < FIRST_PSEUDO_REGISTER
+       && hard_regno_nregs (dst_regno, mode) != 1)
+      || (src_regno < FIRST_PSEUDO_REGISTER
+	  && hard_regno_nregs (src_regno, mode) != 1))
+    return NULL_RTX;
+
+  return SET_SRC (set);
+}
+
 /* Process insns of the basic block given by its LOOP_TREE_NODE to
    update allocno live ranges, allocno hard register conflicts,
    intersected calls, and register pressure info for allocnos for the
@@ -1107,6 +1174,7 @@ process_bb_node_lives (ira_loop_tree_nod
 		     curr_point);
 
 	  call_p = CALL_P (insn);
+	  ignore_reg_for_conflicts = copy_insn_p (insn);
 #ifdef REAL_PIC_OFFSET_TABLE_REGNUM
 	  int regno;
 	  bool clear_pic_use_conflict_p = false;
@@ -1289,6 +1357,7 @@ process_bb_node_lives (ira_loop_tree_nod
 #endif
 	  curr_point++;
 	}
+      ignore_reg_for_conflicts = NULL_RTX;
 
       if (bb_has_eh_pred (bb))
 	for (j = 0; ; ++j)
Index: gcc/ira-lives.c
===================================================================
--- gcc/lra-lives.c     (revision 263506)
+++ gcc/lra-lives.c     (working copy)
@@ -96,6 +96,10 @@ static bitmap_head temp_bitmap;
 /* Pool for pseudo live ranges.	 */
 static object_allocator<lra_live_range> lra_live_range_pool ("live ranges");
 
+/* If non-NULL, the source operand of a register to register copy for which
+   we should not add a conflict with the copy's destination operand.  */
+static rtx ignore_reg_for_conflicts;
+
 /* Free live range list LR.  */
 static void
 free_live_range_list (lra_live_range_t lr)
@@ -251,13 +255,18 @@ make_hard_regno_dead (int regno, bool ch
   sparseset_set_bit (start_dying, regno);
   unsigned int i;
   EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
+    {
+      if (ignore_reg_for_conflicts != NULL_RTX
+	  && REGNO (ignore_reg_for_conflicts) == i)
+	continue;
 #ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-    if (! check_pic_pseudo_p
-	|| regno != REAL_PIC_OFFSET_TABLE_REGNUM
-	|| pic_offset_table_rtx == NULL
-	|| i != REGNO (pic_offset_table_rtx))
+      if (! check_pic_pseudo_p
+	  || regno != REAL_PIC_OFFSET_TABLE_REGNUM
+	  || pic_offset_table_rtx == NULL
+	  || i != REGNO (pic_offset_table_rtx))
 #endif
-      SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
+	SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
+    }
   CLEAR_HARD_REG_BIT (hard_regs_live, regno);
   if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno))
     {
@@ -294,14 +303,41 @@ static void
 mark_pseudo_dead (int regno, int point)
 {
   lra_live_range_t p;
+  int ignore_regno = -1;
+  int last_regno = -1;
 
   lra_assert (regno >= FIRST_PSEUDO_REGISTER);
   lra_assert (sparseset_bit_p (pseudos_live, regno));
   sparseset_clear_bit (pseudos_live, regno);
   sparseset_set_bit (start_dying, regno);
 
+  /* Check whether any part of IGNORE_REG_FOR_CONFLICTS already conflicts
+     with REGNO.  */
+  if (ignore_reg_for_conflicts != NULL_RTX
+      && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
+    {
+      last_regno = END_REGNO (ignore_reg_for_conflicts);
+      int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+
+      while (src_regno < last_regno)
+	{
+	  if (TEST_HARD_REG_BIT (lra_reg_info[regno].conflict_hard_regs,
+				 src_regno))
+	    {
+	      ignore_regno = last_regno = -1;
+	      break;
+	    }
+	  src_regno++;
+	}
+    }
+
   IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live);
 
+  /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with REGNO, make
+     sure it still doesn't.  */
+  for (; ignore_regno < last_regno; ignore_regno++)
+    CLEAR_HARD_REG_BIT (lra_reg_info[regno].conflict_hard_regs, ignore_regno);
+
   if (complete_info_p || lra_get_regno_hard_regno (regno) < 0)
     {
       p = lra_reg_info[regno].live_ranges;
@@ -747,6 +783,7 @@ process_bb_lives (basic_block bb, int &c
 	}
 
       call_p = CALL_P (curr_insn);
+      ignore_reg_for_conflicts = copy_insn_p (curr_insn);
       src_regno = (set != NULL_RTX && REG_P (SET_SRC (set))
 		   ? REGNO (SET_SRC (set)) : -1);
       dst_regno = (set != NULL_RTX && REG_P (SET_DEST (set))
@@ -989,6 +1026,7 @@ process_bb_lives (basic_block bb, int &c
       EXECUTE_IF_SET_IN_SPARSESET (unused_set, j)
 	add_reg_note (curr_insn, REG_UNUSED, regno_reg_rtx[j]);
     }
+  ignore_reg_for_conflicts = NULL_RTX;
 
   if (bb_has_eh_pred (bb))
     for (j = 0; ; ++j)
Index: gcc/testsuite/gcc.target/powerpc/pr86939.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr86939.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr86939.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-options "-O2" } */
+
+typedef long (*fptr_t) (void);
+long
+func (fptr_t *p)
+{
+  if (p)
+    return (*p) ();
+  return 0;
+}
+/* { dg-final { scan-assembler-not {mr %?r?12,} } } */

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-09-26 21:14 [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register Peter Bergner
  2018-09-26 21:16 ` [PATCH 1/2][IRA,LRA] " Peter Bergner
  2018-09-26 21:36 ` [PATCH 2/2][IRA,LRA] " Peter Bergner
@ 2018-09-28 21:45 ` Vladimir Makarov
  2018-09-30 20:28   ` Peter Bergner
  2 siblings, 1 reply; 66+ messages in thread
From: Vladimir Makarov @ 2018-09-28 21:45 UTC (permalink / raw)
  To: Peter Bergner; +Cc: gcc-patches

On 09/26/2018 05:14 PM, Peter Bergner wrote:
> PR86939 shows a problem where IRA (and LRA) adds an unneeded conflict between
> a pseudo reg and a hard reg leading to an unnecessary copy.  The definition
> of conflict that most register allocators use is that two live ranges conflict
> if one is "live" at the definition of the other and they have different
> values (here "live" means live and available).  In computing conflicts,
> IRA/LRA do not try and represent the "...and they have different values"
> which leads to the unneeded conflict.  They also add conflicts when they
> handle insn operand uses, rather than at operand definitions that the
> definition above implies and that other allocators I know about do.
There is some handling simple cases of the same value in LRA but such 
handling is absent in IRA which is the source of the reported problem.

I remember I experimented with value numbering and using it in building 
conflicts (that time it was global.c) but it did not improve the code as 
I expected and made RA much slower.

We still can improve simple cases though.  So thank you, Perter, for 
working on this very non-trivial issue.


> The following 2 patches fixes (some) of the problems mentioned above
> and is enough to fix the performance problem reported in the PR.
> Firstly, PATCH 1 changes IRA and LRA to compute conflicts when handling
> definitions, rather than at uses.  This change is required by the second
> patch, since we cannot handle reg to reg copies specially (ie, skip the
> addition of a conflict) if we've already made them conflict because we
> noticed they're live simultaneously.  Technically, this should only
> make a difference in the conflicts computed for two live ranges that
> are both undefined.  Not really likely.  I also took it upon myself to
> rename the *_born functions, since the live ranges are not born at the
> point we call them.  They are actually deaths/last uses and we need to
> add them to the "live" sets (we traverse the basic blocks in reverse).
> Therefore, I changes the suffix from _born to _live.
We had some PRs with old RA code generation in case of using undefined 
values.  I don't remember what problems were exactly but I remember Ken 
Zadeck worked on this too.  That is why I probably chose the current 
scheme of conflict building.

May be I was wrong because you testing shows that it is not a problem 
anymore.


> PATCH 2 adds the special handling of pseudo to/from hard reg copies.
> Specifically, we ignore the source reg of the copy when adding conflicts
> for the reg that is being defined in the copy insn.  A few things I'd
> like to point out.  Normally, in the other allocators I'm familiar with,
> we handle conflicts for copies by removing the source reg from the "live"
> set, even if it doesn't die in the copy.  Then when we process the
> copies definitions, we add conflicts with everything that is live.
> Since we've removed the source reg, then we've successfully not added a
> conflict here.  Then we process the copies uses which will automatically
> add the source reg back to the live set.  I tried that here too, but
> given allocno ranges computation, I couldn't do that here.  I decided
> just to create a global var that I can test for when adding conflicts.
> Also, this patch does not handle pseudo to pseudo copies, since their
> conflicts are computed by checking their allocno ranges for overlap
> and I could not determine how to recognize and handle copies during
> that process.  If someone has ideas, please let me know.  Finally, I
> have explicitly disallowed special handling of copies of register pairs,
> since that is difficult to get right.  I tried several solutions before
> finally just giving up on them.  That could be a future work item along
> with pseudo to pseudo handling if people want.
Using live ranges speeds up significantly IRA because we don't need to 
build the interference graph.  It also simplifies the fast register 
allocation implementation.  On the other hand, dealing with live ranges 
can be sometimes complicated.
> I'll note that I have bootstrapped and regtested PATCH 1 on both
> powerpc64le-linux and x86_64-linux with not regressions.  I have
> also bootstrapped and regtested PATCH 1 + PATCH 2 on the same two
> platforms with no regressions.
>
> Vlad, Jeff or anyone else, any comments on the approach used here
> to fix PR86939?
>
> If the patches are "ok" as is, I'd like to commit PATCH 1 first and
> let it bake on trunk for a little while before committing PATCH 2.
>
Peter, thank you again for working on the PR.  It is always interesting 
to read your comments about other register allocators and your 
experience about working on RAs.

I am reviewing the patches.  The 1st patch is ok for me.  You can commit 
it to the trunk.  I'll review the second patch on the next week.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-09-28 21:45 ` [PATCH 0/2][IRA,LRA] " Vladimir Makarov
@ 2018-09-30 20:28   ` Peter Bergner
  2018-10-01  0:57     ` H.J. Lu
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-09-30 20:28 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches, Jeff Law

On 9/28/18 4:34 PM, Vladimir Makarov wrote:
> I remember I experimented with value numbering and using it in building
> conflicts (that time it was global.c) but it did not improve the code as
> I expected and made RA much slower.

Yes, value numbering is not cheap to compute for catching special cases 
like this that probably don't happen too often.  Handling copies is fairly
cheap and easy though.


> We had some PRs with old RA code generation in case of using undefined values.
> I don't remember what problems were exactly but I remember Ken Zadeck worked
> on this too.  That is why I probably chose the current scheme of conflict
> building.
> 
> May be I was wrong because you testing shows that it is not a problem anymore.

Maybe it was due to code relying on undefined behavior?  Anyway, it'll be good
to let PATCH 1 bake on trunk for a little while before we commit PATCH 2 to
catch any (if they exist) problems related to that change.


> The 1st patch is ok for me.  You can commit it to the trunk.

Ok, PATCH 1 is now committed, thanks!


> I'll review the second patch on the next week.

Sounds great as it gives any problems caused or exposed by PATCH 1 time to
be reported.  Thanks again!


Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-09-30 20:28   ` Peter Bergner
@ 2018-10-01  0:57     ` H.J. Lu
  2018-10-01  1:18       ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: H.J. Lu @ 2018-10-01  0:57 UTC (permalink / raw)
  To: bergner; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On Sun, Sep 30, 2018 at 1:21 PM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 9/28/18 4:34 PM, Vladimir Makarov wrote:
> > I remember I experimented with value numbering and using it in building
> > conflicts (that time it was global.c) but it did not improve the code as
> > I expected and made RA much slower.
>
> Yes, value numbering is not cheap to compute for catching special cases
> like this that probably don't happen too often.  Handling copies is fairly
> cheap and easy though.
>
>
> > We had some PRs with old RA code generation in case of using undefined values.
> > I don't remember what problems were exactly but I remember Ken Zadeck worked
> > on this too.  That is why I probably chose the current scheme of conflict
> > building.
> >
> > May be I was wrong because you testing shows that it is not a problem anymore.
>
> Maybe it was due to code relying on undefined behavior?  Anyway, it'll be good
> to let PATCH 1 bake on trunk for a little while before we commit PATCH 2 to
> catch any (if they exist) problems related to that change.
>
>
> > The 1st patch is ok for me.  You can commit it to the trunk.
>
> Ok, PATCH 1 is now committed, thanks!
>
>
> > I'll review the second patch on the next week.
>
> Sounds great as it gives any problems caused or exposed by PATCH 1 time to
> be reported.  Thanks again!

This caused:

FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
\\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ \\t]+c@GOTOFF[(]%ebx[)]

for Linux/i686.


-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-01  0:57     ` H.J. Lu
@ 2018-10-01  1:18       ` Peter Bergner
  2018-10-01 12:46         ` H.J. Lu
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-01  1:18 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On 9/30/18 7:57 PM, H.J. Lu wrote:
> This caused:
> 
> FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ \\t]+c@GOTOFF[(]%ebx[)]

Can you check whether the new generated code is at least as good
as the old generated code?  I'm assuming the code we generate now isn't
wrong, just different and maybe we just need to change what we expect
to see.

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-01  1:18       ` Peter Bergner
@ 2018-10-01 12:46         ` H.J. Lu
  2018-10-01 12:51           ` H.J. Lu
  0 siblings, 1 reply; 66+ messages in thread
From: H.J. Lu @ 2018-10-01 12:46 UTC (permalink / raw)
  To: bergner; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On Sun, Sep 30, 2018 at 6:18 PM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 9/30/18 7:57 PM, H.J. Lu wrote:
> > This caused:
> >
> > FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> > \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> > FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ \\t]+c@GOTOFF[(]%ebx[)]
>
> Can you check whether the new generated code is at least as good
> as the old generated code?  I'm assuming the code we generate now isn't
> wrong, just different and maybe we just need to change what we expect
> to see.

I checked gcc.target/i386/pr63527.c and it has a regression.

Before:

00000000 <foo>:
   0: 53                    push   %ebx
   1: e8 fc ff ff ff        call   2 <foo+0x2>
   6: 81 c3 02 00 00 00    add    $0x2,%ebx
   c: 83 ec 08              sub    $0x8,%esp
   f: e8 fc ff ff ff        call   10 <foo+0x10>
  14: e8 fc ff ff ff        call   15 <foo+0x15>
  19: 83 c4 08              add    $0x8,%esp
  1c: 5b                    pop    %ebx
  1d: c3                    ret

Disassembly of section .text.__x86.get_pc_thunk.bx:

00000000 <__x86.get_pc_thunk.bx>:
   0: 8b 1c 24              mov    (%esp),%ebx
   3: c3                    ret

After:

00000000 <foo>:
   0: 56                    push   %esi
   1: e8 fc ff ff ff        call   2 <foo+0x2>
   6: 81 c6 02 00 00 00    add    $0x2,%esi
   c: 53                    push   %ebx
   d: 83 ec 04              sub    $0x4,%esp
  10: 89 f3                mov    %esi,%ebx
  12: e8 fc ff ff ff        call   13 <foo+0x13>
  17: e8 fc ff ff ff        call   18 <foo+0x18>
  1c: 83 c4 04              add    $0x4,%esp
  1f: 5b                    pop    %ebx
  20: 5e                    pop    %esi
  21: c3                    ret

Disassembly of section .text.__x86.get_pc_thunk.si:

00000000 <__x86.get_pc_thunk.si>:
   0: 8b 34 24              mov    (%esp),%esi
   3: c3                    ret

-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-01 12:46         ` H.J. Lu
@ 2018-10-01 12:51           ` H.J. Lu
  2018-10-01 13:16             ` H.J. Lu
  2018-10-02  4:08             ` Peter Bergner
  0 siblings, 2 replies; 66+ messages in thread
From: H.J. Lu @ 2018-10-01 12:51 UTC (permalink / raw)
  To: bergner; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On Mon, Oct 1, 2018 at 5:44 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sun, Sep 30, 2018 at 6:18 PM Peter Bergner <bergner@linux.ibm.com> wrote:
> >
> > On 9/30/18 7:57 PM, H.J. Lu wrote:
> > > This caused:
> > >
> > > FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> > > \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> > > FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ \\t]+c@GOTOFF[(]%ebx[)]
> >
> > Can you check whether the new generated code is at least as good
> > as the old generated code?  I'm assuming the code we generate now isn't
> > wrong, just different and maybe we just need to change what we expect
> > to see.
>
> I checked gcc.target/i386/pr63527.c and it has a regression.
>
> Before:
>
> 00000000 <foo>:
>    0: 53                    push   %ebx
>    1: e8 fc ff ff ff        call   2 <foo+0x2>
>    6: 81 c3 02 00 00 00    add    $0x2,%ebx
>    c: 83 ec 08              sub    $0x8,%esp
>    f: e8 fc ff ff ff        call   10 <foo+0x10>
>   14: e8 fc ff ff ff        call   15 <foo+0x15>
>   19: 83 c4 08              add    $0x8,%esp
>   1c: 5b                    pop    %ebx
>   1d: c3                    ret
>
> Disassembly of section .text.__x86.get_pc_thunk.bx:
>
> 00000000 <__x86.get_pc_thunk.bx>:
>    0: 8b 1c 24              mov    (%esp),%ebx
>    3: c3                    ret
>
> After:
>
> 00000000 <foo>:
>    0: 56                    push   %esi
>    1: e8 fc ff ff ff        call   2 <foo+0x2>
>    6: 81 c6 02 00 00 00    add    $0x2,%esi
>    c: 53                    push   %ebx
>    d: 83 ec 04              sub    $0x4,%esp
>   10: 89 f3                mov    %esi,%ebx
>   12: e8 fc ff ff ff        call   13 <foo+0x13>
>   17: e8 fc ff ff ff        call   18 <foo+0x18>
>   1c: 83 c4 04              add    $0x4,%esp
>   1f: 5b                    pop    %ebx
>   20: 5e                    pop    %esi
>   21: c3                    ret
>
> Disassembly of section .text.__x86.get_pc_thunk.si:
>
> 00000000 <__x86.get_pc_thunk.si>:
>    0: 8b 34 24              mov    (%esp),%esi
>    3: c3                    ret
>

You may have undone:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059

-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-01 12:51           ` H.J. Lu
@ 2018-10-01 13:16             ` H.J. Lu
  2018-10-01 14:05               ` Peter Bergner
  2018-10-02  4:08             ` Peter Bergner
  1 sibling, 1 reply; 66+ messages in thread
From: H.J. Lu @ 2018-10-01 13:16 UTC (permalink / raw)
  To: bergner; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On Mon, Oct 1, 2018 at 5:45 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Mon, Oct 1, 2018 at 5:44 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Sep 30, 2018 at 6:18 PM Peter Bergner <bergner@linux.ibm.com> wrote:
> > >
> > > On 9/30/18 7:57 PM, H.J. Lu wrote:
> > > > This caused:
> > > >
> > > > FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > > FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > > FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> > > > \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> > > > FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ \\t]+c@GOTOFF[(]%ebx[)]
> > >
> > > Can you check whether the new generated code is at least as good
> > > as the old generated code?  I'm assuming the code we generate now isn't
> > > wrong, just different and maybe we just need to change what we expect
> > > to see.
> >
> > I checked gcc.target/i386/pr63527.c and it has a regression.
> >
> > Before:
> >
> > 00000000 <foo>:
> >    0: 53                    push   %ebx
> >    1: e8 fc ff ff ff        call   2 <foo+0x2>
> >    6: 81 c3 02 00 00 00    add    $0x2,%ebx
> >    c: 83 ec 08              sub    $0x8,%esp
> >    f: e8 fc ff ff ff        call   10 <foo+0x10>
> >   14: e8 fc ff ff ff        call   15 <foo+0x15>
> >   19: 83 c4 08              add    $0x8,%esp
> >   1c: 5b                    pop    %ebx
> >   1d: c3                    ret
> >
> > Disassembly of section .text.__x86.get_pc_thunk.bx:
> >
> > 00000000 <__x86.get_pc_thunk.bx>:
> >    0: 8b 1c 24              mov    (%esp),%ebx
> >    3: c3                    ret
> >
> > After:
> >
> > 00000000 <foo>:
> >    0: 56                    push   %esi
> >    1: e8 fc ff ff ff        call   2 <foo+0x2>
> >    6: 81 c6 02 00 00 00    add    $0x2,%esi
> >    c: 53                    push   %ebx
> >    d: 83 ec 04              sub    $0x4,%esp
> >   10: 89 f3                mov    %esi,%ebx
> >   12: e8 fc ff ff ff        call   13 <foo+0x13>
> >   17: e8 fc ff ff ff        call   18 <foo+0x18>
> >   1c: 83 c4 04              add    $0x4,%esp
> >   1f: 5b                    pop    %ebx
> >   20: 5e                    pop    %esi
> >   21: c3                    ret
> >
> > Disassembly of section .text.__x86.get_pc_thunk.si:
> >
> > 00000000 <__x86.get_pc_thunk.si>:
> >    0: 8b 34 24              mov    (%esp),%esi
> >    3: c3                    ret
> >
>
> You may have undone:
>
> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059
>

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87479

-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-01 13:16             ` H.J. Lu
@ 2018-10-01 14:05               ` Peter Bergner
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-10-01 14:05 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On 10/1/18 7:50 AM, H.J. Lu wrote:
> On Mon, Oct 1, 2018 at 5:45 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>> You may have undone:
>>
>> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059
>>
> 
> I opened:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87479


Thanks for checking.  I'll have a look.

Peter



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-01 12:51           ` H.J. Lu
  2018-10-01 13:16             ` H.J. Lu
@ 2018-10-02  4:08             ` Peter Bergner
  2018-10-02 14:50               ` Jeff Law
  2018-10-02 15:07               ` Peter Bergner
  1 sibling, 2 replies; 66+ messages in thread
From: Peter Bergner @ 2018-10-02  4:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Vladimir Makarov, GCC Patches, Jeffrey Law

On 10/1/18 7:45 AM, H.J. Lu wrote:
> You may have undone:
> 
> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059

Yes, the code above also needed to be modified to handle conflicts being
added at definitions rather than at uses.  The patch below does that.
I don't really have access to a i686 (ie, 32-bit) system to test on and
I'm not sure how to force the test to be run in 32-bit mode on a 64-bit
build, but it does fix the assembler for the pr63534.c test case.

That said, looking at the rtl for the test case, I see the following
before RA:

(insn 5 2 6 2 (set (reg:SI 3 bx)
        (reg:SI 82)) "pr63534.c":10 85 {*movsi_internal}
     (nil))
(call_insn 6 5 7 2 (call (mem:QI (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>) [0 barD.1498 S1 A8])
        (const_int 0 [0])) "pr63534.c":10 687 {*call}
     (expr_list:REG_DEAD (reg:SI 3 bx)
        (expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>)
            (nil)))
    (expr_list (use (reg:SI 3 bx))
        (nil)))
(insn 7 6 8 2 (set (reg:SI 3 bx)
        (reg:SI 82)) "pr63534.c":11 85 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 82)
        (nil)))
(call_insn 8 7 0 2 (call (mem:QI (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>) [0 barD.1498 S1 A8])
        (const_int 0 [0])) "pr63534.c":11 687 {*call}
     (expr_list:REG_DEAD (reg:SI 3 bx)
        (expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>)
            (nil)))
    (expr_list (use (reg:SI 3 bx))
        (nil)))

Now that we handle conflicts at definitions and the pic hard reg
is set via a copy from the pic pseudo, my PATCH 2 is setup to
handle exactly this scenario (ie, a copy between a pseudo and
a hard reg).  I looked at the asm output from a build with both
PATCH 1 and PATCH 2, and yes, it also does not add the conflict
between the pic pseudo and pic hard reg, so our other option to
fix PR87479 is to apply PATCH 2.  However, since PATCH 2 handles
the pic pseudo and pic hard reg conflict itself, that means we
don't need the special pic conflict code and it can be removed!
I'm going to update PATCH 2 to remove that pic handling code
and send it through bootstrap and regtesting.

H.J., can you confirm that the following patch not only fixes
the bug you opened, but also doesn't introduce any more?
Once I've updated PATCH 2, I'd like you to test/bless that
one as well.  Thanks.

Peter


gcc/
	PR rtl-optimization/87479
	* ira-lives.c (process_bb_node_lives): Move handling of pic pseudo
	and pic hard reg conflict to the insn that sets pic hard reg.
	* lra-lives.c (mark_regno_dead) <check_pic_pseudo_p>: New function
	argument.  Use it.
	(process_bb_lives): Use new argument to mark_regno_dead.
	Don't handle pic pseudo and pic hard reg conflict when processing
	function call arguments.

Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 264758)
+++ gcc/ira-lives.c	(working copy)
@@ -1108,20 +1108,25 @@ process_bb_node_lives (ira_loop_tree_node_t loop_t
 
 	  call_p = CALL_P (insn);
 #ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-	  int regno;
+	  unsigned int regno;
+	  rtx set;
 	  bool clear_pic_use_conflict_p = false;
-	  /* Processing insn usage in call insn can create conflict
-	     with pic pseudo and pic hard reg and that is wrong.
-	     Check this situation and fix it at the end of the insn
-	     processing.  */
-	  if (call_p && pic_offset_table_rtx != NULL_RTX
+	  /* Processing insn definition of REAL_PIC_OFFSET_TABLE_REGNUM
+	     can create a conflict between the pic pseudo and pic hard reg
+	     and that is wrong.  Check this situation and fix it at the end
+	     of the insn processing.  */
+	  if (pic_offset_table_rtx != NULL_RTX
 	      && (regno = REGNO (pic_offset_table_rtx)) >= FIRST_PSEUDO_REGISTER
-	      && (a = ira_curr_regno_allocno_map[regno]) != NULL)
+	      && (a = ira_curr_regno_allocno_map[regno]) != NULL
+	      && (set = single_set (insn)) != NULL_RTX
+	      && REG_P (SET_DEST (set))
+	      && REGNO (SET_DEST (set)) == REAL_PIC_OFFSET_TABLE_REGNUM
+	      && REG_P (SET_SRC (set))
+	      && REGNO (SET_SRC (set)) == regno)
 	    clear_pic_use_conflict_p
-		= (find_regno_fusage (insn, USE, REAL_PIC_OFFSET_TABLE_REGNUM)
-		   && ! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
-					   (ALLOCNO_OBJECT (a, 0)),
-					   REAL_PIC_OFFSET_TABLE_REGNUM));
+		= ! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
+				       (ALLOCNO_OBJECT (a, 0)),
+				       REAL_PIC_OFFSET_TABLE_REGNUM);
 #endif
 
 	  /* Mark each defined value as live.  We need to do this for
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	(revision 264758)
+++ gcc/lra-lives.c	(working copy)
@@ -342,7 +342,8 @@ mark_regno_live (int regno, machine_mode mode, int
    BB_GEN_PSEUDOS and BB_KILLED_PSEUDOS.  Return TRUE if the liveness
    tracking sets were modified, or FALSE if nothing changed.  */
 static bool
-mark_regno_dead (int regno, machine_mode mode, int point)
+mark_regno_dead (int regno, machine_mode mode, int point,
+		 bool check_pic_pseudo_p)
 {
   int last;
   bool changed = false;
@@ -350,7 +351,7 @@ static bool
   if (regno < FIRST_PSEUDO_REGISTER)
     {
       for (last = end_hard_regno (mode, regno); regno < last; regno++)
-	make_hard_regno_dead (regno, false);
+	make_hard_regno_dead (regno, check_pic_pseudo_p);
     }
   else
     {
@@ -851,9 +852,11 @@ process_bb_lives (basic_block bb, int &curr_point,
       for (reg = curr_id->regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT
 	    && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
+	  /* Don't create a conflict between REAL_PIC_OFFSET_TABLE_REGNUM
+	     and the pic pseudo.  */
 	  need_curr_point_incr
 	    |= mark_regno_dead (reg->regno, reg->biggest_mode,
-				curr_point);
+				curr_point, true);
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT
@@ -863,9 +866,7 @@ process_bb_lives (basic_block bb, int &curr_point,
       if (curr_id->arg_hard_regs != NULL)
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno >= FIRST_PSEUDO_REGISTER)
-	    /* It is a clobber.  Don't create conflict of used
-	       REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
-	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER, true);
+	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER, false);
 
       if (call_p)
 	{
@@ -939,7 +940,7 @@ process_bb_lives (basic_block bb, int &curr_point,
 	    && reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
 	  need_curr_point_incr
 	    |= mark_regno_dead (reg->regno, reg->biggest_mode,
-				curr_point);
+				curr_point, false);
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02  4:08             ` Peter Bergner
@ 2018-10-02 14:50               ` Jeff Law
  2018-10-02 15:07               ` Peter Bergner
  1 sibling, 0 replies; 66+ messages in thread
From: Jeff Law @ 2018-10-02 14:50 UTC (permalink / raw)
  To: Peter Bergner, H.J. Lu; +Cc: Vladimir Makarov, GCC Patches

On 10/1/18 9:52 PM, Peter Bergner wrote:
> On 10/1/18 7:45 AM, H.J. Lu wrote:
>> You may have undone:
>>
>> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059
> 
> Yes, the code above also needed to be modified to handle conflicts being
> added at definitions rather than at uses.  The patch below does that.
> I don't really have access to a i686 (ie, 32-bit) system to test on and
> I'm not sure how to force the test to be run in 32-bit mode on a 64-bit
> build, but it does fix the assembler for the pr63534.c test case.
> 
> That said, looking at the rtl for the test case, I see the following
> before RA:
> 
> (insn 5 2 6 2 (set (reg:SI 3 bx)
>         (reg:SI 82)) "pr63534.c":10 85 {*movsi_internal}
>      (nil))
> (call_insn 6 5 7 2 (call (mem:QI (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>) [0 barD.1498 S1 A8])
>         (const_int 0 [0])) "pr63534.c":10 687 {*call}
>      (expr_list:REG_DEAD (reg:SI 3 bx)
>         (expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>)
>             (nil)))
>     (expr_list (use (reg:SI 3 bx))
>         (nil)))
> (insn 7 6 8 2 (set (reg:SI 3 bx)
>         (reg:SI 82)) "pr63534.c":11 85 {*movsi_internal}
>      (expr_list:REG_DEAD (reg:SI 82)
>         (nil)))
> (call_insn 8 7 0 2 (call (mem:QI (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>) [0 barD.1498 S1 A8])
>         (const_int 0 [0])) "pr63534.c":11 687 {*call}
>      (expr_list:REG_DEAD (reg:SI 3 bx)
>         (expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41]  <function_decl 0x7f30c7548000 bar>)
>             (nil)))
>     (expr_list (use (reg:SI 3 bx))
>         (nil)))
> 
> Now that we handle conflicts at definitions and the pic hard reg
> is set via a copy from the pic pseudo, my PATCH 2 is setup to
> handle exactly this scenario (ie, a copy between a pseudo and
> a hard reg).  I looked at the asm output from a build with both
> PATCH 1 and PATCH 2, and yes, it also does not add the conflict
> between the pic pseudo and pic hard reg, so our other option to
> fix PR87479 is to apply PATCH 2.  However, since PATCH 2 handles
> the pic pseudo and pic hard reg conflict itself, that means we
> don't need the special pic conflict code and it can be removed!
> I'm going to update PATCH 2 to remove that pic handling code
> and send it through bootstrap and regtesting.
> 
> H.J., can you confirm that the following patch not only fixes
> the bug you opened, but also doesn't introduce any more?
> Once I've updated PATCH 2, I'd like you to test/bless that
> one as well.  Thanks.
Haven't looked at the patch yet.  The easiest (but not fastest) way to
build i686 native is gcc45 in the build farm.

Jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02  4:08             ` Peter Bergner
  2018-10-02 14:50               ` Jeff Law
@ 2018-10-02 15:07               ` Peter Bergner
  2018-10-02 15:37                 ` H.J. Lu
                                   ` (2 more replies)
  1 sibling, 3 replies; 66+ messages in thread
From: Peter Bergner @ 2018-10-02 15:07 UTC (permalink / raw)
  To: GCC Patches; +Cc: H.J. Lu, Vladimir Makarov, Jeffrey Law

On 10/1/18 10:52 PM, Peter Bergner wrote:
> Now that we handle conflicts at definitions and the pic hard reg
> is set via a copy from the pic pseudo, my PATCH 2 is setup to
> handle exactly this scenario (ie, a copy between a pseudo and
> a hard reg).  I looked at the asm output from a build with both
> PATCH 1 and PATCH 2, and yes, it also does not add the conflict
> between the pic pseudo and pic hard reg, so our other option to
> fix PR87479 is to apply PATCH 2.  However, since PATCH 2 handles
> the pic pseudo and pic hard reg conflict itself, that means we
> don't need the special pic conflict code and it can be removed!
> I'm going to update PATCH 2 to remove that pic handling code
> and send it through bootstrap and regtesting.

Here is an updated PATCH 2 that adds the generic handling of copies between
pseudos and hard regs which obsoletes the special conflict handling of the
REAL_PIC_OFFSET_TABLE_REGNUM with the pic pseudo.  I have confirmed the
assembler generated by this patch for test case pr63534.c matches the code
generated before PATCH 1 was committed, so we are successfully removing the
copy of the pic pseudo into the pic hard reg with this patch.

I'm currently performing bootstrap and regtesting on powerpc64le-linux and
x86_64-linux.  H.J., could you please test this patch on i686 to verify it
doesn't expose any other problems there?  Otherwise, I'll take Jeff's
suggestion and attempt a build on gcc45, but it sounds like the results
will take a while.

Is this patch version ok for trunk assuming no regressions are found in
the testing mentioned above?

Peter


gcc/
	PR rtl-optimization/86939
	PR rtl-optimization/87479
	* ira.h (copy_insn_p): New prototype.
	* ira-lives.c (ignore_reg_for_conflicts): New static variable.
	(make_hard_regno_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.
	(make_object_dead): Likewise.
	(copy_insn_p): New function.
	(process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
	Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
	* lra-lives.c (ignore_reg_for_conflicts): New static variable.
	(make_hard_regno_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.  Remove special conflict handling of
	REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
	check_pic_pseudo_p and update callers.
	(mark_pseudo_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.
	(process_bb_lives): Set ignore_reg_for_conflicts for copies.

gcc/testsuite/
	* gcc.target/powerpc/pr86939.c: New test.

Index: gcc/ira.h
===================================================================
--- gcc/ira.h	(revision 264789)
+++ gcc/ira.h	(working copy)
@@ -210,6 +210,9 @@ extern void ira_adjust_equiv_reg_cost (u
 /* ira-costs.c */
 extern void ira_costs_c_finalize (void);
 
+/* ira-lives.c */
+extern rtx copy_insn_p (rtx_insn *);
+
 /* Spilling static chain pseudo may result in generation of wrong
    non-local goto code using frame-pointer to address saved stack
    pointer value after restoring old frame pointer value.  The
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 264789)
+++ gcc/ira-lives.c	(working copy)
@@ -84,6 +84,10 @@ static int *allocno_saved_at_call;
    supplemental to recog_data.  */
 static alternative_mask preferred_alternatives;
 
+/* If non-NULL, the source operand of a register to register copy for which
+   we should not add a conflict with the copy's destination operand.  */
+static rtx ignore_reg_for_conflicts;
+
 /* Record hard register REGNO as now being live.  */
 static void
 make_hard_regno_live (int regno)
@@ -101,6 +105,11 @@ make_hard_regno_dead (int regno)
     {
       ira_object_t obj = ira_object_id_map[i];
 
+      if (ignore_reg_for_conflicts != NULL_RTX
+	  && REGNO (ignore_reg_for_conflicts)
+	     == (unsigned int) ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)))
+	continue;
+
       SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
       SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
     }
@@ -154,12 +163,38 @@ static void
 make_object_dead (ira_object_t obj)
 {
   live_range_t lr;
+  int ignore_regno = -1;
+  int last_regno = -1;
 
   sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (obj));
 
+  /* Check whether any part of IGNORE_REG_FOR_CONFLICTS already conflicts
+     with OBJ.  */
+  if (ignore_reg_for_conflicts != NULL_RTX
+      && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
+    {
+      last_regno = END_REGNO (ignore_reg_for_conflicts);
+      int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+
+      while (src_regno < last_regno)
+	{
+	  if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), src_regno))
+	    {
+	      ignore_regno = last_regno = -1;
+	      break;
+	    }
+	  src_regno++;
+	}
+    }
+
   IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj), hard_regs_live);
   IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), hard_regs_live);
 
+  /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with OBJ, make
+     sure it still doesn't.  */
+  for (; ignore_regno < last_regno; ignore_regno++)
+    CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), ignore_regno);
+
   lr = OBJECT_LIVE_RANGES (obj);
   ira_assert (lr != NULL);
   lr->finish = curr_point;
@@ -1022,6 +1057,38 @@ find_call_crossed_cheap_reg (rtx_insn *i
   return cheap_reg;
 }  
 
+/* Determine whether INSN is a register to register copy of the type where
+   we do not need to make the source and destiniation registers conflict.
+   If this is a copy instruction, then return the source reg.  Otherwise,
+   return NULL_RTX.  */
+rtx
+copy_insn_p (rtx_insn *insn)
+{
+  rtx set;
+
+  /* Disallow anything other than a simple register to register copy
+     that has no side effects.  */
+  if ((set = single_set (insn)) == NULL_RTX
+      || !REG_P (SET_DEST (set))
+      || !REG_P (SET_SRC (set))
+      || side_effects_p (set))
+    return NULL_RTX;
+
+  int dst_regno = REGNO (SET_DEST (set));
+  int src_regno = REGNO (SET_SRC (set));
+  machine_mode mode = GET_MODE (SET_DEST (set));
+
+  /* Computing conflicts for register pairs is difficult to get right, so
+     for now, disallow it.  */
+  if ((dst_regno < FIRST_PSEUDO_REGISTER
+       && hard_regno_nregs (dst_regno, mode) != 1)
+      || (src_regno < FIRST_PSEUDO_REGISTER
+	  && hard_regno_nregs (src_regno, mode) != 1))
+    return NULL_RTX;
+
+  return SET_SRC (set);
+}
+
 /* Process insns of the basic block given by its LOOP_TREE_NODE to
    update allocno live ranges, allocno hard register conflicts,
    intersected calls, and register pressure info for allocnos for the
@@ -1107,22 +1174,7 @@ process_bb_node_lives (ira_loop_tree_nod
 		     curr_point);
 
 	  call_p = CALL_P (insn);
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-	  int regno;
-	  bool clear_pic_use_conflict_p = false;
-	  /* Processing insn usage in call insn can create conflict
-	     with pic pseudo and pic hard reg and that is wrong.
-	     Check this situation and fix it at the end of the insn
-	     processing.  */
-	  if (call_p && pic_offset_table_rtx != NULL_RTX
-	      && (regno = REGNO (pic_offset_table_rtx)) >= FIRST_PSEUDO_REGISTER
-	      && (a = ira_curr_regno_allocno_map[regno]) != NULL)
-	    clear_pic_use_conflict_p
-		= (find_regno_fusage (insn, USE, REAL_PIC_OFFSET_TABLE_REGNUM)
-		   && ! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
-					   (ALLOCNO_OBJECT (a, 0)),
-					   REAL_PIC_OFFSET_TABLE_REGNUM));
-#endif
+	  ignore_reg_for_conflicts = copy_insn_p (insn);
 
 	  /* Mark each defined value as live.  We need to do this for
 	     unused values because they still conflict with quantities
@@ -1275,20 +1327,9 @@ process_bb_node_lives (ira_loop_tree_nod
 		}
 	    }
 
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-	  if (clear_pic_use_conflict_p)
-	    {
-	      regno = REGNO (pic_offset_table_rtx);
-	      a = ira_curr_regno_allocno_map[regno];
-	      CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (ALLOCNO_OBJECT (a, 0)),
-				  REAL_PIC_OFFSET_TABLE_REGNUM);
-	      CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS
-				  (ALLOCNO_OBJECT (a, 0)),
-				  REAL_PIC_OFFSET_TABLE_REGNUM);
-	    }
-#endif
 	  curr_point++;
 	}
+      ignore_reg_for_conflicts = NULL_RTX;
 
       if (bb_has_eh_pred (bb))
 	for (j = 0; ; ++j)
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	(revision 264789)
+++ gcc/lra-lives.c	(working copy)
@@ -96,6 +96,10 @@ static bitmap_head temp_bitmap;
 /* Pool for pseudo live ranges.	 */
 static object_allocator<lra_live_range> lra_live_range_pool ("live ranges");
 
+/* If non-NULL, the source operand of a register to register copy for which
+   we should not add a conflict with the copy's destination operand.  */
+static rtx ignore_reg_for_conflicts;
+
 /* Free live range list LR.  */
 static void
 free_live_range_list (lra_live_range_t lr)
@@ -239,11 +243,9 @@ make_hard_regno_live (int regno)
 
 /* Process the definition of hard register REGNO.  This updates
    hard_regs_live, START_DYING and conflict hard regs for living
-   pseudos.  Conflict hard regs for the pic pseudo is not updated if
-   REGNO is REAL_PIC_OFFSET_TABLE_REGNUM and CHECK_PIC_PSEUDO_P is
-   true.  */
+   pseudos.  */
 static void
-make_hard_regno_dead (int regno, bool check_pic_pseudo_p ATTRIBUTE_UNUSED)
+make_hard_regno_dead (int regno)
 {
   lra_assert (regno < FIRST_PSEUDO_REGISTER);
   if (! TEST_HARD_REG_BIT (hard_regs_live, regno))
@@ -251,13 +253,12 @@ make_hard_regno_dead (int regno, bool ch
   sparseset_set_bit (start_dying, regno);
   unsigned int i;
   EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-    if (! check_pic_pseudo_p
-	|| regno != REAL_PIC_OFFSET_TABLE_REGNUM
-	|| pic_offset_table_rtx == NULL
-	|| i != REGNO (pic_offset_table_rtx))
-#endif
+    {
+      if (ignore_reg_for_conflicts != NULL_RTX
+	  && REGNO (ignore_reg_for_conflicts) == i)
+	continue;
       SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
+    }
   CLEAR_HARD_REG_BIT (hard_regs_live, regno);
   if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno))
     {
@@ -294,14 +295,41 @@ static void
 mark_pseudo_dead (int regno, int point)
 {
   lra_live_range_t p;
+  int ignore_regno = -1;
+  int last_regno = -1;
 
   lra_assert (regno >= FIRST_PSEUDO_REGISTER);
   lra_assert (sparseset_bit_p (pseudos_live, regno));
   sparseset_clear_bit (pseudos_live, regno);
   sparseset_set_bit (start_dying, regno);
 
+  /* Check whether any part of IGNORE_REG_FOR_CONFLICTS already conflicts
+     with REGNO.  */
+  if (ignore_reg_for_conflicts != NULL_RTX
+      && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
+    {
+      last_regno = END_REGNO (ignore_reg_for_conflicts);
+      int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+
+      while (src_regno < last_regno)
+	{
+	  if (TEST_HARD_REG_BIT (lra_reg_info[regno].conflict_hard_regs,
+				 src_regno))
+	    {
+	      ignore_regno = last_regno = -1;
+	      break;
+	    }
+	  src_regno++;
+	}
+    }
+
   IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live);
 
+  /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with REGNO, make
+     sure it still doesn't.  */
+  for (; ignore_regno < last_regno; ignore_regno++)
+    CLEAR_HARD_REG_BIT (lra_reg_info[regno].conflict_hard_regs, ignore_regno);
+
   if (complete_info_p || lra_get_regno_hard_regno (regno) < 0)
     {
       p = lra_reg_info[regno].live_ranges;
@@ -350,7 +378,7 @@ mark_regno_dead (int regno, machine_mode
   if (regno < FIRST_PSEUDO_REGISTER)
     {
       for (last = end_hard_regno (mode, regno); regno < last; regno++)
-	make_hard_regno_dead (regno, false);
+	make_hard_regno_dead (regno);
     }
   else
     {
@@ -747,6 +775,7 @@ process_bb_lives (basic_block bb, int &c
 	}
 
       call_p = CALL_P (curr_insn);
+      ignore_reg_for_conflicts = copy_insn_p (curr_insn);
       src_regno = (set != NULL_RTX && REG_P (SET_SRC (set))
 		   ? REGNO (SET_SRC (set)) : -1);
       dst_regno = (set != NULL_RTX && REG_P (SET_DEST (set))
@@ -858,14 +887,13 @@ process_bb_lives (basic_block bb, int &c
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT
 	    && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
-	  make_hard_regno_dead (reg->regno, false);
+	  make_hard_regno_dead (reg->regno);
 
       if (curr_id->arg_hard_regs != NULL)
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno >= FIRST_PSEUDO_REGISTER)
-	    /* It is a clobber.  Don't create conflict of used
-	       REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
-	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER, true);
+	    /* It is a clobber.  */
+	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER);
 
       if (call_p)
 	{
@@ -925,8 +953,7 @@ process_bb_lives (basic_block bb, int &c
 	  make_hard_regno_live (reg->regno);
 
       if (curr_id->arg_hard_regs != NULL)
-	/* Make argument hard registers live.  Don't create conflict
-	   of used REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
+	/* Make argument hard registers live.  */
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno < FIRST_PSEUDO_REGISTER)
 	    make_hard_regno_live (regno);
@@ -954,7 +981,7 @@ process_bb_lives (basic_block bb, int &c
 	      if (reg2->type != OP_OUT && reg2->regno == reg->regno)
 		break;
 	    if (reg2 == NULL)
-	      make_hard_regno_dead (reg->regno, false);
+	      make_hard_regno_dead (reg->regno);
 	  }
 
       if (need_curr_point_incr)
@@ -989,6 +1016,7 @@ process_bb_lives (basic_block bb, int &c
       EXECUTE_IF_SET_IN_SPARSESET (unused_set, j)
 	add_reg_note (curr_insn, REG_UNUSED, regno_reg_rtx[j]);
     }
+  ignore_reg_for_conflicts = NULL_RTX;
 
   if (bb_has_eh_pred (bb))
     for (j = 0; ; ++j)
Index: gcc/testsuite/gcc.target/powerpc/pr86939.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr86939.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr86939.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-options "-O2" } */
+
+typedef long (*fptr_t) (void);
+long
+func (fptr_t *p)
+{
+  if (p)
+    return (*p) ();
+  return 0;
+}
+/* { dg-final { scan-assembler-not {mr %?r?12,} } } */

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 15:07               ` Peter Bergner
@ 2018-10-02 15:37                 ` H.J. Lu
  2018-10-02 15:55                   ` Peter Bergner
  2018-10-02 21:53                 ` Peter Bergner
  2018-10-03 14:43                 ` [PATCH 2/2 v3][IRA,LRA] " Peter Bergner
  2 siblings, 1 reply; 66+ messages in thread
From: H.J. Lu @ 2018-10-02 15:37 UTC (permalink / raw)
  To: bergner; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On Tue, Oct 2, 2018 at 7:59 AM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 10/1/18 10:52 PM, Peter Bergner wrote:
> > Now that we handle conflicts at definitions and the pic hard reg
> > is set via a copy from the pic pseudo, my PATCH 2 is setup to
> > handle exactly this scenario (ie, a copy between a pseudo and
> > a hard reg).  I looked at the asm output from a build with both
> > PATCH 1 and PATCH 2, and yes, it also does not add the conflict
> > between the pic pseudo and pic hard reg, so our other option to
> > fix PR87479 is to apply PATCH 2.  However, since PATCH 2 handles
> > the pic pseudo and pic hard reg conflict itself, that means we
> > don't need the special pic conflict code and it can be removed!
> > I'm going to update PATCH 2 to remove that pic handling code
> > and send it through bootstrap and regtesting.
>
> Here is an updated PATCH 2 that adds the generic handling of copies between
> pseudos and hard regs which obsoletes the special conflict handling of the
> REAL_PIC_OFFSET_TABLE_REGNUM with the pic pseudo.  I have confirmed the
> assembler generated by this patch for test case pr63534.c matches the code
> generated before PATCH 1 was committed, so we are successfully removing the
> copy of the pic pseudo into the pic hard reg with this patch.
>
> I'm currently performing bootstrap and regtesting on powerpc64le-linux and
> x86_64-linux.  H.J., could you please test this patch on i686 to verify it
> doesn't expose any other problems there?  Otherwise, I'll take Jeff's

I am waiting for the result of your previous patch.  I will test this one after
that.

> suggestion and attempt a build on gcc45, but it sounds like the results
> will take a while.
>
> Is this patch version ok for trunk assuming no regressions are found in
> the testing mentioned above?
>
> Peter
>
>
> gcc/
>         PR rtl-optimization/86939
>         PR rtl-optimization/87479
>         * ira.h (copy_insn_p): New prototype.
>         * ira-lives.c (ignore_reg_for_conflicts): New static variable.
>         (make_hard_regno_dead): Don't add conflicts for register
>         ignore_reg_for_conflicts.
>         (make_object_dead): Likewise.
>         (copy_insn_p): New function.
>         (process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
>         Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
>         * lra-lives.c (ignore_reg_for_conflicts): New static variable.
>         (make_hard_regno_dead): Don't add conflicts for register
>         ignore_reg_for_conflicts.  Remove special conflict handling of
>         REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
>         check_pic_pseudo_p and update callers.
>         (mark_pseudo_dead): Don't add conflicts for register
>         ignore_reg_for_conflicts.
>         (process_bb_lives): Set ignore_reg_for_conflicts for copies.
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 15:37                 ` H.J. Lu
@ 2018-10-02 15:55                   ` Peter Bergner
  2018-10-02 17:14                     ` H.J. Lu
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-02 15:55 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On 10/2/18 10:32 AM, H.J. Lu wrote:
> On Tue, Oct 2, 2018 at 7:59 AM Peter Bergner <bergner@linux.ibm.com> wrote:
>> I'm currently performing bootstrap and regtesting on powerpc64le-linux and
>> x86_64-linux.  H.J., could you please test this patch on i686 to verify it
>> doesn't expose any other problems there?  Otherwise, I'll take Jeff's
> 
> I am waiting for the result of your previous patch.  I will test this one after
> that.

Great, thanks!

Peter


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 15:55                   ` Peter Bergner
@ 2018-10-02 17:14                     ` H.J. Lu
  0 siblings, 0 replies; 66+ messages in thread
From: H.J. Lu @ 2018-10-02 17:14 UTC (permalink / raw)
  To: bergner; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On Tue, Oct 2, 2018 at 8:39 AM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 10/2/18 10:32 AM, H.J. Lu wrote:
> > On Tue, Oct 2, 2018 at 7:59 AM Peter Bergner <bergner@linux.ibm.com> wrote:
> >> I'm currently performing bootstrap and regtesting on powerpc64le-linux and
> >> x86_64-linux.  H.J., could you please test this patch on i686 to verify it
> >> doesn't expose any other problems there?  Otherwise, I'll take Jeff's
> >
> > I am waiting for the result of your previous patch.  I will test this one after
> > that.
>
> Great, thanks!

Your previous patch is OK.  I am testing the new patch now.

-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 15:07               ` Peter Bergner
  2018-10-02 15:37                 ` H.J. Lu
@ 2018-10-02 21:53                 ` Peter Bergner
  2018-10-02 22:28                   ` H.J. Lu
  2018-10-03 14:43                 ` [PATCH 2/2 v3][IRA,LRA] " Peter Bergner
  2 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-02 21:53 UTC (permalink / raw)
  To: GCC Patches; +Cc: H.J. Lu, Vladimir Makarov, Jeffrey Law

On 10/2/18 9:59 AM, Peter Bergner wrote:
> gcc/
> 	PR rtl-optimization/86939
> 	PR rtl-optimization/87479
> 	* ira.h (copy_insn_p): New prototype.
> 	* ira-lives.c (ignore_reg_for_conflicts): New static variable.
> 	(make_hard_regno_dead): Don't add conflicts for register
> 	ignore_reg_for_conflicts.
> 	(make_object_dead): Likewise.
> 	(copy_insn_p): New function.
> 	(process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
> 	Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
> 	* lra-lives.c (ignore_reg_for_conflicts): New static variable.
> 	(make_hard_regno_dead): Don't add conflicts for register
> 	ignore_reg_for_conflicts.  Remove special conflict handling of
> 	REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
> 	check_pic_pseudo_p and update callers.
> 	(mark_pseudo_dead): Don't add conflicts for register
> 	ignore_reg_for_conflicts.
> 	(process_bb_lives): Set ignore_reg_for_conflicts for copies.

So bootstrap and regtesting on powerpc64le-linux show no regressions.
Looking at the x86_64-linux results, I see one test suite regression:

  FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8

I have included the function that is compiled differently with and
without the patch.  Looking at the IRA rtl dumps, there is pseudo
85 that is copied from and into hard reg 5 and we now no longer
report that they conflict.  That allows pesudo 85 to now be assigned
to hard reg 5, rather than hard reg 3 (old code) and that leads
to the code differences shown below.  I don't know x86_64 mnemonics
enough to say whether the code changes below are "better" or "similar"
or ???.  H.J., can you comment on the below code gen changes?
If they're better or similar to the old code, I could just modify
the expected results for pr49095.c.

Peter



[bergner@dagger1 PR87479]$ cat pr49095.i 
void foo (void *);

int *
f1 (int *x)
{
  if (!--*x)
    foo (x);
  return x;
}
[bergner@dagger1 PR87479]$ /data/bergner/gcc/build/gcc-fsf-mainline-pr87479-base-regtest/gcc/xgcc -B/data/bergner/gcc/build/gcc-fsf-mainline-pr87479-base-regtest/gcc/ -Os -fno-shrink-wrap -masm=att -ffat-lto-objects -fno-ident -S -o pr49095-base.s pr49095.i 
[bergner@dagger1 PR87479]$ /data/bergner/gcc/build/gcc-fsf-mainline-pr87479-regtest/gcc/xgcc -B/data/bergner/gcc/build/gcc-fsf-mainline-pr87479-regtest/gcc/ -Os -fno-shrink-wrap -masm=att -ffat-lto-objects -fno-ident -S -o pr49095-new.s pr49095.i 
[bergner@dagger1 PR87479]$ diff -u pr49095-base.s pr49095-new.s 
--- pr49095-base.s	2018-10-02 14:07:09.000000000 -0500
+++ pr49095-new.s	2018-10-02 14:07:40.000000000 -0500
@@ -5,16 +5,16 @@
 f1:
 .LFB0:
 	.cfi_startproc
+	subq	$24, %rsp
+	.cfi_def_cfa_offset 32
 	decl	(%rdi)
-	pushq	%rbx
-	.cfi_def_cfa_offset 16
-	.cfi_offset 3, -16
-	movq	%rdi, %rbx
 	jne	.L2
+	movq	%rdi, 8(%rsp)
 	call	foo
+	movq	8(%rsp), %rdi
 .L2:
-	movq	%rbx, %rax
-	popq	%rbx
+	movq	%rdi, %rax
+	addq	$24, %rsp
 	.cfi_def_cfa_offset 8
 	ret
 	.cfi_endproc

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 21:53                 ` Peter Bergner
@ 2018-10-02 22:28                   ` H.J. Lu
  2018-10-03  0:35                     ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: H.J. Lu @ 2018-10-02 22:28 UTC (permalink / raw)
  To: bergner; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On Tue, Oct 2, 2018 at 12:44 PM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 10/2/18 9:59 AM, Peter Bergner wrote:
> > gcc/
> >       PR rtl-optimization/86939
> >       PR rtl-optimization/87479
> >       * ira.h (copy_insn_p): New prototype.
> >       * ira-lives.c (ignore_reg_for_conflicts): New static variable.
> >       (make_hard_regno_dead): Don't add conflicts for register
> >       ignore_reg_for_conflicts.
> >       (make_object_dead): Likewise.
> >       (copy_insn_p): New function.
> >       (process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
> >       Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
> >       * lra-lives.c (ignore_reg_for_conflicts): New static variable.
> >       (make_hard_regno_dead): Don't add conflicts for register
> >       ignore_reg_for_conflicts.  Remove special conflict handling of
> >       REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
> >       check_pic_pseudo_p and update callers.
> >       (mark_pseudo_dead): Don't add conflicts for register
> >       ignore_reg_for_conflicts.
> >       (process_bb_lives): Set ignore_reg_for_conflicts for copies.
>
> So bootstrap and regtesting on powerpc64le-linux show no regressions.
> Looking at the x86_64-linux results, I see one test suite regression:
>
>   FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8
>
> I have included the function that is compiled differently with and
> without the patch.  Looking at the IRA rtl dumps, there is pseudo
> 85 that is copied from and into hard reg 5 and we now no longer
> report that they conflict.  That allows pesudo 85 to now be assigned
> to hard reg 5, rather than hard reg 3 (old code) and that leads
> to the code differences shown below.  I don't know x86_64 mnemonics
> enough to say whether the code changes below are "better" or "similar"
> or ???.  H.J., can you comment on the below code gen changes?
> If they're better or similar to the old code, I could just modify
> the expected results for pr49095.c.
>
> Peter
>
>
>
> [bergner@dagger1 PR87479]$ cat pr49095.i
> void foo (void *);
>
> int *
> f1 (int *x)
> {
>   if (!--*x)
>     foo (x);
>   return x;
> }
> [bergner@dagger1 PR87479]$ /data/bergner/gcc/build/gcc-fsf-mainline-pr87479-base-regtest/gcc/xgcc -B/data/bergner/gcc/build/gcc-fsf-mainline-pr87479-base-regtest/gcc/ -Os -fno-shrink-wrap -masm=att -ffat-lto-objects -fno-ident -S -o pr49095-base.s pr49095.i
> [bergner@dagger1 PR87479]$ /data/bergner/gcc/build/gcc-fsf-mainline-pr87479-regtest/gcc/xgcc -B/data/bergner/gcc/build/gcc-fsf-mainline-pr87479-regtest/gcc/ -Os -fno-shrink-wrap -masm=att -ffat-lto-objects -fno-ident -S -o pr49095-new.s pr49095.i
> [bergner@dagger1 PR87479]$ diff -u pr49095-base.s pr49095-new.s
> --- pr49095-base.s      2018-10-02 14:07:09.000000000 -0500
> +++ pr49095-new.s       2018-10-02 14:07:40.000000000 -0500
> @@ -5,16 +5,16 @@
>  f1:
>  .LFB0:
>         .cfi_startproc
> +       subq    $24, %rsp
> +       .cfi_def_cfa_offset 32
>         decl    (%rdi)
> -       pushq   %rbx
> -       .cfi_def_cfa_offset 16
> -       .cfi_offset 3, -16
> -       movq    %rdi, %rbx
>         jne     .L2
> +       movq    %rdi, 8(%rsp)
>         call    foo
> +       movq    8(%rsp), %rdi
>  .L2:
> -       movq    %rbx, %rax
> -       popq    %rbx
> +       movq    %rdi, %rax
> +       addq    $24, %rsp
>         .cfi_def_cfa_offset 8
>         ret
>         .cfi_endproc
>

I saw the same failures:

FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8
FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8

I think the new ones are better, especially in 32-bit case:

Old:

[hjl@gnu-cfl-1 gcc]$ ./xgcc -B./
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr49095.c
-m32 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -Os -fno-shrink-wrap -masm=att -mregparm=2
-ffat-lto-objects -fno-ident -S -o pr49095.s
[hjl@gnu-cfl-1 gcc]$ wc -l pr49095.s
2314 pr49095.s
[hjl@gnu-cfl-1 gcc]$

New:

[hjl@gnu-skl-1 gcc]$ ./xgcc -B./
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr49095.c
-m32 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -Os -fno-shrink-wrap -masm=att -mregparm=2
-ffat-lto-objects -fno-ident -S -o pr49095.s
[hjl@gnu-skl-1 gcc]$ wc -l pr49095.s
2163 pr49095.s
[hjl@gnu-skl-1 gcc]$

-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 22:28                   ` H.J. Lu
@ 2018-10-03  0:35                     ` Peter Bergner
  2018-10-03  2:23                       ` H.J. Lu
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-03  0:35 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On 10/2/18 4:52 PM, H.J. Lu wrote:
> I saw the same failures:
> 
> FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8
> FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8
> 
> I think the new ones are better, especially in 32-bit case:

Excellent!  Does the following test case patch make it so that
it PASSes again?

Peter


Index: gcc/testsuite/gcc.target/i386/pr49095.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr49095.c	(revision 264793)
+++ gcc/testsuite/gcc.target/i386/pr49095.c	(working copy)
@@ -73,4 +73,5 @@ G (long)
 /* { dg-final { scan-assembler-not "test\[lq\]" } } */
 /* The {f,h}{char,short,int,long}xor functions aren't optimized into
    a RMW instruction, so need load, modify and store.  FIXME eventually.  */
-/* { dg-final { scan-assembler-times "\\), %" 8 } } */
+/* { dg-final { scan-assembler-times "\\), %" 57 { target { ia32 } } } } */
+/* { dg-final { scan-assembler-times "\\), %" 45 { target { lp64 } } } } */

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-03  0:35                     ` Peter Bergner
@ 2018-10-03  2:23                       ` H.J. Lu
  2018-10-03  2:46                         ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: H.J. Lu @ 2018-10-03  2:23 UTC (permalink / raw)
  To: bergner; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On Tue, Oct 2, 2018 at 3:28 PM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 10/2/18 4:52 PM, H.J. Lu wrote:
> > I saw the same failures:
> >
> > FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8
> > FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\), % 8
> >
> > I think the new ones are better, especially in 32-bit case:
>
> Excellent!  Does the following test case patch make it so that
> it PASSes again?
>
> Peter
>
>
> Index: gcc/testsuite/gcc.target/i386/pr49095.c
> ===================================================================
> --- gcc/testsuite/gcc.target/i386/pr49095.c     (revision 264793)
> +++ gcc/testsuite/gcc.target/i386/pr49095.c     (working copy)
> @@ -73,4 +73,5 @@ G (long)
>  /* { dg-final { scan-assembler-not "test\[lq\]" } } */
>  /* The {f,h}{char,short,int,long}xor functions aren't optimized into
>     a RMW instruction, so need load, modify and store.  FIXME eventually.  */
> -/* { dg-final { scan-assembler-times "\\), %" 8 } } */
> +/* { dg-final { scan-assembler-times "\\), %" 57 { target { ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\\), %" 45 { target { lp64 } } } } */

                      ^^^^^ This is wrong.

It should be not ia32.  Otherwise, it will skip x32.

-- 
H.J.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-03  2:23                       ` H.J. Lu
@ 2018-10-03  2:46                         ` Peter Bergner
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-10-03  2:46 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Vladimir Makarov, Jeffrey Law

On 10/2/18 7:17 PM, H.J. Lu wrote:
>> Index: gcc/testsuite/gcc.target/i386/pr49095.c
>> ===================================================================
>> --- gcc/testsuite/gcc.target/i386/pr49095.c     (revision 264793)
>> +++ gcc/testsuite/gcc.target/i386/pr49095.c     (working copy)
>> @@ -73,4 +73,5 @@ G (long)
>>  /* { dg-final { scan-assembler-not "test\[lq\]" } } */
>>  /* The {f,h}{char,short,int,long}xor functions aren't optimized into
>>     a RMW instruction, so need load, modify and store.  FIXME eventually.  */
>> -/* { dg-final { scan-assembler-times "\\), %" 8 } } */
>> +/* { dg-final { scan-assembler-times "\\), %" 57 { target { ia32 } } } } */
>> +/* { dg-final { scan-assembler-times "\\), %" 45 { target { lp64 } } } } */
> 
>                       ^^^^^ This is wrong.
> 
> It should be not ia32.  Otherwise, it will skip x32.

Ok, I changed it, thanks.

Peter


Index: gcc/testsuite/gcc.target/i386/pr49095.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr49095.c	(revision 264793)
+++ gcc/testsuite/gcc.target/i386/pr49095.c	(working copy)
@@ -73,4 +73,5 @@ G (long)
 /* { dg-final { scan-assembler-not "test\[lq\]" } } */
 /* The {f,h}{char,short,int,long}xor functions aren't optimized into
    a RMW instruction, so need load, modify and store.  FIXME eventually.  */
-/* { dg-final { scan-assembler-times "\\), %" 8 } } */
+/* { dg-final { scan-assembler-times "\\), %" 57 { target { ia32 } } } } */
+/* { dg-final { scan-assembler-times "\\), %" 45 { target { ! ia32 } } } } */


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-02 15:07               ` Peter Bergner
  2018-10-02 15:37                 ` H.J. Lu
  2018-10-02 21:53                 ` Peter Bergner
@ 2018-10-03 14:43                 ` Peter Bergner
  2018-10-04 22:18                   ` Vladimir Makarov
  2 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-03 14:43 UTC (permalink / raw)
  To: GCC Patches; +Cc: H.J. Lu, Vladimir Makarov, Jeffrey Law

Here is another updated PATCH 2 that is the same as the previous version,
but includes the modification to the expected output of i386/pr49095.c
test case, as H.J. has confirmed the code gen changes we are seeing on
are a good thing.

This patch completed bootstrap and regtesting on powerpc64le-linux,
x86_64-linux and i686-linux (Thanks H.J.!) with no regressions.
Ok for trunk?

Peter


gcc/
	PR rtl-optimization/86939
	PR rtl-optimization/87479
	* ira.h (copy_insn_p): New prototype.
	* ira-lives.c (ignore_reg_for_conflicts): New static variable.
	(make_hard_regno_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.
	(make_object_dead): Likewise.
	(copy_insn_p): New function.
	(process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
	Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
	* lra-lives.c (ignore_reg_for_conflicts): New static variable.
	(make_hard_regno_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.  Remove special conflict handling of
	REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
	check_pic_pseudo_p and update callers.
	(mark_pseudo_dead): Don't add conflicts for register
	ignore_reg_for_conflicts.
	(process_bb_lives): Set ignore_reg_for_conflicts for copies.

gcc/testsuite/
	* gcc.target/powerpc/pr86939.c: New test.
	* gcc/testsuite/gcc.target/i386/pr49095.c: Fix expected results.

Index: gcc/ira.h
===================================================================
--- gcc/ira.h	(revision 264789)
+++ gcc/ira.h	(working copy)
@@ -210,6 +210,9 @@ extern void ira_adjust_equiv_reg_cost (u
 /* ira-costs.c */
 extern void ira_costs_c_finalize (void);
 
+/* ira-lives.c */
+extern rtx copy_insn_p (rtx_insn *);
+
 /* Spilling static chain pseudo may result in generation of wrong
    non-local goto code using frame-pointer to address saved stack
    pointer value after restoring old frame pointer value.  The
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 264789)
+++ gcc/ira-lives.c	(working copy)
@@ -84,6 +84,10 @@ static int *allocno_saved_at_call;
    supplemental to recog_data.  */
 static alternative_mask preferred_alternatives;
 
+/* If non-NULL, the source operand of a register to register copy for which
+   we should not add a conflict with the copy's destination operand.  */
+static rtx ignore_reg_for_conflicts;
+
 /* Record hard register REGNO as now being live.  */
 static void
 make_hard_regno_live (int regno)
@@ -101,6 +105,11 @@ make_hard_regno_dead (int regno)
     {
       ira_object_t obj = ira_object_id_map[i];
 
+      if (ignore_reg_for_conflicts != NULL_RTX
+	  && REGNO (ignore_reg_for_conflicts)
+	     == (unsigned int) ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)))
+	continue;
+
       SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
       SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
     }
@@ -154,12 +163,38 @@ static void
 make_object_dead (ira_object_t obj)
 {
   live_range_t lr;
+  int ignore_regno = -1;
+  int last_regno = -1;
 
   sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (obj));
 
+  /* Check whether any part of IGNORE_REG_FOR_CONFLICTS already conflicts
+     with OBJ.  */
+  if (ignore_reg_for_conflicts != NULL_RTX
+      && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
+    {
+      last_regno = END_REGNO (ignore_reg_for_conflicts);
+      int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+
+      while (src_regno < last_regno)
+	{
+	  if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), src_regno))
+	    {
+	      ignore_regno = last_regno = -1;
+	      break;
+	    }
+	  src_regno++;
+	}
+    }
+
   IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj), hard_regs_live);
   IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), hard_regs_live);
 
+  /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with OBJ, make
+     sure it still doesn't.  */
+  for (; ignore_regno < last_regno; ignore_regno++)
+    CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), ignore_regno);
+
   lr = OBJECT_LIVE_RANGES (obj);
   ira_assert (lr != NULL);
   lr->finish = curr_point;
@@ -1022,6 +1057,38 @@ find_call_crossed_cheap_reg (rtx_insn *i
   return cheap_reg;
 }  
 
+/* Determine whether INSN is a register to register copy of the type where
+   we do not need to make the source and destiniation registers conflict.
+   If this is a copy instruction, then return the source reg.  Otherwise,
+   return NULL_RTX.  */
+rtx
+copy_insn_p (rtx_insn *insn)
+{
+  rtx set;
+
+  /* Disallow anything other than a simple register to register copy
+     that has no side effects.  */
+  if ((set = single_set (insn)) == NULL_RTX
+      || !REG_P (SET_DEST (set))
+      || !REG_P (SET_SRC (set))
+      || side_effects_p (set))
+    return NULL_RTX;
+
+  int dst_regno = REGNO (SET_DEST (set));
+  int src_regno = REGNO (SET_SRC (set));
+  machine_mode mode = GET_MODE (SET_DEST (set));
+
+  /* Computing conflicts for register pairs is difficult to get right, so
+     for now, disallow it.  */
+  if ((dst_regno < FIRST_PSEUDO_REGISTER
+       && hard_regno_nregs (dst_regno, mode) != 1)
+      || (src_regno < FIRST_PSEUDO_REGISTER
+	  && hard_regno_nregs (src_regno, mode) != 1))
+    return NULL_RTX;
+
+  return SET_SRC (set);
+}
+
 /* Process insns of the basic block given by its LOOP_TREE_NODE to
    update allocno live ranges, allocno hard register conflicts,
    intersected calls, and register pressure info for allocnos for the
@@ -1107,22 +1174,7 @@ process_bb_node_lives (ira_loop_tree_nod
 		     curr_point);
 
 	  call_p = CALL_P (insn);
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-	  int regno;
-	  bool clear_pic_use_conflict_p = false;
-	  /* Processing insn usage in call insn can create conflict
-	     with pic pseudo and pic hard reg and that is wrong.
-	     Check this situation and fix it at the end of the insn
-	     processing.  */
-	  if (call_p && pic_offset_table_rtx != NULL_RTX
-	      && (regno = REGNO (pic_offset_table_rtx)) >= FIRST_PSEUDO_REGISTER
-	      && (a = ira_curr_regno_allocno_map[regno]) != NULL)
-	    clear_pic_use_conflict_p
-		= (find_regno_fusage (insn, USE, REAL_PIC_OFFSET_TABLE_REGNUM)
-		   && ! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
-					   (ALLOCNO_OBJECT (a, 0)),
-					   REAL_PIC_OFFSET_TABLE_REGNUM));
-#endif
+	  ignore_reg_for_conflicts = copy_insn_p (insn);
 
 	  /* Mark each defined value as live.  We need to do this for
 	     unused values because they still conflict with quantities
@@ -1275,20 +1327,9 @@ process_bb_node_lives (ira_loop_tree_nod
 		}
 	    }
 
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-	  if (clear_pic_use_conflict_p)
-	    {
-	      regno = REGNO (pic_offset_table_rtx);
-	      a = ira_curr_regno_allocno_map[regno];
-	      CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (ALLOCNO_OBJECT (a, 0)),
-				  REAL_PIC_OFFSET_TABLE_REGNUM);
-	      CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS
-				  (ALLOCNO_OBJECT (a, 0)),
-				  REAL_PIC_OFFSET_TABLE_REGNUM);
-	    }
-#endif
 	  curr_point++;
 	}
+      ignore_reg_for_conflicts = NULL_RTX;
 
       if (bb_has_eh_pred (bb))
 	for (j = 0; ; ++j)
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	(revision 264789)
+++ gcc/lra-lives.c	(working copy)
@@ -96,6 +96,10 @@ static bitmap_head temp_bitmap;
 /* Pool for pseudo live ranges.	 */
 static object_allocator<lra_live_range> lra_live_range_pool ("live ranges");
 
+/* If non-NULL, the source operand of a register to register copy for which
+   we should not add a conflict with the copy's destination operand.  */
+static rtx ignore_reg_for_conflicts;
+
 /* Free live range list LR.  */
 static void
 free_live_range_list (lra_live_range_t lr)
@@ -239,11 +243,9 @@ make_hard_regno_live (int regno)
 
 /* Process the definition of hard register REGNO.  This updates
    hard_regs_live, START_DYING and conflict hard regs for living
-   pseudos.  Conflict hard regs for the pic pseudo is not updated if
-   REGNO is REAL_PIC_OFFSET_TABLE_REGNUM and CHECK_PIC_PSEUDO_P is
-   true.  */
+   pseudos.  */
 static void
-make_hard_regno_dead (int regno, bool check_pic_pseudo_p ATTRIBUTE_UNUSED)
+make_hard_regno_dead (int regno)
 {
   lra_assert (regno < FIRST_PSEUDO_REGISTER);
   if (! TEST_HARD_REG_BIT (hard_regs_live, regno))
@@ -251,13 +253,12 @@ make_hard_regno_dead (int regno, bool ch
   sparseset_set_bit (start_dying, regno);
   unsigned int i;
   EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
-#ifdef REAL_PIC_OFFSET_TABLE_REGNUM
-    if (! check_pic_pseudo_p
-	|| regno != REAL_PIC_OFFSET_TABLE_REGNUM
-	|| pic_offset_table_rtx == NULL
-	|| i != REGNO (pic_offset_table_rtx))
-#endif
+    {
+      if (ignore_reg_for_conflicts != NULL_RTX
+	  && REGNO (ignore_reg_for_conflicts) == i)
+	continue;
       SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
+    }
   CLEAR_HARD_REG_BIT (hard_regs_live, regno);
   if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno))
     {
@@ -294,14 +295,41 @@ static void
 mark_pseudo_dead (int regno, int point)
 {
   lra_live_range_t p;
+  int ignore_regno = -1;
+  int last_regno = -1;
 
   lra_assert (regno >= FIRST_PSEUDO_REGISTER);
   lra_assert (sparseset_bit_p (pseudos_live, regno));
   sparseset_clear_bit (pseudos_live, regno);
   sparseset_set_bit (start_dying, regno);
 
+  /* Check whether any part of IGNORE_REG_FOR_CONFLICTS already conflicts
+     with REGNO.  */
+  if (ignore_reg_for_conflicts != NULL_RTX
+      && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
+    {
+      last_regno = END_REGNO (ignore_reg_for_conflicts);
+      int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+
+      while (src_regno < last_regno)
+	{
+	  if (TEST_HARD_REG_BIT (lra_reg_info[regno].conflict_hard_regs,
+				 src_regno))
+	    {
+	      ignore_regno = last_regno = -1;
+	      break;
+	    }
+	  src_regno++;
+	}
+    }
+
   IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live);
 
+  /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with REGNO, make
+     sure it still doesn't.  */
+  for (; ignore_regno < last_regno; ignore_regno++)
+    CLEAR_HARD_REG_BIT (lra_reg_info[regno].conflict_hard_regs, ignore_regno);
+
   if (complete_info_p || lra_get_regno_hard_regno (regno) < 0)
     {
       p = lra_reg_info[regno].live_ranges;
@@ -350,7 +378,7 @@ mark_regno_dead (int regno, machine_mode
   if (regno < FIRST_PSEUDO_REGISTER)
     {
       for (last = end_hard_regno (mode, regno); regno < last; regno++)
-	make_hard_regno_dead (regno, false);
+	make_hard_regno_dead (regno);
     }
   else
     {
@@ -747,6 +775,7 @@ process_bb_lives (basic_block bb, int &c
 	}
 
       call_p = CALL_P (curr_insn);
+      ignore_reg_for_conflicts = copy_insn_p (curr_insn);
       src_regno = (set != NULL_RTX && REG_P (SET_SRC (set))
 		   ? REGNO (SET_SRC (set)) : -1);
       dst_regno = (set != NULL_RTX && REG_P (SET_DEST (set))
@@ -858,14 +887,13 @@ process_bb_lives (basic_block bb, int &c
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT
 	    && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
-	  make_hard_regno_dead (reg->regno, false);
+	  make_hard_regno_dead (reg->regno);
 
       if (curr_id->arg_hard_regs != NULL)
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno >= FIRST_PSEUDO_REGISTER)
-	    /* It is a clobber.  Don't create conflict of used
-	       REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
-	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER, true);
+	    /* It is a clobber.  */
+	    make_hard_regno_dead (regno - FIRST_PSEUDO_REGISTER);
 
       if (call_p)
 	{
@@ -925,8 +953,7 @@ process_bb_lives (basic_block bb, int &c
 	  make_hard_regno_live (reg->regno);
 
       if (curr_id->arg_hard_regs != NULL)
-	/* Make argument hard registers live.  Don't create conflict
-	   of used REAL_PIC_OFFSET_TABLE_REGNUM and the pic pseudo.  */
+	/* Make argument hard registers live.  */
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
 	  if (regno < FIRST_PSEUDO_REGISTER)
 	    make_hard_regno_live (regno);
@@ -954,7 +981,7 @@ process_bb_lives (basic_block bb, int &c
 	      if (reg2->type != OP_OUT && reg2->regno == reg->regno)
 		break;
 	    if (reg2 == NULL)
-	      make_hard_regno_dead (reg->regno, false);
+	      make_hard_regno_dead (reg->regno);
 	  }
 
       if (need_curr_point_incr)
@@ -989,6 +1016,7 @@ process_bb_lives (basic_block bb, int &c
       EXECUTE_IF_SET_IN_SPARSESET (unused_set, j)
 	add_reg_note (curr_insn, REG_UNUSED, regno_reg_rtx[j]);
     }
+  ignore_reg_for_conflicts = NULL_RTX;
 
   if (bb_has_eh_pred (bb))
     for (j = 0; ; ++j)
Index: gcc/testsuite/gcc.target/powerpc/pr86939.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr86939.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr86939.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-options "-O2" } */
+
+typedef long (*fptr_t) (void);
+long
+func (fptr_t *p)
+{
+  if (p)
+    return (*p) ();
+  return 0;
+}
+/* { dg-final { scan-assembler-not {mr %?r?12,} } } */
Index: gcc/testsuite/gcc.target/i386/pr49095.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr49095.c	(revision 264793)
+++ gcc/testsuite/gcc.target/i386/pr49095.c	(working copy)
@@ -73,4 +73,5 @@ G (long)
 /* { dg-final { scan-assembler-not "test\[lq\]" } } */
 /* The {f,h}{char,short,int,long}xor functions aren't optimized into
    a RMW instruction, so need load, modify and store.  FIXME eventually.  */
-/* { dg-final { scan-assembler-times "\\), %" 8 } } */
+/* { dg-final { scan-assembler-times "\\), %" 57 { target { ia32 } } } } */
+/* { dg-final { scan-assembler-times "\\), %" 45 { target { ! ia32 } } } } */

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-03 14:43                 ` [PATCH 2/2 v3][IRA,LRA] " Peter Bergner
@ 2018-10-04 22:18                   ` Vladimir Makarov
  2018-10-05 16:50                     ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Vladimir Makarov @ 2018-10-04 22:18 UTC (permalink / raw)
  To: Peter Bergner, GCC Patches; +Cc: H.J. Lu, Jeffrey Law



On 10/03/2018 10:09 AM, Peter Bergner wrote:
> Here is another updated PATCH 2 that is the same as the previous version,
> but includes the modification to the expected output of i386/pr49095.c
> test case, as H.J. has confirmed the code gen changes we are seeing on
> are a good thing.
>
> This patch completed bootstrap and regtesting on powerpc64le-linux,
> x86_64-linux and i686-linux (Thanks H.J.!) with no regressions.
> Ok for trunk?
>
The patch is ok for the trunk.  Only very minor comments.

IMHO, the name copy_insn_p is too common and confusing (we already have 
functions copy_insn and copy_insn_1 in GCC).  The name does not reflect 
its result meaning.  I would call it something like 
non_conflict_copy_source_reg although it is long.

Also I would rename last_regno to bound_regno because it is better 
reflect variable value meaning or at least to end_regno as it is a value 
of END_REGNO macro.

But you can ignore these insignificant comments.

Peter, thank you for working on the issue.  The patches may be solve 
more PRs you mentioned.
>
>
> gcc/
> 	PR rtl-optimization/86939
> 	PR rtl-optimization/87479
> 	* ira.h (copy_insn_p): New prototype.
> 	* ira-lives.c (ignore_reg_for_conflicts): New static variable.
> 	(make_hard_regno_dead): Don't add conflicts for register
> 	ignore_reg_for_conflicts.
> 	(make_object_dead): Likewise.
> 	(copy_insn_p): New function.
> 	(process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
> 	Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
> 	* lra-lives.c (ignore_reg_for_conflicts): New static variable.
> 	(make_hard_regno_dead): Don't add conflicts for register
> 	ignore_reg_for_conflicts.  Remove special conflict handling of
> 	REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
> 	check_pic_pseudo_p and update callers.
> 	(mark_pseudo_dead): Don't add conflicts for register
> 	ignore_reg_for_conflicts.
> 	(process_bb_lives): Set ignore_reg_for_conflicts for copies.
>
> gcc/testsuite/
> 	* gcc.target/powerpc/pr86939.c: New test.
> 	* gcc/testsuite/gcc.target/i386/pr49095.c: Fix expected results.
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-04 22:18                   ` Vladimir Makarov
@ 2018-10-05 16:50                     ` Peter Bergner
  2018-10-05 18:54                       ` Vladimir Makarov
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-05 16:50 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: GCC Patches, H.J. Lu, Jeffrey Law

On 10/4/18 3:01 PM, Vladimir Makarov wrote:
> IMHO, the name copy_insn_p is too common and confusing (we already have
> functions copy_insn and copy_insn_1 in GCC).  The name does not reflect its
> result meaning.  I would call it something like non_conflict_copy_source_reg
> although it is long.

I'm fine with renaming it.  I'm not sure I like the use of source reg in
the name even though it is what is returned.  That is just a convenience for
the caller of the function.  Its true purpose is recognizing whether INSN
is or is not a reg to reg copy for which we can ignore their interference.

How about is_reg_copy_insn_p() or is_reg_to_reg_copy_p() or ???



> Also I would rename last_regno to bound_regno because it is better reflect
> variable value meaning or at least to end_regno as it is a value of END_REGNO
> macro.

Ok, I went with end_regno, since that seems to be used elsewhere.


Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-05 16:50                     ` Peter Bergner
@ 2018-10-05 18:54                       ` Vladimir Makarov
  2018-10-05 20:10                         ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Vladimir Makarov @ 2018-10-05 18:54 UTC (permalink / raw)
  To: Peter Bergner; +Cc: GCC Patches, H.J. Lu, Jeffrey Law

On 10/05/2018 12:40 PM, Peter Bergner wrote:
> On 10/4/18 3:01 PM, Vladimir Makarov wrote:
>> IMHO, the name copy_insn_p is too common and confusing (we already have
>> functions copy_insn and copy_insn_1 in GCC).  The name does not reflect its
>> result meaning.  I would call it something like non_conflict_copy_source_reg
>> although it is long.
> I'm fine with renaming it.  I'm not sure I like the use of source reg in
> the name even though it is what is returned.  That is just a convenience for
> the caller of the function.  Its true purpose is recognizing whether INSN
> is or is not a reg to reg copy for which we can ignore their interference.
OK.
> How about is_reg_copy_insn_p() or is_reg_to_reg_copy_p() or ???
>
Personally I like the first name more.  But it is up to you.  I don't 
want to bother you anymore.
>
>> Also I would rename last_regno to bound_regno because it is better reflect
>> variable value meaning or at least to end_regno as it is a value of END_REGNO
>> macro.
> Ok, I went with end_regno, since that seems to be used elsewhere.
>
Great.

Thank you, Peter.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-05 18:54                       ` Vladimir Makarov
@ 2018-10-05 20:10                         ` Peter Bergner
  2018-10-05 22:56                           ` Vladimir Makarov
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-05 20:10 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: GCC Patches, H.J. Lu, Jeffrey Law

On 10/5/18 1:32 PM, Vladimir Makarov wrote:
> On 10/05/2018 12:40 PM, Peter Bergner wrote:
>> On 10/4/18 3:01 PM, Vladimir Makarov wrote:
>>> IMHO, the name copy_insn_p is too common and confusing (we already have
>>> functions copy_insn and copy_insn_1 in GCC).  The name does not reflect its
>>> result meaning.  I would call it something like non_conflict_copy_source_reg
>>> although it is long.
>> How about is_reg_copy_insn_p() or is_reg_to_reg_copy_p() or ???
>>
> Personally I like the first name more.  But it is up to you.  I don't want
> to bother you anymore.

It's not a bother, so lets get something we both are ok with.
How about non_conflicting_reg_copy or non_conflicting_copy_insn?

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-05 20:10                         ` Peter Bergner
@ 2018-10-05 22:56                           ` Vladimir Makarov
  2018-10-06  6:40                             ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Vladimir Makarov @ 2018-10-05 22:56 UTC (permalink / raw)
  To: Peter Bergner; +Cc: GCC Patches, H.J. Lu, Jeffrey Law

On 10/05/2018 04:00 PM, Peter Bergner wrote:
> On 10/5/18 1:32 PM, Vladimir Makarov wrote:
>> On 10/05/2018 12:40 PM, Peter Bergner wrote:
>>> On 10/4/18 3:01 PM, Vladimir Makarov wrote:
>>>> IMHO, the name copy_insn_p is too common and confusing (we already have
>>>> functions copy_insn and copy_insn_1 in GCC).  The name does not reflect its
>>>> result meaning.  I would call it something like non_conflict_copy_source_reg
>>>> although it is long.
>>> How about is_reg_copy_insn_p() or is_reg_to_reg_copy_p() or ???
>>>
>> Personally I like the first name more.  But it is up to you.  I don't want
>> to bother you anymore.
> It's not a bother, so lets get something we both are ok with.
> How about non_conflicting_reg_copy or non_conflicting_copy_insn?
OK. I like the first name more.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-05 22:56                           ` Vladimir Makarov
@ 2018-10-06  6:40                             ` Peter Bergner
  2018-10-08  9:37                               ` Christophe Lyon
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-06  6:40 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: GCC Patches, H.J. Lu, Jeffrey Law

On 10/5/18 4:12 PM, Vladimir Makarov wrote:
> On 10/05/2018 04:00 PM, Peter Bergner wrote:
>> How about non_conflicting_reg_copy or non_conflicting_copy_insn?
> OK. I like the first name more.

Ok, I committed the patch using the first function name.
Thank you very much for the patch reviews and approvals!

Peter



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-06  6:40                             ` Peter Bergner
@ 2018-10-08  9:37                               ` Christophe Lyon
  2018-10-08 14:21                                 ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Christophe Lyon @ 2018-10-08  9:37 UTC (permalink / raw)
  To: bergner; +Cc: Vladimir Makarov, gcc Patches, H.J. Lu, Jeff Law

On Sat, 6 Oct 2018 at 04:38, Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 10/5/18 4:12 PM, Vladimir Makarov wrote:
> > On 10/05/2018 04:00 PM, Peter Bergner wrote:
> >> How about non_conflicting_reg_copy or non_conflicting_copy_insn?
> > OK. I like the first name more.
>
> Ok, I committed the patch using the first function name.
> Thank you very much for the patch reviews and approvals!
>

Hi,

Since r264897, we are seeing lots of regressions when bootstrapping on arm.
There are execution errors as well as ICEs.
A detailed list can be found at:
https://ci.linaro.org/job/tcwg-compare-results/6498/artifact/artifacts/logs/1-diff-d05_32.tcwg-d05_32-build.txt

You can also see the results posted on gcc-testresults.

Christophe

> Peter
>
>
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-08  9:37                               ` Christophe Lyon
@ 2018-10-08 14:21                                 ` Peter Bergner
  2018-10-08 14:46                                   ` Christophe Lyon
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-08 14:21 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Vladimir Makarov, gcc Patches, H.J. Lu, Jeff Law

On 10/8/18 4:14 AM, Christophe Lyon wrote:
> Since r264897, we are seeing lots of regressions when bootstrapping on arm.
> There are execution errors as well as ICEs.
> A detailed list can be found at:
> https://ci.linaro.org/job/tcwg-compare-results/6498/artifact/artifacts/logs/1-diff-d05_32.tcwg-d05_32-build.txt
> 
> You can also see the results posted on gcc-testresults.

Sorry for the breakage.  I'll try and build a cross compiler to fix the
ICEs.  Hopefully all the other issues are related.

Peter


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-08 14:21                                 ` Peter Bergner
@ 2018-10-08 14:46                                   ` Christophe Lyon
  2018-10-08 15:04                                     ` Jeff Law
  0 siblings, 1 reply; 66+ messages in thread
From: Christophe Lyon @ 2018-10-08 14:46 UTC (permalink / raw)
  To: bergner; +Cc: Vladimir Makarov, gcc Patches, H.J. Lu, Jeff Law

On Mon, 8 Oct 2018 at 16:13, Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 10/8/18 4:14 AM, Christophe Lyon wrote:
> > Since r264897, we are seeing lots of regressions when bootstrapping on arm.
> > There are execution errors as well as ICEs.
> > A detailed list can be found at:
> > https://ci.linaro.org/job/tcwg-compare-results/6498/artifact/artifacts/logs/1-diff-d05_32.tcwg-d05_32-build.txt
> >
> > You can also see the results posted on gcc-testresults.
>
> Sorry for the breakage.  I'll try and build a cross compiler to fix the
> ICEs.  Hopefully all the other issues are related.
>

Note that I saw ICEs only when bootstrapping, not when testing a cross-compiler.

> Peter
>
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-08 14:46                                   ` Christophe Lyon
@ 2018-10-08 15:04                                     ` Jeff Law
  2018-10-11  2:57                                       ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Jeff Law @ 2018-10-08 15:04 UTC (permalink / raw)
  To: Christophe Lyon, bergner; +Cc: Vladimir Makarov, gcc Patches, H.J. Lu

On 10/8/18 8:43 AM, Christophe Lyon wrote:
> On Mon, 8 Oct 2018 at 16:13, Peter Bergner <bergner@linux.ibm.com> wrote:
>>
>> On 10/8/18 4:14 AM, Christophe Lyon wrote:
>>> Since r264897, we are seeing lots of regressions when bootstrapping on arm.
>>> There are execution errors as well as ICEs.
>>> A detailed list can be found at:
>>> https://ci.linaro.org/job/tcwg-compare-results/6498/artifact/artifacts/logs/1-diff-d05_32.tcwg-d05_32-build.txt
>>>
>>> You can also see the results posted on gcc-testresults.
>>
>> Sorry for the breakage.  I'll try and build a cross compiler to fix the
>> ICEs.  Hopefully all the other issues are related.
>>
> 
> Note that I saw ICEs only when bootstrapping, not when testing a cross-compiler.
My tester is showing a variety of problems as well.  hppa, sh4, aarch64,
aarch64_be, alpha arm* and s390x bootstraps are all failing at various
points.   Others may trickle in over time, but clearly something went
wrong recently.  I'm going to start trying to pick them off after the
morning meetings are wrapped up.


jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-08 15:04                                     ` Jeff Law
@ 2018-10-11  2:57                                       ` Peter Bergner
  2018-10-11 18:26                                         ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-11  2:57 UTC (permalink / raw)
  To: Jeff Law, Vladimir Makarov; +Cc: Christophe Lyon, gcc Patches

On 10/8/18 9:52 AM, Jeff Law wrote:
> My tester is showing a variety of problems as well.  hppa, sh4, aarch64,
> aarch64_be, alpha arm* and s390x bootstraps are all failing at various
> points.   Others may trickle in over time, but clearly something went
> wrong recently.  I'm going to start trying to pick them off after the
> morning meetings are wrapped up.

Hi Vlad and Jeff,

I looked into the PA-RISC test case issue Jeff sent me that is caused by my
patch that handles conflicts wrt reg copies and I _think_ I have a handle
on what the problem is, but don't yet know how to fix.  Jeff, I agree with
the analysis you gave in your email to me, but to get Vlad up to speed, here
it is again along with some additional info.

Compiling the test case, we have the following RTL during IRA.  I have also
annotated the pseudos in the RTL to include their hard reg assignment:

(insn 30 19 32 3 (set (reg/f:SI 112 "%r19") ....))
(insn 32 30 20 3 (set (reg:SI 26 %r26)
		      (reg/f:SI 101 "%r26")))
[snip]
(insn 23 22 49 3 (set (reg:SI 109 "%r28")
		 (mem:SI (reg/f:SI 101 "%r26"))))
(insn 49 23 31 3 (set (reg/f:SI 100 "%r28")
	(if_then_else:SI (eq (reg:SI 109 "%r28") (const_int 15 [0xf]))
	    (reg/f:SI 101 "%r26")
	    (const_int 0 [0])))
     (expr_list:REG_DEAD (reg:SI 109 "%r28")
	(expr_list:REG_DEAD (reg/f:SI 101 "%r26"))))
(insn 31 49 33 3 (set (mem/f:SI (reg/f:SI 112 "%r19"))
		      (reg/f:SI 100 "%r28"))
     (expr_list:REG_DEAD (reg/f:SI 112 "%r19")
	(expr_list:REG_DEAD (reg/f:SI 100 "%r28"))))
(call_insn 33 31 36 3 (parallel [
	    (call (mem:SI (symbol_ref/v:SI ("@_Z3yowP11dw_cfi_node"))
		(const_int 16 [0x10]))
	    (clobber (reg:SI 1 %r1))
	    (clobber (reg:SI 2 %r2))
	    (use (const_int 0 [0]))])
     (expr_list:REG_DEAD (reg:SI 26 %r26))
    (expr_list:SI (use (reg:SI 26 %r26)))))

Before my patch, hard reg %r26 and pseudo 101 conflicted, but with my patch
they now (correctly) do not.  IRA is able to assign hard regs to all of the
pseudos, but the constraints in the pattern for insn 49 forces op0 (pseudo 100)
and op1 (pseudo 101) to have the same hard reg.  They are not, so LRA comes
along and spills insn 49 with the following reloads:

Reloads for insn # 49
Reload 0: reload_in (SI) = (reg/f:SI 26 %r26 [orig:101 _10 ] [101])
	  reload_out (SI) = (reg/f:SI 28 %r28 [orig:100 iftmp.2_5 ] [100])
	  GENERAL_REGS, RELOAD_OTHER (opnum = 0)
	  reload_in_reg: (reg/f:SI 26 %r26 [orig:101 _10 ] [101])
	  reload_out_reg: (reg/f:SI 28 %r28 [orig:100 iftmp.2_5 ] [100])
	  reload_reg_rtx: (reg/f:SI 26 %r26 [orig:101 _10 ] [101])

..giving us the following updated insn:

(insn 49 23 58 3 (set (reg/f:SI 26 %r26 [101])
	(if_then_else:SI (eq (reg:SI 28 %r28 [109])
		(const_int 15))
	    (reg/f:SI 26 %r26 [101])
	    (const_int 0 [0]))))

The problem is, that hard reg %r26 is defined in insn 32, to be used in
insn 33, so using %r26 as the reload reg is wrong, because it will clobber
the value we set in insn 32.  Looking thru LRA, it looks like LRA assumes
that for a reload, if one of the input pseudos dies in the insn, then the
hard reg it was assigned to is available to use.  That assumption is (now)
wrong, because another pseudo may be using that hard reg or in this case,
the hard reg itself is still live.

For this example, pseudo 109 also dies in insn 49 and since it's hard reg
%r28 isn't live thru the insn, we could have used that instead.  However,
we cannot just look at REG_DEAD notes for free hard regs to use for reload
regs.  We need to make sure that that hard reg isn't also assigned to another
pseudo that is live at that insn or even that the hard reg itself is live.

Vlad, you know the LRA code better than anyone.  Will it be easy to find
all the places where we create reload regs and fix them up so that we
look at more than just REG_DEAD notes?  Even though looking at REG_DEAD
notes isn't enough, I still think the majority of the time those regs
probably will be free to use, we just have to check for the special
cases like above where they are not.

Peter


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11  2:57                                       ` Peter Bergner
@ 2018-10-11 18:26                                         ` Peter Bergner
  2018-10-11 20:31                                           ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-11 18:26 UTC (permalink / raw)
  To: Jeff Law, Vladimir Makarov; +Cc: Christophe Lyon, gcc Patches

On 10/10/18 7:57 PM, Peter Bergner wrote:
> The problem is, that hard reg %r26 is defined in insn 32, to be used in
> insn 33, so using %r26 as the reload reg is wrong, because it will clobber
> the value we set in insn 32.  Looking thru LRA, it looks like LRA assumes
> that for a reload, if one of the input pseudos dies in the insn, then the
> hard reg it was assigned to is available to use.  That assumption is (now)
> wrong, because another pseudo may be using that hard reg or in this case,
> the hard reg itself is still live.
> 
> For this example, pseudo 109 also dies in insn 49 and since it's hard reg
> %r28 isn't live thru the insn, we could have used that instead.  However,
> we cannot just look at REG_DEAD notes for free hard regs to use for reload
> regs.  We need to make sure that that hard reg isn't also assigned to another
> pseudo that is live at that insn or even that the hard reg itself is live.
> 
> Vlad, you know the LRA code better than anyone.  Will it be easy to find
> all the places where we create reload regs and fix them up so that we
> look at more than just REG_DEAD notes?  Even though looking at REG_DEAD
> notes isn't enough, I still think the majority of the time those regs
> probably will be free to use, we just have to check for the special
> cases like above where they are not.

Ok, after working in gdb, I see that the PA-RISC port still uses reload
and not LRA, but it too seems to have the same issue of reusing input
regs that have REG_DEAD notes, so the question still stands.  It's just
that whatever fix we come up with will have to be to both LRA and reload.

Peter



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 18:26                                         ` Peter Bergner
@ 2018-10-11 20:31                                           ` Peter Bergner
  2018-10-11 20:46                                             ` Jeff Law
                                                               ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Peter Bergner @ 2018-10-11 20:31 UTC (permalink / raw)
  To: Jeff Law, Vladimir Makarov; +Cc: Christophe Lyon, gcc Patches

On 10/11/18 1:18 PM, Peter Bergner wrote:
> Ok, after working in gdb, I see that the PA-RISC port still uses reload
> and not LRA, but it too seems to have the same issue of reusing input
> regs that have REG_DEAD notes, so the question still stands.  It's just
> that whatever fix we come up with will have to be to both LRA and reload.

On second thought, I'm thinking we should just leave reload alone and
only fix this in LRA.  That means we'd have to disable the reg copy
handling when not using LRA though, which might be another reason to
get targets to move to LRA?  I've verified the following patch gets
the PA-RISC test case to pass again.  Thoughts?

If ok, I still have to dig into the fails we're seeing on LRA targets.

Peter

	* ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.

Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 264897)
+++ gcc/ira-lives.c	(working copy)
@@ -1064,6 +1064,10 @@ find_call_crossed_cheap_reg (rtx_insn *i
 rtx
 non_conflicting_reg_copy_p (rtx_insn *insn)
 {
+  /* Disallow this for non LRA targets.  */
+  if (!targetm.lra_p ())
+    return NULL_RTX;
+
   rtx set = single_set (insn);
 
   /* Disallow anything other than a simple register to register copy

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 20:31                                           ` Peter Bergner
@ 2018-10-11 20:46                                             ` Jeff Law
  2018-10-11 21:09                                               ` Peter Bergner
  2018-10-11 21:05                                             ` Vladimir Makarov
  2018-10-12  4:44                                             ` Jeff Law
  2 siblings, 1 reply; 66+ messages in thread
From: Jeff Law @ 2018-10-11 20:46 UTC (permalink / raw)
  To: Peter Bergner, Vladimir Makarov; +Cc: Christophe Lyon, gcc Patches

On 10/11/18 1:23 PM, Peter Bergner wrote:
> On 10/11/18 1:18 PM, Peter Bergner wrote:
>> Ok, after working in gdb, I see that the PA-RISC port still uses reload
>> and not LRA, but it too seems to have the same issue of reusing input
>> regs that have REG_DEAD notes, so the question still stands.  It's just
>> that whatever fix we come up with will have to be to both LRA and reload.
> 
> On second thought, I'm thinking we should just leave reload alone and
> only fix this in LRA.  That means we'd have to disable the reg copy
> handling when not using LRA though, which might be another reason to
> get targets to move to LRA?  I've verified the following patch gets
> the PA-RISC test case to pass again.  Thoughts?
> 
> If ok, I still have to dig into the fails we're seeing on LRA targets.
Hmmm.  Interesting.  I wonder if all the failing targets were reload
targets.....  If so, this may be the way forward -- I certainly don't
want to spend much, if any, time fixing reload.

I'm in the middle of something, but will try to look at each of the
failing targets and confirm they use reload by default.

Jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 20:31                                           ` Peter Bergner
  2018-10-11 20:46                                             ` Jeff Law
@ 2018-10-11 21:05                                             ` Vladimir Makarov
  2018-10-12 16:57                                               ` Peter Bergner
  2018-10-12  4:44                                             ` Jeff Law
  2 siblings, 1 reply; 66+ messages in thread
From: Vladimir Makarov @ 2018-10-11 21:05 UTC (permalink / raw)
  To: Peter Bergner, Jeff Law; +Cc: Christophe Lyon, gcc Patches

On 10/11/2018 03:23 PM, Peter Bergner wrote:
> On 10/11/18 1:18 PM, Peter Bergner wrote:
>> Ok, after working in gdb, I see that the PA-RISC port still uses reload
>> and not LRA, but it too seems to have the same issue of reusing input
>> regs that have REG_DEAD notes, so the question still stands.  It's just
>> that whatever fix we come up with will have to be to both LRA and reload.
> On second thought, I'm thinking we should just leave reload alone and
> only fix this in LRA.  That means we'd have to disable the reg copy
> handling when not using LRA though, which might be another reason to
> get targets to move to LRA?  I've verified the following patch gets
> the PA-RISC test case to pass again.  Thoughts?
>
> If ok, I still have to dig into the fails we're seeing on LRA targets.
>
I think it has a sense because even if LRA has the same problem, it will 
be hard to fix it in reload and LRA.  Nobody worked on reload pass for a 
long time and it is not worth to fix it because we are moving from reload.

I suspect that LRA might be immune to these failures because it 
generates new reload pseudos if it is necessary for insn constraints.  
Plus there is some primitive value numbering in LRA which can avoid the 
problem.

In any case, the patch is ok for me.
>
> 	* ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.
>
> Index: gcc/ira-lives.c
> ===================================================================
> --- gcc/ira-lives.c	(revision 264897)
> +++ gcc/ira-lives.c	(working copy)
> @@ -1064,6 +1064,10 @@ find_call_crossed_cheap_reg (rtx_insn *i
>   rtx
>   non_conflicting_reg_copy_p (rtx_insn *insn)
>   {
> +  /* Disallow this for non LRA targets.  */
> +  if (!targetm.lra_p ())
> +    return NULL_RTX;
> +
>     rtx set = single_set (insn);
>   
>     /* Disallow anything other than a simple register to register copy
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 20:46                                             ` Jeff Law
@ 2018-10-11 21:09                                               ` Peter Bergner
  2018-10-11 21:36                                                 ` Jeff Law
  2018-10-12  9:50                                                 ` Eric Botcazou
  0 siblings, 2 replies; 66+ messages in thread
From: Peter Bergner @ 2018-10-11 21:09 UTC (permalink / raw)
  To: Jeff Law; +Cc: Vladimir Makarov, Christophe Lyon, gcc Patches

On 10/11/18 2:40 PM, Jeff Law wrote:
> On 10/11/18 1:23 PM, Peter Bergner wrote:
>> On 10/11/18 1:18 PM, Peter Bergner wrote:
>>> Ok, after working in gdb, I see that the PA-RISC port still uses reload
>>> and not LRA, but it too seems to have the same issue of reusing input
>>> regs that have REG_DEAD notes, so the question still stands.  It's just
>>> that whatever fix we come up with will have to be to both LRA and reload.
>>
>> On second thought, I'm thinking we should just leave reload alone and
>> only fix this in LRA.  That means we'd have to disable the reg copy
>> handling when not using LRA though, which might be another reason to
>> get targets to move to LRA?  I've verified the following patch gets
>> the PA-RISC test case to pass again.  Thoughts?
>>
>> If ok, I still have to dig into the fails we're seeing on LRA targets.
> Hmmm.  Interesting.  I wonder if all the failing targets were reload
> targets.....  If so, this may be the way forward -- I certainly don't
> want to spend much, if any, time fixing reload.
> 
> I'm in the middle of something, but will try to look at each of the
> failing targets and confirm they use reload by default.

These are the easy ones (they default to reload):

bergner@pike:~/gcc/gcc-fsf-mainline/gcc/config$ grep -r TARGET_LRA_P | grep false | sort
alpha/alpha.c:#define TARGET_LRA_P hook_bool_void_false
avr/avr.c:#define TARGET_LRA_P hook_bool_void_false
bfin/bfin.c:#define TARGET_LRA_P hook_bool_void_false
c6x/c6x.c:#define TARGET_LRA_P hook_bool_void_false
cr16/cr16.c:#define TARGET_LRA_P hook_bool_void_false
cris/cris.c:#define TARGET_LRA_P hook_bool_void_false
epiphany/epiphany.c:#define TARGET_LRA_P hook_bool_void_false
fr30/fr30.c:#define TARGET_LRA_P hook_bool_void_false
frv/frv.c:#define TARGET_LRA_P hook_bool_void_false
h8300/h8300.c:#define TARGET_LRA_P hook_bool_void_false
ia64/ia64.c:#define TARGET_LRA_P hook_bool_void_false
iq2000/iq2000.c:#define TARGET_LRA_P hook_bool_void_false
lm32/lm32.c:#define TARGET_LRA_P hook_bool_void_false
m32c/m32c.c:#define TARGET_LRA_P hook_bool_void_false
m32r/m32r.c:#define TARGET_LRA_P hook_bool_void_false
m68k/m68k.c:#define TARGET_LRA_P hook_bool_void_false
mcore/mcore.c:#define TARGET_LRA_P hook_bool_void_false
microblaze/microblaze.c:#define TARGET_LRA_P hook_bool_void_false
mmix/mmix.c:#define TARGET_LRA_P hook_bool_void_false
mn10300/mn10300.c:#define TARGET_LRA_P hook_bool_void_false
moxie/moxie.c:#define TARGET_LRA_P hook_bool_void_false
msp430/msp430.c:#define TARGET_LRA_P hook_bool_void_false
nvptx/nvptx.c:#define TARGET_LRA_P hook_bool_void_false
pa/pa.c:#define TARGET_LRA_P hook_bool_void_false
rl78/rl78.c:#define TARGET_LRA_P hook_bool_void_false
spu/spu.c:#define TARGET_LRA_P hook_bool_void_false
stormy16/stormy16.c:#define TARGET_LRA_P hook_bool_void_false
tilegx/tilegx.c:#define TARGET_LRA_P hook_bool_void_false
tilepro/tilepro.c:#define TARGET_LRA_P hook_bool_void_false
vax/vax.c:#define TARGET_LRA_P hook_bool_void_false
visium/visium.c:#define TARGET_LRA_P hook_bool_void_false
xtensa/xtensa.c:#define TARGET_LRA_P hook_bool_void_false

These are harder since they support -mlra:

arc/arc.c:#define TARGET_LRA_P arc_lra_p
ft32/ft32.c:#define TARGET_LRA_P ft32_lra_p
mips/mips.c:#define TARGET_LRA_P mips_lra_p
pdp11/pdp11.c:#define TARGET_LRA_P pdp11_lra_p
powerpcspe/powerpcspe.c:#define TARGET_LRA_P rs6000_lra_p
rx/rx.c:#define TARGET_LRA_P 				rx_enable_lra
s390/s390.c:#define TARGET_LRA_P s390_lra_p
sh/sh.c:#define TARGET_LRA_P sh_lra_p
sparc/sparc.c:#define TARGET_LRA_P sparc_lra_p

Quickly looking into their *.opt files, the follwoing default to LRA:
  mips, s390
while these default to reload:
  ft32, sh4
and these I'm not sure of without looking deeper:
  arc, pdp11, powerpcspe, rx, sparc

...if that helps.

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 21:09                                               ` Peter Bergner
@ 2018-10-11 21:36                                                 ` Jeff Law
  2018-10-12  9:50                                                 ` Eric Botcazou
  1 sibling, 0 replies; 66+ messages in thread
From: Jeff Law @ 2018-10-11 21:36 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Vladimir Makarov, Christophe Lyon, gcc Patches

On 10/11/18 3:05 PM, Peter Bergner wrote:
> On 10/11/18 2:40 PM, Jeff Law wrote:
>> On 10/11/18 1:23 PM, Peter Bergner wrote:
>>> On 10/11/18 1:18 PM, Peter Bergner wrote:
>>>> Ok, after working in gdb, I see that the PA-RISC port still uses reload
>>>> and not LRA, but it too seems to have the same issue of reusing input
>>>> regs that have REG_DEAD notes, so the question still stands.  It's just
>>>> that whatever fix we come up with will have to be to both LRA and reload.
>>>
>>> On second thought, I'm thinking we should just leave reload alone and
>>> only fix this in LRA.  That means we'd have to disable the reg copy
>>> handling when not using LRA though, which might be another reason to
>>> get targets to move to LRA?  I've verified the following patch gets
>>> the PA-RISC test case to pass again.  Thoughts?
>>>
>>> If ok, I still have to dig into the fails we're seeing on LRA targets.
>> Hmmm.  Interesting.  I wonder if all the failing targets were reload
>> targets.....  If so, this may be the way forward -- I certainly don't
>> want to spend much, if any, time fixing reload.
>>
>> I'm in the middle of something, but will try to look at each of the
>> failing targets and confirm they use reload by default.
> 
> These are the easy ones (they default to reload):
> 
> bergner@pike:~/gcc/gcc-fsf-mainline/gcc/config$ grep -r TARGET_LRA_P | grep false | sort
> alpha/alpha.c:#define TARGET_LRA_P hook_bool_void_false
> avr/avr.c:#define TARGET_LRA_P hook_bool_void_false
> bfin/bfin.c:#define TARGET_LRA_P hook_bool_void_false
> c6x/c6x.c:#define TARGET_LRA_P hook_bool_void_false
> cr16/cr16.c:#define TARGET_LRA_P hook_bool_void_false
> cris/cris.c:#define TARGET_LRA_P hook_bool_void_false
> epiphany/epiphany.c:#define TARGET_LRA_P hook_bool_void_false
> fr30/fr30.c:#define TARGET_LRA_P hook_bool_void_false
> frv/frv.c:#define TARGET_LRA_P hook_bool_void_false
> h8300/h8300.c:#define TARGET_LRA_P hook_bool_void_false
> ia64/ia64.c:#define TARGET_LRA_P hook_bool_void_false
> iq2000/iq2000.c:#define TARGET_LRA_P hook_bool_void_false
> lm32/lm32.c:#define TARGET_LRA_P hook_bool_void_false
> m32c/m32c.c:#define TARGET_LRA_P hook_bool_void_false
> m32r/m32r.c:#define TARGET_LRA_P hook_bool_void_false
> m68k/m68k.c:#define TARGET_LRA_P hook_bool_void_false
> mcore/mcore.c:#define TARGET_LRA_P hook_bool_void_false
> microblaze/microblaze.c:#define TARGET_LRA_P hook_bool_void_false
> mmix/mmix.c:#define TARGET_LRA_P hook_bool_void_false
> mn10300/mn10300.c:#define TARGET_LRA_P hook_bool_void_false
> moxie/moxie.c:#define TARGET_LRA_P hook_bool_void_false
> msp430/msp430.c:#define TARGET_LRA_P hook_bool_void_false
> nvptx/nvptx.c:#define TARGET_LRA_P hook_bool_void_false
> pa/pa.c:#define TARGET_LRA_P hook_bool_void_false
> rl78/rl78.c:#define TARGET_LRA_P hook_bool_void_false
> spu/spu.c:#define TARGET_LRA_P hook_bool_void_false
> stormy16/stormy16.c:#define TARGET_LRA_P hook_bool_void_false
> tilegx/tilegx.c:#define TARGET_LRA_P hook_bool_void_false
> tilepro/tilepro.c:#define TARGET_LRA_P hook_bool_void_false
> vax/vax.c:#define TARGET_LRA_P hook_bool_void_false
> visium/visium.c:#define TARGET_LRA_P hook_bool_void_false
> xtensa/xtensa.c:#define TARGET_LRA_P hook_bool_void_false
> 
> These are harder since they support -mlra:
> 
> arc/arc.c:#define TARGET_LRA_P arc_lra_p
> ft32/ft32.c:#define TARGET_LRA_P ft32_lra_p
> mips/mips.c:#define TARGET_LRA_P mips_lra_p
> pdp11/pdp11.c:#define TARGET_LRA_P pdp11_lra_p
> powerpcspe/powerpcspe.c:#define TARGET_LRA_P rs6000_lra_p
> rx/rx.c:#define TARGET_LRA_P 				rx_enable_lra
> s390/s390.c:#define TARGET_LRA_P s390_lra_p
> sh/sh.c:#define TARGET_LRA_P sh_lra_p
> sparc/sparc.c:#define TARGET_LRA_P sparc_lra_p
> 
> Quickly looking into their *.opt files, the follwoing default to LRA:
>   mips, s390
So the failing targets were aarch64, alpha, arm, sh4, s390, alpha and
hppa.  In theory your patch has a reasonable chance of fixing sh4, alpha
and hppa.  So I suspect we're still going to have the aarch64, arm and
s390 issues.


I've had my tester turned off while we sorted this out.   I'll put your
patch to disable the conflict pruning for non-LRA targets and see what
we get overnight.



jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 20:31                                           ` Peter Bergner
  2018-10-11 20:46                                             ` Jeff Law
  2018-10-11 21:05                                             ` Vladimir Makarov
@ 2018-10-12  4:44                                             ` Jeff Law
  2018-10-16  2:50                                               ` Peter Bergner
  2 siblings, 1 reply; 66+ messages in thread
From: Jeff Law @ 2018-10-12  4:44 UTC (permalink / raw)
  To: Peter Bergner, Vladimir Makarov; +Cc: Christophe Lyon, gcc Patches

On 10/11/18 1:23 PM, Peter Bergner wrote:
> On 10/11/18 1:18 PM, Peter Bergner wrote:
>> Ok, after working in gdb, I see that the PA-RISC port still uses reload
>> and not LRA, but it too seems to have the same issue of reusing input
>> regs that have REG_DEAD notes, so the question still stands.  It's just
>> that whatever fix we come up with will have to be to both LRA and reload.
> 
> On second thought, I'm thinking we should just leave reload alone and
> only fix this in LRA.  That means we'd have to disable the reg copy
> handling when not using LRA though, which might be another reason to
> get targets to move to LRA?  I've verified the following patch gets
> the PA-RISC test case to pass again.  Thoughts?
> 
> If ok, I still have to dig into the fails we're seeing on LRA targets.
> 
> Peter
> 
> 	* ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.
So this helped the alpha & hppa and sh4.

I'm still seeing failures on the aarch64, s390x.  No surprise on these
since they use LRA by default and would be unaffected by this patch.


You've got state on the aarch64 issue where we generate bogus code for
the kernel which (thankfully) the assembler complained about.

For s390x the stage3 compiler fails its selftest with:

/home/nfs/law/jenkins/workspace/s390x-linux-gnu/obj/gcc/./gcc/xgcc
-B/home/nfs/law/jenkins/workspace/s390x-linux-gnu/obj/gcc/./gcc/ -xc
-nostdinc /dev/null -S -o /dev/null
-fself-test=../../../gcc/gcc/testsuite/selftests
../../../gcc/gcc/input.c:2423: test_lexer_string_locations_simple: FAIL:
ASSERT_EQ (expected_start_col, actual_start_col)
cc1: internal compiler error: in fail, at selftest.c:47
0x2171da3 selftest::fail(selftest::location const&, char const*)
	../../../gcc/gcc/selftest.c:47
0x2192bb3 assert_char_at_range
	../../../gcc/gcc/input.c:2299
0x2198ddb test_lexer_string_locations_simple
	../../../gcc/gcc/input.c:2423
0x2194a57 selftest::for_each_line_table_case(void
(*)(selftest::line_table_case const&))
	../../../gcc/gcc/input.c:3533
0x2194dcf selftest::input_c_tests()
	../../../gcc/gcc/input.c:3556
0x21061e9 selftest::run_tests()
	../../../gcc/gcc/selftest-run-tests.c:80
0x1869f6d toplev::run_self_tests()
	../../../gcc/gcc/toplev.c:2234
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[3]: *** [Makefile:1979: s-selftest-c] Error 1


That's an indication that we've mis-compiled GCC along the way since the
selftest was successful with the stage1 and stage2 compiler.

Given you can see the aarch64 failure with a cross and we haven't done
any real analysis on the s390 failure, I'd focus on aarch64.  I'd ignore
any pathname weirdness and track down why we generate the bogus code the
assembler complains about.


There's also still an arm-linux-gnueabi failure, but I haven't bisected
it to your change.   GCC bootstraps, but fails with an odd error in the
linemap code while building the kernel.

jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 21:09                                               ` Peter Bergner
  2018-10-11 21:36                                                 ` Jeff Law
@ 2018-10-12  9:50                                                 ` Eric Botcazou
  1 sibling, 0 replies; 66+ messages in thread
From: Eric Botcazou @ 2018-10-12  9:50 UTC (permalink / raw)
  To: Peter Bergner, Jeff Law; +Cc: gcc-patches, Vladimir Makarov, Christophe Lyon

> These are the easy ones (they default to reload):
> 
> bergner@pike:~/gcc/gcc-fsf-mainline/gcc/config$ grep -r TARGET_LRA_P | grep
> false | sort alpha/alpha.c:#define TARGET_LRA_P hook_bool_void_false
> avr/avr.c:#define TARGET_LRA_P hook_bool_void_false
> bfin/bfin.c:#define TARGET_LRA_P hook_bool_void_false
> c6x/c6x.c:#define TARGET_LRA_P hook_bool_void_false
> cr16/cr16.c:#define TARGET_LRA_P hook_bool_void_false
> cris/cris.c:#define TARGET_LRA_P hook_bool_void_false
> epiphany/epiphany.c:#define TARGET_LRA_P hook_bool_void_false
> fr30/fr30.c:#define TARGET_LRA_P hook_bool_void_false
> frv/frv.c:#define TARGET_LRA_P hook_bool_void_false
> h8300/h8300.c:#define TARGET_LRA_P hook_bool_void_false
> ia64/ia64.c:#define TARGET_LRA_P hook_bool_void_false
> iq2000/iq2000.c:#define TARGET_LRA_P hook_bool_void_false
> lm32/lm32.c:#define TARGET_LRA_P hook_bool_void_false
> m32c/m32c.c:#define TARGET_LRA_P hook_bool_void_false
> m32r/m32r.c:#define TARGET_LRA_P hook_bool_void_false
> m68k/m68k.c:#define TARGET_LRA_P hook_bool_void_false
> mcore/mcore.c:#define TARGET_LRA_P hook_bool_void_false
> microblaze/microblaze.c:#define TARGET_LRA_P hook_bool_void_false
> mmix/mmix.c:#define TARGET_LRA_P hook_bool_void_false
> mn10300/mn10300.c:#define TARGET_LRA_P hook_bool_void_false
> moxie/moxie.c:#define TARGET_LRA_P hook_bool_void_false
> msp430/msp430.c:#define TARGET_LRA_P hook_bool_void_false
> nvptx/nvptx.c:#define TARGET_LRA_P hook_bool_void_false
> pa/pa.c:#define TARGET_LRA_P hook_bool_void_false
> rl78/rl78.c:#define TARGET_LRA_P hook_bool_void_false
> spu/spu.c:#define TARGET_LRA_P hook_bool_void_false
> stormy16/stormy16.c:#define TARGET_LRA_P hook_bool_void_false
> tilegx/tilegx.c:#define TARGET_LRA_P hook_bool_void_false
> tilepro/tilepro.c:#define TARGET_LRA_P hook_bool_void_false
> vax/vax.c:#define TARGET_LRA_P hook_bool_void_false
> visium/visium.c:#define TARGET_LRA_P hook_bool_void_false
> xtensa/xtensa.c:#define TARGET_LRA_P hook_bool_void_false
> 
> These are harder since they support -mlra:
> 
> arc/arc.c:#define TARGET_LRA_P arc_lra_p
> ft32/ft32.c:#define TARGET_LRA_P ft32_lra_p
> mips/mips.c:#define TARGET_LRA_P mips_lra_p
> pdp11/pdp11.c:#define TARGET_LRA_P pdp11_lra_p
> powerpcspe/powerpcspe.c:#define TARGET_LRA_P rs6000_lra_p
> rx/rx.c:#define TARGET_LRA_P 				rx_enable_lra
> s390/s390.c:#define TARGET_LRA_P s390_lra_p
> sh/sh.c:#define TARGET_LRA_P sh_lra_p
> sparc/sparc.c:#define TARGET_LRA_P sparc_lra_p
> 
> Quickly looking into their *.opt files, the follwoing default to LRA:
>   mips, s390
> while these default to reload:
>   ft32, sh4
> and these I'm not sure of without looking deeper:
>   arc, pdp11, powerpcspe, rx, sparc
> 
> ...if that helps.

See https://gcc.gnu.org/backends.html for a more precise summary.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-11 21:05                                             ` Vladimir Makarov
@ 2018-10-12 16:57                                               ` Peter Bergner
  2018-10-12 17:56                                                 ` Jeff Law
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-12 16:57 UTC (permalink / raw)
  To: Vladimir Makarov, Jeff Law; +Cc: Christophe Lyon, gcc Patches

On 10/11/18 3:51 PM, Vladimir Makarov wrote:
> I think it has a sense because even if LRA has the same problem, it will be
> hard to fix it in reload and LRA.  Nobody worked on reload pass for a long
> time and it is not worth to fix it because we are moving from reload.
[snip]
> In any case, the patch is ok for me.

Ok, I committed the patch as revision 265113 with a slightly longer comment
explaining why we're disabling it for reload.  Thanks!



> I suspect that LRA might be immune to these failures because it generates
> new reload pseudos if it is necessary for insn constraints.  Plus there is
> some primitive value numbering in LRA which can avoid the problem.

Maybe.  We still have some LRA targets with issues, but we haven't gotten
to the point of identifying what the problem is.  It could well be target
constraints, etc. that is the problem.

I built a newer binutils on our s390x box and I have now recreated the
selftest ICE in stage3 and I still have the largish aarch64 failure I'm
looking into.

Peter


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-12 16:57                                               ` Peter Bergner
@ 2018-10-12 17:56                                                 ` Jeff Law
  0 siblings, 0 replies; 66+ messages in thread
From: Jeff Law @ 2018-10-12 17:56 UTC (permalink / raw)
  To: Peter Bergner, Vladimir Makarov; +Cc: Christophe Lyon, gcc Patches

On 10/12/18 10:38 AM, Peter Bergner wrote:
> On 10/11/18 3:51 PM, Vladimir Makarov wrote:
>> I think it has a sense because even if LRA has the same problem, it will be
>> hard to fix it in reload and LRA.  Nobody worked on reload pass for a long
>> time and it is not worth to fix it because we are moving from reload.
> [snip]
>> In any case, the patch is ok for me.
> 
> Ok, I committed the patch as revision 265113 with a slightly longer comment
> explaining why we're disabling it for reload.  Thanks!
> 
> 
> 
>> I suspect that LRA might be immune to these failures because it generates
>> new reload pseudos if it is necessary for insn constraints.  Plus there is
>> some primitive value numbering in LRA which can avoid the problem.
> 
> Maybe.  We still have some LRA targets with issues, but we haven't gotten
> to the point of identifying what the problem is.  It could well be target
> constraints, etc. that is the problem.
> 
> I built a newer binutils on our s390x box and I have now recreated the
> selftest ICE in stage3 and I still have the largish aarch64 failure I'm
> looking into.
ACK.  In that case I'll look into the one arm issue and see if I can
conclusively rule in or out the IRA/LRA work.

jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-12  4:44                                             ` Jeff Law
@ 2018-10-16  2:50                                               ` Peter Bergner
  2018-11-01 18:50                                                 ` Renlin Li
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-10-16  2:50 UTC (permalink / raw)
  To: Jeff Law
  Cc: Vladimir Makarov, Christophe Lyon, gcc Patches, Segher Boessenkool

On 10/11/18 10:40 PM, Jeff Law wrote:
> On 10/11/18 1:23 PM, Peter Bergner wrote:
>> 	* ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.
> So this helped the alpha & hppa and sh4.
> 
> I'm still seeing failures on the aarch64, s390x.  No surprise on these
> since they use LRA by default and would be unaffected by this patch.

Ok, I was able to reduce the aarch64 test case down to the minimum test case
that still kept the kernel's __cmpxchg_double() function intact.  I tested
the patch you're currently running on your builders which changed some
of the "... == OP_OUT" to "... != OP_IN", etc and it doesn't fix the
following test case, like it seems to fix the s390 issue and segher's
small test case (both aarch64 and ppc64).

It's late here, so I'll start digging into this one in the morning.

Peter



bergner@pike:~/gcc/BUGS/PR87507/$ cat slub-min.c
long
__cmpxchg_double (unsigned long arg)
{
  unsigned long old1 = 0;
  unsigned long old2 = arg;
  unsigned long new1 = 0;
  unsigned long new2 = 0;
  volatile void *ptr = 0;

  unsigned long oldval1 = old1;
  unsigned long oldval2 = old2;
  register unsigned long x0 asm ("x0") = old1;
  register unsigned long x1 asm ("x1") = old2;
  register unsigned long x2 asm ("x2") = new1;
  register unsigned long x3 asm ("x3") = new2;
  register unsigned long x4 asm ("x4") = (unsigned long) ptr;
  asm volatile ("	casp    %[old1], %[old2], %[new1], %[new2], %[v]\n"
		"	eor	%[old1], %[old1], %[oldval1]\n"
		"	eor	%[old2], %[old2], %[oldval2]\n"
		"	orr	%[old1], %[old1], %[old2]\n"
		: [old1] "+&r" (x0), [old2] "+&r" (x1), [v] "+Q" (* (unsigned long *) ptr)
		: [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), [oldval1] "r" (oldval1),[oldval2] "r" (oldval2)
		: "x16", "x17", "x30");
  return x0;
}
bergner@pike:~/gcc/BUGS/PR87507/$ /home/bergner/gcc/build/gcc-fsf-mainline-aarch64-r264897/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-aarch64-r264897/gcc -O2 -march=armv8.1-a -c slub-min.c
/tmp/ccQCkiSG.s: Assembler messages:
/tmp/ccQCkiSG.s:24: Error: reg pair must be contiguous at operand 2 -- `casp x0,x6,x2,x3,[x5]'



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-10-16  2:50                                               ` Peter Bergner
@ 2018-11-01 18:50                                                 ` Renlin Li
  2018-11-01 20:35                                                   ` Segher Boessenkool
  2018-11-01 22:08                                                   ` Peter Bergner
  0 siblings, 2 replies; 66+ messages in thread
From: Renlin Li @ 2018-11-01 18:50 UTC (permalink / raw)
  To: Peter Bergner, Jeff Law
  Cc: Vladimir Makarov, Christophe Lyon, gcc Patches, Segher Boessenkool

Hi Peter,

Is there any update on this issues?
arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.

I got the following dump from the test case.

x1 is an early clobber operand in the inline assembly statement,
r92 should conflict with x1?

;; a0(r93,l0) conflicts: a1(r92,l0)
;;     total conflict hard regs: 0-4 16 17 30
;;     conflict hard regs: 0-4 16 17 30


;; a1(r92,l0) conflicts: a0(r93,l0)
;;     total conflict hard regs: 0 2-4 16 17 30
;;     conflict hard regs: 0 2-4 16 17 30

Dump from ira.

(insn 2 8 6 2 (set (reg/v:DI 92 [ arg ])
         (reg:DI 97)) "test.c":3:1 47 {*movdi_aarch64}
      (expr_list:REG_DEAD (reg:DI 97)
         (nil)))
(insn 7 6 9 2 (set (reg/v:DI 1 x1 [ x1 ])
         (reg/v:DI 92 [ arg ])) "test.c":13:26 47 {*movdi_aarch64}
      (nil))
(insn 11 10 14 2 (set (reg/f:DI 93)
         (const_int 0 [0])) "test.c":17:3 47 {*movdi_aarch64}
      (expr_list:REG_EQUIV (const_int 0 [0])
         (nil)))
(insn 14 11 21 2 (parallel [
             (set (reg/v:DI 0 x0 [ x0 ])
                 (asm_operands/v:DI ("	casp    %0, %1, %3, %4, %2
	eor	%0, %0, %6
	eor	%1, %1, %7
	orr	%0, %0, %1
") ("=&r") 0 [
                         (reg/v:DI 2 x2 [ x2 ])
                         (reg/v:DI 3 x3 [ x3 ])
                         (reg/v:DI 4 x4 [ x4 ])
                         (reg/f:DI 93)
                         (reg/v:DI 92 [ arg ])
                         (reg/v:DI 0 x0 [ x0 ])
                         (reg/v:DI 1 x1 [ x1 ])
                         (mem:DI (reg/f:DI 93) [1 MEM[(long unsigned int *)0B]+0 S8 A128])
                     ]
                      [
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("0") test.c:17)
                         (asm_input:DI ("1") test.c:17)
                         (asm_input:DI ("Q") test.c:17)
                     ]
                      [] test.c:17))
             (set (reg/v:DI 1 x1 [ x1 ])
                 (asm_operands/v:DI ("	casp    %0, %1, %3, %4, %2
	eor	%0, %0, %6
	eor	%1, %1, %7
	orr	%0, %0, %1
") ("=&r") 1 [
                         (reg/v:DI 2 x2 [ x2 ])
                         (reg/v:DI 3 x3 [ x3 ])
                         (reg/v:DI 4 x4 [ x4 ])
                         (reg/f:DI 93)
                         (reg/v:DI 92 [ arg ])
                         (reg/v:DI 0 x0 [ x0 ])
                         (reg/v:DI 1 x1 [ x1 ])
                         (mem:DI (reg/f:DI 93) [1 MEM[(long unsigned int *)0B]+0 S8 A128])
                     ]
                      [
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("0") test.c:17)
                         (asm_input:DI ("1") test.c:17)
                         (asm_input:DI ("Q") test.c:17)
                     ]
                      [] test.c:17))
             (set (mem:DI (reg/f:DI 93) [1 MEM[(long unsigned int *)0B]+0 S8 A128])
                 (asm_operands/v:DI ("	casp    %0, %1, %3, %4, %2
	eor	%0, %0, %6
	eor	%1, %1, %7
	orr	%0, %0, %1
") ("=Q") 2 [
                         (reg/v:DI 2 x2 [ x2 ])
                         (reg/v:DI 3 x3 [ x3 ])
                         (reg/v:DI 4 x4 [ x4 ])
                         (reg/f:DI 93)
                         (reg/v:DI 92 [ arg ])
                         (reg/v:DI 0 x0 [ x0 ])
                         (reg/v:DI 1 x1 [ x1 ])
                         (mem:DI (reg/f:DI 93) [1 MEM[(long unsigned int *)0B]+0 S8 A128])
                     ]
                      [
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("r") test.c:17)
                         (asm_input:DI ("0") test.c:17)
                         (asm_input:DI ("1") test.c:17)
                         (asm_input:DI ("Q") test.c:17)
                     ]
                      [] test.c:17))
             (clobber (reg:DI 30 x30))
             (clobber (reg:DI 17 x17))
             (clobber (reg:DI 16 x16))
         ]) "test.c":17:3 -1
      (expr_list:REG_DEAD (reg/f:DI 93)
         (expr_list:REG_DEAD (reg/v:DI 92 [ arg ])
             (expr_list:REG_DEAD (reg/v:DI 4 x4 [ x4 ])
                 (expr_list:REG_DEAD (reg/v:DI 3 x3 [ x3 ])
                     (expr_list:REG_DEAD (reg/v:DI 2 x2 [ x2 ])
                         (expr_list:REG_UNUSED (reg:DI 30 x30)
                             (expr_list:REG_UNUSED (reg:DI 17 x17)
                                 (expr_list:REG_UNUSED (reg:DI 16 x16)
                                     (expr_list:REG_UNUSED (reg/v:DI 1 x1 [ x1 ])
                                         (nil)))))))))))


Regards,
Renlin


On 10/16/2018 03:24 AM, Peter Bergner wrote:
> On 10/11/18 10:40 PM, Jeff Law wrote:
>> On 10/11/18 1:23 PM, Peter Bergner wrote:
>>> 	* ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.
>> So this helped the alpha & hppa and sh4.
>>
>> I'm still seeing failures on the aarch64, s390x.  No surprise on these
>> since they use LRA by default and would be unaffected by this patch.
> 
> Ok, I was able to reduce the aarch64 test case down to the minimum test case
> that still kept the kernel's __cmpxchg_double() function intact.  I tested
> the patch you're currently running on your builders which changed some
> of the "... == OP_OUT" to "... != OP_IN", etc and it doesn't fix the
> following test case, like it seems to fix the s390 issue and segher's
> small test case (both aarch64 and ppc64).
> 
> It's late here, so I'll start digging into this one in the morning.
> 
> Peter
> 
> 
> 
> bergner@pike:~/gcc/BUGS/PR87507/$ cat slub-min.c
> long
> __cmpxchg_double (unsigned long arg)
> {
>    unsigned long old1 = 0;
>    unsigned long old2 = arg;
>    unsigned long new1 = 0;
>    unsigned long new2 = 0;
>    volatile void *ptr = 0;
> 
>    unsigned long oldval1 = old1;
>    unsigned long oldval2 = old2;
>    register unsigned long x0 asm ("x0") = old1;
>    register unsigned long x1 asm ("x1") = old2;
>    register unsigned long x2 asm ("x2") = new1;
>    register unsigned long x3 asm ("x3") = new2;
>    register unsigned long x4 asm ("x4") = (unsigned long) ptr;
>    asm volatile ("	casp    %[old1], %[old2], %[new1], %[new2], %[v]\n"
> 		"	eor	%[old1], %[old1], %[oldval1]\n"
> 		"	eor	%[old2], %[old2], %[oldval2]\n"
> 		"	orr	%[old1], %[old1], %[old2]\n"
> 		: [old1] "+&r" (x0), [old2] "+&r" (x1), [v] "+Q" (* (unsigned long *) ptr)
> 		: [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), [oldval1] "r" (oldval1),[oldval2] "r" (oldval2)
> 		: "x16", "x17", "x30");
>    return x0;
> }
> bergner@pike:~/gcc/BUGS/PR87507/$ /home/bergner/gcc/build/gcc-fsf-mainline-aarch64-r264897/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-aarch64-r264897/gcc -O2 -march=armv8.1-a -c slub-min.c
> /tmp/ccQCkiSG.s: Assembler messages:
> /tmp/ccQCkiSG.s:24: Error: reg pair must be contiguous at operand 2 -- `casp x0,x6,x2,x3,[x5]'
> 
> 
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-01 18:50                                                 ` Renlin Li
@ 2018-11-01 20:35                                                   ` Segher Boessenkool
  2018-11-01 22:08                                                   ` Peter Bergner
  1 sibling, 0 replies; 66+ messages in thread
From: Segher Boessenkool @ 2018-11-01 20:35 UTC (permalink / raw)
  To: Renlin Li
  Cc: Peter Bergner, Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches

Hi Renlin,

On Thu, Nov 01, 2018 at 06:50:22PM +0000, Renlin Li wrote:
> I got the following dump from the test case.
> 
> x1 is an early clobber operand in the inline assembly statement,
> r92 should conflict with x1?

Yes, I think so.  Or LRA should not pick something that was already a
hard register to spill (this should work without the earlyclobbers, too...
In this testcase that works anyhow, but that seems to be by accident).


Segher

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-01 18:50                                                 ` Renlin Li
  2018-11-01 20:35                                                   ` Segher Boessenkool
@ 2018-11-01 22:08                                                   ` Peter Bergner
  2018-11-02 10:05                                                     ` Renlin Li
  2018-11-05 19:20                                                     ` Jeff Law
  1 sibling, 2 replies; 66+ messages in thread
From: Peter Bergner @ 2018-11-01 22:08 UTC (permalink / raw)
  To: Renlin Li
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool

On 11/1/18 1:50 PM, Renlin Li wrote:
> Is there any update on this issues?
> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.

From the analysis I've done, my commit is just exposing latent issues
in LRA.  Can you try the patch I submitted here to see if it helps?

  https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html

It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
Jeff threw it on his testers and said he saw an arm issue and was
trying to come up with a test case for me to debug.

The specific issue you mentioned with the inline asm and the casp insn
is a bug in LRA where is will spill a user defined hard register and
it shouldn't do that.  My patch above stops that.  The question is
whether we've quashed the rest of the latent bugs.

Peter


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-01 22:08                                                   ` Peter Bergner
@ 2018-11-02 10:05                                                     ` Renlin Li
  2018-11-05 19:20                                                     ` Jeff Law
  1 sibling, 0 replies; 66+ messages in thread
From: Renlin Li @ 2018-11-02 10:05 UTC (permalink / raw)
  To: Peter Bergner
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool

Hi Peter,

On 11/01/2018 10:07 PM, Peter Bergner wrote:
> On 11/1/18 1:50 PM, Renlin Li wrote:
>> Is there any update on this issues?
>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
> 
>  From the analysis I've done, my commit is just exposing latent issues
> in LRA.  

Yes, it looks like some latent issues are been exposed.


Can you try the patch I submitted here to see if it helps?
> 
>    https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html

Thanks for the patch! I'll help to test the patch and let you know the status.

Thanks,
Renlin

> 
> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
> Jeff threw it on his testers and said he saw an arm issue and was
> trying to come up with a test case for me to debug.
> 
> The specific issue you mentioned with the inline asm and the casp insn
> is a bug in LRA where is will spill a user defined hard register and
> it shouldn't do that.  My patch above stops that.  The question is
> whether we've quashed the rest of the latent bugs.
> 
> Peter
> 
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-01 22:08                                                   ` Peter Bergner
  2018-11-02 10:05                                                     ` Renlin Li
@ 2018-11-05 19:20                                                     ` Jeff Law
  2018-11-05 19:36                                                       ` Peter Bergner
  1 sibling, 1 reply; 66+ messages in thread
From: Jeff Law @ 2018-11-05 19:20 UTC (permalink / raw)
  To: Peter Bergner, Renlin Li
  Cc: Vladimir Makarov, Christophe Lyon, gcc Patches, Segher Boessenkool

On 11/1/18 4:07 PM, Peter Bergner wrote:
> On 11/1/18 1:50 PM, Renlin Li wrote:
>> Is there any update on this issues?
>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
> 
> From the analysis I've done, my commit is just exposing latent issues
> in LRA.  Can you try the patch I submitted here to see if it helps?
> 
>   https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
> 
> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
> Jeff threw it on his testers and said he saw an arm issue and was
> trying to come up with a test case for me to debug.
So I don't think the ARM issues are related to your patch, they may have
been related the combiner changes that went in around the same time.

At this point your patch appears to be DTRT across the board.  The only
fallout is the bogus s390 asm it caught in the kernel.

Jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-05 19:20                                                     ` Jeff Law
@ 2018-11-05 19:36                                                       ` Peter Bergner
  2018-11-05 19:41                                                         ` Jeff Law
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-11-05 19:36 UTC (permalink / raw)
  To: Jeff Law
  Cc: Renlin Li, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool

On 11/5/18 1:20 PM, Jeff Law wrote:
> On 11/1/18 4:07 PM, Peter Bergner wrote:
>> On 11/1/18 1:50 PM, Renlin Li wrote:
>>> Is there any update on this issues?
>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
>>
>> From the analysis I've done, my commit is just exposing latent issues
>> in LRA.  Can you try the patch I submitted here to see if it helps?
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
>>
>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
>> Jeff threw it on his testers and said he saw an arm issue and was
>> trying to come up with a test case for me to debug.
> So I don't think the ARM issues are related to your patch, they may have
> been related the combiner changes that went in around the same time.
> 
> At this point your patch appears to be DTRT across the board.  The only
> fallout is the bogus s390 asm it caught in the kernel.

Cool.  I will note that I contacted the s390 kernel guys and gave them a
fix to their broken constraints in that asm and they are going to fix it.

Is the above an approval to commit the patch mentioned above or do you
still want to wait until the ARM issues are fully resolved?

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-05 19:36                                                       ` Peter Bergner
@ 2018-11-05 19:41                                                         ` Jeff Law
  2018-11-06 10:52                                                           ` Renlin Li
  0 siblings, 1 reply; 66+ messages in thread
From: Jeff Law @ 2018-11-05 19:41 UTC (permalink / raw)
  To: Peter Bergner
  Cc: Renlin Li, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool

On 11/5/18 12:36 PM, Peter Bergner wrote:
> On 11/5/18 1:20 PM, Jeff Law wrote:
>> On 11/1/18 4:07 PM, Peter Bergner wrote:
>>> On 11/1/18 1:50 PM, Renlin Li wrote:
>>>> Is there any update on this issues?
>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
>>>
>>> From the analysis I've done, my commit is just exposing latent issues
>>> in LRA.  Can you try the patch I submitted here to see if it helps?
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
>>>
>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
>>> Jeff threw it on his testers and said he saw an arm issue and was
>>> trying to come up with a test case for me to debug.
>> So I don't think the ARM issues are related to your patch, they may have
>> been related the combiner changes that went in around the same time.
>>
>> At this point your patch appears to be DTRT across the board.  The only
>> fallout is the bogus s390 asm it caught in the kernel.
> 
> Cool.  I will note that I contacted the s390 kernel guys and gave them a
> fix to their broken constraints in that asm and they are going to fix it.
Sounds good.  I've got a hack in my tester to "fix" that bogus asm until
the kernel folks do it right.


> 
> Is the above an approval to commit the patch mentioned above or do you
> still want to wait until the ARM issues are fully resolved?
I think knowing the patch addresses all the known issues related to the
earlier IRA/LRA change unblocks the review step.  I don't think we need
to wait for the other ARM issues to be resolved -- they seem to be
unrelated to the IRA/LRA changes.

jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-05 19:41                                                         ` Jeff Law
@ 2018-11-06 10:52                                                           ` Renlin Li
  2018-11-06 10:57                                                             ` Ramana Radhakrishnan
  2018-11-06 18:58                                                             ` Jeff Law
  0 siblings, 2 replies; 66+ messages in thread
From: Renlin Li @ 2018-11-06 10:52 UTC (permalink / raw)
  To: Jeff Law, Peter Bergner
  Cc: Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov

Hi Jeff & Peter,

On 11/05/2018 07:41 PM, Jeff Law wrote:
> On 11/5/18 12:36 PM, Peter Bergner wrote:
>> On 11/5/18 1:20 PM, Jeff Law wrote:
>>> On 11/1/18 4:07 PM, Peter Bergner wrote:
>>>> On 11/1/18 1:50 PM, Renlin Li wrote:
>>>>> Is there any update on this issues?
>>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
>>>>
>>>>  From the analysis I've done, my commit is just exposing latent issues
>>>> in LRA.  Can you try the patch I submitted here to see if it helps?
>>>>
>>>>    https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
>>>>
>>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
>>>> Jeff threw it on his testers and said he saw an arm issue and was
>>>> trying to come up with a test case for me to debug.
>>> So I don't think the ARM issues are related to your patch, they may have
>>> been related the combiner changes that went in around the same time.
Yes, there are issues related to the combiner changes.

But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap native toolchain mis-compiled.
And the new patch seems not fix this problem.

I am trying to extract a test case, but it is a little bit hard as the toolchain itself is mis-compiled.
And it ICEs when compile test case with it.

I created a bugzilla ticket for this, PR87899.

./gcc/cc1 ~/gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c  -O3
  test main
Analyzing compilation unit
Performing interprocedural optimizations
  <*free_lang_data> <visibility> <build_ssa_passes> <opt_local_passes> <targetclone> <free-fnsummary> <increase_alignment>Streaming LTO
  <whole-program> <profile_estimate> <icf> <devirt> <cp> <fnsummary> <inline> <pure-const> <free-fnsummary> <static-var> <single-use> 
<comdats>Assembling functions:
  <materialize-all-clones> testduring GIMPLE pass: ldist

gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c: In function ‘test’:
gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c:9:1: internal compiler error: Segmentation fault
     9 | test (void)
       | ^~~~
0x5c3a37 crash_signal
	../../gcc/gcc/toplev.c:325
0x63ef6b inchash::hash::add(void const*, unsigned int)
	../../gcc/gcc/inchash.h:100
0x63ef6b inchash::hash::add_ptr(void const*)
	../../gcc/gcc/inchash.h:94
0x63ef6b ddr_hasher::hash(data_dependence_relation const*)
	../../gcc/gcc/tree-loop-distribution.c:143
0x63ef6b hash_table<ddr_hasher, xcallocator>::find_slot(data_dependence_relation* const&, insert_option)
	../../gcc/gcc/hash-table.h:414
0x63ef6b get_data_dependence
	../../gcc/gcc/tree-loop-distribution.c:1184
0x63f2bd data_dep_in_cycle_p
	../../gcc/gcc/tree-loop-distribution.c:1210
0x63f2bd update_type_for_merge
	../../gcc/gcc/tree-loop-distribution.c:1255
0x64064b build_rdg_partition_for_vertex
	../../gcc/gcc/tree-loop-distribution.c:1302
0x64064b rdg_build_partitions
	../../gcc/gcc/tree-loop-distribution.c:1754
0x64064b distribute_loop
	../../gcc/gcc/tree-loop-distribution.c:2795
0x642299 execute
	../../gcc/gcc/tree-loop-distribution.c:3133
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.



Regards
Renlin



>>>
>>> At this point your patch appears to be DTRT across the board.  The only
>>> fallout is the bogus s390 asm it caught in the kernel.
>>
>> Cool.  I will note that I contacted the s390 kernel guys and gave them a
>> fix to their broken constraints in that asm and they are going to fix it.
> Sounds good.  I've got a hack in my tester to "fix" that bogus asm until
> the kernel folks do it right.
> 
> 
>>
>> Is the above an approval to commit the patch mentioned above or do you
>> still want to wait until the ARM issues are fully resolved?
> I think knowing the patch addresses all the known issues related to the
> earlier IRA/LRA change unblocks the review step.  I don't think we need
> to wait for the other ARM issues to be resolved -- they seem to be
> unrelated to the IRA/LRA changes.
> 
> jeff
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-06 10:52                                                           ` Renlin Li
@ 2018-11-06 10:57                                                             ` Ramana Radhakrishnan
  2018-11-06 12:23                                                               ` Renlin Li
  2018-11-06 18:58                                                             ` Jeff Law
  1 sibling, 1 reply; 66+ messages in thread
From: Ramana Radhakrishnan @ 2018-11-06 10:57 UTC (permalink / raw)
  To: Renlin Li
  Cc: Jeff Law, bergner, Vladimir Makarov, Christophe Lyon,
	gcc-patches, Segher Boessenkool, Ramana Radhakrishnan,
	Kyrylo Tkachov

On Tue, Nov 6, 2018 at 10:52 AM Renlin Li <renlin.li@foss.arm.com> wrote:
>
> Hi Jeff & Peter,
>
> On 11/05/2018 07:41 PM, Jeff Law wrote:
> > On 11/5/18 12:36 PM, Peter Bergner wrote:
> >> On 11/5/18 1:20 PM, Jeff Law wrote:
> >>> On 11/1/18 4:07 PM, Peter Bergner wrote:
> >>>> On 11/1/18 1:50 PM, Renlin Li wrote:
> >>>>> Is there any update on this issues?
> >>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
> >>>>
> >>>>  From the analysis I've done, my commit is just exposing latent issues
> >>>> in LRA.  Can you try the patch I submitted here to see if it helps?
> >>>>
> >>>>    https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
> >>>>
> >>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
> >>>> Jeff threw it on his testers and said he saw an arm issue and was
> >>>> trying to come up with a test case for me to debug.
> >>> So I don't think the ARM issues are related to your patch, they may have
> >>> been related the combiner changes that went in around the same time.
> Yes, there are issues related to the combiner changes.

But didn't the combiner changes come *after* these patches ? So IIUC,
Renlin has been trying to get these fixed *without* the combine
patches but just with your patch applied on top of the revision where
the problem started showing up .

Can you confirm that Renlin ?


Ramana
>
> But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap native toolchain mis-compiled.
> And the new patch seems not fix this problem.
>
> I am trying to extract a test case, but it is a little bit hard as the toolchain itself is mis-compiled.
> And it ICEs when compile test case with it.
>
> I created a bugzilla ticket for this, PR87899.
>
> ./gcc/cc1 ~/gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c  -O3
>   test main
> Analyzing compilation unit
> Performing interprocedural optimizations
>   <*free_lang_data> <visibility> <build_ssa_passes> <opt_local_passes> <targetclone> <free-fnsummary> <increase_alignment>Streaming LTO
>   <whole-program> <profile_estimate> <icf> <devirt> <cp> <fnsummary> <inline> <pure-const> <free-fnsummary> <static-var> <single-use>
> <comdats>Assembling functions:
>   <materialize-all-clones> testduring GIMPLE pass: ldist
>
> gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c: In function ‘test’:
> gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c:9:1: internal compiler error: Segmentation fault
>      9 | test (void)
>        | ^~~~
> 0x5c3a37 crash_signal
>         ../../gcc/gcc/toplev.c:325
> 0x63ef6b inchash::hash::add(void const*, unsigned int)
>         ../../gcc/gcc/inchash.h:100
> 0x63ef6b inchash::hash::add_ptr(void const*)
>         ../../gcc/gcc/inchash.h:94
> 0x63ef6b ddr_hasher::hash(data_dependence_relation const*)
>         ../../gcc/gcc/tree-loop-distribution.c:143
> 0x63ef6b hash_table<ddr_hasher, xcallocator>::find_slot(data_dependence_relation* const&, insert_option)
>         ../../gcc/gcc/hash-table.h:414
> 0x63ef6b get_data_dependence
>         ../../gcc/gcc/tree-loop-distribution.c:1184
> 0x63f2bd data_dep_in_cycle_p
>         ../../gcc/gcc/tree-loop-distribution.c:1210
> 0x63f2bd update_type_for_merge
>         ../../gcc/gcc/tree-loop-distribution.c:1255
> 0x64064b build_rdg_partition_for_vertex
>         ../../gcc/gcc/tree-loop-distribution.c:1302
> 0x64064b rdg_build_partitions
>         ../../gcc/gcc/tree-loop-distribution.c:1754
> 0x64064b distribute_loop
>         ../../gcc/gcc/tree-loop-distribution.c:2795
> 0x642299 execute
>         ../../gcc/gcc/tree-loop-distribution.c:3133
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <https://gcc.gnu.org/bugs/> for instructions.
>
>
>
> Regards
> Renlin
>
>
>
> >>>
> >>> At this point your patch appears to be DTRT across the board.  The only
> >>> fallout is the bogus s390 asm it caught in the kernel.
> >>
> >> Cool.  I will note that I contacted the s390 kernel guys and gave them a
> >> fix to their broken constraints in that asm and they are going to fix it.
> > Sounds good.  I've got a hack in my tester to "fix" that bogus asm until
> > the kernel folks do it right.
> >
> >
> >>
> >> Is the above an approval to commit the patch mentioned above or do you
> >> still want to wait until the ARM issues are fully resolved?
> > I think knowing the patch addresses all the known issues related to the
> > earlier IRA/LRA change unblocks the review step.  I don't think we need
> > to wait for the other ARM issues to be resolved -- they seem to be
> > unrelated to the IRA/LRA changes.
> >
> > jeff
> >

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-06 10:57                                                             ` Ramana Radhakrishnan
@ 2018-11-06 12:23                                                               ` Renlin Li
  2018-11-06 18:46                                                                 ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Renlin Li @ 2018-11-06 12:23 UTC (permalink / raw)
  To: Ramana Radhakrishnan
  Cc: Jeff Law, bergner, Vladimir Makarov, Christophe Lyon,
	gcc-patches, Segher Boessenkool, Ramana Radhakrishnan,
	Kyrylo Tkachov

Hi Ramana,

On 11/06/2018 10:57 AM, Ramana Radhakrishnan wrote:
> On Tue, Nov 6, 2018 at 10:52 AM Renlin Li <renlin.li@foss.arm.com> wrote:
>>
>> Hi Jeff & Peter,
>>
>> On 11/05/2018 07:41 PM, Jeff Law wrote:
>>> On 11/5/18 12:36 PM, Peter Bergner wrote:
>>>> On 11/5/18 1:20 PM, Jeff Law wrote:
>>>>> On 11/1/18 4:07 PM, Peter Bergner wrote:
>>>>>> On 11/1/18 1:50 PM, Renlin Li wrote:
>>>>>>> Is there any update on this issues?
>>>>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.
>>>>>>
>>>>>>   From the analysis I've done, my commit is just exposing latent issues
>>>>>> in LRA.  Can you try the patch I submitted here to see if it helps?
>>>>>>
>>>>>>     https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
>>>>>>
>>>>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
>>>>>> Jeff threw it on his testers and said he saw an arm issue and was
>>>>>> trying to come up with a test case for me to debug.
>>>>> So I don't think the ARM issues are related to your patch, they may have
>>>>> been related the combiner changes that went in around the same time.
>> Yes, there are issues related to the combiner changes.
> 
> But didn't the combiner changes come *after* these patches ? So IIUC,
> Renlin has been trying to get these fixed *without* the combine
> patches but just with your patch applied on top of the revision where
> the problem started showing up .
> 
> Can you confirm that Renlin ?

I just did a bootstrap again with everything up to r264897 which is Oct 6.
it produce the ICE I mentioned on the PR87899.

The first combiner patch on Oct 22.

Regards,
Renlin

> 
> 
> Ramana
>>
>> But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap native toolchain mis-compiled.
>> And the new patch seems not fix this problem.
>>
>> I am trying to extract a test case, but it is a little bit hard as the toolchain itself is mis-compiled.
>> And it ICEs when compile test case with it.
>>
>> I created a bugzilla ticket for this, PR87899.
>>
>> ./gcc/cc1 ~/gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c  -O3
>>    test main
>> Analyzing compilation unit
>> Performing interprocedural optimizations
>>    <*free_lang_data> <visibility> <build_ssa_passes> <opt_local_passes> <targetclone> <free-fnsummary> <increase_alignment>Streaming LTO
>>    <whole-program> <profile_estimate> <icf> <devirt> <cp> <fnsummary> <inline> <pure-const> <free-fnsummary> <static-var> <single-use>
>> <comdats>Assembling functions:
>>    <materialize-all-clones> testduring GIMPLE pass: ldist
>>
>> gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c: In function ‘test’:
>> gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c:9:1: internal compiler error: Segmentation fault
>>       9 | test (void)
>>         | ^~~~
>> 0x5c3a37 crash_signal
>>          ../../gcc/gcc/toplev.c:325
>> 0x63ef6b inchash::hash::add(void const*, unsigned int)
>>          ../../gcc/gcc/inchash.h:100
>> 0x63ef6b inchash::hash::add_ptr(void const*)
>>          ../../gcc/gcc/inchash.h:94
>> 0x63ef6b ddr_hasher::hash(data_dependence_relation const*)
>>          ../../gcc/gcc/tree-loop-distribution.c:143
>> 0x63ef6b hash_table<ddr_hasher, xcallocator>::find_slot(data_dependence_relation* const&, insert_option)
>>          ../../gcc/gcc/hash-table.h:414
>> 0x63ef6b get_data_dependence
>>          ../../gcc/gcc/tree-loop-distribution.c:1184
>> 0x63f2bd data_dep_in_cycle_p
>>          ../../gcc/gcc/tree-loop-distribution.c:1210
>> 0x63f2bd update_type_for_merge
>>          ../../gcc/gcc/tree-loop-distribution.c:1255
>> 0x64064b build_rdg_partition_for_vertex
>>          ../../gcc/gcc/tree-loop-distribution.c:1302
>> 0x64064b rdg_build_partitions
>>          ../../gcc/gcc/tree-loop-distribution.c:1754
>> 0x64064b distribute_loop
>>          ../../gcc/gcc/tree-loop-distribution.c:2795
>> 0x642299 execute
>>          ../../gcc/gcc/tree-loop-distribution.c:3133
>> Please submit a full bug report,
>> with preprocessed source if appropriate.
>> Please include the complete backtrace with any bug report.
>> See <https://gcc.gnu.org/bugs/> for instructions.
>>
>>
>>
>> Regards
>> Renlin
>>
>>
>>
>>>>>
>>>>> At this point your patch appears to be DTRT across the board.  The only
>>>>> fallout is the bogus s390 asm it caught in the kernel.
>>>>
>>>> Cool.  I will note that I contacted the s390 kernel guys and gave them a
>>>> fix to their broken constraints in that asm and they are going to fix it.
>>> Sounds good.  I've got a hack in my tester to "fix" that bogus asm until
>>> the kernel folks do it right.
>>>
>>>
>>>>
>>>> Is the above an approval to commit the patch mentioned above or do you
>>>> still want to wait until the ARM issues are fully resolved?
>>> I think knowing the patch addresses all the known issues related to the
>>> earlier IRA/LRA change unblocks the review step.  I don't think we need
>>> to wait for the other ARM issues to be resolved -- they seem to be
>>> unrelated to the IRA/LRA changes.
>>>
>>> jeff
>>>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-06 12:23                                                               ` Renlin Li
@ 2018-11-06 18:46                                                                 ` Peter Bergner
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-11-06 18:46 UTC (permalink / raw)
  To: Renlin Li
  Cc: Ramana Radhakrishnan, Jeff Law, Vladimir Makarov,
	Christophe Lyon, gcc-patches, Segher Boessenkool,
	Ramana Radhakrishnan, Kyrylo Tkachov

On 11/6/18 6:23 AM, Renlin Li wrote:
> I just did a bootstrap again with everything up to r264897 which is Oct 6.
> it produce the ICE I mentioned on the PR87899.
> 
> The first combiner patch on Oct 22.

Do the testsuite results (for disable-bootstrap builds) differ between
r264896 and r264897?  If so, that would be much easier to track down.

If not, maybe the following patch could help to narrow down which gcc
source file(s) are being miscompiled by allowing you to disable the
special handling of copy conflicts with an option?  The option default
(ie, not using the option or -fno-ira-copies-conflict) is the same behavior
as now and -fira-copies-conflict would make things behave like they did
before my patch.

Peter


Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 265402)
+++ gcc/common.opt	(working copy)
@@ -1761,6 +1761,10 @@ Enum(ira_region) String(all) Value(IRA_R
 EnumValue
 Enum(ira_region) String(mixed) Value(IRA_REGION_MIXED)
 
+fira-copies-conflict
+Common Report Var(flag_ira_copies_conflict) Init(0) Optimization
+Make pseudos connected by a copy conflict
+
 fira-hoist-pressure
 Common Report Var(flag_ira_hoist_pressure) Init(1) Optimization
 Use IRA based register pressure calculation
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 265402)
+++ gcc/ira-lives.c	(working copy)
@@ -1066,7 +1066,7 @@ non_conflicting_reg_copy_p (rtx_insn *in
 {
   /* Reload has issues with overlapping pseudos being assigned to the
      same hard register, so don't allow it.  See PR87600 for details.  */
-  if (!targetm.lra_p ())
+  if (flag_ira_copies_conflict || !targetm.lra_p ())
     return NULL_RTX;
 
   rtx set = single_set (insn);

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-06 10:52                                                           ` Renlin Li
  2018-11-06 10:57                                                             ` Ramana Radhakrishnan
@ 2018-11-06 18:58                                                             ` Jeff Law
  2018-11-08 10:57                                                               ` Renlin Li
  1 sibling, 1 reply; 66+ messages in thread
From: Jeff Law @ 2018-11-06 18:58 UTC (permalink / raw)
  To: Renlin Li, Peter Bergner
  Cc: Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov

On 11/6/18 3:52 AM, Renlin Li wrote:
> Hi Jeff & Peter,
> 
> On 11/05/2018 07:41 PM, Jeff Law wrote:
>> On 11/5/18 12:36 PM, Peter Bergner wrote:
>>> On 11/5/18 1:20 PM, Jeff Law wrote:
>>>> On 11/1/18 4:07 PM, Peter Bergner wrote:
>>>>> On 11/1/18 1:50 PM, Renlin Li wrote:
>>>>>> Is there any update on this issues?
>>>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled
>>>>>> for a while.
>>>>>
>>>>>  From the analysis I've done, my commit is just exposing latent issues
>>>>> in LRA.  Can you try the patch I submitted here to see if it helps?
>>>>>
>>>>>    https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
>>>>>
>>>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
>>>>> Jeff threw it on his testers and said he saw an arm issue and was
>>>>> trying to come up with a test case for me to debug.
>>>> So I don't think the ARM issues are related to your patch, they may
>>>> have
>>>> been related the combiner changes that went in around the same time.
> Yes, there are issues related to the combiner changes.
> 
> But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap
> native toolchain mis-compiled.
> And the new patch seems not fix this problem.
That's strange.  I'm bootstrapping arm-linux-gnueabihf daily with qemu +
a suitable root filesystem using Peter's most recent testing patch.


> 
> I am trying to extract a test case, but it is a little bit hard as the
> toolchain itself is mis-compiled.
> And it ICEs when compile test case with it.
What I would suggest doing is to first start with running the testsuite
against the stage1 compiler before/after Peter's changes.  Sometimes
that'll turn up something useful and you can avoid debuging things
through stage2/stage3.


jeff

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-06 18:58                                                             ` Jeff Law
@ 2018-11-08 10:57                                                               ` Renlin Li
  2018-11-08 11:49                                                                 ` Richard Biener
                                                                                   ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Renlin Li @ 2018-11-08 10:57 UTC (permalink / raw)
  To: Jeff Law, Peter Bergner
  Cc: Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov,
	Wilco Dijkstra

[-- Attachment #1: Type: text/plain, Size: 3097 bytes --]

Hi,

On 11/06/2018 06:58 PM, Jeff Law wrote:
> On 11/6/18 3:52 AM, Renlin Li wrote:
>> Hi Jeff & Peter,
>>
>> On 11/05/2018 07:41 PM, Jeff Law wrote:
>>> On 11/5/18 12:36 PM, Peter Bergner wrote:
>>>> On 11/5/18 1:20 PM, Jeff Law wrote:
>>>>> On 11/1/18 4:07 PM, Peter Bergner wrote:
>>>>>> On 11/1/18 1:50 PM, Renlin Li wrote:
>>>>>>> Is there any update on this issues?
>>>>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled
>>>>>>> for a while.
>>>>>>
>>>>>>   From the analysis I've done, my commit is just exposing latent issues
>>>>>> in LRA.  Can you try the patch I submitted here to see if it helps?
>>>>>>
>>>>>>     https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
>>>>>>
>>>>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
>>>>>> Jeff threw it on his testers and said he saw an arm issue and was
>>>>>> trying to come up with a test case for me to debug.
>>>>> So I don't think the ARM issues are related to your patch, they may
>>>>> have
>>>>> been related the combiner changes that went in around the same time.
>> Yes, there are issues related to the combiner changes.
>>
>> But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap
>> native toolchain mis-compiled.
>> And the new patch seems not fix this problem.
> That's strange.  I'm bootstrapping arm-linux-gnueabihf daily with qemu +
> a suitable root filesystem using Peter's most recent testing patch.
> 
> 
>>
>> I am trying to extract a test case, but it is a little bit hard as the
>> toolchain itself is mis-compiled.
>> And it ICEs when compile test case with it.
> What I would suggest doing is to first start with running the testsuite
> against the stage1 compiler before/after Peter's changes.  Sometimes
> that'll turn up something useful and you can avoid debuging things
> through stage2/stage3.

Hi Jeff,
Thanks for the suggestion! I could reproduce it with stage1 compiler.



Hi Peter,

I think I found the problem!

As described in the PR, a hard register is used in
an pre/post modify expression. The hard register is live, but updated.
In this case, we should make it conflicting with all pseudos live at
that point.  Does it make sense?





It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in the
PR.

I attached the patch for discussion.  I haven't give a complete test on arm or
any other targets, yet. (Probably need more adjusting)

I will run arm and aarch64 regression test, cross and native.

Regards,
Renlin

BTW, The pre/post modify expression is generated by auto_inc/auto_dec pass.
somehow, it merges with hard register,  for example function argument registers.
This optimization make the life for RA harder. Probably we don't want that pass
too aggressive. @Wilco.
(This IRA/LRA and the combiner change reveals a lot of issues,
force us to work on it and improve the compiler :) .)

gcc/ChangeLog:

2018-11-08  Renlin Li  <renlin.li@arm.com>
         PR middle-end/87899
	* lra-lives.c (process_bb_lives): Make hard register of INOUT
	type conflict with all live pseudo.







> 
> 
> jeff
> 

[-- Attachment #2: lra.patch --]
[-- Type: text/x-patch, Size: 1338 bytes --]

diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 0bf8cd06a302c8a6fcb914b94f953cdaa86597a2..370a7254cac7dbde4e320424e09274cee66c50b9 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -878,11 +878,25 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
 
       /* See which defined values die here.  */
       for (reg = curr_id->regs; reg != NULL; reg = reg->next)
-	if (reg->type == OP_OUT
-	    && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
-	  need_curr_point_incr
-	    |= mark_regno_dead (reg->regno, reg->biggest_mode,
-				curr_point);
+	if (! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p)
+	  {
+	    if (reg->type == OP_OUT)
+	      need_curr_point_incr
+		|= mark_regno_dead (reg->regno, reg->biggest_mode,
+				    curr_point);
+
+	    // This is a hard register, and it must be live.  Keep it live and
+	    // make it conflict with all live pseudo registers.
+	    else if (reg->type == OP_INOUT && reg->regno < FIRST_PSEUDO_REGISTER)
+	      {
+		lra_assert (TEST_HARD_REG_BIT (hard_regs_live, reg->regno));
+
+		unsigned int i;
+		EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
+		  SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs,
+				    reg->regno);
+	      }
+	  }
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type == OP_OUT

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 10:57                                                               ` Renlin Li
@ 2018-11-08 11:49                                                                 ` Richard Biener
  2018-11-08 14:29                                                                   ` Peter Bergner
  2018-11-08 12:35                                                                 ` Peter Bergner
  2018-11-08 15:21                                                                 ` Peter Bergner
  2 siblings, 1 reply; 66+ messages in thread
From: Richard Biener @ 2018-11-08 11:49 UTC (permalink / raw)
  To: renlin.li
  Cc: Jeff Law, bergner, Vladimir N. Makarov, Christophe Lyon,
	GCC Patches, Segher Boessenkool, Ramana Radhakrishnan,
	kyrylo.tkachov, Wilco Dijkstra

On Thu, Nov 8, 2018 at 11:57 AM Renlin Li <renlin.li@foss.arm.com> wrote:
>
> Hi,
>
> On 11/06/2018 06:58 PM, Jeff Law wrote:
> > On 11/6/18 3:52 AM, Renlin Li wrote:
> >> Hi Jeff & Peter,
> >>
> >> On 11/05/2018 07:41 PM, Jeff Law wrote:
> >>> On 11/5/18 12:36 PM, Peter Bergner wrote:
> >>>> On 11/5/18 1:20 PM, Jeff Law wrote:
> >>>>> On 11/1/18 4:07 PM, Peter Bergner wrote:
> >>>>>> On 11/1/18 1:50 PM, Renlin Li wrote:
> >>>>>>> Is there any update on this issues?
> >>>>>>> arm-none-linux-gnueabihf native toolchain has been mis-compiled
> >>>>>>> for a while.
> >>>>>>
> >>>>>>   From the analysis I've done, my commit is just exposing latent issues
> >>>>>> in LRA.  Can you try the patch I submitted here to see if it helps?
> >>>>>>
> >>>>>>     https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html
> >>>>>>
> >>>>>> It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
> >>>>>> Jeff threw it on his testers and said he saw an arm issue and was
> >>>>>> trying to come up with a test case for me to debug.
> >>>>> So I don't think the ARM issues are related to your patch, they may
> >>>>> have
> >>>>> been related the combiner changes that went in around the same time.
> >> Yes, there are issues related to the combiner changes.
> >>
> >> But the IRA/LRA change dose cause the arm-none-linux-gnueabihf bootstrap
> >> native toolchain mis-compiled.
> >> And the new patch seems not fix this problem.
> > That's strange.  I'm bootstrapping arm-linux-gnueabihf daily with qemu +
> > a suitable root filesystem using Peter's most recent testing patch.
> >
> >
> >>
> >> I am trying to extract a test case, but it is a little bit hard as the
> >> toolchain itself is mis-compiled.
> >> And it ICEs when compile test case with it.
> > What I would suggest doing is to first start with running the testsuite
> > against the stage1 compiler before/after Peter's changes.  Sometimes
> > that'll turn up something useful and you can avoid debuging things
> > through stage2/stage3.
>
> Hi Jeff,
> Thanks for the suggestion! I could reproduce it with stage1 compiler.
>
>
>
> Hi Peter,
>
> I think I found the problem!
>
> As described in the PR, a hard register is used in
> an pre/post modify expression. The hard register is live, but updated.
> In this case, we should make it conflicting with all pseudos live at
> that point.  Does it make sense?

Err, that looks very much like a hack that manages to hide the issue.

Esp. adding conflicts in a loop that says "See which defined values die here."
is quite fishy.

Richard.

>
>
>
>
> It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in the
> PR.
>
> I attached the patch for discussion.  I haven't give a complete test on arm or
> any other targets, yet. (Probably need more adjusting)
>
> I will run arm and aarch64 regression test, cross and native.
>
> Regards,
> Renlin
>
> BTW, The pre/post modify expression is generated by auto_inc/auto_dec pass.
> somehow, it merges with hard register,  for example function argument registers.
> This optimization make the life for RA harder. Probably we don't want that pass
> too aggressive. @Wilco.
> (This IRA/LRA and the combiner change reveals a lot of issues,
> force us to work on it and improve the compiler :) .)
>
> gcc/ChangeLog:
>
> 2018-11-08  Renlin Li  <renlin.li@arm.com>
>          PR middle-end/87899
>         * lra-lives.c (process_bb_lives): Make hard register of INOUT
>         type conflict with all live pseudo.
>
>
>
>
>
>
>
> >
> >
> > jeff
> >

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 10:57                                                               ` Renlin Li
  2018-11-08 11:49                                                                 ` Richard Biener
@ 2018-11-08 12:35                                                                 ` Peter Bergner
  2018-11-08 13:42                                                                   ` Renlin Li
  2018-11-08 15:21                                                                 ` Peter Bergner
  2 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-11-08 12:35 UTC (permalink / raw)
  To: Renlin Li
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov,
	Wilco Dijkstra

On 11/8/18 4:57 AM, Renlin Li wrote:
> I think I found the problem!
> 
> As described in the PR, a hard register is used in
> an pre/post modify expression. The hard register is live, but updated.
> In this case, we should make it conflicting with all pseudos live at
> that point.  Does it make sense?

Do you have a reproducer test case I can look at?  I'd like to see the
problematical rtl to help me determine whether your patch is correct
or not.  ...and thank you for debugging this!

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 12:35                                                                 ` Peter Bergner
@ 2018-11-08 13:42                                                                   ` Renlin Li
  0 siblings, 0 replies; 66+ messages in thread
From: Renlin Li @ 2018-11-08 13:42 UTC (permalink / raw)
  To: Peter Bergner
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov,
	Wilco Dijkstra

Hi Peter,

On 11/08/2018 12:35 PM, Peter Bergner wrote:
> On 11/8/18 4:57 AM, Renlin Li wrote:
>> I think I found the problem!
>>
>> As described in the PR, a hard register is used in
>> an pre/post modify expression. The hard register is live, but updated.
>> In this case, we should make it conflicting with all pseudos live at
>> that point.  Does it make sense?
> 
> Do you have a reproducer test case I can look at?  I'd like to see the
> problematical rtl to help me determine whether your patch is correct
> or not.  ...and thank you for debugging this!
> 
> Peter
> 

Sure! (I was trying to send the mail, but it failed with large attachment.)
I attached the dump file in the bugzilla ticket: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899
I remove the unrelated dump (tar -xzvf xxx.tgz)

The code you want to check is the following in ira pass:
insn 10905: r1 = r2040
insn 208: use and update r1 with pre_modify
insn 191: use pseudo r2040

I could not create a test case. This dump is created with stage1 compiler compiling next stage compiler.
The (not helpful) command line is:


/home/renlin/try-new/./prev-gcc/cc1plus -quiet -nostdinc++ -v -I 
/home/renlin/try-new/prev-arm-none-linux-gnueabihf/libstdc++-v3/include/arm-none-linux-gnueabihf -I 
/home/renlin/try-new/prev-arm-none-linux-gnueabihf/libstdc++-v3/include -I /home/renlin/gcc/libstdc++-v3/libsupc++ -I . -I . -I ../../gcc/gcc -I 
../../gcc/gcc/. -I ../../gcc/gcc/../include -I ../../gcc/gcc/../libcpp/include -I ../../gcc/gcc/../libdecnumber -I ../../gcc/gcc/../libdecnumber/dpd 
-I ../libdecnumber -I ../../gcc/gcc/../libbacktrace -imultilib . -imultiarch arm-linux-gnueabihf -iprefix 
/home/renlin/try-new/prev-gcc/../lib/gcc/arm-none-linux-gnueabihf/9.0.0/ -isystem /home/renlin/try-new/./prev-gcc/include -isystem 
/home/renlin/try-new/./prev-gcc/include-fixed -MMD tree-loop-distribution.d -MF ./.deps/tree-loop-distribution.TPo -MP -MT tree-loop-distribution.o 
-D_GNU_SOURCE -D IN_GCC -D HAVE_CONFIG_H ../../gcc/gcc/tree-loop-distribution.c -quiet -dumpbase tree-loop-distribution.c -mfloat-abi=hard -mfpu=neon 
-mthumb -mtls-dialect=gnu -march=armv7-a+simd -auxbase-strip tree-loop-distribution.s -g -gtoggle -O2 -Wextra -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wsuggest-attribute=format -Woverloaded-virtual -Wpedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -version 
-fno-PIE -fno-checking -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -fno-common -o tree-loop-distribution.s

Regards,
Renlin

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 11:49                                                                 ` Richard Biener
@ 2018-11-08 14:29                                                                   ` Peter Bergner
  2018-11-08 19:01                                                                     ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-11-08 14:29 UTC (permalink / raw)
  To: Richard Biener
  Cc: renlin.li, Jeff Law, Vladimir N. Makarov, Christophe Lyon,
	GCC Patches, Segher Boessenkool, Ramana Radhakrishnan,
	kyrylo.tkachov, Wilco Dijkstra

On 11/8/18 5:48 AM, Richard Biener wrote:
> Err, that looks very much like a hack that manages to hide the issue.

It's true we do not want to hide the issue by adding unneeded conflicts,
since that can lead to unnecessary spills.  However, ...


> Esp. adding conflicts in a loop that says "See which defined values die here."
> is quite fishy.

..the original loop is dealing with some of the gory details you never read
about in academic RA papers.  This code is used to catch the case where an insn
defines a register(s) that is never used.  Because it is never used, it never
ends up in the "live" (ie, live and available) set, which can cause us to miss
some required conflicts.

That said, I still need to look at the RTL from the bad program before
determining whether the patch is correct or not.  Computing accurate
conflict information is a delicate thing.

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 10:57                                                               ` Renlin Li
  2018-11-08 11:49                                                                 ` Richard Biener
  2018-11-08 12:35                                                                 ` Peter Bergner
@ 2018-11-08 15:21                                                                 ` Peter Bergner
  2018-11-08 16:20                                                                   ` Renlin Li
  2 siblings, 1 reply; 66+ messages in thread
From: Peter Bergner @ 2018-11-08 15:21 UTC (permalink / raw)
  To: Renlin Li
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov,
	Wilco Dijkstra

On 11/8/18 4:57 AM, Renlin Li wrote:
> I think I found the problem!
> 
> As described in the PR, a hard register is used in
> an pre/post modify expression. The hard register is live, but updated.
> In this case, we should make it conflicting with all pseudos live at
> that point.  Does it make sense?
[snip]
> It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in the
> PR.
> 
> I attached the patch for discussion.  I haven't give a complete test on arm or
> any other targets, yet. (Probably need more adjusting)

Yes, this is the problem.  We see from the dump, that r2040 does not conflict with
hard reg r1:

;; a2040(r1597,l0) conflicts: <list of pseudo regs>
;;     total conflict hard regs:
;;     conflict hard regs:

...and we have the following RTL:

(insn 10905 179 199 2 (set (reg:SI 1 r1)
        (reg/f:SI 2040)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
     (nil))

...

(insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 1 r1)
                (plus:SI (reg:SI 1 r1)
                    (const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
        (reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
     (expr_list:REG_INC (reg:SI 1 r1)
        (nil)))

...

(insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 2040)
                (const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
        (reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 {*thumb2_movsi_vfp}
     (nil))

So my patch caused us to (correctly) skip adding a conflict between r1 and
r2040 due to the register copy in insn 10905.  However, they really should
conflict as you found due to the definition of r1 in insn 208 and the fact
we don't add one is a latent bug in LRA.  I think your patch is on the right
track, but not totally there yet.  Imagine instead that the references to r1
and r2040 were swapped, so instead we have:

(insn 10905 179 199 2 (set (reg:SI 2040)
        (reg/f:SI 1 r1)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
     (nil))

...

(insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 2040)
                (plus:SI (reg:SI 2040)
                    (const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
        (reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
     (expr_list:REG_INC (reg:SI 2040)
        (nil)))

...

(insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 1 r1)
                (const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
        (reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 {*thumb2_movsi_vfp}
     (nil))

Even with your patch, we'd miss adding the conflict between r1 and r2040.
Let me think about how we should solve this one.

And a *BIG* thank you for tracking down the problem!!!

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 15:21                                                                 ` Peter Bergner
@ 2018-11-08 16:20                                                                   ` Renlin Li
  2018-11-08 17:52                                                                     ` Peter Bergner
  0 siblings, 1 reply; 66+ messages in thread
From: Renlin Li @ 2018-11-08 16:20 UTC (permalink / raw)
  To: Peter Bergner
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov,
	Wilco Dijkstra

Hi Peter,

On 11/08/2018 03:21 PM, Peter Bergner wrote:
> On 11/8/18 4:57 AM, Renlin Li wrote:
>> I think I found the problem!
>>
>> As described in the PR, a hard register is used in
>> an pre/post modify expression. The hard register is live, but updated.
>> In this case, we should make it conflicting with all pseudos live at
>> that point.  Does it make sense?
> [snip]
>> It fixes the ICE of mis-compiled arm-linux-gnueabihf toolchain described in the
>> PR.
>>
>> I attached the patch for discussion.  I haven't give a complete test on arm or
>> any other targets, yet. (Probably need more adjusting)
> 
> Yes, this is the problem.  We see from the dump, that r2040 does not conflict with
> hard reg r1:
> 
> ;; a2040(r1597,l0) conflicts: <list of pseudo regs>
> ;;     total conflict hard regs:
> ;;     conflict hard regs:
I think you should look for axxx(r2040, ..)?

Maybe I am wrong (not an expert of RA), from what I observed, it is the LRA
makes the code more complex. It decides to split the live range and spill r2040.
It creates multiple instructions to reload it.
r2944 in LRA dump is the register which starts to go wrong. It is assigned as r1.


       Creating newreg=2944 from oldreg=2040, assigning class GENERAL_REGS to inheritance r2944
     Original reg change 2040->2944 (bb2):
  10905: r1:SI=r2944:SI
     Add inheritance<-original before:
  12868: r2944:SI=r2040:SI

The dump is the final state of LRA. I debug it with gdb, and there are some temporary steps
which is not observable in the final dump.

> 
> ...and we have the following RTL:
> 
> (insn 10905 179 199 2 (set (reg:SI 1 r1)
>          (reg/f:SI 2040)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
>       (nil))
> 
> ...
> 
> (insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 1 r1)
>                  (plus:SI (reg:SI 1 r1)
>                      (const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
>          (reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
>       (expr_list:REG_INC (reg:SI 1 r1)
>          (nil)))
> 
> ...
> 
> (insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 2040)
>                  (const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
>          (reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 {*thumb2_movsi_vfp}
>       (nil))
> 
> So my patch caused us to (correctly) skip adding a conflict between r1 and
> r2040 due to the register copy in insn 10905.  However, they really should
> conflict as you found due to the definition of r1 in insn 208 and the fact
> we don't add one is a latent bug in LRA.  I think your patch is on the right
> track, but not totally there yet.  Imagine instead that the references to r1
> and r2040 were swapped, so instead we have:
> 
> (insn 10905 179 199 2 (set (reg:SI 2040)
>          (reg/f:SI 1 r1)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
>       (nil))
> 
> ...
> 
> (insn 208 202 182 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 2040)
>                  (plus:SI (reg:SI 2040)
>                      (const_int 12 [0xc]))) [129 loop_nest.m_vec+0 S4 A32])
>          (reg:SI 1048)) "../../gcc/gcc/vec.h":1654 647 {*thumb2_movsi_vfp}
>       (expr_list:REG_INC (reg:SI 2040)
>          (nil)))
> 
> ...
> 
> (insn 191 189 192 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 1 r1)
>                  (const_int 8 [0x8])) [367 ddrs_table+0 S4 A32])
>          (reg/f:SI 1047)) "../../gcc/gcc/tree-loop-distribution.c":2741 647 {*thumb2_movsi_vfp}
>       (nil))
> 
> Even with your patch, we'd miss adding the conflict between r1 and r2040.
> Let me think about how we should solve this one.

Yes, I am not confident the patch will be the ultimate fix to the problem.

> 
> And a *BIG* thank you for tracking down the problem!!!
> 
Nop.

Regards,
Renlin
> Peter
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 16:20                                                                   ` Renlin Li
@ 2018-11-08 17:52                                                                     ` Peter Bergner
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-11-08 17:52 UTC (permalink / raw)
  To: Renlin Li
  Cc: Jeff Law, Vladimir Makarov, Christophe Lyon, gcc Patches,
	Segher Boessenkool, Ramana Radhakrishnan, Kyrill Tkachov,
	Wilco Dijkstra

On 11/8/18 10:19 AM, Renlin Li wrote:
>> Yes, this is the problem.  We see from the dump, that r2040 does not conflict with
>> hard reg r1:
>>
>> ;; a2040(r1597,l0) conflicts: <list of pseudo regs>
>> ;;     total conflict hard regs:
>> ;;     conflict hard regs:
> I think you should look for axxx(r2040, ..)?
> 
> Maybe I am wrong (not an expert of RA), from what I observed, it is the LRA
> makes the code more complex. It decides to split the live range and spill r2040.
> It creates multiple instructions to reload it.
> r2944 in LRA dump is the register which starts to go wrong. It is assigned as r1.

Yes, IRA and LRA have similar code to compute conflicts.  We need them both to
compute that r2040 (and the reload pseudo(s) generated for it by LRA) conflict
with r1.

Peter




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register
  2018-11-08 14:29                                                                   ` Peter Bergner
@ 2018-11-08 19:01                                                                     ` Peter Bergner
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Bergner @ 2018-11-08 19:01 UTC (permalink / raw)
  To: Richard Biener
  Cc: renlin.li, Jeff Law, Vladimir N. Makarov, Christophe Lyon,
	GCC Patches, Segher Boessenkool, Ramana Radhakrishnan,
	kyrylo.tkachov, Wilco Dijkstra

On 11/8/18 8:29 AM, Peter Bergner wrote:
> On 11/8/18 5:48 AM, Richard Biener wrote:
>> Esp. adding conflicts in a loop that says "See which defined values die here."
>> is quite fishy.
> 
> ..the original loop is dealing with some of the gory details you never read
> about in academic RA papers.  This code is used to catch the case where an insn
> defines a register(s) that is never used.  Because it is never used, it never
> ends up in the "live" (ie, live and available) set, which can cause us to miss
> some required conflicts.

Heh, of course I confused this loop with the one I described which is
earlier in the function.  This loop is actually just the normal loop
over all definitions, removing them from the live set and adding conflicts
with all pseudos that are currently live.

Sorry for the confusion. :-(

Peter

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2018-11-08 19:01 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-26 21:14 [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register Peter Bergner
2018-09-26 21:16 ` [PATCH 1/2][IRA,LRA] " Peter Bergner
2018-09-26 21:36 ` [PATCH 2/2][IRA,LRA] " Peter Bergner
2018-09-28 21:45 ` [PATCH 0/2][IRA,LRA] " Vladimir Makarov
2018-09-30 20:28   ` Peter Bergner
2018-10-01  0:57     ` H.J. Lu
2018-10-01  1:18       ` Peter Bergner
2018-10-01 12:46         ` H.J. Lu
2018-10-01 12:51           ` H.J. Lu
2018-10-01 13:16             ` H.J. Lu
2018-10-01 14:05               ` Peter Bergner
2018-10-02  4:08             ` Peter Bergner
2018-10-02 14:50               ` Jeff Law
2018-10-02 15:07               ` Peter Bergner
2018-10-02 15:37                 ` H.J. Lu
2018-10-02 15:55                   ` Peter Bergner
2018-10-02 17:14                     ` H.J. Lu
2018-10-02 21:53                 ` Peter Bergner
2018-10-02 22:28                   ` H.J. Lu
2018-10-03  0:35                     ` Peter Bergner
2018-10-03  2:23                       ` H.J. Lu
2018-10-03  2:46                         ` Peter Bergner
2018-10-03 14:43                 ` [PATCH 2/2 v3][IRA,LRA] " Peter Bergner
2018-10-04 22:18                   ` Vladimir Makarov
2018-10-05 16:50                     ` Peter Bergner
2018-10-05 18:54                       ` Vladimir Makarov
2018-10-05 20:10                         ` Peter Bergner
2018-10-05 22:56                           ` Vladimir Makarov
2018-10-06  6:40                             ` Peter Bergner
2018-10-08  9:37                               ` Christophe Lyon
2018-10-08 14:21                                 ` Peter Bergner
2018-10-08 14:46                                   ` Christophe Lyon
2018-10-08 15:04                                     ` Jeff Law
2018-10-11  2:57                                       ` Peter Bergner
2018-10-11 18:26                                         ` Peter Bergner
2018-10-11 20:31                                           ` Peter Bergner
2018-10-11 20:46                                             ` Jeff Law
2018-10-11 21:09                                               ` Peter Bergner
2018-10-11 21:36                                                 ` Jeff Law
2018-10-12  9:50                                                 ` Eric Botcazou
2018-10-11 21:05                                             ` Vladimir Makarov
2018-10-12 16:57                                               ` Peter Bergner
2018-10-12 17:56                                                 ` Jeff Law
2018-10-12  4:44                                             ` Jeff Law
2018-10-16  2:50                                               ` Peter Bergner
2018-11-01 18:50                                                 ` Renlin Li
2018-11-01 20:35                                                   ` Segher Boessenkool
2018-11-01 22:08                                                   ` Peter Bergner
2018-11-02 10:05                                                     ` Renlin Li
2018-11-05 19:20                                                     ` Jeff Law
2018-11-05 19:36                                                       ` Peter Bergner
2018-11-05 19:41                                                         ` Jeff Law
2018-11-06 10:52                                                           ` Renlin Li
2018-11-06 10:57                                                             ` Ramana Radhakrishnan
2018-11-06 12:23                                                               ` Renlin Li
2018-11-06 18:46                                                                 ` Peter Bergner
2018-11-06 18:58                                                             ` Jeff Law
2018-11-08 10:57                                                               ` Renlin Li
2018-11-08 11:49                                                                 ` Richard Biener
2018-11-08 14:29                                                                   ` Peter Bergner
2018-11-08 19:01                                                                     ` Peter Bergner
2018-11-08 12:35                                                                 ` Peter Bergner
2018-11-08 13:42                                                                   ` Renlin Li
2018-11-08 15:21                                                                 ` Peter Bergner
2018-11-08 16:20                                                                   ` Renlin Li
2018-11-08 17:52                                                                     ` Peter Bergner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).