public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 2/5] combine: handle I2 a parallel of two SETs
  2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
  2014-11-14 19:21 ` [PATCH 1/5] combine: more verbose costs output Segher Boessenkool
@ 2014-11-14 19:21 ` Segher Boessenkool
  2014-11-14 19:52   ` Bernd Schmidt
  2014-11-25 19:02   ` Jeff Law
  2014-11-14 19:29 ` [PATCH 3/5] combine: add regno field to LOG_LINKS Segher Boessenkool
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-14 19:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

If I2 is a PARALLEL of two SETs, split it into two instructions, I1
and I2.  If there already was an I1, rename it to I0.  If there
already was an I0, don't do anything.

This surprisingly simple patch is enough to let combine handle such
PARALLELs properly.


2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/
	* combine.c (try_combine): If I2 is a PARALLEL of two SETs,
	split it into two insns.

---
 gcc/combine.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index f7797e7..c4d23e3 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2780,6 +2780,37 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 	  SUBST_LINK (LOG_LINKS (i2), alloc_insn_link (i1, LOG_LINKS (i2)));
 	}
     }
+
+  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
+     make those two SETs separate I1 and I2 insns, and make an I0 that is
+     the original I1.  */
+  if (i0 == 0
+      && GET_CODE (PATTERN (i2)) == PARALLEL
+      && XVECLEN (PATTERN (i2), 0) >= 2
+      && GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
+      && GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
+      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
+      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
+      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3)
+      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3)
+      && (XVECLEN (PATTERN (i2), 0) == 2
+	  || GET_CODE (XVECEXP (PATTERN (i2), 0, 2)) == CLOBBER))
+    {
+      /* If there is no I1, there is no I0 either.  */
+      i0 = i1;
+
+      /* We make I1 with the same INSN_UID as I2.  This gives it
+	 the same DF_INSN_LUID for value tracking.  Our fake I1 will
+	 never appear in the insn stream so giving it the same INSN_UID
+	 as I2 will not cause a problem.  */
+
+      i1 = gen_rtx_INSN (VOIDmode, NULL, i2, BLOCK_FOR_INSN (i2),
+			 XVECEXP (PATTERN (i2), 0, 0), INSN_LOCATION (i2),
+			 -1, NULL_RTX);
+      INSN_UID (i1) = INSN_UID (i2);
+
+      SUBST (PATTERN (i2), XVECEXP (PATTERN (i2), 0, 1));
+    }
 #endif
 
   /* Verify that I2 and I1 are valid for combining.  */
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 0/5] some combine patches
@ 2014-11-14 19:21 Segher Boessenkool
  2014-11-14 19:21 ` [PATCH 1/5] combine: more verbose costs output Segher Boessenkool
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-14 19:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

Here are five patches that together allow combine to do more useful
work with PARALLELs of two SETs, like on many machines a set of a GPR
and one of the condition code, or a GPR and the carry bit on PowerPC,
or two GPRs on some machines.

The first patch is just for debug.

The second is the real meat: it allows combining an I2 that has two SETs.

The third adds a regno field to LOG_LINKS, which the fourth then uses in
distribute_log_links; without that, most parallels lose their log_links
early.

The fifth removes a SET from the combination result if it is (now) dead,
if what's left is a valid instruction.

Bootstrapped and tested on powerpc64-linux (tree of a week ago), all five
together, -m64,-m32,-m32/-mpowerpc64,-m64/-mlra; no regressions.  Checks
of the separate patches still running.  Is this okay for mainline if it
passes?


Segher


Segher Boessenkool (5):
  combine: more verbose costs output
  combine: handle I2 a parallel of two SETs
  combine: add regno field to LOG_LINKS
  combine: distribute_log_links for PARALLELs of SETs
  combine: preferably delete dead SETs in PARALLELs

 gcc/combine.c | 313 +++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 191 insertions(+), 122 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/5] combine: more verbose costs output
  2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
@ 2014-11-14 19:21 ` Segher Boessenkool
  2014-11-17  9:56   ` Richard Biener
  2014-11-14 19:21 ` [PATCH 2/5] combine: handle I2 a parallel of two SETs Segher Boessenkool
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-14 19:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

Output the cost calculation always, not only when the costs disallow
a combination.  Checking if your port has sane costs is much easier
this way.

Also there is no point in printing full lines at once; debug dumps
are never translated, so we can print piece by piece.


2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/
	* combine.c (combine_validate_cost): Always print the insn costs
	to the dump file.

---
 gcc/combine.c | 63 ++++++++++++++++++++++++-----------------------------------
 1 file changed, 25 insertions(+), 38 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index e240cfb..f7797e7 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -914,48 +914,35 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn *i3,
 
   /* Disallow this combination if both new_cost and old_cost are greater than
      zero, and new_cost is greater than old cost.  */
-  if (old_cost > 0 && new_cost > old_cost)
-    {
-      if (dump_file)
-	{
-	  if (i0)
-	    {
-	      fprintf (dump_file,
-		       "rejecting combination of insns %d, %d, %d and %d\n",
-		       INSN_UID (i0), INSN_UID (i1), INSN_UID (i2),
-		       INSN_UID (i3));
-	      fprintf (dump_file, "original costs %d + %d + %d + %d = %d\n",
-		       i0_cost, i1_cost, i2_cost, i3_cost, old_cost);
-	    }
-	  else if (i1)
-	    {
-	      fprintf (dump_file,
-		       "rejecting combination of insns %d, %d and %d\n",
-		       INSN_UID (i1), INSN_UID (i2), INSN_UID (i3));
-	      fprintf (dump_file, "original costs %d + %d + %d = %d\n",
-		       i1_cost, i2_cost, i3_cost, old_cost);
-	    }
-	  else
-	    {
-	      fprintf (dump_file,
-		       "rejecting combination of insns %d and %d\n",
-		       INSN_UID (i2), INSN_UID (i3));
-	      fprintf (dump_file, "original costs %d + %d = %d\n",
-		       i2_cost, i3_cost, old_cost);
-	    }
+  int reject = old_cost > 0 && new_cost > old_cost;
 
-	  if (newi2pat)
-	    {
-	      fprintf (dump_file, "replacement costs %d + %d = %d\n",
-		       new_i2_cost, new_i3_cost, new_cost);
-	    }
-	  else
-	    fprintf (dump_file, "replacement cost %d\n", new_cost);
-	}
+  if (dump_file)
+    {
+      fprintf (dump_file, "%s combination of insns ",
+	       reject ? "rejecting" : "allowing");
+      if (i0)
+	fprintf (dump_file, "%d, ", INSN_UID (i0));
+      if (i1)
+	fprintf (dump_file, "%d, ", INSN_UID (i1));
+      fprintf (dump_file, "%d and %d\n", INSN_UID (i2), INSN_UID (i3));
 
-      return false;
+      fprintf (dump_file, "original costs ");
+      if (i0)
+	fprintf (dump_file, "%d + ", i0_cost);
+      if (i1)
+	fprintf (dump_file, "%d + ", i1_cost);
+      fprintf (dump_file, "%d + %d = %d\n", i2_cost, i3_cost, old_cost);
+
+      if (newi2pat)
+	fprintf (dump_file, "replacement costs %d + %d = %d\n",
+		 new_i2_cost, new_i3_cost, new_cost);
+      else
+	fprintf (dump_file, "replacement cost %d\n", new_cost);
     }
 
+  if (reject)
+    return false;
+
   /* Update the uid_insn_cost array with the replacement costs.  */
   INSN_COST (i2) = new_i2_cost;
   INSN_COST (i3) = new_i3_cost;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/5] combine: add regno field to LOG_LINKS
  2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
  2014-11-14 19:21 ` [PATCH 1/5] combine: more verbose costs output Segher Boessenkool
  2014-11-14 19:21 ` [PATCH 2/5] combine: handle I2 a parallel of two SETs Segher Boessenkool
@ 2014-11-14 19:29 ` Segher Boessenkool
  2014-11-25 19:09   ` Jeff Law
  2014-11-14 19:35 ` [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs Segher Boessenkool
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-14 19:29 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

With this new field in place, we can have LOG_LINKS for insns that set
more than one register and distribute them properly in distribute_links.
This then allows many more PARALLELs to be combined.

Also split off new functions can_combine_{def,use}_p from the
create_log_links function.


2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/
	* combine.c (struct insn_link): New field `regno'.
	(alloc_insn_link): New parameter `regno'.  Use it.
	(find_single_use): Check the new field.
	(can_combine_def_p, can_combine_use_p): New functions.  Split
	off from ...
	(create_log_links): ... here.  Correct data type of `regno'.
	Adjust call to alloc_insn_link.
	(adjust_for_new_dest): Find regno, use it in call to
	alloc_insn_link.
	(try_combine): Adjust call to alloc_insn_link.
	(distribute_links): Check the new field.

---
 gcc/combine.c | 135 ++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 80 insertions(+), 55 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index c4d23e3..cf184db 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -328,6 +328,7 @@ static int *uid_insn_cost;
 
 struct insn_link {
   rtx_insn *insn;
+  unsigned int regno;
   struct insn_link *next;
 };
 
@@ -346,12 +347,13 @@ static struct obstack insn_link_obstack;
 /* Allocate a link.  */
 
 static inline struct insn_link *
-alloc_insn_link (rtx_insn *insn, struct insn_link *next)
+alloc_insn_link (rtx_insn *insn, unsigned int regno, struct insn_link *next)
 {
   struct insn_link *l
     = (struct insn_link *) obstack_alloc (&insn_link_obstack,
 					  sizeof (struct insn_link));
   l->insn = insn;
+  l->regno = regno;
   l->next = next;
   return l;
 }
@@ -686,7 +688,7 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc)
     if (INSN_P (next) && dead_or_set_p (next, dest))
       {
 	FOR_EACH_LOG_LINK (link, next)
-	  if (link->insn == insn)
+	  if (link->insn == insn && link->regno == REGNO (dest))
 	    break;
 
 	if (link)
@@ -982,6 +984,43 @@ delete_noop_moves (void)
 }
 
 \f
+/* Return false if we do not want to (or cannot) combine DEF.  */
+static bool
+can_combine_def_p (df_ref def)
+{
+  /* Do not consider if it is pre/post modification in MEM.  */
+  if (DF_REF_FLAGS (def) & DF_REF_PRE_POST_MODIFY)
+    return false;
+
+  unsigned int regno = DF_REF_REGNO (def);
+
+  /* Do not combine frame pointer adjustments.  */
+  if ((regno == FRAME_POINTER_REGNUM
+       && (!reload_completed || frame_pointer_needed))
+#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
+      || (regno == HARD_FRAME_POINTER_REGNUM
+	  && (!reload_completed || frame_pointer_needed))
+#endif
+#if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
+      || (regno == ARG_POINTER_REGNUM && fixed_regs[regno])
+#endif
+      )
+    return false;
+
+  return true;
+}
+
+/* Return false if we do not want to (or cannot) combine USE.  */
+static bool
+can_combine_use_p (df_ref use)
+{
+  /* Do not consider the usage of the stack pointer by function call.  */
+  if (DF_REF_FLAGS (use) & DF_REF_CALL_STACK_USAGE)
+    return false;
+
+  return true;
+}
+
 /* Fill in log links field for all insns.  */
 
 static void
@@ -1015,67 +1054,39 @@ create_log_links (void)
 
 	  FOR_EACH_INSN_DEF (def, insn)
             {
-              int regno = DF_REF_REGNO (def);
+              unsigned int regno = DF_REF_REGNO (def);
               rtx_insn *use_insn;
 
               if (!next_use[regno])
                 continue;
 
-              /* Do not consider if it is pre/post modification in MEM.  */
-              if (DF_REF_FLAGS (def) & DF_REF_PRE_POST_MODIFY)
-                continue;
+	      if (!can_combine_def_p (def))
+		continue;
 
-              /* Do not make the log link for frame pointer.  */
-              if ((regno == FRAME_POINTER_REGNUM
-                   && (! reload_completed || frame_pointer_needed))
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-                  || (regno == HARD_FRAME_POINTER_REGNUM
-                      && (! reload_completed || frame_pointer_needed))
-#endif
-#if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
-                  || (regno == ARG_POINTER_REGNUM && fixed_regs[regno])
-#endif
-                  )
-                continue;
+	      use_insn = next_use[regno];
+	      next_use[regno] = NULL;
 
-              use_insn = next_use[regno];
-              if (BLOCK_FOR_INSN (use_insn) == bb)
-                {
-                  /* flow.c claimed:
-
-                     We don't build a LOG_LINK for hard registers contained
-                     in ASM_OPERANDs.  If these registers get replaced,
-                     we might wind up changing the semantics of the insn,
-                     even if reload can make what appear to be valid
-                     assignments later.  */
-                  if (regno >= FIRST_PSEUDO_REGISTER
-                      || asm_noperands (PATTERN (use_insn)) < 0)
-		    {
-		      /* Don't add duplicate links between instructions.  */
-		      struct insn_link *links;
-		      FOR_EACH_LOG_LINK (links, use_insn)
-		        if (insn == links->insn)
-			  break;
+	      if (BLOCK_FOR_INSN (use_insn) != bb)
+		continue;
 
-		      if (!links)
-			LOG_LINKS (use_insn)
-			  = alloc_insn_link (insn, LOG_LINKS (use_insn));
-		    }
-                }
-              next_use[regno] = NULL;
-            }
+	      /* flow.c claimed:
 
-	  FOR_EACH_INSN_USE (use, insn)
-            {
-	      int regno = DF_REF_REGNO (use);
+		 We don't build a LOG_LINK for hard registers contained
+		 in ASM_OPERANDs.  If these registers get replaced,
+		 we might wind up changing the semantics of the insn,
+		 even if reload can make what appear to be valid
+		 assignments later.  */
+	      if (regno < FIRST_PSEUDO_REGISTER
+		  && asm_noperands (PATTERN (use_insn)) >= 0)
+		continue;
 
-              /* Do not consider the usage of the stack pointer
-		 by function call.  */
-              if (DF_REF_FLAGS (use) & DF_REF_CALL_STACK_USAGE)
-                continue;
-
-              next_use[regno] = insn;
+	      LOG_LINKS (use_insn)
+		= alloc_insn_link (insn, regno, LOG_LINKS (use_insn));
             }
+
+	  FOR_EACH_INSN_USE (use, insn)
+	    if (can_combine_use_p (use))
+	      next_use[DF_REF_REGNO (use)] = insn;
         }
     }
 
@@ -2347,7 +2358,19 @@ adjust_for_new_dest (rtx_insn *insn)
   /* The new insn will have a destination that was previously the destination
      of an insn just above it.  Call distribute_links to make a LOG_LINK from
      the next use of that destination.  */
-  distribute_links (alloc_insn_link (insn, NULL));
+
+  rtx set = single_set (insn);
+  gcc_assert (set);
+
+  rtx reg = SET_DEST (set);
+
+  while (GET_CODE (reg) == ZERO_EXTRACT
+	 || GET_CODE (reg) == STRICT_LOW_PART
+	 || GET_CODE (reg) == SUBREG)
+    reg = XEXP (reg, 0);
+  gcc_assert (REG_P (reg));
+
+  distribute_links (alloc_insn_link (insn, REGNO (reg), NULL));
 
   df_insn_rescan (insn);
 }
@@ -2777,7 +2800,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 	  SUBST (PATTERN (i2), XVECEXP (PATTERN (i2), 0, 0));
 	  SUBST (XEXP (SET_SRC (PATTERN (i2)), 0),
 		 SET_DEST (PATTERN (i1)));
-	  SUBST_LINK (LOG_LINKS (i2), alloc_insn_link (i1, LOG_LINKS (i2)));
+	  unsigned int regno = REGNO (SET_DEST (PATTERN (i1)));
+	  SUBST_LINK (LOG_LINKS (i2),
+		      alloc_insn_link (i1, regno, LOG_LINKS (i2)));
 	}
     }
 
@@ -13865,7 +13890,7 @@ distribute_links (struct insn_link *links)
 	  struct insn_link *link2;
 
 	  FOR_EACH_LOG_LINK (link2, place)
-	    if (link2->insn == link->insn)
+	    if (link2->insn == link->insn && link2->regno == link->regno)
 	      break;
 
 	  if (link2 == NULL)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs
  2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
                   ` (2 preceding siblings ...)
  2014-11-14 19:29 ` [PATCH 3/5] combine: add regno field to LOG_LINKS Segher Boessenkool
@ 2014-11-14 19:35 ` Segher Boessenkool
  2014-11-25 20:06   ` Jeff Law
  2014-11-14 19:37 ` [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs Segher Boessenkool
  2014-11-16 12:08 ` [PATCH 0/5] some combine patches Oleg Endo
  5 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-14 19:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

Now that LOG_LINKS are per regno, we can distribute them on PARALLELs
just fine.  Do so.  This makes PARALLELs not lose their LOG_LINKS early
when e.g. a trivial reg-reg move is combined, so that they can be used
in more useful combinations as well.


2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/
	* combine.c (distribute_links): Handle multiple SETs.

---
 gcc/combine.c | 52 +++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index cf184db..8c7f052 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -13832,24 +13832,46 @@ distribute_links (struct insn_link *links)
 
       next_link = link->next;
 
-      /* If the insn that this link points to is a NOTE or isn't a single
-	 set, ignore it.  In the latter case, it isn't clear what we
-	 can do other than ignore the link, since we can't tell which
-	 register it was for.  Such links wouldn't be used by combine
-	 anyway.
-
-	 It is not possible for the destination of the target of the link to
-	 have been changed by combine.  The only potential of this is if we
-	 replace I3, I2, and I1 by I3 and I2.  But in that case the
-	 destination of I2 also remains unchanged.  */
-
-      if (NOTE_P (link->insn)
-	  || (set = single_set (link->insn)) == 0)
+      /* If the insn that this link points to is a NOTE, ignore it.  */
+      if (NOTE_P (link->insn))
+	continue;
+
+      set = 0;
+      rtx pat = PATTERN (link->insn);
+      if (GET_CODE (pat) == SET)
+	set = pat;
+      else if (GET_CODE (pat) == PARALLEL)
+	{
+	  int i;
+	  for (i = 0; i < XVECLEN (pat, 0); i++)
+	    {
+	      set = XVECEXP (pat, 0, i);
+	      if (GET_CODE (set) != SET)
+		continue;
+
+	      reg = SET_DEST (set);
+	      while (GET_CODE (reg) == ZERO_EXTRACT
+		     || GET_CODE (reg) == STRICT_LOW_PART
+		     || GET_CODE (reg) == SUBREG)
+		reg = XEXP (reg, 0);
+
+	      if (!REG_P (reg))
+		continue;
+
+	      if (REGNO (reg) == link->regno)
+		break;
+	    }
+	  if (i == XVECLEN (pat, 0))
+	    continue;
+	}
+      else
 	continue;
 
       reg = SET_DEST (set);
-      while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
-	     || GET_CODE (reg) == STRICT_LOW_PART)
+
+      while (GET_CODE (reg) == ZERO_EXTRACT
+	     || GET_CODE (reg) == STRICT_LOW_PART
+	     || GET_CODE (reg) == SUBREG)
 	reg = XEXP (reg, 0);
 
       /* A LOG_LINK is defined as being placed on the first insn that uses
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs
  2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
                   ` (3 preceding siblings ...)
  2014-11-14 19:35 ` [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs Segher Boessenkool
@ 2014-11-14 19:37 ` Segher Boessenkool
  2014-11-17 21:25   ` Jeff Law
  2014-11-16 12:08 ` [PATCH 0/5] some combine patches Oleg Endo
  5 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-14 19:37 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

If the result of combining some insns is a PARALLEL of two SETs, and that
is not a recognised insn, and one of the SETs is dead, combine tries to
use the remaining SET as insn.

This patch changes things around so that the one SET is preferably used,
not the PARALLEL.


2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/
	* combine.c (try_combine): Prefer to delete dead SETs inside
	a PARALLEL over keeping them.

---
 gcc/combine.c | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 8c7f052..d9292df 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -3318,29 +3318,25 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 	RTVEC_ELT (newpat_vec_with_clobbers, i) = XVECEXP (newpat, 0, i);
     }
 
-  /* Is the result of combination a valid instruction?  */
-  insn_code_number = recog_for_combine (&newpat, i3, &new_i3_notes);
+  /* We have recognized nothing yet.  */
+  insn_code_number = -1;
+
+  /* See if this is a PARALLEL of two SETs where one SET's destination is
+     a register that is unused and this isn't marked as an instruction that
+     might trap in an EH region.  In that case, we just need the other SET.
+     We prefer this over the PARALLEL.
 
-  /* If the result isn't valid, see if it is a PARALLEL of two SETs where
-     the second SET's destination is a register that is unused and isn't
-     marked as an instruction that might trap in an EH region.  In that case,
-     we just need the first SET.   This can occur when simplifying a divmod
-     insn.  We *must* test for this case here because the code below that
-     splits two independent SETs doesn't handle this case correctly when it
-     updates the register status.
+     This can occur when simplifying a divmod insn.  We *must* test for this
+     case here because the code below that splits two independent SETs doesn't
+     handle this case correctly when it updates the register status.
 
      It's pointless doing this if we originally had two sets, one from
      i3, and one from i2.  Combining then splitting the parallel results
      in the original i2 again plus an invalid insn (which we delete).
      The net effect is only to move instructions around, which makes
-     debug info less accurate.
+     debug info less accurate.  */
 
-     Also check the case where the first SET's destination is unused.
-     That would not cause incorrect code, but does cause an unneeded
-     insn to remain.  */
-
-  if (insn_code_number < 0
-      && !(added_sets_2 && i1 == 0)
+  if (!(added_sets_2 && i1 == 0)
       && GET_CODE (newpat) == PARALLEL
       && XVECLEN (newpat, 0) == 2
       && GET_CODE (XVECEXP (newpat, 0, 0)) == SET
@@ -3349,6 +3345,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
     {
       rtx set0 = XVECEXP (newpat, 0, 0);
       rtx set1 = XVECEXP (newpat, 0, 1);
+      rtx oldpat = newpat;
 
       if (((REG_P (SET_DEST (set1))
 	    && find_reg_note (i3, REG_UNUSED, SET_DEST (set1)))
@@ -3375,8 +3372,15 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 	  if (insn_code_number >= 0)
 	    changed_i3_dest = 1;
 	}
+
+      if (insn_code_number < 0)
+	newpat = oldpat;
     }
 
+  /* Is the result of combination a valid instruction?  */
+  if (insn_code_number < 0)
+    insn_code_number = recog_for_combine (&newpat, i3, &new_i3_notes);
+
   /* If we were combining three insns and the result is a simple SET
      with no ASM_OPERANDS that wasn't recognized, try to split it into two
      insns.  There are two ways to do this.  It can be split using a
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs
  2014-11-14 19:21 ` [PATCH 2/5] combine: handle I2 a parallel of two SETs Segher Boessenkool
@ 2014-11-14 19:52   ` Bernd Schmidt
  2014-11-15 15:21     ` Segher Boessenkool
  2014-11-25 19:02   ` Jeff Law
  1 sibling, 1 reply; 26+ messages in thread
From: Bernd Schmidt @ 2014-11-14 19:52 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches

On 11/14/2014 08:19 PM, Segher Boessenkool wrote:
> +  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
> +     make those two SETs separate I1 and I2 insns, and make an I0 that is
> +     the original I1.  */
> +  if (i0 == 0
> +      && GET_CODE (PATTERN (i2)) == PARALLEL
> +      && XVECLEN (PATTERN (i2), 0) >= 2
> +      && GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
> +      && GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
> +      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
> +      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
> +      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3)
> +      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3)

Don't we have other code in combine checking the reg_used_between case?

> +      && (XVECLEN (PATTERN (i2), 0) == 2
> +	  || GET_CODE (XVECEXP (PATTERN (i2), 0, 2)) == CLOBBER))

This probably wants to test for XVECLEN == 3 for the second case. Can 
then drop the earlier test.

I think you also need to check that !reg_overlap_mentioned_p between the 
two dests and the other set's sources.


Bernd

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs
  2014-11-14 19:52   ` Bernd Schmidt
@ 2014-11-15 15:21     ` Segher Boessenkool
  2014-11-26 16:24       ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-15 15:21 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

On Fri, Nov 14, 2014 at 08:35:48PM +0100, Bernd Schmidt wrote:
> On 11/14/2014 08:19 PM, Segher Boessenkool wrote:
> >+  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
> >+     make those two SETs separate I1 and I2 insns, and make an I0 that is
> >+     the original I1.  */
> >+  if (i0 == 0
> >+      && GET_CODE (PATTERN (i2)) == PARALLEL
> >+      && XVECLEN (PATTERN (i2), 0) >= 2
> >+      && GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
> >+      && GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
> >+      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
> >+      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
> >+      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), 
> >i2, i3)
> >+      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), 
> >i2, i3)
> 
> Don't we have other code in combine checking the reg_used_between case?

It doesn't make any sense at all.  What I wanted to check is whether
splitting the parallel creates a conflict, but I woefully failed at that.
Will fix.

> >+      && (XVECLEN (PATTERN (i2), 0) == 2
> >+	  || GET_CODE (XVECEXP (PATTERN (i2), 0, 2)) == CLOBBER))
> 
> This probably wants to test for XVECLEN == 3 for the second case. Can 
> then drop the earlier test.

It needs to test there are exactly two SETs, any amount of clobbers, and
nothing else.

> I think you also need to check that !reg_overlap_mentioned_p between the 
> two dests and the other set's sources.

Only the dest of the new I1 with the sources of the new I2, but yes.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/5] some combine patches
  2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
                   ` (4 preceding siblings ...)
  2014-11-14 19:37 ` [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs Segher Boessenkool
@ 2014-11-16 12:08 ` Oleg Endo
  2014-11-16 13:52   ` Segher Boessenkool
  5 siblings, 1 reply; 26+ messages in thread
From: Oleg Endo @ 2014-11-16 12:08 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

Hi Segher,

On 15 Nov 2014, at 04:19, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> Here are five patches that together allow combine to do more useful
> work with PARALLELs of two SETs, like on many machines a set of a GPR
> and one of the condition code, or a GPR and the carry bit on PowerPC,
> or two GPRs on some machines.
> 
> The first patch is just for debug.
> 
> The second is the real meat: it allows combining an I2 that has two SETs.
> 
> The third adds a regno field to LOG_LINKS, which the fourth then uses in
> distribute_log_links; without that, most parallels lose their log_links
> early.
> 
> The fifth removes a SET from the combination result if it is (now) dead,
> if what's left is a valid instruction.
> 
> Bootstrapped and tested on powerpc64-linux (tree of a week ago), all five
> together, -m64,-m32,-m32/-mpowerpc64,-m64/-mlra; no regressions.  Checks
> of the separate patches still running.  Is this okay for mainline if it
> passes?

When you commit those, could you please also add PR 59278 to the ChangeLog so that the commit appears in bugzilla?  After your patches are in, I'd like to add some SH specific test cases (assuming that your patches fix PR 59278).

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/5] some combine patches
  2014-11-16 12:08 ` [PATCH 0/5] some combine patches Oleg Endo
@ 2014-11-16 13:52   ` Segher Boessenkool
  2014-11-16 14:24     ` Oleg Endo
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-16 13:52 UTC (permalink / raw)
  To: Oleg Endo; +Cc: gcc-patches

On Sun, Nov 16, 2014 at 05:45:06PM +0900, Oleg Endo wrote:
> When you commit those, could you please also add PR 59278 to the ChangeLog so
> that the commit appears in bugzilla?  After your patches are in, I'd like to
> add some SH specific test cases (assuming that your patches fix PR 59278).

It doesn't fix this testcase.  Here, recog_for_combine needs to add a
clobber of T, but it thinks T is not dead.  It doesn't say anything
about that in the debug dump :-(

Maybe the movrt_negc pattern shouldn't set T at all, just clobber it?
Or reg_dead_at_p can be taught about "unused" notes.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/5] some combine patches
  2014-11-16 13:52   ` Segher Boessenkool
@ 2014-11-16 14:24     ` Oleg Endo
  0 siblings, 0 replies; 26+ messages in thread
From: Oleg Endo @ 2014-11-16 14:24 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches


On 16 Nov 2014, at 22:18, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> On Sun, Nov 16, 2014 at 05:45:06PM +0900, Oleg Endo wrote:
>> When you commit those, could you please also add PR 59278 to the ChangeLog so
>> that the commit appears in bugzilla?  After your patches are in, I'd like to
>> add some SH specific test cases (assuming that your patches fix PR 59278).
> 
> It doesn't fix this testcase.  

Too bad.

> Here, recog_for_combine needs to add a
> clobber of T, but it thinks T is not dead.  It doesn't say anything
> about that in the debug dump :-(
> 
> Maybe the movrt_negc pattern shouldn't set T at all, just clobber it?

On SH, it's not just that particular pattern, but a couple of others, which would need to be changed from set-set to set-clobber before/during combine and then converted/split into the actual set-set patterns after combine.  E.g. some patterns set the T bit to a known zero/one value which can be good to know later on.

> Or reg_dead_at_p can be taught about "unused" notes.

Sounds the easier way from my point of view.  I don't know about side effects for other targets of "unused" reg notes are treated as "dead" reg notes.

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] combine: more verbose costs output
  2014-11-14 19:21 ` [PATCH 1/5] combine: more verbose costs output Segher Boessenkool
@ 2014-11-17  9:56   ` Richard Biener
  0 siblings, 0 replies; 26+ messages in thread
From: Richard Biener @ 2014-11-17  9:56 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On Fri, Nov 14, 2014 at 8:19 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> Output the cost calculation always, not only when the costs disallow
> a combination.  Checking if your port has sane costs is much easier
> this way.
>
> Also there is no point in printing full lines at once; debug dumps
> are never translated, so we can print piece by piece.

Ok.

Thanks,
Richard.

>
> 2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
>
> gcc/
>         * combine.c (combine_validate_cost): Always print the insn costs
>         to the dump file.
>
> ---
>  gcc/combine.c | 63 ++++++++++++++++++++++++-----------------------------------
>  1 file changed, 25 insertions(+), 38 deletions(-)
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index e240cfb..f7797e7 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -914,48 +914,35 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn *i3,
>
>    /* Disallow this combination if both new_cost and old_cost are greater than
>       zero, and new_cost is greater than old cost.  */
> -  if (old_cost > 0 && new_cost > old_cost)
> -    {
> -      if (dump_file)
> -       {
> -         if (i0)
> -           {
> -             fprintf (dump_file,
> -                      "rejecting combination of insns %d, %d, %d and %d\n",
> -                      INSN_UID (i0), INSN_UID (i1), INSN_UID (i2),
> -                      INSN_UID (i3));
> -             fprintf (dump_file, "original costs %d + %d + %d + %d = %d\n",
> -                      i0_cost, i1_cost, i2_cost, i3_cost, old_cost);
> -           }
> -         else if (i1)
> -           {
> -             fprintf (dump_file,
> -                      "rejecting combination of insns %d, %d and %d\n",
> -                      INSN_UID (i1), INSN_UID (i2), INSN_UID (i3));
> -             fprintf (dump_file, "original costs %d + %d + %d = %d\n",
> -                      i1_cost, i2_cost, i3_cost, old_cost);
> -           }
> -         else
> -           {
> -             fprintf (dump_file,
> -                      "rejecting combination of insns %d and %d\n",
> -                      INSN_UID (i2), INSN_UID (i3));
> -             fprintf (dump_file, "original costs %d + %d = %d\n",
> -                      i2_cost, i3_cost, old_cost);
> -           }
> +  int reject = old_cost > 0 && new_cost > old_cost;
>
> -         if (newi2pat)
> -           {
> -             fprintf (dump_file, "replacement costs %d + %d = %d\n",
> -                      new_i2_cost, new_i3_cost, new_cost);
> -           }
> -         else
> -           fprintf (dump_file, "replacement cost %d\n", new_cost);
> -       }
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "%s combination of insns ",
> +              reject ? "rejecting" : "allowing");
> +      if (i0)
> +       fprintf (dump_file, "%d, ", INSN_UID (i0));
> +      if (i1)
> +       fprintf (dump_file, "%d, ", INSN_UID (i1));
> +      fprintf (dump_file, "%d and %d\n", INSN_UID (i2), INSN_UID (i3));
>
> -      return false;
> +      fprintf (dump_file, "original costs ");
> +      if (i0)
> +       fprintf (dump_file, "%d + ", i0_cost);
> +      if (i1)
> +       fprintf (dump_file, "%d + ", i1_cost);
> +      fprintf (dump_file, "%d + %d = %d\n", i2_cost, i3_cost, old_cost);
> +
> +      if (newi2pat)
> +       fprintf (dump_file, "replacement costs %d + %d = %d\n",
> +                new_i2_cost, new_i3_cost, new_cost);
> +      else
> +       fprintf (dump_file, "replacement cost %d\n", new_cost);
>      }
>
> +  if (reject)
> +    return false;
> +
>    /* Update the uid_insn_cost array with the replacement costs.  */
>    INSN_COST (i2) = new_i2_cost;
>    INSN_COST (i3) = new_i3_cost;
> --
> 1.8.1.4
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs
  2014-11-14 19:37 ` [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs Segher Boessenkool
@ 2014-11-17 21:25   ` Jeff Law
  2014-11-18 12:52     ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2014-11-17 21:25 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches

On 11/14/14 12:19, Segher Boessenkool wrote:
> If the result of combining some insns is a PARALLEL of two SETs, and that
> is not a recognised insn, and one of the SETs is dead, combine tries to
> use the remaining SET as insn.
>
> This patch changes things around so that the one SET is preferably used,
> not the PARALLEL.
>
>
> 2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
>
> gcc/
> 	* combine.c (try_combine): Prefer to delete dead SETs inside
> 	a PARALLEL over keeping them.
OK.  Can this go forward independent of the other changes?  Seems like 
it should.

Does it help pr52714 where we'd like to rip apart a PARALLEL with two 
sets, one of which is dead.  If so, it might allow us to  optimize that 
code better.  Granted, it originally was an m68k issue, but presumably 
it's happening eleswhere or you wouldn't be poking at it :-)


jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs
  2014-11-17 21:25   ` Jeff Law
@ 2014-11-18 12:52     ` Segher Boessenkool
  2014-11-18 13:51       ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-18 12:52 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Mon, Nov 17, 2014 at 02:13:22PM -0700, Jeff Law wrote:
> On 11/14/14 12:19, Segher Boessenkool wrote:
> >If the result of combining some insns is a PARALLEL of two SETs, and that
> >is not a recognised insn, and one of the SETs is dead, combine tries to
> >use the remaining SET as insn.
> >
> >This patch changes things around so that the one SET is preferably used,
> >not the PARALLEL.
> >
> >
> >2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
> >
> >gcc/
> >	* combine.c (try_combine): Prefer to delete dead SETs inside
> >	a PARALLEL over keeping them.
> OK.  Can this go forward independent of the other changes?  Seems like 
> it should.

Yes, only 4 depends on 3, the rest are independent patches.

> Does it help pr52714 where we'd like to rip apart a PARALLEL with two 
> sets, one of which is dead.  If so, it might allow us to  optimize that 
> code better.

It does not seem to fix the testcase.  I'll investigate why not.
You're talking about the

  (parallel [(set (pc) (pc))
             (set (X) (sp))])

right?  I guess the "set pc pc" is not marked as unused...

> Granted, it originally was an m68k issue, but presumably 
> it's happening eleswhere or you wouldn't be poking at it :-)

The case that made me do this is PowerPC, where (with more patches) for

  long long subfM1(long long a) { return 0x1ffffffff - a; }

we generated (-m32)

  subfic 4,4,-1 ; subfic 3,3,1

where that first subfic is

  (parallel [(set (reg 4) (not (reg 4)))
             (set (ca) (const_int 1))])

with that second set dead, so we can just do

  not 4,4 ; subfic 3,3,1

which is cheaper.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs
  2014-11-18 12:52     ` Segher Boessenkool
@ 2014-11-18 13:51       ` Segher Boessenkool
  2014-11-18 18:59         ` Jeff Law
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-18 13:51 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Nov 18, 2014 at 06:38:49AM -0600, Segher Boessenkool wrote:
> > Does it help pr52714 where we'd like to rip apart a PARALLEL with two 
> > sets, one of which is dead.  If so, it might allow us to  optimize that 
> > code better.
> 
> It does not seem to fix the testcase.  I'll investigate why not.
> You're talking about the
> 
>   (parallel [(set (pc) (pc))
>              (set (X) (sp))])
> 
> right?  I guess the "set pc pc" is not marked as unused...

The very first thing that is checked for is !(added_sets_2 && i1 == 0)
which matches in this case.  I wonder what that condition is supposed
to accomplish -- "optimisation" I suppose?

Hacking that out, it still won't match because the "set pc pc" is not
considered dead.  It's a no-op, not the same thing.  I'll try to widen
the condition...


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs
  2014-11-18 13:51       ` Segher Boessenkool
@ 2014-11-18 18:59         ` Jeff Law
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff Law @ 2014-11-18 18:59 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

On 11/18/14 06:42, Segher Boessenkool wrote:
> On Tue, Nov 18, 2014 at 06:38:49AM -0600, Segher Boessenkool wrote:
>>> Does it help pr52714 where we'd like to rip apart a PARALLEL with two
>>> sets, one of which is dead.  If so, it might allow us to  optimize that
>>> code better.
>>
>> It does not seem to fix the testcase.  I'll investigate why not.
>> You're talking about the
>>
>>    (parallel [(set (pc) (pc))
>>               (set (X) (sp))])
>>
>> right?  I guess the "set pc pc" is not marked as unused...
>
> The very first thing that is checked for is !(added_sets_2 && i1 == 0)
> which matches in this case.  I wonder what that condition is supposed
> to accomplish -- "optimisation" I suppose?
>
> Hacking that out, it still won't match because the "set pc pc" is not
> considered dead.  It's a no-op, not the same thing.  I'll try to widen
> the condition...
Just to be clear, I wouldn't consider fixing this a requirement to get 
your change in.   I think 5/5 was ready to go as-is.  Obviously if you 
can improve on it, that's great ;-)


jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs
  2014-11-14 19:21 ` [PATCH 2/5] combine: handle I2 a parallel of two SETs Segher Boessenkool
  2014-11-14 19:52   ` Bernd Schmidt
@ 2014-11-25 19:02   ` Jeff Law
  2014-11-25 22:42     ` Segher Boessenkool
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff Law @ 2014-11-25 19:02 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches

On 11/14/14 12:19, Segher Boessenkool wrote:
> If I2 is a PARALLEL of two SETs, split it into two instructions, I1
> and I2.  If there already was an I1, rename it to I0.  If there
> already was an I0, don't do anything.
>
> This surprisingly simple patch is enough to let combine handle such
> PARALLELs properly.
It's clever.

>
>
> 2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
>
> gcc/
> 	* combine.c (try_combine): If I2 is a PARALLEL of two SETs,
> 	split it into two insns.
So you're virtually serializing the PARALLEL to make combine happy if 
I'm reading this correctly.

THe first thing I worry about is preserving the semantics of a PARALLEL. 
  Specifically that all the inputs are evaluated, then all the side 
effects happen.  So I think one of the checks you need is that the 
destinations of the SETs are not used as source operands in the SETs.

The second thing I worry about handling of match_dup operands.  But 
presumably all the resulting insns must match in one way or another or 
the whole thing gets reset to its prior state.  So I suspect those are 
OK as well.

Related, I was worried about RTL structure sharing, but in the end I 
think those are a non-concern for the same basic reasons as match_dups 
aren't a real worry.


>
> ---
>   gcc/combine.c | 31 +++++++++++++++++++++++++++++++
>   1 file changed, 31 insertions(+)
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index f7797e7..c4d23e3 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -2780,6 +2780,37 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
>   	  SUBST_LINK (LOG_LINKS (i2), alloc_insn_link (i1, LOG_LINKS (i2)));
>   	}
>       }
> +
> +  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
> +     make those two SETs separate I1 and I2 insns, and make an I0 that is
> +     the original I1.  */
> +  if (i0 == 0
Test for NULL.


> +      && GET_CODE (PATTERN (i2)) == PARALLEL
> +      && XVECLEN (PATTERN (i2), 0) >= 2
> +      && GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
> +      && GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
> +      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
> +      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
> +      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3)
> +      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3)
> +      && (XVECLEN (PATTERN (i2), 0) == 2
> +	  || GET_CODE (XVECEXP (PATTERN (i2), 0, 2)) == CLOBBER))
As noted above, I think you need to verify the set/clobbered operands do 
not conflict with any of the source operands.  Otherwise you run the 
risk of changing the semantics when you rip apart the PARALLEL.

Ah, just saw that Bernd made the same observation.  Good.

And I think while convention has CLOBBERs at the end of insns, I don't 
think that's a hard requirement.  So I think you need a stronger check 
for elements 2 and beyond in the vector.

OK with the direction this is going, but I think another iteration is 
going to be necessary.

Jeff





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] combine: add regno field to LOG_LINKS
  2014-11-14 19:29 ` [PATCH 3/5] combine: add regno field to LOG_LINKS Segher Boessenkool
@ 2014-11-25 19:09   ` Jeff Law
  2014-11-25 22:33     ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2014-11-25 19:09 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches

On 11/14/14 12:19, Segher Boessenkool wrote:
> With this new field in place, we can have LOG_LINKS for insns that set
> more than one register and distribute them properly in distribute_links.
> This then allows many more PARALLELs to be combined.
>
> Also split off new functions can_combine_{def,use}_p from the
> create_log_links function.
>
>
> 2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
>
> gcc/
> 	* combine.c (struct insn_link): New field `regno'.
> 	(alloc_insn_link): New parameter `regno'.  Use it.
> 	(find_single_use): Check the new field.
> 	(can_combine_def_p, can_combine_use_p): New functions.  Split
> 	off from ...
> 	(create_log_links): ... here.  Correct data type of `regno'.
> 	Adjust call to alloc_insn_link.
> 	(adjust_for_new_dest): Find regno, use it in call to
> 	alloc_insn_link.
> 	(try_combine): Adjust call to alloc_insn_link.
> 	(distribute_links): Check the new field.
Didn't you lose the check that avoids duplicated LOG_LINKs?   Or is the 
claim that the check is no longer needed because there are no duplicates 
now that we include the register associated with the link?


> +
> +  rtx set = single_set (insn);
> +  gcc_assert (set);
> +
> +  rtx reg = SET_DEST (set);
> +
> +  while (GET_CODE (reg) == ZERO_EXTRACT
> +	 || GET_CODE (reg) == STRICT_LOW_PART
> +	 || GET_CODE (reg) == SUBREG)
> +    reg = XEXP (reg, 0);
> +  gcc_assert (REG_P (reg));
Can REG ever be a hard reg here?  If so, then the SUBREG case needs to 
simplify the hard reg rather than just strip off the SUBREG.


Might be OK, depends on answers to questions above -- holding final 
approval pending those answers.

Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs
  2014-11-14 19:35 ` [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs Segher Boessenkool
@ 2014-11-25 20:06   ` Jeff Law
  2014-11-25 22:38     ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2014-11-25 20:06 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches

On 11/14/14 12:19, Segher Boessenkool wrote:
> Now that LOG_LINKS are per regno, we can distribute them on PARALLELs
> just fine.  Do so.  This makes PARALLELs not lose their LOG_LINKS early
> when e.g. a trivial reg-reg move is combined, so that they can be used
> in more useful combinations as well.
>
>
> 2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
>
> gcc/
> 	* combine.c (distribute_links): Handle multiple SETs.
So the code in distribute_links implies that we're not going to see hard 
register SUBREGs, so ignore my concerns with the prior patch in this 
series WRT hard register SUBREGs.


This is OK once prereqs are approved.

You might consider pushing the two LOG_LINKs related patches forward 
independently of the patch to rip apart the PARALLELs.  Though I think 
that all of the patches are pretty close to being approved.  Your call.

Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] combine: add regno field to LOG_LINKS
  2014-11-25 19:09   ` Jeff Law
@ 2014-11-25 22:33     ` Segher Boessenkool
  2014-11-26 18:21       ` Jeff Law
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-25 22:33 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Nov 25, 2014 at 11:46:52AM -0700, Jeff Law wrote:
> On 11/14/14 12:19, Segher Boessenkool wrote:
> >With this new field in place, we can have LOG_LINKS for insns that set
> >more than one register and distribute them properly in distribute_links.
> >This then allows many more PARALLELs to be combined.
> >
> >Also split off new functions can_combine_{def,use}_p from the
> >create_log_links function.
> >
> >
> >2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
> >
> >gcc/
> >	* combine.c (struct insn_link): New field `regno'.
> >	(alloc_insn_link): New parameter `regno'.  Use it.
> >	(find_single_use): Check the new field.
> >	(can_combine_def_p, can_combine_use_p): New functions.  Split
> >	off from ...
> >	(create_log_links): ... here.  Correct data type of `regno'.
> >	Adjust call to alloc_insn_link.
> >	(adjust_for_new_dest): Find regno, use it in call to
> >	alloc_insn_link.
> >	(try_combine): Adjust call to alloc_insn_link.
> >	(distribute_links): Check the new field.

> Didn't you lose the check that avoids duplicated LOG_LINKs?

I don't think so; if I did, that's a bug.

> Or is the 
> claim that the check is no longer needed because there are no duplicates 
> now that we include the register associated with the link?

Are you talking about create_log_links?  There can be no duplicates there
(anymore), that would be multiple defs of the same reg in the same insn,
indeed.

I did check all the places that look at links, and adjusted everything
that needed adjusting.  Could have missed something of course...


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs
  2014-11-25 20:06   ` Jeff Law
@ 2014-11-25 22:38     ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-25 22:38 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Nov 25, 2014 at 12:37:57PM -0700, Jeff Law wrote:
> This is OK once prereqs are approved.

That is only 3/5 now.

> You might consider pushing the two LOG_LINKs related patches forward 
> independently of the patch to rip apart the PARALLELs.  Though I think 
> that all of the patches are pretty close to being approved.  Your call.

I probably will; 2/5 (the "rip" patch) needs updating, and my patch stack
is much too high :-)  Not that [34]/5 do terribly much without 2/5.

Thanks for all the reviews,


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs
  2014-11-25 19:02   ` Jeff Law
@ 2014-11-25 22:42     ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-25 22:42 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Nov 25, 2014 at 11:35:14AM -0700, Jeff Law wrote:
> On 11/14/14 12:19, Segher Boessenkool wrote:
> >If I2 is a PARALLEL of two SETs, split it into two instructions, I1
> >and I2.  If there already was an I1, rename it to I0.  If there
> >already was an I0, don't do anything.
> >
> >This surprisingly simple patch is enough to let combine handle such
> >PARALLELs properly.
> It's clever.

It does a very similar thing as the code right before it.

> So you're virtually serializing the PARALLEL to make combine happy if 
> I'm reading this correctly.

That is correct.

> THe first thing I worry about is preserving the semantics of a PARALLEL. 
>  Specifically that all the inputs are evaluated, then all the side 
> effects happen.  So I think one of the checks you need is that the 
> destinations of the SETs are not used as source operands in the SETs.

As you say below, Bernd also noticed this; I haven't found the time to
make a new patch yet.  "Soon".

> The second thing I worry about handling of match_dup operands.  But 
> presumably all the resulting insns must match in one way or another or 
> the whole thing gets reset to its prior state.  So I suspect those are 
> OK as well.
> 
> Related, I was worried about RTL structure sharing, but in the end I 
> think those are a non-concern for the same basic reasons as match_dups 
> aren't a real worry.

combine make a copy of any RTL before it modifies it.

> >+  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
> >+     make those two SETs separate I1 and I2 insns, and make an I0 that is
> >+     the original I1.  */
> >+  if (i0 == 0
> Test for NULL.

All similar code in combine tests for 0.  I'd have written "if (!i0)"
otherwise.

> And I think while convention has CLOBBERs at the end of insns, I don't 
> think that's a hard requirement.  So I think you need a stronger check 
> for elements 2 and beyond in the vector.

I'll check if 0,1 are SETs and all of 2..N-1 are CLOBBERs.  I was thinking
there could be nothing else but SETs and CLOBBERs (which combine already
does expect to always be in that order), but let's be less tricky.

> OK with the direction this is going, but I think another iteration is 
> going to be necessary.

Great, new patch coming up.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs
  2014-11-15 15:21     ` Segher Boessenkool
@ 2014-11-26 16:24       ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-26 16:24 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

On Sat, Nov 15, 2014 at 06:59:23AM -0600, Segher Boessenkool wrote:
> On Fri, Nov 14, 2014 at 08:35:48PM +0100, Bernd Schmidt wrote:
> > On 11/14/2014 08:19 PM, Segher Boessenkool wrote:
> > >+  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
> > >+     make those two SETs separate I1 and I2 insns, and make an I0 that is
> > >+     the original I1.  */
> > >+  if (i0 == 0
> > >+      && GET_CODE (PATTERN (i2)) == PARALLEL
> > >+      && XVECLEN (PATTERN (i2), 0) >= 2
> > >+      && GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
> > >+      && GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
> > >+      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
> > >+      && REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
> > >+      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), 
> > >i2, i3)
> > >+      && !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), 
> > >i2, i3)
> > 
> > Don't we have other code in combine checking the reg_used_between case?

Actually, no, no we don't.

Under the old regime (before adding the regno field to log_links), a
link links a first insn with its earliest successor that uses any of
the regs that first insn sets.  This guarantees that in try_combine
none of the insns between I2 and I3 will use any of the registers set
by I2.

Under the new regime (patches 3 and 4) a link links a first insn with
its earliest successor that uses a particular reg the first insn sets.
For single sets, this is the same as before; but not so for multiple
sets.

There are only two cases where this matters: this patch, and the code
right before it (that never seems to trigger, not on any target; trying
to figure out if that is true, and if so, why).

New patchset is bootstrapping/testing.

Cheers,


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] combine: add regno field to LOG_LINKS
  2014-11-25 22:33     ` Segher Boessenkool
@ 2014-11-26 18:21       ` Jeff Law
  2014-11-26 22:17         ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Law @ 2014-11-26 18:21 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

On 11/25/14 14:47, Segher Boessenkool wrote:
> On Tue, Nov 25, 2014 at 11:46:52AM -0700, Jeff Law wrote:
>> On 11/14/14 12:19, Segher Boessenkool wrote:
>>> With this new field in place, we can have LOG_LINKS for insns that set
>>> more than one register and distribute them properly in distribute_links.
>>> This then allows many more PARALLELs to be combined.
>>>
>>> Also split off new functions can_combine_{def,use}_p from the
>>> create_log_links function.
>>>
>>>
>>> 2014-11-14  Segher Boessenkool  <segher@kernel.crashing.org>
>>>
>>> gcc/
>>> 	* combine.c (struct insn_link): New field `regno'.
>>> 	(alloc_insn_link): New parameter `regno'.  Use it.
>>> 	(find_single_use): Check the new field.
>>> 	(can_combine_def_p, can_combine_use_p): New functions.  Split
>>> 	off from ...
>>> 	(create_log_links): ... here.  Correct data type of `regno'.
>>> 	Adjust call to alloc_insn_link.
>>> 	(adjust_for_new_dest): Find regno, use it in call to
>>> 	alloc_insn_link.
>>> 	(try_combine): Adjust call to alloc_insn_link.
>>> 	(distribute_links): Check the new field.
>
>> Didn't you lose the check that avoids duplicated LOG_LINKs?
>
> I don't think so; if I did, that's a bug.
>
>> Or is the
>> claim that the check is no longer needed because there are no duplicates
>> now that we include the register associated with the link?
>
> Are you talking about create_log_links?  There can be no duplicates there
> (anymore), that would be multiple defs of the same reg in the same insn,
> indeed.
Yes, I was referring to the code in create_log_links.  You dropped the 
check for duplicate links.  It caught my eye when reading the changes, 
but then I realized the check may no longer be necessary.

Hmm, what about an insn that has two destinations, which happen to be 
upper and lower SUBREGs of a pseudo.  Would that create duplicate links 
after your change?


Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] combine: add regno field to LOG_LINKS
  2014-11-26 18:21       ` Jeff Law
@ 2014-11-26 22:17         ` Segher Boessenkool
  2014-12-01 18:14           ` Jeff Law
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2014-11-26 22:17 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Wed, Nov 26, 2014 at 11:14:36AM -0700, Jeff Law wrote:
> >Are you talking about create_log_links?  There can be no duplicates there
> >(anymore), that would be multiple defs of the same reg in the same insn,
> >indeed.
> Yes, I was referring to the code in create_log_links.  You dropped the 
> check for duplicate links.  It caught my eye when reading the changes, 
> but then I realized the check may no longer be necessary.
> 
> Hmm, what about an insn that has two destinations, which happen to be 
> upper and lower SUBREGs of a pseudo.  Would that create duplicate links 
> after your change?

Yes it would.  And I'm not so certain distribute_log_links handles that
situation gracefully.  Rats.

IMNSHO such RTL should be invalid (it can always be written simpler as
one SET); but there seems to be no such rule.

I'll add the check back.

Wouldn't it be lovely if we could just use DF here...


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] combine: add regno field to LOG_LINKS
  2014-11-26 22:17         ` Segher Boessenkool
@ 2014-12-01 18:14           ` Jeff Law
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff Law @ 2014-12-01 18:14 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

On 11/26/14 14:59, Segher Boessenkool wrote:
> On Wed, Nov 26, 2014 at 11:14:36AM -0700, Jeff Law wrote:
>>> Are you talking about create_log_links?  There can be no duplicates there
>>> (anymore), that would be multiple defs of the same reg in the same insn,
>>> indeed.
>> Yes, I was referring to the code in create_log_links.  You dropped the
>> check for duplicate links.  It caught my eye when reading the changes,
>> but then I realized the check may no longer be necessary.
>>
>> Hmm, what about an insn that has two destinations, which happen to be
>> upper and lower SUBREGs of a pseudo.  Would that create duplicate links
>> after your change?
>
> Yes it would.  And I'm not so certain distribute_log_links handles that
> situation gracefully.  Rats.
>
> IMNSHO such RTL should be invalid (it can always be written simpler as
> one SET); but there seems to be no such rule.
There's no such rule.  And I was just using an example off the top of my 
head.  There may be others that can't necessarily be expressed by a 
single set.

Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-12-01 18:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-14 19:21 [PATCH 0/5] some combine patches Segher Boessenkool
2014-11-14 19:21 ` [PATCH 1/5] combine: more verbose costs output Segher Boessenkool
2014-11-17  9:56   ` Richard Biener
2014-11-14 19:21 ` [PATCH 2/5] combine: handle I2 a parallel of two SETs Segher Boessenkool
2014-11-14 19:52   ` Bernd Schmidt
2014-11-15 15:21     ` Segher Boessenkool
2014-11-26 16:24       ` Segher Boessenkool
2014-11-25 19:02   ` Jeff Law
2014-11-25 22:42     ` Segher Boessenkool
2014-11-14 19:29 ` [PATCH 3/5] combine: add regno field to LOG_LINKS Segher Boessenkool
2014-11-25 19:09   ` Jeff Law
2014-11-25 22:33     ` Segher Boessenkool
2014-11-26 18:21       ` Jeff Law
2014-11-26 22:17         ` Segher Boessenkool
2014-12-01 18:14           ` Jeff Law
2014-11-14 19:35 ` [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs Segher Boessenkool
2014-11-25 20:06   ` Jeff Law
2014-11-25 22:38     ` Segher Boessenkool
2014-11-14 19:37 ` [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs Segher Boessenkool
2014-11-17 21:25   ` Jeff Law
2014-11-18 12:52     ` Segher Boessenkool
2014-11-18 13:51       ` Segher Boessenkool
2014-11-18 18:59         ` Jeff Law
2014-11-16 12:08 ` [PATCH 0/5] some combine patches Oleg Endo
2014-11-16 13:52   ` Segher Boessenkool
2014-11-16 14:24     ` Oleg Endo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).