public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v2] combine: Don't turn (mult (extend x) 2^n) into extract [PR96998]
@ 2020-09-30 10:39 Alex Coplan
  2020-10-08 10:21 ` [PING][PATCH " Alex Coplan
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Coplan @ 2020-09-30 10:39 UTC (permalink / raw)
  To: gcc-patches
  Cc: Segher Boessenkool, Richard Earnshaw, Richard Sandiford, Kyrylo Tkachov

[-- Attachment #1: Type: text/plain, Size: 4924 bytes --]

Currently, make_extraction() identifies where we can emit an ASHIFT of
an extend in place of an extraction, but fails to make the corresponding
canonicalization/simplification when presented with a MULT by a power of
two. Such a representation is canonical when representing a left-shifted
address inside a MEM.

This patch remedies this situation: after the patch, make_extraction()
now also identifies RTXs such as:

(mult:DI (subreg:DI (reg:SI r)) (const_int 2^n))

and rewrites this as:

(mult:DI (sign_extend:DI (reg:SI r)) (const_int 2^n))

instead of using a sign_extract.

(This patch also fixes up a comment in expand_compound_operation() which
appears to have suffered from bitrot.)

This fixes PR96998: an ICE on AArch64 due to an unrecognised
sign_extract insn which was exposed by
r11-2903-g6b3034eaba83935d9f6dfb20d2efbdb34b5b00bf. That change
introduced a canonicalisation in LRA to rewrite mult to shift in address
reloads.

Prior to this patch, the flow was as follows. We start with the
following insn going into combine:

(insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (reg:DI 98 [ g ])
                    (const_int 4 [0x4]))
                (reg/f:DI 96)) [3 *i_5+0 S4 A32])
        (asm_operands:SI ("") ("=Q") 0 []
             []
             [] test.c:11)) "test.c":11:5 -1
     (expr_list:REG_DEAD (reg:DI 98 [ g ])
        (nil)))

Then combine turns this into a sign_extract:

(insn 9 8 10 3 (set (mem:SI (plus:DI (sign_extract:DI (mult:DI (subreg:DI (reg/v:SI 92 [ g ]) 0)
                        (const_int 4 [0x4]))
                    (const_int 34 [0x22])
                    (const_int 0 [0]))
                (reg/f:DI 96)) [3 *i_5+0 S4 A32])
        (asm_operands:SI ("") ("=Q") 0 []
             []
             [] test.c:11)) "test.c":11:5 -1
     (expr_list:REG_DEAD (reg/v:SI 92 [ g ])
        (nil)))

Then LRA reloads the address and (prior to the LRA change) we get:

(insn 32 8 9 3 (set (reg:DI 0 x0 [100])
        (plus:DI (sign_extract:DI (mult:DI (reg:DI 0 x0 [orig:92 g ] [92])
                    (const_int 4 [0x4]))
                (const_int 34 [0x22])
                (const_int 0 [0]))
            (reg/f:DI 19 x19 [96]))) "test.c":11:5 283 {*add_extvdi_multp2}
     (nil))
(insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32])
        (asm_operands:SI ("") ("=Q") 0 []
             []
             [] test.c:11)) "test.c":11:5 -1
     (nil))

Now observe that insn 32 here is not canonical: firstly, we should be
using an ASHIFT by 2 instead of a MULT by 4, since we're outside of a
MEM. Indeed, the LRA change remedies this, and support for such insns in
the AArch64 backend was dropped in
r11-3033-g2f8ae301f6a125f50b0a758047fcddae7b68daa8.

Now the reason we ICE after the LRA change here is that AArch64 has
never supported the ASHIFT variant of this sign_extract insn. Inspecting
the unrecognised reloaded insn confirms this:

(gdb) p debug(insn)
(insn 33 8 34 3 (set (reg:DI 100)
        (sign_extract:DI (ashift:DI (subreg:DI (reg/v:SI 92 [ g ]) 0)
                (const_int 2 [0x2]))
            (const_int 34 [0x22])
            (const_int 0 [0]))) "test.c":11:5 -1
     (nil))

The thesis of this patch is that combine should _never_ be producing
such an insn. Clearly this should be canonicalised as an extend
operation instead (as combine already does in make_extraction() for the
ASHIFT form). After this change to combine, we get:

(insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (sign_extend:DI (reg/v:SI 92 [ g ]))
                    (const_int 4 [0x4]))
                (reg/f:DI 96)) [3 *i_5+0 S4 A32])
        (asm_operands:SI ("") ("=Q") 0 []
             []
             [] test.c:11)) "test.c":11:5 -1
     (expr_list:REG_DEAD (reg/v:SI 92 [ g ])
        (nil)))

coming out of combine, and LRA can happily reload the address:

(insn 32 8 9 3 (set (reg:DI 0 x0 [100])
        (plus:DI (ashift:DI (sign_extend:DI (reg/v:SI 0 x0 [orig:92 g ] [92]))
                (const_int 2 [0x2]))
            (reg/f:DI 19 x19 [96]))) "test.c":11:5 245 {*add_extendsi_shft_di}
     (nil))
(insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32])
        (asm_operands:SI ("") ("=Q") 0 []
             []
             [] test.c:11)) "test.c":11:5 -1
     (nil))

and all is well, with nice simple and canonical RTL being used
throughout.

Testing:
 * Bootstrap and regtest on aarch64-linux-gnu, arm-linux-gnueabihf, and
   x86-linux-gnu in progress.

OK for trunk (with AArch64 changes discussed here [0] as a follow-on
patch) provided it passes testing?

Thanks,
Alex

[0] : https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554257.html

---

gcc/ChangeLog:

	PR target/96998
	* combine.c (expand_compound_operation): Tweak variable name in
	comment to match source.
	(make_extraction): Handle mult by power of two in addition to
	ashift.

gcc/testsuite/ChangeLog:

	PR target/96998
	* gcc.c-torture/compile/pr96998.c: New test.

[-- Attachment #2: combine.diff --]
[-- Type: text/x-diff, Size: 2885 bytes --]

diff --git a/gcc/combine.c b/gcc/combine.c
index c88382efbd3..fe8eff2b464 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7419,8 +7419,8 @@ expand_compound_operation (rtx x)
     }
 
   /* If we reach here, we want to return a pair of shifts.  The inner
-     shift is a left shift of BITSIZE - POS - LEN bits.  The outer
-     shift is a right shift of BITSIZE - LEN bits.  It is arithmetic or
+     shift is a left shift of MODEWIDTH - POS - LEN bits.  The outer
+     shift is a right shift of MODEWIDTH - LEN bits.  It is arithmetic or
      logical depending on the value of UNSIGNEDP.
 
      If this was a ZERO_EXTEND or ZERO_EXTRACT, this pair of shifts will be
@@ -7650,20 +7650,27 @@ make_extraction (machine_mode mode, rtx inner, HOST_WIDE_INT pos,
 	is_mode = GET_MODE (SUBREG_REG (inner));
       inner = SUBREG_REG (inner);
     }
-  else if (GET_CODE (inner) == ASHIFT
+  else if ((GET_CODE (inner) == ASHIFT || GET_CODE (inner) == MULT)
 	   && CONST_INT_P (XEXP (inner, 1))
-	   && pos_rtx == 0 && pos == 0
-	   && len > UINTVAL (XEXP (inner, 1)))
-    {
-      /* We're extracting the least significant bits of an rtx
-	 (ashift X (const_int C)), where LEN > C.  Extract the
-	 least significant (LEN - C) bits of X, giving an rtx
-	 whose mode is MODE, then shift it left C times.  */
-      new_rtx = make_extraction (mode, XEXP (inner, 0),
-			     0, 0, len - INTVAL (XEXP (inner, 1)),
-			     unsignedp, in_dest, in_compare);
-      if (new_rtx != 0)
-	return gen_rtx_ASHIFT (mode, new_rtx, XEXP (inner, 1));
+	   && pos_rtx == 0 && pos == 0)
+    {
+      const HOST_WIDE_INT ci = INTVAL (XEXP (inner, 1));
+      const auto code = GET_CODE (inner);
+      const HOST_WIDE_INT shift_amt = (code == MULT) ? exact_log2 (ci) : ci;
+
+      if (shift_amt > 0 && len > (unsigned HOST_WIDE_INT)shift_amt)
+	{
+	  /* We're extracting the least significant bits of an rtx
+	     (ashift X (const_int C)) or (mult X (const_int (2^C))),
+	     where LEN > C.  Extract the least significant (LEN - C) bits
+	     of X, giving an rtx whose mode is MODE, then shift it left
+	     C times.  */
+	  new_rtx = make_extraction (mode, XEXP (inner, 0),
+				 0, 0, len - shift_amt,
+				 unsignedp, in_dest, in_compare);
+	  if (new_rtx)
+	    return gen_rtx_fmt_ee (code, mode, new_rtx, XEXP (inner, 1));
+	}
     }
   else if (GET_CODE (inner) == TRUNCATE
 	   /* If trying or potentionally trying to extract
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr96998.c b/gcc/testsuite/gcc.c-torture/compile/pr96998.c
new file mode 100644
index 00000000000..a75d5dcfe08
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr96998.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target arm*-*-* aarch64*-*-* } } */
+
+int h(void);
+struct c d;
+struct c {
+  int e[1];
+};
+
+void f(void) {
+  int g;
+  for (;; g = h()) {
+    int *i = &d.e[g];
+    asm("" : "=Q"(*i));
+  }
+}

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-15  9:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30 10:39 [PATCH v2] combine: Don't turn (mult (extend x) 2^n) into extract [PR96998] Alex Coplan
2020-10-08 10:21 ` [PING][PATCH " Alex Coplan
2020-10-08 20:20   ` Segher Boessenkool
2020-10-09  8:38     ` Alex Coplan
2020-10-09 23:02       ` Segher Boessenkool
2020-10-12 16:19         ` Richard Sandiford
2020-10-12 17:14           ` Segher Boessenkool
2020-10-15  9:14             ` Alex Coplan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).