* [PATCH] postreload: Fix autoinc handling in reload_cse_move2add [PR94516]
@ 2020-04-07 22:55 Jakub Jelinek
2020-04-08 0:05 ` Jeff Law
0 siblings, 1 reply; 2+ messages in thread
From: Jakub Jelinek @ 2020-04-07 22:55 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Hi!
The following testcase shows two separate issues caused by the cselib
changes.
One is that through the cselib sp tracking improvements on
... r12 = rsp; rsp -= 8; push cst1; push cst2; push cst3; call
rsp += 32; rsp -= 8; push cst4; push cst5; push cst6; call
rsp += 32; rsp -= 8; push cst7; push cst8; push cst9; call
rsp += 32
reload_cse_simplify_set decides to optimize the rsp += 32 insns
into rsp = r12 because cselib figures that the r12 register holds the right
value. From the pure cost perspective that seems like a win and on its own
at least for -Os that would be beneficial, except that there are those
rsp -= 8 stack adjustments after it, where rsp += 32; rsp -= 8; is optimized
into rsp += 24; by the csa pass, but rsp = r12; rsp -= 8 can't. Dunno
what to do about this part, the PR has a hack in a comment.
Anyway, the following patch fixes the other part, which isn't a missed
optimization, but a wrong-code issue. The problem is that the pushes of
constant are on x86 represented through PRE_MODIFY and while
move2add_note_store has some code to handle {PRE,POST}_{INC,DEC} without
REG_INC note, it doesn't handle {PRE,POST}_MODIFY (that would be enough
to fix this testcase). But additionally it looks misplaced, because
move2add_note_store is only called on the rtxes that are stored into,
while RTX_AUTOINC can happen not just in those, but anywhere else in the
instruction (e.g. pop insn can have autoinc in the SET_SRC MEM).
REG_INC note seems to be required for any autoinc except for stack pointer
autoinc which doesn't have those notes, so this patch just handles
the sp autoinc after the REG_INC note handling loop.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2020-04-07 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/94516
* postreload.c: Include rtl-iter.h.
(reload_cse_move2add): Handle SP autoinc here by FOR_EACH_SUBRTX_VAR
looking for all MEMs with RTX_AUTOINC operand.
(move2add_note_store): Remove {PRE,POST}_{INC,DEC} handling.
* gcc.dg/torture/pr94516.c: New test.
--- gcc/postreload.c.jj 2020-01-23 20:07:55.041981751 +0100
+++ gcc/postreload.c 2020-04-07 16:13:56.812293400 +0200
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.
#include "tree-pass.h"
#include "dbgcnt.h"
#include "function-abi.h"
+#include "rtl-iter.h"
static int reload_cse_noop_set_p (rtx);
static bool reload_cse_simplify (rtx_insn *, rtx);
@@ -2090,6 +2091,21 @@ reload_cse_move2add (rtx_insn *first)
}
}
}
+
+ /* There are no REG_INC notes for SP autoinc. */
+ subrtx_var_iterator::array_type array;
+ FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (insn), NONCONST)
+ {
+ rtx mem = *iter;
+ if (mem
+ && MEM_P (mem)
+ && GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) == RTX_AUTOINC)
+ {
+ if (XEXP (XEXP (mem, 0), 0) == stack_pointer_rtx)
+ reg_mode[STACK_POINTER_REGNUM] = VOIDmode;
+ }
+ }
+
note_stores (insn, move2add_note_store, insn);
/* If INSN is a conditional branch, we try to extract an
@@ -2144,17 +2160,6 @@ move2add_note_store (rtx dst, const_rtx
unsigned int regno = 0;
scalar_int_mode mode;
- /* Some targets do argument pushes without adding REG_INC notes. */
-
- if (MEM_P (dst))
- {
- dst = XEXP (dst, 0);
- if (GET_CODE (dst) == PRE_INC || GET_CODE (dst) == POST_INC
- || GET_CODE (dst) == PRE_DEC || GET_CODE (dst) == POST_DEC)
- reg_mode[REGNO (XEXP (dst, 0))] = VOIDmode;
- return;
- }
-
if (GET_CODE (dst) == SUBREG)
regno = subreg_regno (dst);
else if (REG_P (dst))
--- gcc/testsuite/gcc.dg/torture/pr94516.c.jj 2020-04-07 16:50:25.531448655 +0200
+++ gcc/testsuite/gcc.dg/torture/pr94516.c 2020-04-07 16:52:57.703196275 +0200
@@ -0,0 +1,31 @@
+/* PR rtl-optimization/94516 */
+/* { dg-do run } */
+/* { dg-additional-options "-fpie" { target pie } } */
+
+struct S { unsigned char *a; unsigned int b; };
+typedef int V __attribute__((vector_size (sizeof (int) * 4)));
+
+__attribute__((noipa)) void
+foo (const char *a, const char *b, const char *c, const struct S *d, int e, int f, int g, int h, int i)
+{
+ V v = { 1, 2, 3, 4 };
+ asm volatile ("" : : "g" (&v) : "memory");
+ v += (V) { 5, 6, 7, 8 };
+ asm volatile ("" : : "g" (&v) : "memory");
+}
+
+__attribute__((noipa)) void
+bar (void)
+{
+ const struct S s = { "foobarbaz", 9 };
+ foo ("foo", (const char *) 0, "corge", &s, 0, 1, 0, -12, -31);
+ foo ("bar", "quux", "qux", &s, 0, 0, 9, 0, 0);
+ foo ("baz", (const char *) 0, "qux", &s, 1, 0, 0, -12, -32);
+}
+
+int
+main ()
+{
+ bar ();
+ return 0;
+}
Jakub
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] postreload: Fix autoinc handling in reload_cse_move2add [PR94516]
2020-04-07 22:55 [PATCH] postreload: Fix autoinc handling in reload_cse_move2add [PR94516] Jakub Jelinek
@ 2020-04-08 0:05 ` Jeff Law
0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2020-04-08 0:05 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches
On Wed, 2020-04-08 at 00:55 +0200, Jakub Jelinek wrote:
> Hi!
>
> The following testcase shows two separate issues caused by the cselib
> changes.
> One is that through the cselib sp tracking improvements on
> ... r12 = rsp; rsp -= 8; push cst1; push cst2; push cst3; call
> rsp += 32; rsp -= 8; push cst4; push cst5; push cst6; call
> rsp += 32; rsp -= 8; push cst7; push cst8; push cst9; call
> rsp += 32
> reload_cse_simplify_set decides to optimize the rsp += 32 insns
> into rsp = r12 because cselib figures that the r12 register holds the right
> value. From the pure cost perspective that seems like a win and on its own
> at least for -Os that would be beneficial, except that there are those
> rsp -= 8 stack adjustments after it, where rsp += 32; rsp -= 8; is optimized
> into rsp += 24; by the csa pass, but rsp = r12; rsp -= 8 can't. Dunno
> what to do about this part, the PR has a hack in a comment.
Yea, what to do here is a bit unclear. I'd guess the CSA improvements happen
more often that cselib finding a register that just happens to have the right
value lying around. But that's speculation with no hard data on my part...
>
> Anyway, the following patch fixes the other part, which isn't a missed
> optimization, but a wrong-code issue. The problem is that the pushes of
> constant are on x86 represented through PRE_MODIFY and while
> move2add_note_store has some code to handle {PRE,POST}_{INC,DEC} without
> REG_INC note, it doesn't handle {PRE,POST}_MODIFY (that would be enough
> to fix this testcase). But additionally it looks misplaced, because
> move2add_note_store is only called on the rtxes that are stored into,
> while RTX_AUTOINC can happen not just in those, but anywhere else in the
> instruction (e.g. pop insn can have autoinc in the SET_SRC MEM).
> REG_INC note seems to be required for any autoinc except for stack pointer
> autoinc which doesn't have those notes, so this patch just handles
> the sp autoinc after the REG_INC note handling loop.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2020-04-07 Jakub Jelinek <jakub@redhat.com>
>
> PR rtl-optimization/94516
> * postreload.c: Include rtl-iter.h.
> (reload_cse_move2add): Handle SP autoinc here by FOR_EACH_SUBRTX_VAR
> looking for all MEMs with RTX_AUTOINC operand.
> (move2add_note_store): Remove {PRE,POST}_{INC,DEC} handling.
>
> * gcc.dg/torture/pr94516.c: New test.
OK
jeff
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-04-08 0:05 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07 22:55 [PATCH] postreload: Fix autoinc handling in reload_cse_move2add [PR94516] Jakub Jelinek
2020-04-08 0:05 ` Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).