From: Vineet Gupta <vineetg@rivosinc.com>
To: gcc-patches@gcc.gnu.org
Cc: Jeff Law <jeffreyalaw@gmail.com>,
kito.cheng@gmail.com, Palmer Dabbelt <palmer@rivosinc.com>,
gnu-toolchain@rivosinc.com, Robin Dapp <rdapp.gcc@gmail.com>,
Vineet Gupta <vineetg@rivosinc.com>
Subject: [gcc-15 3/3] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]
Date: Sat, 16 Mar 2024 10:35:24 -0700 [thread overview]
Message-ID: <20240316173524.1147760-4-vineetg@rivosinc.com> (raw)
In-Reply-To: <20240316173524.1147760-1-vineetg@rivosinc.com>
If the constant used for stack offset can be expressed as sum of two S12
values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.
The prev patches to generally avoid LUI based const materialization didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.
This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)
gcc-13.1 release | gcc 230823 | |
| g6619b3d4c15c | This patch | clang/llvm
---------------------------------------------------------------------------------
li t0,-4096 | li t0,-4096 | addi sp,sp,-2048 | addi sp,sp,-2048
addi t0,t0,2016 | addi t0,t0,2032 | add sp,sp,-16 | addi sp,sp,-32
li a4,4096 | add sp,sp,t0 | add a5,sp,a0 | add a1,sp,16
add sp,sp,t0 | addi a5,sp,-2032 | sb zero,0(a5) | add a0,a0,a1
li a5,-4096 | add a0,a5,a0 | addi sp,sp,2032 | sb zero,0(a0)
addi a4,a4,-2032 | li t0, 4096 | addi sp,sp,32 | addi sp,sp,2032
add a4,a4,a5 | sb zero,2032(a0) | ret | addi sp,sp,48
addi a5,sp,16 | addi t0,t0,-2032 | | ret
add a5,a4,a5 | add sp,sp,t0 |
add a0,a5,a0 | ret |
li t0,4096 |
sd a5,8(sp) |
sb zero,2032(a0)|
addi t0,t0,-2016 |
add sp,sp,t0 |
ret |
gcc/ChangeLog:
PR target/105733
* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
function to split a sum of two s12 values into constituents.
(riscv_expand_prologue): Handle offset being sum of two S12.
(riscv_expand_epilogue): Ditto.
* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr105733.c: New Test.
* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
expect LUI 4096.
* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
gcc/config/riscv/riscv-protos.h | 2 +
gcc/config/riscv/riscv.cc | 80 +++++++++++++++++--
gcc/testsuite/gcc.target/riscv/pr105733.c | 15 ++++
.../riscv/rvv/autovec/vls/spill-1.c | 4 +-
.../riscv/rvv/autovec/vls/spill-2.c | 4 +-
.../riscv/rvv/autovec/vls/spill-3.c | 4 +-
.../riscv/rvv/autovec/vls/spill-4.c | 4 +-
.../riscv/rvv/autovec/vls/spill-5.c | 4 +-
.../riscv/rvv/autovec/vls/spill-6.c | 4 +-
.../riscv/rvv/autovec/vls/spill-7.c | 4 +-
10 files changed, 104 insertions(+), 21 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/pr105733.c
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index f9e407bf5768..8b3f7ce181cc 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -165,6 +165,8 @@ extern void riscv_subword_address (rtx, rtx *, rtx *, rtx *, rtx *);
extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *);
extern enum memmodel riscv_union_memmodels (enum memmodel, enum memmodel);
extern bool riscv_reg_frame_related (rtx);
+extern void riscv_split_sum_of_two_s12 (HOST_WIDE_INT, bool,
+ HOST_WIDE_INT *, HOST_WIDE_INT *);
/* Routines implemented in riscv-c.cc. */
void riscv_cpu_cpp_builtins (cpp_reader *);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38aebefa2590..b4a7947433a7 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3810,6 +3810,38 @@ riscv_split_doubleword_move (rtx dest, rtx src)
riscv_emit_move (riscv_subword (dest, true), riscv_subword (src, true));
}
}
+
+/* Constant VAL is known to be sum of two S12 constants. Break it into
+ comprising BASE and OFF.
+ Numerically S12 is -2048 to 2047, however if ALIGN_BASE is true, use the
+ more conservative -2048 to 2032 range for offsets pertaining to stack
+ related registers. */
+
+void
+riscv_split_sum_of_two_s12 (HOST_WIDE_INT val, bool align_base,
+ HOST_WIDE_INT *base, HOST_WIDE_INT *off)
+{
+ if (SUM_OF_TWO_S12_N (val))
+ {
+ *base = -2048;
+ *off = val - (-2048);
+ }
+ else if (SUM_OF_TWO_S12_P (val, true) && align_base)
+ {
+ *base = 2032;
+ *off = val - 2032;
+ }
+ else if (SUM_OF_TWO_S12_P (val, false) && !align_base)
+ {
+ *base = 2047;
+ *off = val - 2047;
+ }
+ else
+ {
+ gcc_unreachable ();
+ }
+}
+
\f
/* Return the appropriate instructions to move SRC into DEST. Assume
that SRC is operand 1 and DEST is operand 0. */
@@ -7366,6 +7398,17 @@ riscv_expand_prologue (void)
GEN_INT (-constant_frame));
RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
}
+ else if (SUM_OF_TWO_S12 (-constant_frame, true))
+ {
+ HOST_WIDE_INT base, off;
+ riscv_split_sum_of_two_s12 (-constant_frame, true, &base, &off);
+ insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+ GEN_INT (base));
+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+ insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+ GEN_INT (off));
+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+ }
else
{
riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (-constant_frame));
@@ -7581,14 +7624,26 @@ riscv_expand_epilogue (int style)
}
else
{
- if (!SMALL_OPERAND (adjust_offset.to_constant ()))
+ HOST_WIDE_INT adj_off_value = adjust_offset.to_constant ();
+ if (SMALL_OPERAND (adj_off_value))
+ {
+ adjust = GEN_INT (adj_off_value);
+ }
+ else if (SUM_OF_TWO_S12 (adj_off_value, true))
+ {
+ HOST_WIDE_INT base, off;
+ riscv_split_sum_of_two_s12 (adj_off_value, true, &base, &off);
+ insn = gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx,
+ GEN_INT (base));
+ RTX_FRAME_RELATED_P (insn) = 1;
+ adjust = GEN_INT (off);
+ }
+ else
{
riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode),
- GEN_INT (adjust_offset.to_constant ()));
+ GEN_INT (adj_off_value));
adjust = RISCV_PROLOGUE_TEMP (Pmode);
}
- else
- adjust = GEN_INT (adjust_offset.to_constant ());
}
insn = emit_insn (
@@ -7655,10 +7710,21 @@ riscv_expand_epilogue (int style)
/* Get an rtx for STEP1 that we can add to BASE.
Skip if adjust equal to zero. */
- if (step1.to_constant () != 0)
+ HOST_WIDE_INT step1_value = step1.to_constant ();
+ if (step1_value != 0)
{
- rtx adjust = GEN_INT (step1.to_constant ());
- if (!SMALL_OPERAND (step1.to_constant ()))
+ rtx adjust = GEN_INT (step1_value);
+ if (SUM_OF_TWO_S12 (step1_value, true))
+ {
+ HOST_WIDE_INT base, off;
+ riscv_split_sum_of_two_s12 (step1_value, true, &base, &off);
+ insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
+ stack_pointer_rtx,
+ GEN_INT (base)));
+ RTX_FRAME_RELATED_P (insn) = 1;
+ adjust = GEN_INT (off);
+ }
+ else if (!SMALL_OPERAND (step1_value))
{
riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust);
adjust = RISCV_PROLOGUE_TEMP (Pmode);
diff --git a/gcc/testsuite/gcc.target/riscv/pr105733.c b/gcc/testsuite/gcc.target/riscv/pr105733.c
new file mode 100644
index 000000000000..6156c36dc7ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr105733.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options { -march=rv64gcv -mabi=lp64d } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+#define BUF_SIZE 2064
+
+void
+foo(unsigned long i)
+{
+ volatile char buf[BUF_SIZE];
+
+ buf[i] = 0;
+}
+
+/* { dg-final { scan-assembler-not {li\t[a-x0-9]+,4096} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
index 842bb630be56..82f4e3c29a7e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
@@ -129,5 +129,5 @@ spill_12 (int8_t *in, int8_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
index 8f6ee81b98f2..56cc4cc0c0b8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
@@ -120,5 +120,5 @@ spill_11 (int16_t *in, int16_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
index 0f317d6cce5c..9159cc6162df 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
@@ -111,5 +111,5 @@ spill_10 (int32_t *in, int32_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
index ef61d9a2c0c3..1faf31ffd8e0 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
@@ -102,5 +102,5 @@ spill_9 (int64_t *in, int64_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
index b366a4649d87..1d322a27ccc1 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
@@ -120,5 +120,5 @@ spill_11 (_Float16 *in, _Float16 *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
index d35e2a44f79b..487624c7b696 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
@@ -111,5 +111,5 @@ spill_10 (float *in, float *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
index 70ca683908db..e3980a295406 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
@@ -102,5 +102,5 @@ spill_9 (int64_t *in, int64_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
--
2.34.1
next prev parent reply other threads:[~2024-03-16 17:35 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-16 17:35 [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak Vineet Gupta
2024-03-16 17:35 ` [gcc-15 1/3] RISC-V: avoid LUI based const materialization ... [part of PR/106265] Vineet Gupta
2024-03-16 20:28 ` Jeff Law
2024-03-19 0:07 ` Vineet Gupta
2024-03-23 5:59 ` Jeff Law
2024-03-16 17:35 ` [gcc-15 2/3] RISC-V: avoid LUI based const mat: keep stack offsets aligned Vineet Gupta
2024-03-16 20:21 ` Jeff Law
2024-03-19 0:27 ` Vineet Gupta
2024-03-19 6:48 ` Andrew Waterman
2024-03-19 13:10 ` Jeff Law
2024-03-19 20:05 ` Vineet Gupta
2024-03-19 20:58 ` Andrew Waterman
2024-03-19 21:17 ` Palmer Dabbelt
2024-03-20 18:57 ` Jeff Law
2024-03-23 6:05 ` Jeff Law
2024-03-16 17:35 ` Vineet Gupta [this message]
2024-03-16 20:27 ` [gcc-15 3/3] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law
2024-03-19 4:41 ` [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak Jeff Law
2024-03-21 0:45 ` Vineet Gupta
2024-03-21 14:36 ` scheduler queue flush (was Re: [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak) Vineet Gupta
2024-03-21 14:45 ` Jeff Law
2024-03-21 17:19 ` Vineet Gupta
2024-03-21 19:56 ` Jeff Law
2024-03-22 0:34 ` scheduler queue flush Vineet Gupta
2024-03-22 8:47 ` scheduler queue flush (was Re: [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak) Richard Biener
2024-03-22 12:29 ` Jeff Law
2024-03-22 16:56 ` Vineet Gupta
2024-03-25 3:05 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240316173524.1147760-4-vineetg@rivosinc.com \
--to=vineetg@rivosinc.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=gnu-toolchain@rivosinc.com \
--cc=jeffreyalaw@gmail.com \
--cc=kito.cheng@gmail.com \
--cc=palmer@rivosinc.com \
--cc=rdapp.gcc@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).