* [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]
@ 2024-05-20 23:32 Vineet Gupta
2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
2024-05-21 17:30 ` [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law
0 siblings, 2 replies; 6+ messages in thread
From: Vineet Gupta @ 2024-05-20 23:32 UTC (permalink / raw)
To: gcc-patches
Cc: Jeff Law, kito.cheng, Palmer Dabbelt, gnu-toolchain, Vineet Gupta
Changes since v2:
- Broke out the hunk corresponding to alloca in epilogue expansion in
a seperate patch.
---
If the constant used for stack offset can be expressed as sum of two S12
values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.
The prev patches to generally avoid LUI based const materialization didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.
This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)
gcc-13.1 release | gcc 230823 | |
| g6619b3d4c15c | This patch | clang/llvm
---------------------------------------------------------------------------------
li t0,-4096 | li t0,-4096 | addi sp,sp,-2048 | addi sp,sp,-2048
addi t0,t0,2016 | addi t0,t0,2032 | add sp,sp,-16 | addi sp,sp,-32
li a4,4096 | add sp,sp,t0 | add a5,sp,a0 | add a1,sp,16
add sp,sp,t0 | addi a5,sp,-2032 | sb zero,0(a5) | add a0,a0,a1
li a5,-4096 | add a0,a5,a0 | addi sp,sp,2032 | sb zero,0(a0)
addi a4,a4,-2032 | li t0, 4096 | addi sp,sp,32 | addi sp,sp,2032
add a4,a4,a5 | sb zero,2032(a0) | ret | addi sp,sp,48
addi a5,sp,16 | addi t0,t0,-2032 | | ret
add a5,a4,a5 | add sp,sp,t0 |
add a0,a5,a0 | ret |
li t0,4096 |
sd a5,8(sp) |
sb zero,2032(a0)|
addi t0,t0,-2016 |
add sp,sp,t0 |
ret |
gcc/ChangeLog:
PR target/105733
* config/riscv/riscv.h: New macros for with aligned offsets.
* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
function to split a sum of two s12 values into constituents.
(riscv_expand_prologue): Handle offset being sum of two S12.
(riscv_expand_epilogue): Ditto.
* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr105733.c: New Test.
* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
expect LUI 4096.
* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
gcc/config/riscv/riscv-protos.h | 2 +
gcc/config/riscv/riscv.cc | 54 +++++++++++++++++--
gcc/config/riscv/riscv.h | 7 +++
gcc/testsuite/gcc.target/riscv/pr105733.c | 15 ++++++
.../riscv/rvv/autovec/vls/spill-1.c | 4 +-
.../riscv/rvv/autovec/vls/spill-2.c | 4 +-
.../riscv/rvv/autovec/vls/spill-3.c | 4 +-
.../riscv/rvv/autovec/vls/spill-4.c | 4 +-
.../riscv/rvv/autovec/vls/spill-5.c | 4 +-
.../riscv/rvv/autovec/vls/spill-6.c | 4 +-
.../riscv/rvv/autovec/vls/spill-7.c | 4 +-
11 files changed, 89 insertions(+), 17 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/pr105733.c
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index c64aae18deb9..0704968561bb 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -167,6 +167,8 @@ extern void riscv_subword_address (rtx, rtx *, rtx *, rtx *, rtx *);
extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *);
extern enum memmodel riscv_union_memmodels (enum memmodel, enum memmodel);
extern bool riscv_reg_frame_related (rtx);
+extern void riscv_split_sum_of_two_s12 (HOST_WIDE_INT, HOST_WIDE_INT *,
+ HOST_WIDE_INT *);
/* Routines implemented in riscv-c.cc. */
void riscv_cpu_cpp_builtins (cpp_reader *);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index d0c22058b8c3..2ecbcf1d0af8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4075,6 +4075,32 @@ riscv_split_doubleword_move (rtx dest, rtx src)
riscv_emit_move (riscv_subword (dest, true), riscv_subword (src, true));
}
}
+
+/* Constant VAL is known to be sum of two S12 constants. Break it into
+ comprising BASE and OFF.
+ Numerically S12 is -2048 to 2047, however it uses the more conservative
+ range -2048 to 2032 as offsets pertain to stack related registers. */
+
+void
+riscv_split_sum_of_two_s12 (HOST_WIDE_INT val, HOST_WIDE_INT *base,
+ HOST_WIDE_INT *off)
+{
+ if (SUM_OF_TWO_S12_N (val))
+ {
+ *base = -2048;
+ *off = val - (-2048);
+ }
+ else if (SUM_OF_TWO_S12_P_ALGN (val))
+ {
+ *base = 2032;
+ *off = val - 2032;
+ }
+ else
+ {
+ gcc_unreachable ();
+ }
+}
+
\f
/* Return the appropriate instructions to move SRC into DEST. Assume
that SRC is operand 1 and DEST is operand 0. */
@@ -7864,6 +7890,17 @@ riscv_expand_prologue (void)
GEN_INT (-constant_frame));
RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
}
+ else if (SUM_OF_TWO_S12_ALGN (-constant_frame))
+ {
+ HOST_WIDE_INT one, two;
+ riscv_split_sum_of_two_s12 (-constant_frame, &one, &two);
+ insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+ GEN_INT (one));
+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+ insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+ GEN_INT (two));
+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+ }
else
{
riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (-constant_frame));
@@ -8160,10 +8197,21 @@ riscv_expand_epilogue (int style)
/* Get an rtx for STEP1 that we can add to BASE.
Skip if adjust equal to zero. */
- if (step1.to_constant () != 0)
+ HOST_WIDE_INT step1_value = step1.to_constant ();
+ if (step1_value != 0)
{
- rtx adjust = GEN_INT (step1.to_constant ());
- if (!SMALL_OPERAND (step1.to_constant ()))
+ rtx adjust = GEN_INT (step1_value);
+ if (SUM_OF_TWO_S12_ALGN (step1_value))
+ {
+ HOST_WIDE_INT one, two;
+ riscv_split_sum_of_two_s12 (step1_value, &one, &two);
+ insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
+ stack_pointer_rtx,
+ GEN_INT (one)));
+ RTX_FRAME_RELATED_P (insn) = 1;
+ adjust = GEN_INT (two);
+ }
+ else if (!SMALL_OPERAND (step1_value))
{
riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust);
adjust = RISCV_PROLOGUE_TEMP (Pmode);
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 0d27c0d378df..d6b14c4d6205 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -641,6 +641,13 @@ enum reg_class
#define SUM_OF_TWO_S12(VALUE) \
(SUM_OF_TWO_S12_N (VALUE) || SUM_OF_TWO_S12_P (VALUE))
+/* Variant with first value 8 byte aligned if involving stack regs. */
+#define SUM_OF_TWO_S12_P_ALGN(VALUE) \
+ (((VALUE) >= (2032 + 1)) && ((VALUE) <= (2032 * 2)))
+
+#define SUM_OF_TWO_S12_ALGN(VALUE) \
+ (SUM_OF_TWO_S12_N (VALUE) || SUM_OF_TWO_S12_P_ALGN (VALUE))
+
/* If this is a single bit mask, then we can load it with bseti. Special
handling of SImode 0x80000000 on RV64 is done in riscv_build_integer_1. */
#define SINGLE_BIT_MASK_OPERAND(VALUE) \
diff --git a/gcc/testsuite/gcc.target/riscv/pr105733.c b/gcc/testsuite/gcc.target/riscv/pr105733.c
new file mode 100644
index 000000000000..6156c36dc7ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr105733.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options { -march=rv64gcv -mabi=lp64d } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+#define BUF_SIZE 2064
+
+void
+foo(unsigned long i)
+{
+ volatile char buf[BUF_SIZE];
+
+ buf[i] = 0;
+}
+
+/* { dg-final { scan-assembler-not {li\t[a-x0-9]+,4096} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
index b64c73f34f13..6afcf1db593b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
@@ -129,5 +129,5 @@ spill_12 (int8_t *in, int8_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
index 8fcdca705384..544e8628a27b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
@@ -120,5 +120,5 @@ spill_11 (int16_t *in, int16_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
index ca296ce02d66..4bfeb07e9aca 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
@@ -111,5 +111,5 @@ spill_10 (int32_t *in, int32_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
index ef61d9a2c0c3..1faf31ffd8e0 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
@@ -102,5 +102,5 @@ spill_9 (int64_t *in, int64_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
index 150135a91103..0c8dccc518e3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
@@ -120,5 +120,5 @@ spill_11 (_Float16 *in, _Float16 *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
index c5d2d0194348..8bf53b84d1cd 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
@@ -111,5 +111,5 @@ spill_10 (float *in, float *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
index 70ca683908db..e3980a295406 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
@@ -102,5 +102,5 @@ spill_9 (int64_t *in, int64_t *out)
/* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
/* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
2024-05-20 23:32 [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
@ 2024-05-20 23:32 ` Vineet Gupta
2024-05-21 3:54 ` Jeff Law
2024-05-21 17:38 ` Jeff Law
2024-05-21 17:30 ` [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law
1 sibling, 2 replies; 6+ messages in thread
From: Vineet Gupta @ 2024-05-20 23:32 UTC (permalink / raw)
To: gcc-patches
Cc: Jeff Law, kito.cheng, Palmer Dabbelt, gnu-toolchain, Vineet Gupta
This is testsuite clean however there's a dwarf quirk which I want to
run by the experts. The test that was tripping CI has following
fragment:
Before patch | After Patch
------------------------------------------------------
li t0,-4096 | addi sp,s0,-2048
addi t0,t0,560 | .cfi_def_cfa 2, 2048 <- #1
add sp,s0,t0 | addi sp,sp,-1488
.cfi_def_cfa 2, 3536 | .cfi_def_cfa_offset 3536 <- #2
addi sp,sp,1504 | addi sp,sp,1504
.cfi_def_cfa_offset 2032 | .cfi_def_cfa_offset 2032 <- #3
The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
---
This is continuing on the prev patch in function epilogue expansion.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_epilogue): Handle offset
being sum of two S12.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
gcc/config/riscv/riscv.cc | 33 ++++++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 7 deletions(-)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2ecbcf1d0af8..85df5b7ab498 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8111,7 +8111,10 @@ riscv_expand_epilogue (int style)
need_barrier_p = false;
poly_int64 adjust_offset = -frame->hard_frame_pointer_offset;
+ rtx dwarf_adj = gen_int_mode (adjust_offset, Pmode);
rtx adjust = NULL_RTX;
+ bool sum_of_two_s12 = false;
+ HOST_WIDE_INT one, two;
if (!adjust_offset.is_constant ())
{
@@ -8123,14 +8126,23 @@ riscv_expand_epilogue (int style)
}
else
{
- if (!SMALL_OPERAND (adjust_offset.to_constant ()))
+ HOST_WIDE_INT adj_off_value = adjust_offset.to_constant ();
+ if (SMALL_OPERAND (adj_off_value))
+ {
+ adjust = GEN_INT (adj_off_value);
+ }
+ else if (SUM_OF_TWO_S12_ALGN (adj_off_value))
+ {
+ riscv_split_sum_of_two_s12 (adj_off_value, &one, &two);
+ dwarf_adj = adjust = GEN_INT (one);
+ sum_of_two_s12 = true;
+ }
+ else
{
riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode),
- GEN_INT (adjust_offset.to_constant ()));
+ GEN_INT (adj_off_value));
adjust = RISCV_PROLOGUE_TEMP (Pmode);
}
- else
- adjust = GEN_INT (adjust_offset.to_constant ());
}
insn = emit_insn (
@@ -8138,14 +8150,21 @@ riscv_expand_epilogue (int style)
adjust));
rtx dwarf = NULL_RTX;
- rtx cfa_adjust_value = gen_rtx_PLUS (
- Pmode, hard_frame_pointer_rtx,
- gen_int_mode (-frame->hard_frame_pointer_offset, Pmode));
+ rtx cfa_adjust_value = gen_rtx_PLUS (Pmode, hard_frame_pointer_rtx,
+ dwarf_adj);
rtx cfa_adjust_rtx = gen_rtx_SET (stack_pointer_rtx, cfa_adjust_value);
dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, cfa_adjust_rtx, dwarf);
+
RTX_FRAME_RELATED_P (insn) = 1;
REG_NOTES (insn) = dwarf;
+
+ if (sum_of_two_s12)
+ {
+ insn = emit_insn (gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+ GEN_INT (two)));
+ RTX_FRAME_RELATED_P (insn) = 1;
+ }
}
if (use_restore_libcall || use_multi_pop)
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
@ 2024-05-21 3:54 ` Jeff Law
2024-05-21 15:15 ` Vineet Gupta
2024-05-21 17:38 ` Jeff Law
1 sibling, 1 reply; 6+ messages in thread
From: Jeff Law @ 2024-05-21 3:54 UTC (permalink / raw)
To: Vineet Gupta, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain
On 5/20/24 5:32 PM, Vineet Gupta wrote:
> This is testsuite clean however there's a dwarf quirk which I want to
> run by the experts. The test that was tripping CI has following
> fragment:
>
> Before patch | After Patch
> ------------------------------------------------------
> li t0,-4096 | addi sp,s0,-2048
> addi t0,t0,560 | .cfi_def_cfa 2, 2048 <- #1
> add sp,s0,t0 | addi sp,sp,-1488
> .cfi_def_cfa 2, 3536 | .cfi_def_cfa_offset 3536 <- #2
> addi sp,sp,1504 | addi sp,sp,1504
> .cfi_def_cfa_offset 2032 | .cfi_def_cfa_offset 2032 <- #3
>
> The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
What about it seems dubious? We need a CFA adjustment on each insn that
modifies the stack pointer so that we can unwind at any arbitrary point.
The first adjustment says the prior frame is at sp + 2048. Then it's at
sp + 3536. Then after the final insn the prior frame is at sp+2032.
Jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
2024-05-21 3:54 ` Jeff Law
@ 2024-05-21 15:15 ` Vineet Gupta
0 siblings, 0 replies; 6+ messages in thread
From: Vineet Gupta @ 2024-05-21 15:15 UTC (permalink / raw)
To: Jeff Law, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain
On 5/20/24 20:54, Jeff Law wrote:
> On 5/20/24 5:32 PM, Vineet Gupta wrote:
>> This is testsuite clean however there's a dwarf quirk which I want to
>> run by the experts. The test that was tripping CI has following
>> fragment:
>>
>> Before patch | After Patch
>> ------------------------------------------------------
>> li t0,-4096 | addi sp,s0,-2048
>> addi t0,t0,560 | .cfi_def_cfa 2, 2048 <- #1
>> add sp,s0,t0 | addi sp,sp,-1488
>> .cfi_def_cfa 2, 3536 | .cfi_def_cfa_offset 3536 <- #2
>> addi sp,sp,1504 | addi sp,sp,1504
>> .cfi_def_cfa_offset 2032 | .cfi_def_cfa_offset 2032 <- #3
>>
>> The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
> What about it seems dubious?
My discomfort at claiming I understand dwarf, despite debugging/fixing
the ARC Linux port's in kernel dwarf unwinder :-)
> We need a CFA adjustment on each insn that
> modifies the stack pointer so that we can unwind at any arbitrary point.
Of course.
> The first adjustment says the prior frame is at sp + 2048. Then it's at
> sp + 3536. Then after the final insn the prior frame is at sp+2032.
Yeah I got confused with second one since once it gets anchored to SP
from S0, but you are right it is farther from base CFA now.
-Vineet
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]
2024-05-20 23:32 [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
@ 2024-05-21 17:30 ` Jeff Law
1 sibling, 0 replies; 6+ messages in thread
From: Jeff Law @ 2024-05-21 17:30 UTC (permalink / raw)
To: Vineet Gupta, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain
On 5/20/24 5:32 PM, Vineet Gupta wrote:
> Changes since v2:
> - Broke out the hunk corresponding to alloca in epilogue expansion in
> a seperate patch.
> ---
>
> If the constant used for stack offset can be expressed as sum of two S12
> values, the constant need not be materialized (in a reg) and instead the
> two S12 bits can be added to instructions involved with frame pointer.
> This avoids burning a register and more importantly can often get down
> to be 2 insn vs. 3.
>
> The prev patches to generally avoid LUI based const materialization didn't
> fix this PR and need this directed fix in funcion prologue/epilogue
> expansion.
>
> This fix doesn't move the neddle for SPEC, at all, but it is still a
> win considering gcc generates one insn fewer than llvm for the test ;-)
>
> gcc-13.1 release | gcc 230823 | |
> | g6619b3d4c15c | This patch | clang/llvm
> ---------------------------------------------------------------------------------
> li t0,-4096 | li t0,-4096 | addi sp,sp,-2048 | addi sp,sp,-2048
> addi t0,t0,2016 | addi t0,t0,2032 | add sp,sp,-16 | addi sp,sp,-32
> li a4,4096 | add sp,sp,t0 | add a5,sp,a0 | add a1,sp,16
> add sp,sp,t0 | addi a5,sp,-2032 | sb zero,0(a5) | add a0,a0,a1
> li a5,-4096 | add a0,a5,a0 | addi sp,sp,2032 | sb zero,0(a0)
> addi a4,a4,-2032 | li t0, 4096 | addi sp,sp,32 | addi sp,sp,2032
> add a4,a4,a5 | sb zero,2032(a0) | ret | addi sp,sp,48
> addi a5,sp,16 | addi t0,t0,-2032 | | ret
> add a5,a4,a5 | add sp,sp,t0 |
> add a0,a5,a0 | ret |
> li t0,4096 |
> sd a5,8(sp) |
> sb zero,2032(a0)|
> addi t0,t0,-2016 |
> add sp,sp,t0 |
> ret |
>
> gcc/ChangeLog:
> PR target/105733
> * config/riscv/riscv.h: New macros for with aligned offsets.
> * config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
> function to split a sum of two s12 values into constituents.
> (riscv_expand_prologue): Handle offset being sum of two S12.
> (riscv_expand_epilogue): Ditto.
> * config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/pr105733.c: New Test.
> * gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
> expect LUI 4096.
> * gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.
OK
Jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
2024-05-21 3:54 ` Jeff Law
@ 2024-05-21 17:38 ` Jeff Law
1 sibling, 0 replies; 6+ messages in thread
From: Jeff Law @ 2024-05-21 17:38 UTC (permalink / raw)
To: Vineet Gupta, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain
On 5/20/24 5:32 PM, Vineet Gupta wrote:
> This is testsuite clean however there's a dwarf quirk which I want to
> run by the experts. The test that was tripping CI has following
> fragment:
>
> Before patch | After Patch
> ------------------------------------------------------
> li t0,-4096 | addi sp,s0,-2048
> addi t0,t0,560 | .cfi_def_cfa 2, 2048 <- #1
> add sp,s0,t0 | addi sp,sp,-1488
> .cfi_def_cfa 2, 3536 | .cfi_def_cfa_offset 3536 <- #2
> addi sp,sp,1504 | addi sp,sp,1504
> .cfi_def_cfa_offset 2032 | .cfi_def_cfa_offset 2032 <- #3
>
> The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
>
> ---
>
> This is continuing on the prev patch in function epilogue expansion.
>
> gcc/ChangeLog:
> * config/riscv/riscv.cc (riscv_expand_epilogue): Handle offset
> being sum of two S12.
OK.
jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-05-21 17:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-20 23:32 [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
2024-05-21 3:54 ` Jeff Law
2024-05-21 15:15 ` Vineet Gupta
2024-05-21 17:38 ` Jeff Law
2024-05-21 17:30 ` [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).