* [PATCH 0/4] [RISC-V] support zcmp extention
@ 2023-06-07 5:52 Fei Gao
2023-06-07 5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-07 5:52 UTC (permalink / raw)
To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao
please be noted the series depend on the zcmp switch that Jiawei posted
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615289.html
The 1st patch is a follow up on Kito's V3 review.
Others are new.
Fei Gao (4):
[RISC-V] support cm.push cm.pop cm.popret in zcmp
[RISC-V] support cm.popretz in zcmp
[RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
[RISC-V] support cm.mva01s cm.mvsa01 in zcmp
gcc/config/riscv/iterators.md | 15 +
gcc/config/riscv/peephole.md | 28 +
gcc/config/riscv/predicates.md | 107 ++
gcc/config/riscv/riscv-protos.h | 1 +
gcc/config/riscv/riscv.cc | 445 ++++-
gcc/config/riscv/riscv.h | 23 +
gcc/config/riscv/riscv.md | 4 +
gcc/config/riscv/zc.md | 1457 +++++++++++++++++
gcc/shrink-wrap.cc | 25 +-
gcc/shrink-wrap.h | 1 +
gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c | 21 +
gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 251 +++
gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 251 +++
.../riscv/zcmp_shrink_wrap_separate.c | 97 ++
.../riscv/zcmp_shrink_wrap_separate2.c | 97 ++
.../gcc.target/riscv/zcmp_stack_alignment.c | 23 +
16 files changed, 2795 insertions(+), 51 deletions(-)
create mode 100644 gcc/config/riscv/zc.md
create mode 100644 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
--
2.17.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-06-07 5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
@ 2023-06-07 5:52 ` Fei Gao
2023-06-07 10:11 ` jiawei
2023-08-16 8:33 ` Kito Cheng
2023-06-07 5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
` (2 subsequent siblings)
3 siblings, 2 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-07 5:52 UTC (permalink / raw)
To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao
Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
by cm.push, step 1 and step 2.
please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
So adaption has been done in .cfi directives in my patch.
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/iterators.md
slot0_offset: slot 0 offset in stack GPRs area in bytes
slot1_offset: slot 1 offset in stack GPRs area in bytes
slot2_offset: likewise
slot3_offset: likewise
slot4_offset: likewise
slot5_offset: likewise
slot6_offset: likewise
slot7_offset: likewise
slot8_offset: likewise
slot9_offset: likewise
slot10_offset: likewise
slot11_offset: likewise
slot12_offset: likewise
* config/riscv/predicates.md
(stack_push_up_to_ra_operand): predicates of stack adjust pushing ra
(stack_push_up_to_s0_operand): predicates of stack adjust pushing ra, s0
(stack_push_up_to_s1_operand): likewise
(stack_push_up_to_s2_operand): likewise
(stack_push_up_to_s3_operand): likewise
(stack_push_up_to_s4_operand): likewise
(stack_push_up_to_s5_operand): likewise
(stack_push_up_to_s6_operand): likewise
(stack_push_up_to_s7_operand): likewise
(stack_push_up_to_s8_operand): likewise
(stack_push_up_to_s9_operand): likewise
(stack_push_up_to_s11_operand): likewise
(stack_pop_up_to_ra_operand): predicates of stack adjust poping ra
(stack_pop_up_to_s0_operand): predicates of stack adjust poping ra, s0
(stack_pop_up_to_s1_operand): likewise
(stack_pop_up_to_s2_operand): likewise
(stack_pop_up_to_s3_operand): likewise
(stack_pop_up_to_s4_operand): likewise
(stack_pop_up_to_s5_operand): likewise
(stack_pop_up_to_s6_operand): likewise
(stack_pop_up_to_s7_operand): likewise
(stack_pop_up_to_s8_operand): likewise
(stack_pop_up_to_s9_operand): likewise
(stack_pop_up_to_s11_operand): likewise
* config/riscv/riscv-protos.h
(riscv_zcmp_valid_stack_adj_bytes_p):declaration
* config/riscv/riscv.cc (struct riscv_frame_info): comment change
(riscv_avoid_multi_push): helper function of riscv_use_multi_push
(riscv_use_multi_push): true if multi push is used
(riscv_multi_push_sregs_count): num of sregs in multi-push
(riscv_multi_push_regs_count): num of regs in multi-push
(riscv_16bytes_align): align to 16 bytes
(riscv_stack_align): moved to a better place
(riscv_save_libcall_count): no functional change
(riscv_compute_frame_info): add zcmp frame info
(riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
(riscv_gen_multi_push_pop_insn): gen function for multi push and pop
(riscv_expand_prologue): allocate stack by cm.push
(riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
(riscv_expand_epilogue): allocate stack by cm.pop[ret]
(zcmp_base_adj): calculate stack adjustment base size
(zcmp_additional_adj): calculate stack adjustment additional size
(riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment valid
* config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
(S0_MASK): likewise
(S1_MASK): likewise
(S2_MASK): likewise
(S3_MASK): likewise
(S4_MASK): likewise
(S5_MASK): likewise
(S6_MASK): likewise
(S7_MASK): likewise
(S8_MASK): likewise
(S9_MASK): likewise
(S10_MASK): likewise
(S11_MASK): likewise
(MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
(ZCMP_MAX_SPIMM): max spimm value
(ZCMP_SP_INC_STEP): zcmp sp increment step
(ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
(ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
(ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
* config/riscv/riscv.md: include zc.md
* config/riscv/zc.md: New file. machine description for zcmp
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv32e_zcmp.c: New test.
* gcc.target/riscv/rv32i_zcmp.c: New test.
* gcc.target/riscv/zcmp_stack_alignment.c: New test.
---
gcc/config/riscv/iterators.md | 15 +
gcc/config/riscv/predicates.md | 96 ++
gcc/config/riscv/riscv-protos.h | 1 +
gcc/config/riscv/riscv.cc | 360 +++++-
gcc/config/riscv/riscv.h | 23 +
gcc/config/riscv/riscv.md | 2 +
gcc/config/riscv/zc.md | 1042 +++++++++++++++++
gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 239 ++++
gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 239 ++++
.../gcc.target/riscv/zcmp_stack_alignment.c | 23 +
10 files changed, 2000 insertions(+), 40 deletions(-)
create mode 100644 gcc/config/riscv/zc.md
create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index d374a10810c..6ed4174f9cc 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -120,6 +120,21 @@
(define_mode_attr shiftm1 [(SI "const_si_mask_operand") (DI "const_di_mask_operand")])
(define_mode_attr shiftm1p [(SI "DsS") (DI "DsD")])
+; zcmp mode attribute
+(define_mode_attr slot0_offset [(SI "-4") (DI "-8")])
+(define_mode_attr slot1_offset [(SI "-8") (DI "-16")])
+(define_mode_attr slot2_offset [(SI "-12") (DI "-24")])
+(define_mode_attr slot3_offset [(SI "-16") (DI "-32")])
+(define_mode_attr slot4_offset [(SI "-20") (DI "-40")])
+(define_mode_attr slot5_offset [(SI "-24") (DI "-48")])
+(define_mode_attr slot6_offset [(SI "-28") (DI "-56")])
+(define_mode_attr slot7_offset [(SI "-32") (DI "-64")])
+(define_mode_attr slot8_offset [(SI "-36") (DI "-72")])
+(define_mode_attr slot9_offset [(SI "-40") (DI "-80")])
+(define_mode_attr slot10_offset [(SI "-44") (DI "-88")])
+(define_mode_attr slot11_offset [(SI "-48") (DI "-96")])
+(define_mode_attr slot12_offset [(SI "-52") (DI "-104")])
+
;; -------------------------------------------------------------------
;; Code Iterators
;; -------------------------------------------------------------------
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 04ca6ceabc7..ab67b3332f0 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -65,6 +65,102 @@
(ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
+(define_predicate "stack_push_up_to_ra_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
+
+(define_predicate "stack_push_up_to_s0_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
+
+(define_predicate "stack_push_up_to_s1_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
+
+(define_predicate "stack_push_up_to_s2_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
+
+(define_predicate "stack_push_up_to_s3_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
+
+(define_predicate "stack_push_up_to_s4_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
+
+(define_predicate "stack_push_up_to_s5_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
+
+(define_predicate "stack_push_up_to_s6_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
+
+(define_predicate "stack_push_up_to_s7_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
+
+(define_predicate "stack_push_up_to_s8_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
+
+(define_predicate "stack_push_up_to_s9_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
+
+(define_predicate "stack_push_up_to_s11_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
+
+(define_predicate "stack_pop_up_to_ra_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
+
+(define_predicate "stack_pop_up_to_s0_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
+
+(define_predicate "stack_pop_up_to_s1_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
+
+(define_predicate "stack_pop_up_to_s2_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
+
+(define_predicate "stack_pop_up_to_s3_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
+
+(define_predicate "stack_pop_up_to_s4_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
+
+(define_predicate "stack_pop_up_to_s5_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
+
+(define_predicate "stack_pop_up_to_s6_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
+
+(define_predicate "stack_pop_up_to_s7_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
+
+(define_predicate "stack_pop_up_to_s8_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
+
+(define_predicate "stack_pop_up_to_s9_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
+
+(define_predicate "stack_pop_up_to_s11_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
+
;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
(define_predicate "branch_on_bit_operand"
(and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 00e1b20c6c6..f23b11622a2 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -56,6 +56,7 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
extern void riscv_split_doubleword_move (rtx, rtx);
extern const char *riscv_output_move (rtx, rtx);
extern const char *riscv_output_return ();
+extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
#ifdef RTX_CODE
extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3954c89a039..c476c699f4c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -126,6 +126,14 @@ struct GTY(()) riscv_frame_info {
/* How much the GPR save/restore routines adjust sp (or 0 if unused). */
unsigned save_libcall_adjustment;
+ /* the minimum number of bytes, in multiples of 16-byte address increments,
+ required to cover the registers in a multi push & pop. */
+ unsigned multi_push_adj_base;
+
+ /* the number of additional 16-byte address increments allocated for the stack frame
+ in a multi push & pop. */
+ unsigned multi_push_adj_addi;
+
/* Offsets of fixed-point and floating-point save areas from frame bottom */
poly_int64 gp_sp_offset;
poly_int64 fp_sp_offset;
@@ -422,6 +430,16 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
#include "riscv-cores.def"
};
+typedef enum
+{
+ PUSH_IDX = 0,
+ POP_IDX,
+ POPRET_IDX,
+ ZCMP_OP_NUM
+} riscv_zcmp_op_t;
+
+typedef insn_code (* code_for_push_pop_t)(machine_mode);
+
void riscv_frame_info::reset(void)
{
total_size = 0;
@@ -4876,6 +4894,37 @@ riscv_save_reg_p (unsigned int regno)
return false;
}
+/* Return TRUE if Zcmp push and pop insns should be
+ avoided. FALSE otherwise.
+ Only use multi push & pop if all GPRs masked can be covered,
+ and stack access is SP based,
+ and GPRs are at top of the stack frame,
+ and no conflicts in stack allocation with other features */
+static bool
+riscv_avoid_multi_push(const struct riscv_frame_info *frame)
+{
+ if (!TARGET_ZCMP
+ || crtl->calls_eh_return
+ || frame_pointer_needed
+ || cfun->machine->interrupt_handler_p
+ || cfun->machine->varargs_size != 0
+ || crtl->args.pretend_args_size != 0
+ || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
+ return true;
+
+ return false;
+}
+
+/* Determine whether to use multi push insn. */
+static bool
+riscv_use_multi_push(const struct riscv_frame_info *frame)
+{
+ if (riscv_avoid_multi_push (frame))
+ return false;
+
+ return (frame->multi_push_adj_base != 0);
+}
+
/* Return TRUE if a libcall to save/restore GPRs should be
avoided. FALSE otherwise. */
static bool
@@ -4913,6 +4962,51 @@ riscv_save_libcall_count (unsigned mask)
abort ();
}
+/* calculate number of s regs in multi push and pop.
+ Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead. */
+static unsigned
+riscv_multi_push_sregs_count (unsigned mask)
+{
+ unsigned num = riscv_save_libcall_count (mask);
+ return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
+ ? ZCMP_S0S11_SREGS_COUNTS
+ : num;
+}
+
+/* calculate number of regs(ra, s0-sx) in multi push and pop. */
+static unsigned
+riscv_multi_push_regs_count (unsigned mask)
+{
+ /* 1 is for ra */
+ return riscv_multi_push_sregs_count (mask) + 1;
+}
+
+/* Handle 16 bytes align for poly_int. */
+static poly_int64
+riscv_16bytes_align (poly_int64 value)
+{
+ return aligned_upper_bound (value, 16);
+}
+
+static HOST_WIDE_INT
+riscv_16bytes_align (HOST_WIDE_INT value)
+{
+ return ROUND_UP(value, 16);
+}
+
+/* Handle stack align for poly_int. */
+static poly_int64
+riscv_stack_align (poly_int64 value)
+{
+ return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
+}
+
+static HOST_WIDE_INT
+riscv_stack_align (HOST_WIDE_INT value)
+{
+ return RISCV_STACK_ALIGN (value);
+}
+
/* Populate the current function's riscv_frame_info structure.
RISC-V stack frames grown downward. High addresses are at the top.
@@ -4938,7 +5032,7 @@ riscv_save_libcall_count (unsigned mask)
| GPR save area | + UNITS_PER_WORD
| |
+-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
- | | + UNITS_PER_HWVALUE
+ | | + UNITS_PER_FP_REG
| FPR save area |
| |
+-------------------------------+ <-- frame_pointer_rtx (virtual)
@@ -4957,19 +5051,6 @@ riscv_save_libcall_count (unsigned mask)
static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
-/* Handle stack align for poly_int. */
-static poly_int64
-riscv_stack_align (poly_int64 value)
-{
- return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
-}
-
-static HOST_WIDE_INT
-riscv_stack_align (HOST_WIDE_INT value)
-{
- return RISCV_STACK_ALIGN (value);
-}
-
static void
riscv_compute_frame_info (void)
{
@@ -5017,8 +5098,9 @@ riscv_compute_frame_info (void)
if (frame->mask)
{
x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
- unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
+ /* 1 is for ra */
+ unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
/* Only use save/restore routines if they don't alter the stack size. */
if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
&& !riscv_avoid_save_libcall ())
@@ -5030,6 +5112,15 @@ riscv_compute_frame_info (void)
frame->save_libcall_adjustment = x_save_size;
}
+
+ if (!riscv_avoid_multi_push (frame))
+ {
+ /* num(ra, s0-sx) */
+ unsigned num_multi_push =
+ riscv_multi_push_regs_count (frame->mask);
+ x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
+ frame->multi_push_adj_base = riscv_16bytes_align (x_save_size);
+ }
}
/* At the bottom of the frame are any outgoing stack arguments. */
@@ -5044,7 +5135,15 @@ riscv_compute_frame_info (void)
frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
/* Next are the callee-saved GPRs. */
if (frame->mask)
- offset += x_save_size;
+ {
+ offset += x_save_size;
+ /* align to 16 bytes and add paddings to GPR part to honor
+ both stack alignment and zcmp pus/pop size alignment. */
+ if (riscv_use_multi_push (frame)
+ && known_lt(offset,
+ frame->multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
+ offset = riscv_16bytes_align (offset);
+ }
frame->gp_sp_offset = offset - UNITS_PER_WORD;
/* The hard frame pointer points above the callee-saved GPRs. */
frame->hard_frame_pointer_offset = offset;
@@ -5388,6 +5487,42 @@ riscv_adjust_libcall_cfi_prologue ()
return dwarf;
}
+static rtx
+riscv_adjust_multi_push_cfi_prologue (int saved_size)
+{
+ rtx dwarf = NULL_RTX;
+ rtx adjust_sp_rtx, reg, mem, insn;
+ unsigned int mask = cfun->machine->frame.mask;
+ int offset;
+ int saved_cnt = 0;
+
+ if (mask & S10_MASK)
+ mask |= S11_MASK;
+
+ for (int regno = GP_REG_LAST; regno >= GP_REG_FIRST; regno--)
+ if (BITSET_P (mask & MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
+ {
+ /* The save order is s11-s0, ra
+ from high to low addr. */
+ offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
+
+ reg = gen_rtx_REG (Pmode, regno);
+ mem = gen_frame_mem (Pmode, plus_constant (Pmode,
+ stack_pointer_rtx,
+ offset));
+
+ insn = gen_rtx_SET (mem, reg);
+ dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
+ }
+
+ /* Debug info for adjust sp. */
+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+ plus_constant(Pmode, stack_pointer_rtx, -saved_size));
+ dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+ dwarf);
+ return dwarf;
+}
+
static void
riscv_emit_stack_tie (void)
{
@@ -5397,6 +5532,45 @@ riscv_emit_stack_tie (void)
emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
}
+/*zcmp multi push and pop code_for_push_pop function ptr array */
+const code_for_push_pop_t code_for_push_pop [ZCMP_MAX_GRP_SLOTS][ZCMP_OP_NUM] = {
+ {code_for_gpr_multi_push_up_to_ra, code_for_gpr_multi_pop_up_to_ra,
+ code_for_gpr_multi_popret_up_to_ra},
+ {code_for_gpr_multi_push_up_to_s0, code_for_gpr_multi_pop_up_to_s0,
+ code_for_gpr_multi_popret_up_to_s0},
+ {code_for_gpr_multi_push_up_to_s1, code_for_gpr_multi_pop_up_to_s1,
+ code_for_gpr_multi_popret_up_to_s1},
+ {code_for_gpr_multi_push_up_to_s2, code_for_gpr_multi_pop_up_to_s2,
+ code_for_gpr_multi_popret_up_to_s2},
+ {code_for_gpr_multi_push_up_to_s3, code_for_gpr_multi_pop_up_to_s3,
+ code_for_gpr_multi_popret_up_to_s3},
+ {code_for_gpr_multi_push_up_to_s4, code_for_gpr_multi_pop_up_to_s4,
+ code_for_gpr_multi_popret_up_to_s4},
+ {code_for_gpr_multi_push_up_to_s5, code_for_gpr_multi_pop_up_to_s5,
+ code_for_gpr_multi_popret_up_to_s5},
+ {code_for_gpr_multi_push_up_to_s6, code_for_gpr_multi_pop_up_to_s6,
+ code_for_gpr_multi_popret_up_to_s6},
+ {code_for_gpr_multi_push_up_to_s7, code_for_gpr_multi_pop_up_to_s7,
+ code_for_gpr_multi_popret_up_to_s7},
+ {code_for_gpr_multi_push_up_to_s8, code_for_gpr_multi_pop_up_to_s8,
+ code_for_gpr_multi_popret_up_to_s8},
+ {code_for_gpr_multi_push_up_to_s9, code_for_gpr_multi_pop_up_to_s9,
+ code_for_gpr_multi_popret_up_to_s9},
+ {nullptr, nullptr, nullptr},
+ {code_for_gpr_multi_push_up_to_s11, code_for_gpr_multi_pop_up_to_s11,
+ code_for_gpr_multi_popret_up_to_s11}};
+
+static rtx
+riscv_gen_multi_push_pop_insn (riscv_zcmp_op_t op, HOST_WIDE_INT adj_size,
+ unsigned int regs_num)
+{
+ gcc_assert (op < ZCMP_OP_NUM);
+ gcc_assert (regs_num <= ZCMP_MAX_GRP_SLOTS
+ && regs_num != ZCMP_INVALID_S0S10_SREGS_COUNTS + 1); /* 1 for ra*/
+ rtx stack_adj = GEN_INT (adj_size);
+ return GEN_FCN (code_for_push_pop[regs_num - 1][op] (Pmode)) (stack_adj);
+}
+
/* Expand the "prologue" pattern. */
void
@@ -5405,7 +5579,8 @@ riscv_expand_prologue (void)
struct riscv_frame_info *frame = &cfun->machine->frame;
poly_int64 remaining_size = frame->total_size;
unsigned mask = frame->mask;
- rtx insn;
+ int spimm, multi_push_additional, stack_adj;
+ rtx insn, dwarf = NULL_RTX;
if (flag_stack_usage_info)
current_function_static_stack_size = constant_lower_bound (remaining_size);
@@ -5413,8 +5588,35 @@ riscv_expand_prologue (void)
if (cfun->machine->naked_p)
return;
+ /* prefer muti-push to save-restore libcall. */
+ if (riscv_use_multi_push(frame))
+ {
+ remaining_size -= frame->multi_push_adj_base;
+ if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
+ spimm = 3;
+ else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
+ spimm = 2;
+ else if (known_gt(remaining_size, 0))
+ spimm = 1;
+ else
+ spimm = 0;
+ multi_push_additional = spimm * ZCMP_SP_INC_STEP;
+ frame->multi_push_adj_addi = multi_push_additional;
+ remaining_size -= multi_push_additional;
+
+ /* emit multi push insn & dwarf along with it. */
+ stack_adj = frame->multi_push_adj_base + multi_push_additional;
+ insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
+ -stack_adj, riscv_multi_push_regs_count(frame->mask)));
+ dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
+ RTX_FRAME_RELATED_P (insn) = 1;
+ REG_NOTES (insn) = dwarf;
+
+ /* Temporarily fib that we need not save GPRs. */
+ frame->mask = 0;
+ }
/* When optimizing for size, call a subroutine to save the registers. */
- if (riscv_use_save_libcall (frame))
+ else if (riscv_use_save_libcall (frame))
{
rtx dwarf = NULL_RTX;
dwarf = riscv_adjust_libcall_cfi_prologue ();
@@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
/* Save the registers. */
if ((frame->mask | frame->fmask) != 0)
{
- HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
-
- insn = gen_add3_insn (stack_pointer_rtx,
- stack_pointer_rtx,
- GEN_INT (-step1));
- RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
- remaining_size -= step1;
+ if (known_gt (remaining_size, frame->frame_pointer_offset))
+ {
+ HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
+ remaining_size -= step1;
+ insn = gen_add3_insn (stack_pointer_rtx,
+ stack_pointer_rtx,
+ GEN_INT (-step1));
+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+ }
riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
}
@@ -5493,6 +5697,32 @@ riscv_expand_prologue (void)
}
}
+static rtx
+riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
+{
+ rtx dwarf = NULL_RTX;
+ rtx adjust_sp_rtx, reg;
+ unsigned int mask = cfun->machine->frame.mask;
+
+ if (mask & S10_MASK)
+ mask |= S11_MASK;
+
+ /* Debug info for adjust sp. */
+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+ plus_constant(Pmode, stack_pointer_rtx, saved_size));
+ dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+ dwarf);
+
+ for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+ if (BITSET_P (mask, regno - GP_REG_FIRST))
+ {
+ reg = gen_rtx_REG (Pmode, regno);
+ dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
+ }
+
+ return dwarf;
+}
+
static rtx
riscv_adjust_libcall_cfi_epilogue ()
{
@@ -5532,10 +5762,18 @@ riscv_expand_epilogue (int style)
struct riscv_frame_info *frame = &cfun->machine->frame;
unsigned mask = frame->mask;
HOST_WIDE_INT step2 = 0;
- bool use_restore_libcall = ((style == NORMAL_RETURN)
- && riscv_use_save_libcall (frame));
- unsigned libcall_size = (use_restore_libcall
- ? frame->save_libcall_adjustment : 0);
+ bool use_multi_pop_normal = ((style == NORMAL_RETURN)
+ && riscv_use_multi_push (frame));
+ bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
+ && riscv_use_multi_push (frame));
+ bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
+
+ bool use_restore_libcall = !use_multi_pop && ((style == NORMAL_RETURN)
+ && riscv_use_save_libcall (frame));
+ unsigned libcall_size = use_restore_libcall && !use_multi_pop ?
+ frame->save_libcall_adjustment : 0;
+ unsigned multipop_size = use_multi_pop ?
+ frame->multi_push_adj_base + frame->multi_push_adj_addi : 0;
rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
rtx insn;
@@ -5606,18 +5844,25 @@ riscv_expand_epilogue (int style)
REG_NOTES (insn) = dwarf;
}
- if (use_restore_libcall)
- frame->mask = 0; /* Temporarily fib for GPRs. */
+ if (use_restore_libcall || use_multi_pop)
+ frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
/* If we need to restore registers, deallocate as much stack as
possible in the second step without going out of range. */
- if ((frame->mask | frame->fmask) != 0)
+ if (use_multi_pop)
+ {
+ if (frame->fmask
+ && known_gt (frame->total_size - multipop_size,
+ frame->frame_pointer_offset))
+ step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
+ }
+ else if ((frame->mask | frame->fmask) != 0)
step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
- if (use_restore_libcall)
+ if (use_restore_libcall || use_multi_pop)
frame->mask = mask; /* Undo the above fib. */
- poly_int64 step1 = frame->total_size - step2 - libcall_size;
+ poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
/* Set TARGET to BASE + STEP1. */
if (known_gt (step1, 0))
@@ -5652,7 +5897,7 @@ riscv_expand_epilogue (int style)
adjust));
rtx dwarf = NULL_RTX;
rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
- GEN_INT (step2 + libcall_size));
+ GEN_INT (step2 + libcall_size + multipop_size));
dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
RTX_FRAME_RELATED_P (insn) = 1;
@@ -5667,15 +5912,15 @@ riscv_expand_epilogue (int style)
epilogue_cfa_sp_offset = step2;
}
- if (use_restore_libcall)
+ if (use_restore_libcall || use_multi_pop)
frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
/* Restore the registers. */
- riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
+ riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
riscv_restore_reg,
true, style == EXCEPTION_RETURN);
- if (use_restore_libcall)
+ if (use_restore_libcall || use_multi_pop)
frame->mask = mask; /* Undo the above fib. */
if (need_barrier_p)
@@ -5689,14 +5934,30 @@ riscv_expand_epilogue (int style)
rtx dwarf = NULL_RTX;
rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
- GEN_INT (libcall_size));
+ GEN_INT (libcall_size + multipop_size));
dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
RTX_FRAME_RELATED_P (insn) = 1;
REG_NOTES (insn) = dwarf;
}
- if (use_restore_libcall)
+ if (use_multi_pop)
+ {
+ unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
+ if (use_multi_pop_normal)
+ insn = emit_jump_insn (
+ riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
+ else
+ insn= emit_insn (
+ riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
+
+ rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
+ RTX_FRAME_RELATED_P (insn) = 1;
+ REG_NOTES (insn) = dwarf;
+ if (use_multi_pop_normal)
+ return;
+ }
+ else if (use_restore_libcall)
{
rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
@@ -6980,6 +7241,25 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
return gen_rtx_PARALLEL (VOIDmode, vec);
}
+static HOST_WIDE_INT zcmp_base_adj(int regs_num)
+{
+ return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
+}
+
+static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
+{
+ return total - zcmp_base_adj(regs_num);
+}
+
+bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
+{
+ HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
+ return additioanl_bytes == 0
+ || additioanl_bytes == 1 * ZCMP_SP_INC_STEP
+ || additioanl_bytes == 2 * ZCMP_SP_INC_STEP
+ || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
+}
+
/* Return true if it's valid gpr_save pattern. */
bool
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 4541255a8ae..2fa555dce2d 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -420,6 +420,29 @@ ASM_MISA_SPEC
#define RISCV_CALL_ADDRESS_TEMP(MODE) \
gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
+#define RETURN_ADDR_MASK ( 1 << RETURN_ADDR_REGNUM)
+#define S0_MASK ( 1 << S0_REGNUM)
+#define S1_MASK ( 1 << S1_REGNUM)
+#define S2_MASK ( 1 << S2_REGNUM)
+#define S3_MASK ( 1 << S3_REGNUM)
+#define S4_MASK ( 1 << S4_REGNUM)
+#define S5_MASK ( 1 << S5_REGNUM)
+#define S6_MASK ( 1 << S6_REGNUM)
+#define S7_MASK ( 1 << S7_REGNUM)
+#define S8_MASK ( 1 << S8_REGNUM)
+#define S9_MASK ( 1 << S9_REGNUM)
+#define S10_MASK ( 1 << S10_REGNUM)
+#define S11_MASK ( 1 << S11_REGNUM)
+
+#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK | S3_MASK \
+ | S4_MASK | S5_MASK | S6_MASK | S7_MASK \
+ | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
+#define ZCMP_MAX_SPIMM 3
+#define ZCMP_SP_INC_STEP 16
+#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
+#define ZCMP_S0S11_SREGS_COUNTS 12
+#define ZCMP_MAX_GRP_SLOTS 13
+
#define MCOUNT_NAME "_mcount"
#define NO_PROFILE_COUNTERS 1
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index be960583101..c858b3bc9ef 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -113,6 +113,7 @@
(define_constants
[(RETURN_ADDR_REGNUM 1)
+ (SP_REGNUM 2)
(GP_REGNUM 3)
(TP_REGNUM 4)
(T0_REGNUM 5)
@@ -3163,3 +3164,4 @@
(include "sifive-7.md")
(include "thead.md")
(include "vector.md")
+(include "zc.md")
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
new file mode 100644
index 00000000000..5c1bf031b8d
--- /dev/null
+++ b/gcc/config/riscv/zc.md
@@ -0,0 +1,1042 @@
+;; Machine description for RISC-V Zc extention.
+;; Copyright (C) 2023 Free Software Foundation, Inc.
+;; Contributed by Fei Gao (gaofei@eswincomputing.com).
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3. If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_insn "@gpr_multi_pop_up_to_ra_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s0_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s1_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s2_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s3_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s4_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s5_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s6_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s7_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s8_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s9_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s11_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+ (set (reg:X S11_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S10_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot11_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot12_offset>))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s11}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_ra_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s0_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s1_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s2_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s3_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s4_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s5_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s6_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s7_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s8_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s9_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s11_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+ (set (reg:X S11_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S10_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot11_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot12_offset>))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s11}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_ra_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s0_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s1_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s2_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s3_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s4_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s5_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s6_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s7_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s8_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S8_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s9_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S9_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S8_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s11_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>)))
+ (reg:X S11_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>)))
+ (reg:X S10_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>)))
+ (reg:X S9_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>)))
+ (reg:X S8_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>)))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>)))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>)))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>)))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>)))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>)))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>)))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot11_offset>)))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot12_offset>)))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s11}, %0"
+)
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
new file mode 100644
index 00000000000..6dbe489da9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+ (int arg0, int arg1, int arg2, int arg3,
+ int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+** ...
+** cm.push {ra, s0-s1}, -64
+** ...
+** cm.popret {ra, s0-s1}, 64
+** ...
+*/
+int test1()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0;
+ for (int i = 0; i < 3120; i++)
+ {
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i];
+ }
+ return sum;
+}
+
+/*
+**test2_step1_0_size:
+** ...
+** cm.push {ra, s0}, -64
+** ...
+** cm.popret {ra, s0}, 64
+** ...
+*/
+int test2_step1_0_size()
+{
+ int volatile iarray[3120 + 1824/4 -8];
+
+ for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+ {
+ iarray[i] = my_getchar() * 2;
+ }
+ return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+** ...
+** cm.push {ra, s0-s1}, -64
+** ...
+** cm.popret {ra, s0-s1}, 64
+** ...
+*/
+float test3()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+ for (int i = 0; i < 3120; i++)
+ {
+ f1 = getf();
+ f2 = getf();
+ f3 = getf();
+ f4 = getf();
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+ }
+ return sum;
+}
+
+/*
+**outgoing_stack_args:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+int outgoing_stack_args()
+{
+ int local = getint();
+ return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+** ...
+** cm.push {ra}, -32
+** ...
+** cm.popret {ra}, 32
+** ...
+*/
+float callPrintInts()
+{
+ volatile float f = getf(); // f in local
+ PrintInts(9,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint:
+** ...
+** cm.push {ra}, -32
+** ...
+** cm.popret {ra}, 32
+** ...
+*/
+float callPrint()
+{
+ volatile float f = getf(); // f in local
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_S:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+float callPrint_S()
+{
+ float f = getf();
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_2:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+float callPrint_2()
+{
+ float f = getf();
+ PrintInts2(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+** ...
+** cm.push {ra}, -16
+** ...
+** cm.popret {ra}, 16
+** ...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+ int a = 9;
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s0:
+** ...
+** cm.push {ra, s0}, -16
+** ...
+** cm.popret {ra, s0}, 16
+** ...
+*/
+int test_s0()
+{
+
+ int a = my_getchar();
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s1:
+** ...
+** cm.push {ra, s0-s1}, -16
+** ...
+** cm.popret {ra, s0-s1}, 16
+** ...
+*/
+int test_s1()
+{
+
+ int s0 = my_getchar();
+ int s1 = my_getchar();
+ int b = my_getchar();
+ return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+** ...
+** cm.push {ra, s0-s1}, -16
+** ...
+** cm.popret {ra, s0-s1}, 16
+** ...
+*/
+int test_f0()
+{
+
+ int s0 = my_getchar();
+ float f0 = getf();
+ int b = my_getchar();
+ return f0 +s0 +b;
+}
+
+/*
+**foo:
+** cm.push {ra}, -16
+** call f1
+** cm.pop {ra}, 16
+** tail f2
+*/
+void foo(void)
+{
+ f1();
+ f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
new file mode 100644
index 00000000000..924197cb3c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+ (int arg0, int arg1, int arg2, int arg3,
+ int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+** ...
+** cm.push {ra, s0-s4}, -80
+** ...
+** cm.popret {ra, s0-s4}, 80
+** ...
+*/
+int test1()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0;
+ for (int i = 0; i < 3120; i++)
+ {
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i];
+ }
+ return sum;
+}
+
+/*
+**test2_step1_0_size:
+** ...
+** cm.push {ra, s0-s1}, -64
+** ...
+** cm.popret {ra, s0-s1}, 64
+** ...
+*/
+int test2_step1_0_size()
+{
+ int volatile iarray[3120 + 1824/4 -8];
+
+ for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+ {
+ iarray[i] = my_getchar() * 2;
+ }
+ return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+** ...
+** cm.push {ra, s0-s4}, -80
+** ...
+** cm.popret {ra, s0-s4}, 80
+** ...
+*/
+float test3()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+ for (int i = 0; i < 3120; i++)
+ {
+ f1 = getf();
+ f2 = getf();
+ f3 = getf();
+ f4 = getf();
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+ }
+ return sum;
+}
+
+/*
+**outgoing_stack_args:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+int outgoing_stack_args()
+{
+ int local = getint();
+ return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrintInts()
+{
+ volatile float f = getf(); // f in local
+ PrintInts(9,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrint()
+{
+ volatile float f = getf(); // f in local
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_S:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrint_S()
+{
+ float f = getf();
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_2:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrint_2()
+{
+ float f = getf();
+ PrintInts2(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+** ...
+** cm.push {ra}, -16
+** ...
+** cm.popret {ra}, 16
+** ...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+ int a = 9;
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s0:
+** ...
+** cm.push {ra, s0}, -16
+** ...
+** cm.popret {ra, s0}, 16
+** ...
+*/
+int test_s0()
+{
+
+ int a = my_getchar();
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s1:
+** ...
+** cm.push {ra, s0-s1}, -16
+** ...
+** cm.popret {ra, s0-s1}, 16
+** ...
+*/
+int test_s1()
+{
+
+ int s0 = my_getchar();
+ int s1 = my_getchar();
+ int b = my_getchar();
+ return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+int test_f0()
+{
+
+ int s0 = my_getchar();
+ float f0 = getf();
+ int b = my_getchar();
+ return f0 +s0 +b;
+}
+
+/*
+**foo:
+** cm.push {ra}, -16
+** call f1
+** cm.pop {ra}, 16
+** tail f2
+*/
+void foo(void)
+{
+ f1();
+ f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
new file mode 100644
index 00000000000..05602302a8f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
+/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+void bar();
+
+/*
+**fool_rv32e:
+** cm.push {ra}, -32
+** ...
+** call bar
+** ...
+** lw a5,32\(sp\)
+** ...
+** cm.popret {ra}, 32
+*/
+int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
+ int incoming0)
+{
+ bar();
+ return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
+}
--
2.17.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/4] [RISC-V] support cm.popretz in zcmp
2023-06-07 5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
2023-06-07 5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
@ 2023-06-07 5:52 ` Fei Gao
2023-07-13 8:31 ` Kito Cheng
2023-06-07 5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
2023-06-07 5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
3 siblings, 1 reply; 17+ messages in thread
From: Fei Gao @ 2023-06-07 5:52 UTC (permalink / raw)
To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao
Generate cm.popretz instead of cm.popret if return value is 0.
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_zcmp_can_use_popretz): true if popretz can be used
(riscv_gen_multi_pop_insn): interface to generate cm.pop[ret][z]
(riscv_expand_epilogue): expand cm.pop[ret][z] in epilogue
* config/riscv/riscv.md:
* config/riscv/zc.md
(@gpr_multi_popretz_up_to_ra_<mode>): md for popretz ra
(@gpr_multi_popretz_up_to_s0_<mode>): md for popretz ra, s0
(@gpr_multi_popretz_up_to_s1_<mode>): likewise
(@gpr_multi_popretz_up_to_s2_<mode>): likewise
(@gpr_multi_popretz_up_to_s3_<mode>): likewise
(@gpr_multi_popretz_up_to_s4_<mode>): likewise
(@gpr_multi_popretz_up_to_s5_<mode>): likewise
(@gpr_multi_popretz_up_to_s6_<mode>): likewise
(@gpr_multi_popretz_up_to_s7_<mode>): likewise
(@gpr_multi_popretz_up_to_s8_<mode>): likewise
(@gpr_multi_popretz_up_to_s9_<mode>): likewise
(@gpr_multi_popretz_up_to_s11_<mode>): likewise
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv32e_zcmp.c: add testcase for cm.popretz in rv32e
* gcc.target/riscv/rv32i_zcmp.c: add testcase for cm.popretz in rv32i
---
gcc/config/riscv/riscv.cc | 114 ++++--
gcc/config/riscv/riscv.md | 1 +
gcc/config/riscv/zc.md | 393 ++++++++++++++++++++
gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 12 +
gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 12 +
5 files changed, 508 insertions(+), 24 deletions(-)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index c476c699f4c..f60c241a526 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -435,6 +435,7 @@ typedef enum
PUSH_IDX = 0,
POP_IDX,
POPRET_IDX,
+ POPRETZ_IDX,
ZCMP_OP_NUM
} riscv_zcmp_op_t;
@@ -5535,30 +5536,30 @@ riscv_emit_stack_tie (void)
/*zcmp multi push and pop code_for_push_pop function ptr array */
const code_for_push_pop_t code_for_push_pop [ZCMP_MAX_GRP_SLOTS][ZCMP_OP_NUM] = {
{code_for_gpr_multi_push_up_to_ra, code_for_gpr_multi_pop_up_to_ra,
- code_for_gpr_multi_popret_up_to_ra},
+ code_for_gpr_multi_popret_up_to_ra, code_for_gpr_multi_popretz_up_to_ra},
{code_for_gpr_multi_push_up_to_s0, code_for_gpr_multi_pop_up_to_s0,
- code_for_gpr_multi_popret_up_to_s0},
+ code_for_gpr_multi_popret_up_to_s0, code_for_gpr_multi_popretz_up_to_s0},
{code_for_gpr_multi_push_up_to_s1, code_for_gpr_multi_pop_up_to_s1,
- code_for_gpr_multi_popret_up_to_s1},
+ code_for_gpr_multi_popret_up_to_s1, code_for_gpr_multi_popretz_up_to_s1},
{code_for_gpr_multi_push_up_to_s2, code_for_gpr_multi_pop_up_to_s2,
- code_for_gpr_multi_popret_up_to_s2},
+ code_for_gpr_multi_popret_up_to_s2, code_for_gpr_multi_popretz_up_to_s2},
{code_for_gpr_multi_push_up_to_s3, code_for_gpr_multi_pop_up_to_s3,
- code_for_gpr_multi_popret_up_to_s3},
+ code_for_gpr_multi_popret_up_to_s3, code_for_gpr_multi_popretz_up_to_s3},
{code_for_gpr_multi_push_up_to_s4, code_for_gpr_multi_pop_up_to_s4,
- code_for_gpr_multi_popret_up_to_s4},
+ code_for_gpr_multi_popret_up_to_s4, code_for_gpr_multi_popretz_up_to_s4},
{code_for_gpr_multi_push_up_to_s5, code_for_gpr_multi_pop_up_to_s5,
- code_for_gpr_multi_popret_up_to_s5},
+ code_for_gpr_multi_popret_up_to_s5, code_for_gpr_multi_popretz_up_to_s5},
{code_for_gpr_multi_push_up_to_s6, code_for_gpr_multi_pop_up_to_s6,
- code_for_gpr_multi_popret_up_to_s6},
+ code_for_gpr_multi_popret_up_to_s6, code_for_gpr_multi_popretz_up_to_s6},
{code_for_gpr_multi_push_up_to_s7, code_for_gpr_multi_pop_up_to_s7,
- code_for_gpr_multi_popret_up_to_s7},
+ code_for_gpr_multi_popret_up_to_s7, code_for_gpr_multi_popretz_up_to_s7},
{code_for_gpr_multi_push_up_to_s8, code_for_gpr_multi_pop_up_to_s8,
- code_for_gpr_multi_popret_up_to_s8},
+ code_for_gpr_multi_popret_up_to_s8, code_for_gpr_multi_popretz_up_to_s8},
{code_for_gpr_multi_push_up_to_s9, code_for_gpr_multi_pop_up_to_s9,
- code_for_gpr_multi_popret_up_to_s9},
- {nullptr, nullptr, nullptr},
+ code_for_gpr_multi_popret_up_to_s9, code_for_gpr_multi_popretz_up_to_s9},
+ {nullptr, nullptr, nullptr, nullptr},
{code_for_gpr_multi_push_up_to_s11, code_for_gpr_multi_pop_up_to_s11,
- code_for_gpr_multi_popret_up_to_s11}};
+ code_for_gpr_multi_popret_up_to_s11, code_for_gpr_multi_popretz_up_to_s11}};
static rtx
riscv_gen_multi_push_pop_insn (riscv_zcmp_op_t op, HOST_WIDE_INT adj_size,
@@ -5747,6 +5748,80 @@ riscv_adjust_libcall_cfi_epilogue ()
return dwarf;
}
+/* return true if popretz pattern can be matched.
+ set (reg 10 a0) (const_int 0)
+ use (reg 10 a0)
+ NOTE_INSN_EPILOGUE_BEG */
+static rtx_insn *
+riscv_zcmp_can_use_popretz(void)
+{
+ rtx_insn *insn = NULL, *use = NULL, *clear = NULL;
+
+ /* sequence stack for NOTE_INSN_EPILOGUE_BEG*/
+ struct sequence_stack * outer_seq = get_current_sequence ()->next;
+ if (!outer_seq)
+ return NULL;
+ insn = outer_seq->first;
+ if(!insn || !NOTE_P (insn) || NOTE_KIND (insn) != NOTE_INSN_EPILOGUE_BEG)
+ return NULL;
+
+ /* sequence stack for the insn before NOTE_INSN_EPILOGUE_BEG*/
+ outer_seq = outer_seq->next;
+ if (outer_seq)
+ insn = outer_seq->last;
+
+ /* skip notes */
+ while (insn && NOTE_P (insn))
+ {
+ insn = PREV_INSN (insn);
+ }
+ use = insn;
+
+ /* match use (reg 10 a0) */
+ if (use == NULL || !INSN_P (use)
+ || GET_CODE (PATTERN (use)) != USE
+ || !REG_P(XEXP(PATTERN (use), 0))
+ || REGNO(XEXP(PATTERN (use), 0)) != A0_REGNUM)
+ return NULL;
+
+ /* match set (reg 10 a0) (const_int 0 [0]) */
+ clear = PREV_INSN (use);
+ if (clear != NULL && INSN_P (clear)
+ && GET_CODE (PATTERN (clear)) == SET
+ && REG_P (SET_DEST (PATTERN (clear)))
+ && REGNO (SET_DEST (PATTERN (clear))) == A0_REGNUM
+ && SET_SRC (PATTERN (clear)) == const0_rtx)
+ return clear;
+
+ return NULL;
+}
+
+static void
+riscv_gen_multi_pop_insn(bool use_multi_pop_normal, unsigned mask,
+ unsigned multipop_size)
+{
+ rtx insn;
+ unsigned regs_count = riscv_multi_push_regs_count (mask);
+
+ if (!use_multi_pop_normal)
+ insn= emit_insn (
+ riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
+ else if(rtx_insn * clear_a0_insn = riscv_zcmp_can_use_popretz())
+ {
+ delete_insn (NEXT_INSN (clear_a0_insn));
+ delete_insn (clear_a0_insn);
+ insn = emit_jump_insn (
+ riscv_gen_multi_push_pop_insn (POPRETZ_IDX, multipop_size, regs_count));
+ }
+ else
+ insn = emit_jump_insn (
+ riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
+
+ rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
+ RTX_FRAME_RELATED_P (insn) = 1;
+ REG_NOTES (insn) = dwarf;
+}
+
/* Expand an "epilogue", "sibcall_epilogue", or "eh_return_internal" pattern;
style says which. */
@@ -5943,17 +6018,8 @@ riscv_expand_epilogue (int style)
if (use_multi_pop)
{
- unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
- if (use_multi_pop_normal)
- insn = emit_jump_insn (
- riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
- else
- insn= emit_insn (
- riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
-
- rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
- RTX_FRAME_RELATED_P (insn) = 1;
- REG_NOTES (insn) = dwarf;
+ riscv_gen_multi_pop_insn(use_multi_pop_normal,
+ frame->mask, multipop_size);
if (use_multi_pop_normal)
return;
}
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c858b3bc9ef..b2e1f82f627 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -120,6 +120,7 @@
(T1_REGNUM 6)
(S0_REGNUM 8)
(S1_REGNUM 9)
+ (A0_REGNUM 10)
(S2_REGNUM 18)
(S3_REGNUM 19)
(S4_REGNUM 20)
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
index 5c1bf031b8d..8d7de97daad 100644
--- a/gcc/config/riscv/zc.md
+++ b/gcc/config/riscv/zc.md
@@ -708,6 +708,399 @@
"cm.popret {ra, s0-s11}, %0"
)
+(define_insn "@gpr_multi_popretz_up_to_ra_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s0_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s1_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s2_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s3_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s4_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s5_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s6_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s7_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s8_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s9_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s11_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+ (set (reg:X S11_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot0_offset>))))
+ (set (reg:X S10_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot1_offset>))))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot2_offset>))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot3_offset>))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot4_offset>))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot5_offset>))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot6_offset>))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot7_offset>))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot8_offset>))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot9_offset>))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot10_offset>))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot11_offset>))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (const_int <slot12_offset>))))
+ (set (reg:X A0_REGNUM)
+ (const_int 0))
+ (use (reg:X A0_REGNUM))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popretz {ra, s0-s11}, %0"
+)
+
(define_insn "@gpr_multi_push_up_to_ra_<mode>"
[(set (mem:X (plus:X (reg:X SP_REGNUM)
(const_int <slot0_offset>)))
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
index 6dbe489da9b..05e52df99c2 100644
--- a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -237,3 +237,15 @@ void foo(void)
f1();
f2();
}
+
+/*
+**test_popretz:
+** cm.push {ra}, -16
+** call f1
+** cm.popretz {ra}, 16
+*/
+long test_popretz()
+{
+ f1();
+ return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
index 924197cb3c4..7d5c1121c35 100644
--- a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
@@ -237,3 +237,15 @@ void foo(void)
f1();
f2();
}
+
+/*
+**test_popretz:
+** cm.push {ra}, -16
+** call f1
+** cm.popretz {ra}, 16
+*/
+long test_popretz()
+{
+ f1();
+ return 0;
+}
--
2.17.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
2023-06-07 5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
2023-06-07 5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
2023-06-07 5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
@ 2023-06-07 5:52 ` Fei Gao
2023-06-12 15:17 ` Kito Cheng
2023-06-12 19:26 ` Jeff Law
2023-06-07 5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
3 siblings, 2 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-07 5:52 UTC (permalink / raw)
To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao
Disable zcmp multi push/pop if shrink-wrap-separate is active.
So in -Os that prefers smaller code size, by default shrink-wrap-separate
is disabled while zcmp multi push/pop is enabled.
And in -O2 and others that prefers speed, by default shrink-wrap-separate
is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
The following TC shows the issues in -O2 before this patch with both
shrink-wrap-separate and zcmp multi push/pop active.
1. duplicated store of s regs.
2. cm.push pushes ra, s0-s11 in reverse order than what normal
prologue does, causing stack corruption and failure to resotre s regs.
TC: zcmp_shrink_wrap_separate.c included in this patch.
output asm before this patch:
calc_func:
cm.push {ra, s0-s3}, -32
...
beq a5,zero,.L2
...
.L2:
...
sw s1,20(sp) //issue here
sw s3,12(sp) //issue here
...
sw s2,16(sp) //issue here
output asm after this patch:
calc_func:
addi sp,sp,-32
sw s0,24(sp)
...
beq a5,zero,.L2
...
.L2:
...
sw s1,20(sp)
sw s3,12(sp)
...
sw s2,16(sp)
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_avoid_shrink_wrapping_separate): wrap the condition check in
riscv_avoid_shrink_wrapping_separate.
(riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
is active.
(riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
* shrink-wrap.cc (try_shrink_wrapping_separate): call
use_shrink_wrapping_separate.
(use_shrink_wrapping_separate):wrap the condition
check in use_shrink_wrapping_separate
* shrink-wrap.h (use_shrink_wrapping_separate): add to extern
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
* gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
Co-Authored-By: Zhangjin Liao <liaozhangjin@eswincomputing.com>
---
gcc/config/riscv/riscv.cc | 19 +++-
gcc/shrink-wrap.cc | 25 +++--
gcc/shrink-wrap.h | 1 +
.../riscv/zcmp_shrink_wrap_separate.c | 97 +++++++++++++++++++
.../riscv/zcmp_shrink_wrap_separate2.c | 97 +++++++++++++++++++
5 files changed, 228 insertions(+), 11 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f60c241a526..b505cdeca34 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -64,6 +64,7 @@ along with GCC; see the file COPYING3. If not see
#include "cfghooks.h"
#include "cfgloop.h"
#include "cfgrtl.h"
+#include "shrink-wrap.h"
#include "sel-sched.h"
#include "fold-const.h"
#include "gimple-iterator.h"
@@ -389,6 +390,7 @@ static const struct riscv_tune_param optimize_size_tune_info = {
false, /* use_divmod_expansion */
};
+static bool riscv_avoid_shrink_wrapping_separate ();
static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);
@@ -4910,6 +4912,8 @@ riscv_avoid_multi_push(const struct riscv_frame_info *frame)
|| cfun->machine->interrupt_handler_p
|| cfun->machine->varargs_size != 0
|| crtl->args.pretend_args_size != 0
+ || (use_shrink_wrapping_separate ()
+ && !riscv_avoid_shrink_wrapping_separate ())
|| (frame->mask & ~ MULTI_PUSH_GPR_MASK))
return true;
@@ -6077,6 +6081,17 @@ riscv_epilogue_uses (unsigned int regno)
return false;
}
+static bool
+riscv_avoid_shrink_wrapping_separate ()
+{
+ if (riscv_use_save_libcall (&cfun->machine->frame)
+ || cfun->machine->interrupt_handler_p
+ || !cfun->machine->frame.gp_sp_offset.is_constant ())
+ return true;
+
+ return false;
+}
+
/* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS. */
static sbitmap
@@ -6086,9 +6101,7 @@ riscv_get_separate_components (void)
sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER);
bitmap_clear (components);
- if (riscv_use_save_libcall (&cfun->machine->frame)
- || cfun->machine->interrupt_handler_p
- || !cfun->machine->frame.gp_sp_offset.is_constant ())
+ if (riscv_avoid_shrink_wrapping_separate ())
return components;
offset = cfun->machine->frame.gp_sp_offset.to_constant ();
diff --git a/gcc/shrink-wrap.cc b/gcc/shrink-wrap.cc
index b8d7b557130..d534964321a 100644
--- a/gcc/shrink-wrap.cc
+++ b/gcc/shrink-wrap.cc
@@ -1776,16 +1776,14 @@ insert_prologue_epilogue_for_components (sbitmap components)
commit_edge_insertions ();
}
-/* The main entry point to this subpass. FIRST_BB is where the prologue
- would be normally put. */
-void
-try_shrink_wrapping_separate (basic_block first_bb)
+bool
+use_shrink_wrapping_separate (void)
{
if (!(SHRINK_WRAPPING_ENABLED
- && flag_shrink_wrap_separate
- && optimize_function_for_speed_p (cfun)
- && targetm.shrink_wrap.get_separate_components))
- return;
+ && flag_shrink_wrap_separate
+ && optimize_function_for_speed_p (cfun)
+ && targetm.shrink_wrap.get_separate_components))
+ return false;
/* We don't handle "strange" functions. */
if (cfun->calls_alloca
@@ -1794,6 +1792,17 @@ try_shrink_wrapping_separate (basic_block first_bb)
|| crtl->calls_eh_return
|| crtl->has_nonlocal_goto
|| crtl->saves_all_registers)
+ return false;
+
+ return true;
+}
+
+/* The main entry point to this subpass. FIRST_BB is where the prologue
+ would be normally put. */
+void
+try_shrink_wrapping_separate (basic_block first_bb)
+{
+ if (!use_shrink_wrapping_separate ())
return;
/* Ask the target what components there are. If it returns NULL, don't
diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
index 161647711a3..82386c2b712 100644
--- a/gcc/shrink-wrap.h
+++ b/gcc/shrink-wrap.h
@@ -26,6 +26,7 @@ along with GCC; see the file COPYING3. If not see
extern bool requires_stack_frame_p (rtx_insn *, HARD_REG_SET, HARD_REG_SET);
extern void try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq);
extern void try_shrink_wrapping_separate (basic_block first_bb);
+extern bool use_shrink_wrapping_separate (void);
#define SHRINK_WRAPPING_ENABLED \
(flag_shrink_wrap && targetm.have_simple_return ())
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
new file mode 100644
index 00000000000..11f87aee607
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
@@ -0,0 +1,97 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+
+typedef struct MAT_PARAMS_S
+{
+ int N;
+ signed short *A;
+ signed short *B;
+ signed int *C;
+} mat_params;
+
+typedef struct CORE_PORTABLE_S
+{
+ unsigned char portable_id;
+} core_portable;
+
+typedef struct RESULTS_S
+{
+ /* inputs */
+ signed short seed1; /* Initializing seed */
+ signed short seed2; /* Initializing seed */
+ signed short seed3; /* Initializing seed */
+ void * memblock[4]; /* Pointer to safe memory location */
+ unsigned int size; /* Size of the data */
+ unsigned int iterations; /* Number of iterations to execute */
+ unsigned int execs; /* Bitmask of operations to execute */
+ struct list_head_s *list;
+ mat_params mat;
+ /* outputs */
+ unsigned short crc;
+ unsigned short crclist;
+ unsigned short crcmatrix;
+ unsigned short crcstate;
+ signed short err;
+ /* ultithread specific */
+ core_portable port;
+} core_results;
+
+extern signed short
+core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
+
+extern signed short
+core_bench_matrix(mat_params *, signed short, unsigned short);
+
+extern unsigned short
+crcu16(signed short, unsigned short);
+
+signed short
+calc_func(signed short *pdata, core_results *res)
+{
+ signed short data = *pdata;
+ signed short retval;
+ unsigned char optype
+ = (data >> 7)
+ & 1; /* bit 7 indicates if the function result has been cached */
+ if (optype) /* if cached, use cache */
+ return (data & 0x007f);
+ else
+ { /* otherwise calculate and cache the result */
+ signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
+ signed short dtype
+ = ((data >> 3)
+ & 0xf); /* bits 3-6 is specific data for the operation */
+ dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
+ switch (flag)
+ {
+ case 0:
+ if (dtype < 0x22) /* set min period for bit corruption */
+ dtype = 0x22;
+ retval = core_bench_state(res->size,
+ res->memblock[3],
+ res->seed1,
+ res->seed2,
+ dtype,
+ res->crc);
+ if (res->crcstate == 0)
+ res->crcstate = retval;
+ break;
+ case 1:
+ retval = core_bench_matrix(&(res->mat), dtype, res->crc);
+ if (res->crcmatrix == 0)
+ res->crcmatrix = retval;
+ break;
+ default:
+ retval = data;
+ break;
+ }
+ res->crc = crcu16(retval, res->crc);
+ retval &= 0x007f;
+ *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
+ return retval;
+ }
+}
+
+/* { dg-final { scan-assembler-not "cm\.push" } } */
+
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
new file mode 100644
index 00000000000..ec7e9c39b5d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
@@ -0,0 +1,97 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 -fno-shrink-wrap-separate -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+
+typedef struct MAT_PARAMS_S
+{
+ int N;
+ signed short *A;
+ signed short *B;
+ signed int *C;
+} mat_params;
+
+typedef struct CORE_PORTABLE_S
+{
+ unsigned char portable_id;
+} core_portable;
+
+typedef struct RESULTS_S
+{
+ /* inputs */
+ signed short seed1; /* Initializing seed */
+ signed short seed2; /* Initializing seed */
+ signed short seed3; /* Initializing seed */
+ void * memblock[4]; /* Pointer to safe memory location */
+ unsigned int size; /* Size of the data */
+ unsigned int iterations; /* Number of iterations to execute */
+ unsigned int execs; /* Bitmask of operations to execute */
+ struct list_head_s *list;
+ mat_params mat;
+ /* outputs */
+ unsigned short crc;
+ unsigned short crclist;
+ unsigned short crcmatrix;
+ unsigned short crcstate;
+ signed short err;
+ /* ultithread specific */
+ core_portable port;
+} core_results;
+
+extern signed short
+core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
+
+extern signed short
+core_bench_matrix(mat_params *, signed short, unsigned short);
+
+extern unsigned short
+crcu16(signed short, unsigned short);
+
+signed short
+calc_func(signed short *pdata, core_results *res)
+{
+ signed short data = *pdata;
+ signed short retval;
+ unsigned char optype
+ = (data >> 7)
+ & 1; /* bit 7 indicates if the function result has been cached */
+ if (optype) /* if cached, use cache */
+ return (data & 0x007f);
+ else
+ { /* otherwise calculate and cache the result */
+ signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
+ signed short dtype
+ = ((data >> 3)
+ & 0xf); /* bits 3-6 is specific data for the operation */
+ dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
+ switch (flag)
+ {
+ case 0:
+ if (dtype < 0x22) /* set min period for bit corruption */
+ dtype = 0x22;
+ retval = core_bench_state(res->size,
+ res->memblock[3],
+ res->seed1,
+ res->seed2,
+ dtype,
+ res->crc);
+ if (res->crcstate == 0)
+ res->crcstate = retval;
+ break;
+ case 1:
+ retval = core_bench_matrix(&(res->mat), dtype, res->crc);
+ if (res->crcmatrix == 0)
+ res->crcmatrix = retval;
+ break;
+ default:
+ retval = data;
+ break;
+ }
+ res->crc = crcu16(retval, res->crc);
+ retval &= 0x007f;
+ *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
+ return retval;
+ }
+}
+
+/* { dg-final { scan-assembler "cm\.push" } } */
+
--
2.17.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp
2023-06-07 5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
` (2 preceding siblings ...)
2023-06-07 5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
@ 2023-06-07 5:52 ` Fei Gao
2023-07-13 8:18 ` Kito Cheng
3 siblings, 1 reply; 17+ messages in thread
From: Fei Gao @ 2023-06-07 5:52 UTC (permalink / raw)
To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Die Li
From: Die Li <lidie@eswincomputing.com>
Signed-off-by: Die Li <lidie@eswincomputing.com>
Co-Authored-By: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/peephole.md: New pattern.
* config/riscv/predicates.md (a0a1_reg_operand): New predicate.
(zcmp_mv_sreg_operand): New predicate.
* config/riscv/riscv.md: New predicate.
* config/riscv/zc.md (*mva01s<X:mode>): New pattern.
(*mvsa01<X:mode>): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cm_mv_rv32.c: New test.
---
gcc/config/riscv/peephole.md | 28 +++++++++++++++++++++
gcc/config/riscv/predicates.md | 11 ++++++++
gcc/config/riscv/riscv.md | 1 +
gcc/config/riscv/zc.md | 22 ++++++++++++++++
gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c | 21 ++++++++++++++++
5 files changed, 83 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
diff --git a/gcc/config/riscv/peephole.md b/gcc/config/riscv/peephole.md
index 67e7046d7e6..e8cb1ba4838 100644
--- a/gcc/config/riscv/peephole.md
+++ b/gcc/config/riscv/peephole.md
@@ -94,3 +94,31 @@
{
th_mempair_order_operands (operands, true, SImode);
})
+
+;; ZCMP
+(define_peephole2
+ [(set (match_operand:X 0 "a0a1_reg_operand")
+ (match_operand:X 1 "zcmp_mv_sreg_operand"))
+ (set (match_operand:X 2 "a0a1_reg_operand")
+ (match_operand:X 3 "zcmp_mv_sreg_operand"))]
+ "TARGET_ZCMP
+ && (REGNO (operands[2]) != REGNO (operands[0]))"
+ [(parallel [(set (match_dup 0)
+ (match_dup 1))
+ (set (match_dup 2)
+ (match_dup 3))])]
+)
+
+(define_peephole2
+ [(set (match_operand:X 0 "zcmp_mv_sreg_operand")
+ (match_operand:X 1 "a0a1_reg_operand"))
+ (set (match_operand:X 2 "zcmp_mv_sreg_operand")
+ (match_operand:X 3 "a0a1_reg_operand"))]
+ "TARGET_ZCMP
+ && (REGNO (operands[0]) != REGNO (operands[2]))
+ && (REGNO (operands[1]) != REGNO (operands[3]))"
+ [(parallel [(set (match_dup 0)
+ (match_dup 1))
+ (set (match_dup 2)
+ (match_dup 3))])]
+)
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index a1b9367b997..6d5e8630cb5 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -207,6 +207,17 @@
(and (match_code "const_int")
(match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
+;; ZCMP predicates
+(define_predicate "a0a1_reg_operand"
+ (and (match_operand 0 "register_operand")
+ (match_test "IN_RANGE (REGNO (op), A0_REGNUM, A1_REGNUM)")))
+
+(define_predicate "zcmp_mv_sreg_operand"
+ (and (match_operand 0 "register_operand")
+ (match_test "TARGET_RVE ? IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
+ : IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
+ || IN_RANGE (REGNO (op), S2_REGNUM, S7_REGNUM)")))
+
;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
(define_predicate "branch_on_bit_operand"
(and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 02802d2685d..25bc3e6ab4c 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -121,6 +121,7 @@
(S0_REGNUM 8)
(S1_REGNUM 9)
(A0_REGNUM 10)
+ (A1_REGNUM 11)
(S2_REGNUM 18)
(S3_REGNUM 19)
(S4_REGNUM 20)
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
index 217e115035b..bb4975cd333 100644
--- a/gcc/config/riscv/zc.md
+++ b/gcc/config/riscv/zc.md
@@ -1433,3 +1433,25 @@
"TARGET_ZCMP"
"cm.push {ra, s0-s11}, %0"
)
+
+;; ZCMP mv
+(define_insn "*mva01s<X:mode>"
+ [(set (match_operand:X 0 "a0a1_reg_operand" "=r")
+ (match_operand:X 1 "zcmp_mv_sreg_operand" "r"))
+ (set (match_operand:X 2 "a0a1_reg_operand" "=r")
+ (match_operand:X 3 "zcmp_mv_sreg_operand" "r"))]
+ "TARGET_ZCMP
+ && (REGNO (operands[2]) != REGNO (operands[0]))"
+ { return (REGNO (operands[0]) == A0_REGNUM)?"cm.mva01s\t%1,%3":"cm.mva01s\t%3,%1"; }
+ [(set_attr "mode" "<X:MODE>")])
+
+(define_insn "*mvsa01<X:mode>"
+ [(set (match_operand:X 0 "zcmp_mv_sreg_operand" "=r")
+ (match_operand:X 1 "a0a1_reg_operand" "r"))
+ (set (match_operand:X 2 "zcmp_mv_sreg_operand" "=r")
+ (match_operand:X 3 "a0a1_reg_operand" "r"))]
+ "TARGET_ZCMP
+ && (REGNO (operands[0]) != REGNO (operands[2]))
+ && (REGNO (operands[1]) != REGNO (operands[3]))"
+ { return (REGNO (operands[1]) == A0_REGNUM)?"cm.mvsa01\t%0,%2":"cm.mvsa01\t%2,%0"; }
+ [(set_attr "mode" "<X:MODE>")])
diff --git a/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
new file mode 100644
index 00000000000..49c94c01603
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32i_zca_zcmp -mabi=ilp32 " } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+int func (int a, int b);
+
+/*
+**sum:
+** ...
+** cm.mvsa01 s1,s2
+** call func
+** mv s0,a0
+** cm.mva01s s1,s2
+** call func
+** ...
+*/
+int sum (int a, int b)
+{
+ return func (a, b) + func (a, b);
+}
--
2.17.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-06-07 5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
@ 2023-06-07 10:11 ` jiawei
2023-08-16 8:33 ` Kito Cheng
1 sibling, 0 replies; 17+ messages in thread
From: jiawei @ 2023-06-07 10:11 UTC (permalink / raw)
To: Fei Gao; +Cc: gcc-patches, kito.cheng, palmer, jeffreyalaw, sinan.lin
Seems there are some indent format problems in the patch, could you fix them :)
```
patch:509: indent with spaces.
x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
error: patch failed: gcc/config/riscv/riscv.cc:5652
error: gcc/config/riscv/riscv.cc: patch does not apply
```
> -----原始邮件-----
> 发件人: "Fei Gao" <gaofei@eswincomputing.com>
> 发送时间: 2023-06-07 13:52:12 (星期三)
> 收件人: gcc-patches@gcc.gnu.org
> 抄送: kito.cheng@gmail.com, palmer@dabbelt.com, jeffreyalaw@gmail.com, sinan.lin@linux.alibaba.com, jiawei@iscas.ac.cn, "Fei Gao" <gaofei@eswincomputing.com>
> 主题: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
>
> Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
> by cm.push, step 1 and step 2.
>
> please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
> So adaption has been done in .cfi directives in my patch.
>
> Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
>
> gcc/ChangeLog:
>
> * config/riscv/iterators.md
> slot0_offset: slot 0 offset in stack GPRs area in bytes
> slot1_offset: slot 1 offset in stack GPRs area in bytes
> slot2_offset: likewise
> slot3_offset: likewise
> slot4_offset: likewise
> slot5_offset: likewise
> slot6_offset: likewise
> slot7_offset: likewise
> slot8_offset: likewise
> slot9_offset: likewise
> slot10_offset: likewise
> slot11_offset: likewise
> slot12_offset: likewise
> * config/riscv/predicates.md
> (stack_push_up_to_ra_operand): predicates of stack adjust pushing ra
> (stack_push_up_to_s0_operand): predicates of stack adjust pushing ra, s0
> (stack_push_up_to_s1_operand): likewise
> (stack_push_up_to_s2_operand): likewise
> (stack_push_up_to_s3_operand): likewise
> (stack_push_up_to_s4_operand): likewise
> (stack_push_up_to_s5_operand): likewise
> (stack_push_up_to_s6_operand): likewise
> (stack_push_up_to_s7_operand): likewise
> (stack_push_up_to_s8_operand): likewise
> (stack_push_up_to_s9_operand): likewise
> (stack_push_up_to_s11_operand): likewise
> (stack_pop_up_to_ra_operand): predicates of stack adjust poping ra
> (stack_pop_up_to_s0_operand): predicates of stack adjust poping ra, s0
> (stack_pop_up_to_s1_operand): likewise
> (stack_pop_up_to_s2_operand): likewise
> (stack_pop_up_to_s3_operand): likewise
> (stack_pop_up_to_s4_operand): likewise
> (stack_pop_up_to_s5_operand): likewise
> (stack_pop_up_to_s6_operand): likewise
> (stack_pop_up_to_s7_operand): likewise
> (stack_pop_up_to_s8_operand): likewise
> (stack_pop_up_to_s9_operand): likewise
> (stack_pop_up_to_s11_operand): likewise
> * config/riscv/riscv-protos.h
> (riscv_zcmp_valid_stack_adj_bytes_p):declaration
> * config/riscv/riscv.cc (struct riscv_frame_info): comment change
> (riscv_avoid_multi_push): helper function of riscv_use_multi_push
> (riscv_use_multi_push): true if multi push is used
> (riscv_multi_push_sregs_count): num of sregs in multi-push
> (riscv_multi_push_regs_count): num of regs in multi-push
> (riscv_16bytes_align): align to 16 bytes
> (riscv_stack_align): moved to a better place
> (riscv_save_libcall_count): no functional change
> (riscv_compute_frame_info): add zcmp frame info
> (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
> (riscv_gen_multi_push_pop_insn): gen function for multi push and pop
> (riscv_expand_prologue): allocate stack by cm.push
> (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
> (riscv_expand_epilogue): allocate stack by cm.pop[ret]
> (zcmp_base_adj): calculate stack adjustment base size
> (zcmp_additional_adj): calculate stack adjustment additional size
> (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment valid
> * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
> (S0_MASK): likewise
> (S1_MASK): likewise
> (S2_MASK): likewise
> (S3_MASK): likewise
> (S4_MASK): likewise
> (S5_MASK): likewise
> (S6_MASK): likewise
> (S7_MASK): likewise
> (S8_MASK): likewise
> (S9_MASK): likewise
> (S10_MASK): likewise
> (S11_MASK): likewise
> (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
> (ZCMP_MAX_SPIMM): max spimm value
> (ZCMP_SP_INC_STEP): zcmp sp increment step
> (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
> (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
> (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
> * config/riscv/riscv.md: include zc.md
> * config/riscv/zc.md: New file. machine description for zcmp
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rv32e_zcmp.c: New test.
> * gcc.target/riscv/rv32i_zcmp.c: New test.
> * gcc.target/riscv/zcmp_stack_alignment.c: New test.
> ---
> gcc/config/riscv/iterators.md | 15 +
> gcc/config/riscv/predicates.md | 96 ++
> gcc/config/riscv/riscv-protos.h | 1 +
> gcc/config/riscv/riscv.cc | 360 +++++-
> gcc/config/riscv/riscv.h | 23 +
> gcc/config/riscv/riscv.md | 2 +
> gcc/config/riscv/zc.md | 1042 +++++++++++++++++
> gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 239 ++++
> gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 239 ++++
> .../gcc.target/riscv/zcmp_stack_alignment.c | 23 +
> 10 files changed, 2000 insertions(+), 40 deletions(-)
> create mode 100644 gcc/config/riscv/zc.md
> create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
>
> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> index d374a10810c..6ed4174f9cc 100644
> --- a/gcc/config/riscv/iterators.md
> +++ b/gcc/config/riscv/iterators.md
> @@ -120,6 +120,21 @@
> (define_mode_attr shiftm1 [(SI "const_si_mask_operand") (DI "const_di_mask_operand")])
> (define_mode_attr shiftm1p [(SI "DsS") (DI "DsD")])
>
> +; zcmp mode attribute
> +(define_mode_attr slot0_offset [(SI "-4") (DI "-8")])
> +(define_mode_attr slot1_offset [(SI "-8") (DI "-16")])
> +(define_mode_attr slot2_offset [(SI "-12") (DI "-24")])
> +(define_mode_attr slot3_offset [(SI "-16") (DI "-32")])
> +(define_mode_attr slot4_offset [(SI "-20") (DI "-40")])
> +(define_mode_attr slot5_offset [(SI "-24") (DI "-48")])
> +(define_mode_attr slot6_offset [(SI "-28") (DI "-56")])
> +(define_mode_attr slot7_offset [(SI "-32") (DI "-64")])
> +(define_mode_attr slot8_offset [(SI "-36") (DI "-72")])
> +(define_mode_attr slot9_offset [(SI "-40") (DI "-80")])
> +(define_mode_attr slot10_offset [(SI "-44") (DI "-88")])
> +(define_mode_attr slot11_offset [(SI "-48") (DI "-96")])
> +(define_mode_attr slot12_offset [(SI "-52") (DI "-104")])
> +
> ;; -------------------------------------------------------------------
> ;; Code Iterators
> ;; -------------------------------------------------------------------
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index 04ca6ceabc7..ab67b3332f0 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -65,6 +65,102 @@
> (ior (match_operand 0 "const_0_operand")
> (match_operand 0 "register_operand")))
>
> +(define_predicate "stack_push_up_to_ra_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
> +
> +(define_predicate "stack_push_up_to_s0_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
> +
> +(define_predicate "stack_push_up_to_s1_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
> +
> +(define_predicate "stack_push_up_to_s2_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
> +
> +(define_predicate "stack_push_up_to_s3_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
> +
> +(define_predicate "stack_push_up_to_s4_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
> +
> +(define_predicate "stack_push_up_to_s5_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
> +
> +(define_predicate "stack_push_up_to_s6_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
> +
> +(define_predicate "stack_push_up_to_s7_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
> +
> +(define_predicate "stack_push_up_to_s8_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
> +
> +(define_predicate "stack_push_up_to_s9_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
> +
> +(define_predicate "stack_push_up_to_s11_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
> +
> +(define_predicate "stack_pop_up_to_ra_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
> +
> +(define_predicate "stack_pop_up_to_s0_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
> +
> +(define_predicate "stack_pop_up_to_s1_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
> +
> +(define_predicate "stack_pop_up_to_s2_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
> +
> +(define_predicate "stack_pop_up_to_s3_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
> +
> +(define_predicate "stack_pop_up_to_s4_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
> +
> +(define_predicate "stack_pop_up_to_s5_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
> +
> +(define_predicate "stack_pop_up_to_s6_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
> +
> +(define_predicate "stack_pop_up_to_s7_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
> +
> +(define_predicate "stack_pop_up_to_s8_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
> +
> +(define_predicate "stack_pop_up_to_s9_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
> +
> +(define_predicate "stack_pop_up_to_s11_operand"
> + (and (match_code "const_int")
> + (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
> +
> ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
> (define_predicate "branch_on_bit_operand"
> (and (match_code "const_int")
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 00e1b20c6c6..f23b11622a2 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -56,6 +56,7 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
> extern void riscv_split_doubleword_move (rtx, rtx);
> extern const char *riscv_output_move (rtx, rtx);
> extern const char *riscv_output_return ();
> +extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
>
> #ifdef RTX_CODE
> extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 3954c89a039..c476c699f4c 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -126,6 +126,14 @@ struct GTY(()) riscv_frame_info {
> /* How much the GPR save/restore routines adjust sp (or 0 if unused). */
> unsigned save_libcall_adjustment;
>
> + /* the minimum number of bytes, in multiples of 16-byte address increments,
> + required to cover the registers in a multi push & pop. */
> + unsigned multi_push_adj_base;
> +
> + /* the number of additional 16-byte address increments allocated for the stack frame
> + in a multi push & pop. */
> + unsigned multi_push_adj_addi;
> +
> /* Offsets of fixed-point and floating-point save areas from frame bottom */
> poly_int64 gp_sp_offset;
> poly_int64 fp_sp_offset;
> @@ -422,6 +430,16 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
> #include "riscv-cores.def"
> };
>
> +typedef enum
> +{
> + PUSH_IDX = 0,
> + POP_IDX,
> + POPRET_IDX,
> + ZCMP_OP_NUM
> +} riscv_zcmp_op_t;
> +
> +typedef insn_code (* code_for_push_pop_t)(machine_mode);
> +
> void riscv_frame_info::reset(void)
> {
> total_size = 0;
> @@ -4876,6 +4894,37 @@ riscv_save_reg_p (unsigned int regno)
> return false;
> }
>
> +/* Return TRUE if Zcmp push and pop insns should be
> + avoided. FALSE otherwise.
> + Only use multi push & pop if all GPRs masked can be covered,
> + and stack access is SP based,
> + and GPRs are at top of the stack frame,
> + and no conflicts in stack allocation with other features */
> +static bool
> +riscv_avoid_multi_push(const struct riscv_frame_info *frame)
> +{
> + if (!TARGET_ZCMP
> + || crtl->calls_eh_return
> + || frame_pointer_needed
> + || cfun->machine->interrupt_handler_p
> + || cfun->machine->varargs_size != 0
> + || crtl->args.pretend_args_size != 0
> + || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
> + return true;
> +
> + return false;
> +}
> +
> +/* Determine whether to use multi push insn. */
> +static bool
> +riscv_use_multi_push(const struct riscv_frame_info *frame)
> +{
> + if (riscv_avoid_multi_push (frame))
> + return false;
> +
> + return (frame->multi_push_adj_base != 0);
> +}
> +
> /* Return TRUE if a libcall to save/restore GPRs should be
> avoided. FALSE otherwise. */
> static bool
> @@ -4913,6 +4962,51 @@ riscv_save_libcall_count (unsigned mask)
> abort ();
> }
>
> +/* calculate number of s regs in multi push and pop.
> + Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead. */
> +static unsigned
> +riscv_multi_push_sregs_count (unsigned mask)
> +{
> + unsigned num = riscv_save_libcall_count (mask);
> + return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
> + ? ZCMP_S0S11_SREGS_COUNTS
> + : num;
> +}
> +
> +/* calculate number of regs(ra, s0-sx) in multi push and pop. */
> +static unsigned
> +riscv_multi_push_regs_count (unsigned mask)
> +{
> + /* 1 is for ra */
> + return riscv_multi_push_sregs_count (mask) + 1;
> +}
> +
> +/* Handle 16 bytes align for poly_int. */
> +static poly_int64
> +riscv_16bytes_align (poly_int64 value)
> +{
> + return aligned_upper_bound (value, 16);
> +}
> +
> +static HOST_WIDE_INT
> +riscv_16bytes_align (HOST_WIDE_INT value)
> +{
> + return ROUND_UP(value, 16);
> +}
> +
> +/* Handle stack align for poly_int. */
> +static poly_int64
> +riscv_stack_align (poly_int64 value)
> +{
> + return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
> +}
> +
> +static HOST_WIDE_INT
> +riscv_stack_align (HOST_WIDE_INT value)
> +{
> + return RISCV_STACK_ALIGN (value);
> +}
> +
> /* Populate the current function's riscv_frame_info structure.
>
> RISC-V stack frames grown downward. High addresses are at the top.
> @@ -4938,7 +5032,7 @@ riscv_save_libcall_count (unsigned mask)
> | GPR save area | + UNITS_PER_WORD
> | |
> +-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
> - | | + UNITS_PER_HWVALUE
> + | | + UNITS_PER_FP_REG
> | FPR save area |
> | |
> +-------------------------------+ <-- frame_pointer_rtx (virtual)
> @@ -4957,19 +5051,6 @@ riscv_save_libcall_count (unsigned mask)
>
> static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
>
> -/* Handle stack align for poly_int. */
> -static poly_int64
> -riscv_stack_align (poly_int64 value)
> -{
> - return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
> -}
> -
> -static HOST_WIDE_INT
> -riscv_stack_align (HOST_WIDE_INT value)
> -{
> - return RISCV_STACK_ALIGN (value);
> -}
> -
> static void
> riscv_compute_frame_info (void)
> {
> @@ -5017,8 +5098,9 @@ riscv_compute_frame_info (void)
> if (frame->mask)
> {
> x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
> - unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
>
> + /* 1 is for ra */
> + unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
> /* Only use save/restore routines if they don't alter the stack size. */
> if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
> && !riscv_avoid_save_libcall ())
> @@ -5030,6 +5112,15 @@ riscv_compute_frame_info (void)
>
> frame->save_libcall_adjustment = x_save_size;
> }
> +
> + if (!riscv_avoid_multi_push (frame))
> + {
> + /* num(ra, s0-sx) */
> + unsigned num_multi_push =
> + riscv_multi_push_regs_count (frame->mask);
> + x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
> + frame->multi_push_adj_base = riscv_16bytes_align (x_save_size);
> + }
> }
>
> /* At the bottom of the frame are any outgoing stack arguments. */
> @@ -5044,7 +5135,15 @@ riscv_compute_frame_info (void)
> frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
> /* Next are the callee-saved GPRs. */
> if (frame->mask)
> - offset += x_save_size;
> + {
> + offset += x_save_size;
> + /* align to 16 bytes and add paddings to GPR part to honor
> + both stack alignment and zcmp pus/pop size alignment. */
> + if (riscv_use_multi_push (frame)
> + && known_lt(offset,
> + frame->multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
> + offset = riscv_16bytes_align (offset);
> + }
> frame->gp_sp_offset = offset - UNITS_PER_WORD;
> /* The hard frame pointer points above the callee-saved GPRs. */
> frame->hard_frame_pointer_offset = offset;
> @@ -5388,6 +5487,42 @@ riscv_adjust_libcall_cfi_prologue ()
> return dwarf;
> }
>
> +static rtx
> +riscv_adjust_multi_push_cfi_prologue (int saved_size)
> +{
> + rtx dwarf = NULL_RTX;
> + rtx adjust_sp_rtx, reg, mem, insn;
> + unsigned int mask = cfun->machine->frame.mask;
> + int offset;
> + int saved_cnt = 0;
> +
> + if (mask & S10_MASK)
> + mask |= S11_MASK;
> +
> + for (int regno = GP_REG_LAST; regno >= GP_REG_FIRST; regno--)
> + if (BITSET_P (mask & MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
> + {
> + /* The save order is s11-s0, ra
> + from high to low addr. */
> + offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
> +
> + reg = gen_rtx_REG (Pmode, regno);
> + mem = gen_frame_mem (Pmode, plus_constant (Pmode,
> + stack_pointer_rtx,
> + offset));
> +
> + insn = gen_rtx_SET (mem, reg);
> + dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
> + }
> +
> + /* Debug info for adjust sp. */
> + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
> + plus_constant(Pmode, stack_pointer_rtx, -saved_size));
> + dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
> + dwarf);
> + return dwarf;
> +}
> +
> static void
> riscv_emit_stack_tie (void)
> {
> @@ -5397,6 +5532,45 @@ riscv_emit_stack_tie (void)
> emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
> }
>
> +/*zcmp multi push and pop code_for_push_pop function ptr array */
> +const code_for_push_pop_t code_for_push_pop [ZCMP_MAX_GRP_SLOTS][ZCMP_OP_NUM] = {
> + {code_for_gpr_multi_push_up_to_ra, code_for_gpr_multi_pop_up_to_ra,
> + code_for_gpr_multi_popret_up_to_ra},
> + {code_for_gpr_multi_push_up_to_s0, code_for_gpr_multi_pop_up_to_s0,
> + code_for_gpr_multi_popret_up_to_s0},
> + {code_for_gpr_multi_push_up_to_s1, code_for_gpr_multi_pop_up_to_s1,
> + code_for_gpr_multi_popret_up_to_s1},
> + {code_for_gpr_multi_push_up_to_s2, code_for_gpr_multi_pop_up_to_s2,
> + code_for_gpr_multi_popret_up_to_s2},
> + {code_for_gpr_multi_push_up_to_s3, code_for_gpr_multi_pop_up_to_s3,
> + code_for_gpr_multi_popret_up_to_s3},
> + {code_for_gpr_multi_push_up_to_s4, code_for_gpr_multi_pop_up_to_s4,
> + code_for_gpr_multi_popret_up_to_s4},
> + {code_for_gpr_multi_push_up_to_s5, code_for_gpr_multi_pop_up_to_s5,
> + code_for_gpr_multi_popret_up_to_s5},
> + {code_for_gpr_multi_push_up_to_s6, code_for_gpr_multi_pop_up_to_s6,
> + code_for_gpr_multi_popret_up_to_s6},
> + {code_for_gpr_multi_push_up_to_s7, code_for_gpr_multi_pop_up_to_s7,
> + code_for_gpr_multi_popret_up_to_s7},
> + {code_for_gpr_multi_push_up_to_s8, code_for_gpr_multi_pop_up_to_s8,
> + code_for_gpr_multi_popret_up_to_s8},
> + {code_for_gpr_multi_push_up_to_s9, code_for_gpr_multi_pop_up_to_s9,
> + code_for_gpr_multi_popret_up_to_s9},
> + {nullptr, nullptr, nullptr},
> + {code_for_gpr_multi_push_up_to_s11, code_for_gpr_multi_pop_up_to_s11,
> + code_for_gpr_multi_popret_up_to_s11}};
> +
> +static rtx
> +riscv_gen_multi_push_pop_insn (riscv_zcmp_op_t op, HOST_WIDE_INT adj_size,
> + unsigned int regs_num)
> +{
> + gcc_assert (op < ZCMP_OP_NUM);
> + gcc_assert (regs_num <= ZCMP_MAX_GRP_SLOTS
> + && regs_num != ZCMP_INVALID_S0S10_SREGS_COUNTS + 1); /* 1 for ra*/
> + rtx stack_adj = GEN_INT (adj_size);
> + return GEN_FCN (code_for_push_pop[regs_num - 1][op] (Pmode)) (stack_adj);
> +}
> +
> /* Expand the "prologue" pattern. */
>
> void
> @@ -5405,7 +5579,8 @@ riscv_expand_prologue (void)
> struct riscv_frame_info *frame = &cfun->machine->frame;
> poly_int64 remaining_size = frame->total_size;
> unsigned mask = frame->mask;
> - rtx insn;
> + int spimm, multi_push_additional, stack_adj;
> + rtx insn, dwarf = NULL_RTX;
>
> if (flag_stack_usage_info)
> current_function_static_stack_size = constant_lower_bound (remaining_size);
> @@ -5413,8 +5588,35 @@ riscv_expand_prologue (void)
> if (cfun->machine->naked_p)
> return;
>
> + /* prefer muti-push to save-restore libcall. */
> + if (riscv_use_multi_push(frame))
> + {
> + remaining_size -= frame->multi_push_adj_base;
> + if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
> + spimm = 3;
> + else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
> + spimm = 2;
> + else if (known_gt(remaining_size, 0))
> + spimm = 1;
> + else
> + spimm = 0;
> + multi_push_additional = spimm * ZCMP_SP_INC_STEP;
> + frame->multi_push_adj_addi = multi_push_additional;
> + remaining_size -= multi_push_additional;
> +
> + /* emit multi push insn & dwarf along with it. */
> + stack_adj = frame->multi_push_adj_base + multi_push_additional;
> + insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
> + -stack_adj, riscv_multi_push_regs_count(frame->mask)));
> + dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
> + RTX_FRAME_RELATED_P (insn) = 1;
> + REG_NOTES (insn) = dwarf;
> +
> + /* Temporarily fib that we need not save GPRs. */
> + frame->mask = 0;
> + }
> /* When optimizing for size, call a subroutine to save the registers. */
> - if (riscv_use_save_libcall (frame))
> + else if (riscv_use_save_libcall (frame))
> {
> rtx dwarf = NULL_RTX;
> dwarf = riscv_adjust_libcall_cfi_prologue ();
> @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
> /* Save the registers. */
> if ((frame->mask | frame->fmask) != 0)
> {
> - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> -
> - insn = gen_add3_insn (stack_pointer_rtx,
> - stack_pointer_rtx,
> - GEN_INT (-step1));
> - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> - remaining_size -= step1;
> + if (known_gt (remaining_size, frame->frame_pointer_offset))
> + {
> + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> + remaining_size -= step1;
> + insn = gen_add3_insn (stack_pointer_rtx,
> + stack_pointer_rtx,
> + GEN_INT (-step1));
> + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> + }
> riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
> }
>
> @@ -5493,6 +5697,32 @@ riscv_expand_prologue (void)
> }
> }
>
> +static rtx
> +riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
> +{
> + rtx dwarf = NULL_RTX;
> + rtx adjust_sp_rtx, reg;
> + unsigned int mask = cfun->machine->frame.mask;
> +
> + if (mask & S10_MASK)
> + mask |= S11_MASK;
> +
> + /* Debug info for adjust sp. */
> + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
> + plus_constant(Pmode, stack_pointer_rtx, saved_size));
> + dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
> + dwarf);
> +
> + for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
> + if (BITSET_P (mask, regno - GP_REG_FIRST))
> + {
> + reg = gen_rtx_REG (Pmode, regno);
> + dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
> + }
> +
> + return dwarf;
> +}
> +
> static rtx
> riscv_adjust_libcall_cfi_epilogue ()
> {
> @@ -5532,10 +5762,18 @@ riscv_expand_epilogue (int style)
> struct riscv_frame_info *frame = &cfun->machine->frame;
> unsigned mask = frame->mask;
> HOST_WIDE_INT step2 = 0;
> - bool use_restore_libcall = ((style == NORMAL_RETURN)
> - && riscv_use_save_libcall (frame));
> - unsigned libcall_size = (use_restore_libcall
> - ? frame->save_libcall_adjustment : 0);
> + bool use_multi_pop_normal = ((style == NORMAL_RETURN)
> + && riscv_use_multi_push (frame));
> + bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
> + && riscv_use_multi_push (frame));
> + bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
> +
> + bool use_restore_libcall = !use_multi_pop && ((style == NORMAL_RETURN)
> + && riscv_use_save_libcall (frame));
> + unsigned libcall_size = use_restore_libcall && !use_multi_pop ?
> + frame->save_libcall_adjustment : 0;
> + unsigned multipop_size = use_multi_pop ?
> + frame->multi_push_adj_base + frame->multi_push_adj_addi : 0;
> rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
> rtx insn;
>
> @@ -5606,18 +5844,25 @@ riscv_expand_epilogue (int style)
> REG_NOTES (insn) = dwarf;
> }
>
> - if (use_restore_libcall)
> - frame->mask = 0; /* Temporarily fib for GPRs. */
> + if (use_restore_libcall || use_multi_pop)
> + frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
>
> /* If we need to restore registers, deallocate as much stack as
> possible in the second step without going out of range. */
> - if ((frame->mask | frame->fmask) != 0)
> + if (use_multi_pop)
> + {
> + if (frame->fmask
> + && known_gt (frame->total_size - multipop_size,
> + frame->frame_pointer_offset))
> + step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
> + }
> + else if ((frame->mask | frame->fmask) != 0)
> step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
>
> - if (use_restore_libcall)
> + if (use_restore_libcall || use_multi_pop)
> frame->mask = mask; /* Undo the above fib. */
>
> - poly_int64 step1 = frame->total_size - step2 - libcall_size;
> + poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
>
> /* Set TARGET to BASE + STEP1. */
> if (known_gt (step1, 0))
> @@ -5652,7 +5897,7 @@ riscv_expand_epilogue (int style)
> adjust));
> rtx dwarf = NULL_RTX;
> rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> - GEN_INT (step2 + libcall_size));
> + GEN_INT (step2 + libcall_size + multipop_size));
>
> dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
> RTX_FRAME_RELATED_P (insn) = 1;
> @@ -5667,15 +5912,15 @@ riscv_expand_epilogue (int style)
> epilogue_cfa_sp_offset = step2;
> }
>
> - if (use_restore_libcall)
> + if (use_restore_libcall || use_multi_pop)
> frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
>
> /* Restore the registers. */
> - riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
> + riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
> riscv_restore_reg,
> true, style == EXCEPTION_RETURN);
>
> - if (use_restore_libcall)
> + if (use_restore_libcall || use_multi_pop)
> frame->mask = mask; /* Undo the above fib. */
>
> if (need_barrier_p)
> @@ -5689,14 +5934,30 @@ riscv_expand_epilogue (int style)
>
> rtx dwarf = NULL_RTX;
> rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> - GEN_INT (libcall_size));
> + GEN_INT (libcall_size + multipop_size));
> dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
> RTX_FRAME_RELATED_P (insn) = 1;
>
> REG_NOTES (insn) = dwarf;
> }
>
> - if (use_restore_libcall)
> + if (use_multi_pop)
> + {
> + unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
> + if (use_multi_pop_normal)
> + insn = emit_jump_insn (
> + riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
> + else
> + insn= emit_insn (
> + riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
> +
> + rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
> + RTX_FRAME_RELATED_P (insn) = 1;
> + REG_NOTES (insn) = dwarf;
> + if (use_multi_pop_normal)
> + return;
> + }
> + else if (use_restore_libcall)
> {
> rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
> insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
> @@ -6980,6 +7241,25 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
> return gen_rtx_PARALLEL (VOIDmode, vec);
> }
>
> +static HOST_WIDE_INT zcmp_base_adj(int regs_num)
> +{
> + return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
> +}
> +
> +static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
> +{
> + return total - zcmp_base_adj(regs_num);
> +}
> +
> +bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
> +{
> + HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
> + return additioanl_bytes == 0
> + || additioanl_bytes == 1 * ZCMP_SP_INC_STEP
> + || additioanl_bytes == 2 * ZCMP_SP_INC_STEP
> + || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
> +}
> +
> /* Return true if it's valid gpr_save pattern. */
>
> bool
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 4541255a8ae..2fa555dce2d 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -420,6 +420,29 @@ ASM_MISA_SPEC
> #define RISCV_CALL_ADDRESS_TEMP(MODE) \
> gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
>
> +#define RETURN_ADDR_MASK ( 1 << RETURN_ADDR_REGNUM)
> +#define S0_MASK ( 1 << S0_REGNUM)
> +#define S1_MASK ( 1 << S1_REGNUM)
> +#define S2_MASK ( 1 << S2_REGNUM)
> +#define S3_MASK ( 1 << S3_REGNUM)
> +#define S4_MASK ( 1 << S4_REGNUM)
> +#define S5_MASK ( 1 << S5_REGNUM)
> +#define S6_MASK ( 1 << S6_REGNUM)
> +#define S7_MASK ( 1 << S7_REGNUM)
> +#define S8_MASK ( 1 << S8_REGNUM)
> +#define S9_MASK ( 1 << S9_REGNUM)
> +#define S10_MASK ( 1 << S10_REGNUM)
> +#define S11_MASK ( 1 << S11_REGNUM)
> +
> +#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK | S3_MASK \
> + | S4_MASK | S5_MASK | S6_MASK | S7_MASK \
> + | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
> +#define ZCMP_MAX_SPIMM 3
> +#define ZCMP_SP_INC_STEP 16
> +#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
> +#define ZCMP_S0S11_SREGS_COUNTS 12
> +#define ZCMP_MAX_GRP_SLOTS 13
> +
> #define MCOUNT_NAME "_mcount"
>
> #define NO_PROFILE_COUNTERS 1
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index be960583101..c858b3bc9ef 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -113,6 +113,7 @@
>
> (define_constants
> [(RETURN_ADDR_REGNUM 1)
> + (SP_REGNUM 2)
> (GP_REGNUM 3)
> (TP_REGNUM 4)
> (T0_REGNUM 5)
> @@ -3163,3 +3164,4 @@
> (include "sifive-7.md")
> (include "thead.md")
> (include "vector.md")
> +(include "zc.md")
> diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
> new file mode 100644
> index 00000000000..5c1bf031b8d
> --- /dev/null
> +++ b/gcc/config/riscv/zc.md
> @@ -0,0 +1,1042 @@
> +;; Machine description for RISC-V Zc extention.
> +;; Copyright (C) 2023 Free Software Foundation, Inc.
> +;; Contributed by Fei Gao (gaofei@eswincomputing.com).
> +
> +;; This file is part of GCC.
> +
> +;; GCC is free software; you can redistribute it and/or modify
> +;; it under the terms of the GNU General Public License as published by
> +;; the Free Software Foundation; either version 3, or (at your option)
> +;; any later version.
> +
> +;; GCC is distributed in the hope that it will be useful,
> +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +;; GNU General Public License for more details.
> +
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3. If not see
> +;; <http: www.gnu.org="" licenses=""></http:>.
> +
> +(define_insn "@gpr_multi_pop_up_to_ra_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s0_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s1_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s1}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s2_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s2}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s3_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s3}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s4_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s4}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s5_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s5}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s6_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s6}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s7_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s7}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s8_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
> + (set (reg:X S8_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s8}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s9_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
> + (set (reg:X S9_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S8_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot10_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s9}, %0"
> +)
> +
> +(define_insn "@gpr_multi_pop_up_to_s11_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
> + (set (reg:X S11_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S10_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S9_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S8_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot10_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot11_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot12_offset>))))]
> + "TARGET_ZCMP"
> + "cm.pop {ra, s0-s11}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_ra_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s0_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s1_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s1}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s2_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s2}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s3_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s3}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s4_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s4}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s5_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s5}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s6_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s6}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s7_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s7}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s8_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
> + (set (reg:X S8_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s8}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s9_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
> + (set (reg:X S9_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S8_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot10_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s9}, %0"
> +)
> +
> +(define_insn "@gpr_multi_popret_up_to_s11_<mode>"
> + [(set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
> + (set (reg:X S11_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>))))
> + (set (reg:X S10_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>))))
> + (set (reg:X S9_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>))))
> + (set (reg:X S8_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>))))
> + (set (reg:X S7_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>))))
> + (set (reg:X S6_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>))))
> + (set (reg:X S5_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>))))
> + (set (reg:X S4_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>))))
> + (set (reg:X S3_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>))))
> + (set (reg:X S2_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>))))
> + (set (reg:X S1_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot10_offset>))))
> + (set (reg:X S0_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot11_offset>))))
> + (set (reg:X RETURN_ADDR_REGNUM)
> + (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot12_offset>))))
> + (return)
> + (use (reg:SI RETURN_ADDR_REGNUM))]
> + "TARGET_ZCMP"
> + "cm.popret {ra, s0-s11}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_ra_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s0_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s1_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s1}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s2_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s2}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s3_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s3}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s4_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s4}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s5_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S5_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s5}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s6_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S6_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S5_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s6}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s7_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S7_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S6_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S5_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s7}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s8_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S8_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S7_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S6_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S5_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s8}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s9_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S9_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S8_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S7_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S6_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S5_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot10_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s9}, %0"
> +)
> +
> +(define_insn "@gpr_multi_push_up_to_s11_<mode>"
> + [(set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot0_offset>)))
> + (reg:X S11_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot1_offset>)))
> + (reg:X S10_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot2_offset>)))
> + (reg:X S9_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot3_offset>)))
> + (reg:X S8_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot4_offset>)))
> + (reg:X S7_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot5_offset>)))
> + (reg:X S6_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot6_offset>)))
> + (reg:X S5_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot7_offset>)))
> + (reg:X S4_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot8_offset>)))
> + (reg:X S3_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot9_offset>)))
> + (reg:X S2_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot10_offset>)))
> + (reg:X S1_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot11_offset>)))
> + (reg:X S0_REGNUM))
> + (set (mem:X (plus:X (reg:X SP_REGNUM)
> + (const_int <slot12_offset>)))
> + (reg:X RETURN_ADDR_REGNUM))
> + (set (reg:X SP_REGNUM)
> + (plus:X (reg:X SP_REGNUM)
> + (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
> + "TARGET_ZCMP"
> + "cm.push {ra, s0-s11}, %0"
> +)
> diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
> new file mode 100644
> index 00000000000..6dbe489da9b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
> @@ -0,0 +1,239 @@
> +/* { dg-do compile } */
> +/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +char my_getchar();
> +float getf();
> +int __attribute__((noinline)) incoming_stack_args
> + (int arg0, int arg1, int arg2, int arg3,
> + int arg4, int arg5, int arg6, int arg7, int arg8);
> +int getint();
> +void PrintInts (int n, ...); // varargs
> +void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
> +void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
> +extern void f1(void);
> +extern void f2(void);
> +
> +/*
> +**test1:
> +** ...
> +** cm.push {ra, s0-s1}, -64
> +** ...
> +** cm.popret {ra, s0-s1}, 64
> +** ...
> +*/
> +int test1()
> +{
> + char volatile array[3120];
> + float volatile farray[3120];
> +
> + float sum = 0;
> + for (int i = 0; i < 3120; i++)
> + {
> + array[i] = my_getchar();
> + farray[i] = my_getchar() * 1.2;
> + sum += array[i] + farray[i];
> + }
> + return sum;
> +}
> +
> +/*
> +**test2_step1_0_size:
> +** ...
> +** cm.push {ra, s0}, -64
> +** ...
> +** cm.popret {ra, s0}, 64
> +** ...
> +*/
> +int test2_step1_0_size()
> +{
> + int volatile iarray[3120 + 1824/4 -8];
> +
> + for (int i = 0; i < 3120 + 1824/4 - 8; i++)
> + {
> + iarray[i] = my_getchar() * 2;
> + }
> + return iarray[0] + iarray[1];
> +}
> +
> +/*
> +**test3:
> +** ...
> +** cm.push {ra, s0-s1}, -64
> +** ...
> +** cm.popret {ra, s0-s1}, 64
> +** ...
> +*/
> +float test3()
> +{
> + char volatile array[3120];
> + float volatile farray[3120];
> +
> + float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
> +
> + for (int i = 0; i < 3120; i++)
> + {
> + f1 = getf();
> + f2 = getf();
> + f3 = getf();
> + f4 = getf();
> + array[i] = my_getchar();
> + farray[i] = my_getchar() * 1.2;
> + sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
> + }
> + return sum;
> +}
> +
> +/*
> +**outgoing_stack_args:
> +** ...
> +** cm.push {ra, s0}, -32
> +** ...
> +** cm.popret {ra, s0}, 32
> +** ...
> +*/
> +int outgoing_stack_args()
> +{
> + int local = getint();
> + return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
> +}
> +
> +/*
> +**callPrintInts:
> +** ...
> +** cm.push {ra}, -32
> +** ...
> +** cm.popret {ra}, 32
> +** ...
> +*/
> +float callPrintInts()
> +{
> + volatile float f = getf(); // f in local
> + PrintInts(9,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**callPrint:
> +** ...
> +** cm.push {ra}, -32
> +** ...
> +** cm.popret {ra}, 32
> +** ...
> +*/
> +float callPrint()
> +{
> + volatile float f = getf(); // f in local
> + PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**callPrint_S:
> +** ...
> +** cm.push {ra, s0}, -32
> +** ...
> +** cm.popret {ra, s0}, 32
> +** ...
> +*/
> +float callPrint_S()
> +{
> + float f = getf();
> + PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**callPrint_2:
> +** ...
> +** cm.push {ra, s0}, -32
> +** ...
> +** cm.popret {ra, s0}, 32
> +** ...
> +*/
> +float callPrint_2()
> +{
> + float f = getf();
> + PrintInts2(0,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**test_step1_0bytes_save_restore:
> +** ...
> +** cm.push {ra}, -16
> +** ...
> +** cm.popret {ra}, 16
> +** ...
> +*/
> +int test_step1_0bytes_save_restore()
> +{
> +
> + int a = 9;
> + int b = my_getchar();
> + return a +b;
> +}
> +
> +/*
> +**test_s0:
> +** ...
> +** cm.push {ra, s0}, -16
> +** ...
> +** cm.popret {ra, s0}, 16
> +** ...
> +*/
> +int test_s0()
> +{
> +
> + int a = my_getchar();
> + int b = my_getchar();
> + return a +b;
> +}
> +
> +/*
> +**test_s1:
> +** ...
> +** cm.push {ra, s0-s1}, -16
> +** ...
> +** cm.popret {ra, s0-s1}, 16
> +** ...
> +*/
> +int test_s1()
> +{
> +
> + int s0 = my_getchar();
> + int s1 = my_getchar();
> + int b = my_getchar();
> + return s1 +s0 +b;
> +}
> +
> +/*
> +**test_f0:
> +** ...
> +** cm.push {ra, s0-s1}, -16
> +** ...
> +** cm.popret {ra, s0-s1}, 16
> +** ...
> +*/
> +int test_f0()
> +{
> +
> + int s0 = my_getchar();
> + float f0 = getf();
> + int b = my_getchar();
> + return f0 +s0 +b;
> +}
> +
> +/*
> +**foo:
> +** cm.push {ra}, -16
> +** call f1
> +** cm.pop {ra}, 16
> +** tail f2
> +*/
> +void foo(void)
> +{
> + f1();
> + f2();
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
> new file mode 100644
> index 00000000000..924197cb3c4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
> @@ -0,0 +1,239 @@
> +/* { dg-do compile } */
> +/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +char my_getchar();
> +float getf();
> +int __attribute__((noinline)) incoming_stack_args
> + (int arg0, int arg1, int arg2, int arg3,
> + int arg4, int arg5, int arg6, int arg7, int arg8);
> +int getint();
> +void PrintInts (int n, ...); // varargs
> +void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
> +void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
> +extern void f1(void);
> +extern void f2(void);
> +
> +/*
> +**test1:
> +** ...
> +** cm.push {ra, s0-s4}, -80
> +** ...
> +** cm.popret {ra, s0-s4}, 80
> +** ...
> +*/
> +int test1()
> +{
> + char volatile array[3120];
> + float volatile farray[3120];
> +
> + float sum = 0;
> + for (int i = 0; i < 3120; i++)
> + {
> + array[i] = my_getchar();
> + farray[i] = my_getchar() * 1.2;
> + sum += array[i] + farray[i];
> + }
> + return sum;
> +}
> +
> +/*
> +**test2_step1_0_size:
> +** ...
> +** cm.push {ra, s0-s1}, -64
> +** ...
> +** cm.popret {ra, s0-s1}, 64
> +** ...
> +*/
> +int test2_step1_0_size()
> +{
> + int volatile iarray[3120 + 1824/4 -8];
> +
> + for (int i = 0; i < 3120 + 1824/4 - 8; i++)
> + {
> + iarray[i] = my_getchar() * 2;
> + }
> + return iarray[0] + iarray[1];
> +}
> +
> +/*
> +**test3:
> +** ...
> +** cm.push {ra, s0-s4}, -80
> +** ...
> +** cm.popret {ra, s0-s4}, 80
> +** ...
> +*/
> +float test3()
> +{
> + char volatile array[3120];
> + float volatile farray[3120];
> +
> + float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
> +
> + for (int i = 0; i < 3120; i++)
> + {
> + f1 = getf();
> + f2 = getf();
> + f3 = getf();
> + f4 = getf();
> + array[i] = my_getchar();
> + farray[i] = my_getchar() * 1.2;
> + sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
> + }
> + return sum;
> +}
> +
> +/*
> +**outgoing_stack_args:
> +** ...
> +** cm.push {ra, s0}, -32
> +** ...
> +** cm.popret {ra, s0}, 32
> +** ...
> +*/
> +int outgoing_stack_args()
> +{
> + int local = getint();
> + return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
> +}
> +
> +/*
> +**callPrintInts:
> +** ...
> +** cm.push {ra}, -48
> +** ...
> +** cm.popret {ra}, 48
> +** ...
> +*/
> +float callPrintInts()
> +{
> + volatile float f = getf(); // f in local
> + PrintInts(9,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**callPrint:
> +** ...
> +** cm.push {ra}, -48
> +** ...
> +** cm.popret {ra}, 48
> +** ...
> +*/
> +float callPrint()
> +{
> + volatile float f = getf(); // f in local
> + PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**callPrint_S:
> +** ...
> +** cm.push {ra}, -48
> +** ...
> +** cm.popret {ra}, 48
> +** ...
> +*/
> +float callPrint_S()
> +{
> + float f = getf();
> + PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**callPrint_2:
> +** ...
> +** cm.push {ra}, -48
> +** ...
> +** cm.popret {ra}, 48
> +** ...
> +*/
> +float callPrint_2()
> +{
> + float f = getf();
> + PrintInts2(0,1,2,3,4,5,6,7,8,9);
> + return f;
> +}
> +
> +/*
> +**test_step1_0bytes_save_restore:
> +** ...
> +** cm.push {ra}, -16
> +** ...
> +** cm.popret {ra}, 16
> +** ...
> +*/
> +int test_step1_0bytes_save_restore()
> +{
> +
> + int a = 9;
> + int b = my_getchar();
> + return a +b;
> +}
> +
> +/*
> +**test_s0:
> +** ...
> +** cm.push {ra, s0}, -16
> +** ...
> +** cm.popret {ra, s0}, 16
> +** ...
> +*/
> +int test_s0()
> +{
> +
> + int a = my_getchar();
> + int b = my_getchar();
> + return a +b;
> +}
> +
> +/*
> +**test_s1:
> +** ...
> +** cm.push {ra, s0-s1}, -16
> +** ...
> +** cm.popret {ra, s0-s1}, 16
> +** ...
> +*/
> +int test_s1()
> +{
> +
> + int s0 = my_getchar();
> + int s1 = my_getchar();
> + int b = my_getchar();
> + return s1 +s0 +b;
> +}
> +
> +/*
> +**test_f0:
> +** ...
> +** cm.push {ra, s0}, -32
> +** ...
> +** cm.popret {ra, s0}, 32
> +** ...
> +*/
> +int test_f0()
> +{
> +
> + int s0 = my_getchar();
> + float f0 = getf();
> + int b = my_getchar();
> + return f0 +s0 +b;
> +}
> +
> +/*
> +**foo:
> +** cm.push {ra}, -16
> +** call f1
> +** cm.pop {ra}, 16
> +** tail f2
> +*/
> +void foo(void)
> +{
> + f1();
> + f2();
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
> new file mode 100644
> index 00000000000..05602302a8f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
> +/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +void bar();
> +
> +/*
> +**fool_rv32e:
> +** cm.push {ra}, -32
> +** ...
> +** call bar
> +** ...
> +** lw a5,32\(sp\)
> +** ...
> +** cm.popret {ra}, 32
> +*/
> +int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
> + int incoming0)
> +{
> + bar();
> + return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
> +}
> --
> 2.17.1
</slot12_offset></slot11_offset></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot2_offset></slot1_offset></slot0_offset></mode></slot1_offset></slot0_offset></mode></slot0_offset></mode></slot12_offset></slot11_offset></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot2_offset></slot1_offset></slot0_offset></mode></slot1_offset></slot0_offset></mode></slot0_offset></mode></slot12_offset></slot11_offset></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot2_offset></slot1_offset></slot0_offset></mode></slot1_offset></slot0_offset></mode></slot0_offset></mode></gaofei@eswincomputing.com></gaofei@eswincomputing.com></gaofei@eswincomputing.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
2023-06-07 5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
@ 2023-06-12 15:17 ` Kito Cheng
2023-06-12 19:26 ` Jeff Law
1 sibling, 0 replies; 17+ messages in thread
From: Kito Cheng @ 2023-06-12 15:17 UTC (permalink / raw)
To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei
I would suggest breaking this patch into two parts: RISC-V part and
the rest part (shrink-wrap.h / shrink-wrap.cc).
On Wed, Jun 7, 2023 at 1:55 PM Fei Gao <gaofei@eswincomputing.com> wrote:
>
> Disable zcmp multi push/pop if shrink-wrap-separate is active.
>
> So in -Os that prefers smaller code size, by default shrink-wrap-separate
> is disabled while zcmp multi push/pop is enabled.
>
> And in -O2 and others that prefers speed, by default shrink-wrap-separate
> is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
> push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
>
> The following TC shows the issues in -O2 before this patch with both
> shrink-wrap-separate and zcmp multi push/pop active.
> 1. duplicated store of s regs.
> 2. cm.push pushes ra, s0-s11 in reverse order than what normal
> prologue does, causing stack corruption and failure to resotre s regs.
>
> TC: zcmp_shrink_wrap_separate.c included in this patch.
>
> output asm before this patch:
> calc_func:
> cm.push {ra, s0-s3}, -32
> ...
> beq a5,zero,.L2
> ...
> .L2:
> ...
> sw s1,20(sp) //issue here
> sw s3,12(sp) //issue here
> ...
> sw s2,16(sp) //issue here
>
> output asm after this patch:
> calc_func:
> addi sp,sp,-32
> sw s0,24(sp)
> ...
> beq a5,zero,.L2
> ...
> .L2:
> ...
> sw s1,20(sp)
> sw s3,12(sp)
> ...
> sw s2,16(sp)
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc
> (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
> riscv_avoid_shrink_wrapping_separate.
> (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
> is active.
> (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
> * shrink-wrap.cc (try_shrink_wrapping_separate): call
> use_shrink_wrapping_separate.
> (use_shrink_wrapping_separate):wrap the condition
> check in use_shrink_wrapping_separate
> * shrink-wrap.h (use_shrink_wrapping_separate): add to extern
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
> * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
>
> Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
> Co-Authored-By: Zhangjin Liao <liaozhangjin@eswincomputing.com>
> ---
> gcc/config/riscv/riscv.cc | 19 +++-
> gcc/shrink-wrap.cc | 25 +++--
> gcc/shrink-wrap.h | 1 +
> .../riscv/zcmp_shrink_wrap_separate.c | 97 +++++++++++++++++++
> .../riscv/zcmp_shrink_wrap_separate2.c | 97 +++++++++++++++++++
> 5 files changed, 228 insertions(+), 11 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index f60c241a526..b505cdeca34 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3. If not see
> #include "cfghooks.h"
> #include "cfgloop.h"
> #include "cfgrtl.h"
> +#include "shrink-wrap.h"
> #include "sel-sched.h"
> #include "fold-const.h"
> #include "gimple-iterator.h"
> @@ -389,6 +390,7 @@ static const struct riscv_tune_param optimize_size_tune_info = {
> false, /* use_divmod_expansion */
> };
>
> +static bool riscv_avoid_shrink_wrapping_separate ();
> static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
> static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);
>
> @@ -4910,6 +4912,8 @@ riscv_avoid_multi_push(const struct riscv_frame_info *frame)
> || cfun->machine->interrupt_handler_p
> || cfun->machine->varargs_size != 0
> || crtl->args.pretend_args_size != 0
> + || (use_shrink_wrapping_separate ()
> + && !riscv_avoid_shrink_wrapping_separate ())
> || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
> return true;
>
> @@ -6077,6 +6081,17 @@ riscv_epilogue_uses (unsigned int regno)
> return false;
> }
>
> +static bool
> +riscv_avoid_shrink_wrapping_separate ()
> +{
> + if (riscv_use_save_libcall (&cfun->machine->frame)
> + || cfun->machine->interrupt_handler_p
> + || !cfun->machine->frame.gp_sp_offset.is_constant ())
> + return true;
> +
> + return false;
> +}
> +
> /* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS. */
>
> static sbitmap
> @@ -6086,9 +6101,7 @@ riscv_get_separate_components (void)
> sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER);
> bitmap_clear (components);
>
> - if (riscv_use_save_libcall (&cfun->machine->frame)
> - || cfun->machine->interrupt_handler_p
> - || !cfun->machine->frame.gp_sp_offset.is_constant ())
> + if (riscv_avoid_shrink_wrapping_separate ())
> return components;
>
> offset = cfun->machine->frame.gp_sp_offset.to_constant ();
> diff --git a/gcc/shrink-wrap.cc b/gcc/shrink-wrap.cc
> index b8d7b557130..d534964321a 100644
> --- a/gcc/shrink-wrap.cc
> +++ b/gcc/shrink-wrap.cc
> @@ -1776,16 +1776,14 @@ insert_prologue_epilogue_for_components (sbitmap components)
> commit_edge_insertions ();
> }
>
> -/* The main entry point to this subpass. FIRST_BB is where the prologue
> - would be normally put. */
> -void
> -try_shrink_wrapping_separate (basic_block first_bb)
> +bool
> +use_shrink_wrapping_separate (void)
> {
> if (!(SHRINK_WRAPPING_ENABLED
> - && flag_shrink_wrap_separate
> - && optimize_function_for_speed_p (cfun)
> - && targetm.shrink_wrap.get_separate_components))
> - return;
> + && flag_shrink_wrap_separate
> + && optimize_function_for_speed_p (cfun)
> + && targetm.shrink_wrap.get_separate_components))
> + return false;
>
> /* We don't handle "strange" functions. */
> if (cfun->calls_alloca
> @@ -1794,6 +1792,17 @@ try_shrink_wrapping_separate (basic_block first_bb)
> || crtl->calls_eh_return
> || crtl->has_nonlocal_goto
> || crtl->saves_all_registers)
> + return false;
> +
> + return true;
> +}
> +
> +/* The main entry point to this subpass. FIRST_BB is where the prologue
> + would be normally put. */
> +void
> +try_shrink_wrapping_separate (basic_block first_bb)
> +{
> + if (!use_shrink_wrapping_separate ())
> return;
>
> /* Ask the target what components there are. If it returns NULL, don't
> diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
> index 161647711a3..82386c2b712 100644
> --- a/gcc/shrink-wrap.h
> +++ b/gcc/shrink-wrap.h
> @@ -26,6 +26,7 @@ along with GCC; see the file COPYING3. If not see
> extern bool requires_stack_frame_p (rtx_insn *, HARD_REG_SET, HARD_REG_SET);
> extern void try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq);
> extern void try_shrink_wrapping_separate (basic_block first_bb);
> +extern bool use_shrink_wrapping_separate (void);
> #define SHRINK_WRAPPING_ENABLED \
> (flag_shrink_wrap && targetm.have_simple_return ())
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
> new file mode 100644
> index 00000000000..11f87aee607
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
> @@ -0,0 +1,97 @@
> +/* { dg-do compile } */
> +/* { dg-options " -O2 -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
> +
> +typedef struct MAT_PARAMS_S
> +{
> + int N;
> + signed short *A;
> + signed short *B;
> + signed int *C;
> +} mat_params;
> +
> +typedef struct CORE_PORTABLE_S
> +{
> + unsigned char portable_id;
> +} core_portable;
> +
> +typedef struct RESULTS_S
> +{
> + /* inputs */
> + signed short seed1; /* Initializing seed */
> + signed short seed2; /* Initializing seed */
> + signed short seed3; /* Initializing seed */
> + void * memblock[4]; /* Pointer to safe memory location */
> + unsigned int size; /* Size of the data */
> + unsigned int iterations; /* Number of iterations to execute */
> + unsigned int execs; /* Bitmask of operations to execute */
> + struct list_head_s *list;
> + mat_params mat;
> + /* outputs */
> + unsigned short crc;
> + unsigned short crclist;
> + unsigned short crcmatrix;
> + unsigned short crcstate;
> + signed short err;
> + /* ultithread specific */
> + core_portable port;
> +} core_results;
> +
> +extern signed short
> +core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
> +
> +extern signed short
> +core_bench_matrix(mat_params *, signed short, unsigned short);
> +
> +extern unsigned short
> +crcu16(signed short, unsigned short);
> +
> +signed short
> +calc_func(signed short *pdata, core_results *res)
> +{
> + signed short data = *pdata;
> + signed short retval;
> + unsigned char optype
> + = (data >> 7)
> + & 1; /* bit 7 indicates if the function result has been cached */
> + if (optype) /* if cached, use cache */
> + return (data & 0x007f);
> + else
> + { /* otherwise calculate and cache the result */
> + signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
> + signed short dtype
> + = ((data >> 3)
> + & 0xf); /* bits 3-6 is specific data for the operation */
> + dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
> + switch (flag)
> + {
> + case 0:
> + if (dtype < 0x22) /* set min period for bit corruption */
> + dtype = 0x22;
> + retval = core_bench_state(res->size,
> + res->memblock[3],
> + res->seed1,
> + res->seed2,
> + dtype,
> + res->crc);
> + if (res->crcstate == 0)
> + res->crcstate = retval;
> + break;
> + case 1:
> + retval = core_bench_matrix(&(res->mat), dtype, res->crc);
> + if (res->crcmatrix == 0)
> + res->crcmatrix = retval;
> + break;
> + default:
> + retval = data;
> + break;
> + }
> + res->crc = crcu16(retval, res->crc);
> + retval &= 0x007f;
> + *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
> + return retval;
> + }
> +}
> +
> +/* { dg-final { scan-assembler-not "cm\.push" } } */
> +
> diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
> new file mode 100644
> index 00000000000..ec7e9c39b5d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
> @@ -0,0 +1,97 @@
> +/* { dg-do compile } */
> +/* { dg-options " -O2 -fno-shrink-wrap-separate -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
> +
> +typedef struct MAT_PARAMS_S
> +{
> + int N;
> + signed short *A;
> + signed short *B;
> + signed int *C;
> +} mat_params;
> +
> +typedef struct CORE_PORTABLE_S
> +{
> + unsigned char portable_id;
> +} core_portable;
> +
> +typedef struct RESULTS_S
> +{
> + /* inputs */
> + signed short seed1; /* Initializing seed */
> + signed short seed2; /* Initializing seed */
> + signed short seed3; /* Initializing seed */
> + void * memblock[4]; /* Pointer to safe memory location */
> + unsigned int size; /* Size of the data */
> + unsigned int iterations; /* Number of iterations to execute */
> + unsigned int execs; /* Bitmask of operations to execute */
> + struct list_head_s *list;
> + mat_params mat;
> + /* outputs */
> + unsigned short crc;
> + unsigned short crclist;
> + unsigned short crcmatrix;
> + unsigned short crcstate;
> + signed short err;
> + /* ultithread specific */
> + core_portable port;
> +} core_results;
> +
> +extern signed short
> +core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
> +
> +extern signed short
> +core_bench_matrix(mat_params *, signed short, unsigned short);
> +
> +extern unsigned short
> +crcu16(signed short, unsigned short);
> +
> +signed short
> +calc_func(signed short *pdata, core_results *res)
> +{
> + signed short data = *pdata;
> + signed short retval;
> + unsigned char optype
> + = (data >> 7)
> + & 1; /* bit 7 indicates if the function result has been cached */
> + if (optype) /* if cached, use cache */
> + return (data & 0x007f);
> + else
> + { /* otherwise calculate and cache the result */
> + signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
> + signed short dtype
> + = ((data >> 3)
> + & 0xf); /* bits 3-6 is specific data for the operation */
> + dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
> + switch (flag)
> + {
> + case 0:
> + if (dtype < 0x22) /* set min period for bit corruption */
> + dtype = 0x22;
> + retval = core_bench_state(res->size,
> + res->memblock[3],
> + res->seed1,
> + res->seed2,
> + dtype,
> + res->crc);
> + if (res->crcstate == 0)
> + res->crcstate = retval;
> + break;
> + case 1:
> + retval = core_bench_matrix(&(res->mat), dtype, res->crc);
> + if (res->crcmatrix == 0)
> + res->crcmatrix = retval;
> + break;
> + default:
> + retval = data;
> + break;
> + }
> + res->crc = crcu16(retval, res->crc);
> + retval &= 0x007f;
> + *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
> + return retval;
> + }
> +}
> +
> +/* { dg-final { scan-assembler "cm\.push" } } */
> +
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
2023-06-07 5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
2023-06-12 15:17 ` Kito Cheng
@ 2023-06-12 19:26 ` Jeff Law
2023-06-13 2:35 ` Fei Gao
1 sibling, 1 reply; 17+ messages in thread
From: Jeff Law @ 2023-06-12 19:26 UTC (permalink / raw)
To: Fei Gao, gcc-patches; +Cc: kito.cheng, palmer, sinan.lin, jiawei
On 6/6/23 23:52, Fei Gao wrote:
> Disable zcmp multi push/pop if shrink-wrap-separate is active.
>
> So in -Os that prefers smaller code size, by default shrink-wrap-separate
> is disabled while zcmp multi push/pop is enabled.
>
> And in -O2 and others that prefers speed, by default shrink-wrap-separate
> is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
> push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
>
> The following TC shows the issues in -O2 before this patch with both
> shrink-wrap-separate and zcmp multi push/pop active.
> 1. duplicated store of s regs.
> 2. cm.push pushes ra, s0-s11 in reverse order than what normal
> prologue does, causing stack corruption and failure to resotre s regs.
>
> TC: zcmp_shrink_wrap_separate.c included in this patch.
>
> output asm before this patch:
> calc_func:
> cm.push {ra, s0-s3}, -32
> ...
> beq a5,zero,.L2
> ...
> .L2:
> ...
> sw s1,20(sp) //issue here
> sw s3,12(sp) //issue here
> ...
> sw s2,16(sp) //issue here
>
> output asm after this patch:
> calc_func:
> addi sp,sp,-32
> sw s0,24(sp)
> ...
> beq a5,zero,.L2
> ...
> .L2:
> ...
> sw s1,20(sp)
> sw s3,12(sp)
> ...
> sw s2,16(sp)
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc
> (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
> riscv_avoid_shrink_wrapping_separate.
> (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
> is active.
> (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
> * shrink-wrap.cc (try_shrink_wrapping_separate): call
> use_shrink_wrapping_separate.
> (use_shrink_wrapping_separate):wrap the condition
> check in use_shrink_wrapping_separate
> * shrink-wrap.h (use_shrink_wrapping_separate): add to extern
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
> * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
I know Kito asked for this to be broken up into target dependent vs
target independent changes, that's a good ask.
Can't we utilize the get_separate_components hook to accomplish what
you're trying to do? ie, put the logic to avoid shrink wrapping for
this case within the existing risc-v hook?
jeff
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
2023-06-12 19:26 ` Jeff Law
@ 2023-06-13 2:35 ` Fei Gao
0 siblings, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-13 2:35 UTC (permalink / raw)
To: jeffreyalaw, gcc-patches; +Cc: Kito Cheng, Palmer Dabbelt, Sinan, jiawei
On 2023-06-13 03:26 Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
>On 6/6/23 23:52, Fei Gao wrote:
>> Disable zcmp multi push/pop if shrink-wrap-separate is active.
>>
>> So in -Os that prefers smaller code size, by default shrink-wrap-separate
>> is disabled while zcmp multi push/pop is enabled.
>>
>> And in -O2 and others that prefers speed, by default shrink-wrap-separate
>> is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
>> push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
>>
>> The following TC shows the issues in -O2 before this patch with both
>> shrink-wrap-separate and zcmp multi push/pop active.
>> 1. duplicated store of s regs.
>> 2. cm.push pushes ra, s0-s11 in reverse order than what normal
>> prologue does, causing stack corruption and failure to resotre s regs.
>>
>> TC: zcmp_shrink_wrap_separate.c included in this patch.
>>
>> output asm before this patch:
>> calc_func:
>> cm.push {ra, s0-s3}, -32
>> ...
>> beq a5,zero,.L2
>> ...
>> .L2:
>> ...
>> sw s1,20(sp) //issue here
>> sw s3,12(sp) //issue here
>> ...
>> sw s2,16(sp) //issue here
>>
>> output asm after this patch:
>> calc_func:
>> addi sp,sp,-32
>> sw s0,24(sp)
>> ...
>> beq a5,zero,.L2
>> ...
>> .L2:
>> ...
>> sw s1,20(sp)
>> sw s3,12(sp)
>> ...
>> sw s2,16(sp)
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv.cc
>> (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
>> riscv_avoid_shrink_wrapping_separate.
>> (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
>> is active.
>> (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
>> * shrink-wrap.cc (try_shrink_wrapping_separate): call
>> use_shrink_wrapping_separate.
>> (use_shrink_wrapping_separate):wrap the condition
>> check in use_shrink_wrapping_separate
>> * shrink-wrap.h (use_shrink_wrapping_separate): add to extern
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
>> * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
>I know Kito asked for this to be broken up into target dependent vs
>target independent changes, that's a good ask.
>
>Can't we utilize the get_separate_components hook to accomplish what
>you're trying to do? ie, put the logic to avoid shrink wrapping for
>this case within the existing risc-v hook?
Thank Jeff and Kito for your comments.
My first try was to avoid shrink wrapping if zcmp is enabled.
But after discussion with Kito and Andrew Pinski, I realized it's better to disable
zcmp push and pops if shrink wrapping is active.
For detailed discussion, please check link below.
thread: [PATCH 1/2] [RISC-V] disable shrink-wrap-separate if zcmp enabled.
link: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg307203.html
I will go ahead with Kito's advice if you're fine with the current solution.
Thanks.
BR,
Fei
>
>jeff
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp
2023-06-07 5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
@ 2023-07-13 8:18 ` Kito Cheng
0 siblings, 0 replies; 17+ messages in thread
From: Kito Cheng @ 2023-07-13 8:18 UTC (permalink / raw)
To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei, Die Li
LGTM, thanks, just like other zc* patches, I would like to defer this
until the binutils part landed :)
On Wed, Jun 7, 2023 at 1:54 PM Fei Gao <gaofei@eswincomputing.com> wrote:
>
> From: Die Li <lidie@eswincomputing.com>
>
> Signed-off-by: Die Li <lidie@eswincomputing.com>
> Co-Authored-By: Fei Gao <gaofei@eswincomputing.com>
>
> gcc/ChangeLog:
>
> * config/riscv/peephole.md: New pattern.
> * config/riscv/predicates.md (a0a1_reg_operand): New predicate.
> (zcmp_mv_sreg_operand): New predicate.
> * config/riscv/riscv.md: New predicate.
> * config/riscv/zc.md (*mva01s<X:mode>): New pattern.
> (*mvsa01<X:mode>): New pattern.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/cm_mv_rv32.c: New test.
> ---
> gcc/config/riscv/peephole.md | 28 +++++++++++++++++++++
> gcc/config/riscv/predicates.md | 11 ++++++++
> gcc/config/riscv/riscv.md | 1 +
> gcc/config/riscv/zc.md | 22 ++++++++++++++++
> gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c | 21 ++++++++++++++++
> 5 files changed, 83 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
>
> diff --git a/gcc/config/riscv/peephole.md b/gcc/config/riscv/peephole.md
> index 67e7046d7e6..e8cb1ba4838 100644
> --- a/gcc/config/riscv/peephole.md
> +++ b/gcc/config/riscv/peephole.md
> @@ -94,3 +94,31 @@
> {
> th_mempair_order_operands (operands, true, SImode);
> })
> +
> +;; ZCMP
> +(define_peephole2
> + [(set (match_operand:X 0 "a0a1_reg_operand")
> + (match_operand:X 1 "zcmp_mv_sreg_operand"))
> + (set (match_operand:X 2 "a0a1_reg_operand")
> + (match_operand:X 3 "zcmp_mv_sreg_operand"))]
> + "TARGET_ZCMP
> + && (REGNO (operands[2]) != REGNO (operands[0]))"
> + [(parallel [(set (match_dup 0)
> + (match_dup 1))
> + (set (match_dup 2)
> + (match_dup 3))])]
> +)
> +
> +(define_peephole2
> + [(set (match_operand:X 0 "zcmp_mv_sreg_operand")
> + (match_operand:X 1 "a0a1_reg_operand"))
> + (set (match_operand:X 2 "zcmp_mv_sreg_operand")
> + (match_operand:X 3 "a0a1_reg_operand"))]
> + "TARGET_ZCMP
> + && (REGNO (operands[0]) != REGNO (operands[2]))
> + && (REGNO (operands[1]) != REGNO (operands[3]))"
> + [(parallel [(set (match_dup 0)
> + (match_dup 1))
> + (set (match_dup 2)
> + (match_dup 3))])]
> +)
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index a1b9367b997..6d5e8630cb5 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -207,6 +207,17 @@
> (and (match_code "const_int")
> (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
>
> +;; ZCMP predicates
> +(define_predicate "a0a1_reg_operand"
> + (and (match_operand 0 "register_operand")
> + (match_test "IN_RANGE (REGNO (op), A0_REGNUM, A1_REGNUM)")))
> +
> +(define_predicate "zcmp_mv_sreg_operand"
> + (and (match_operand 0 "register_operand")
> + (match_test "TARGET_RVE ? IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
> + : IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
> + || IN_RANGE (REGNO (op), S2_REGNUM, S7_REGNUM)")))
> +
> ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
> (define_predicate "branch_on_bit_operand"
> (and (match_code "const_int")
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 02802d2685d..25bc3e6ab4c 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -121,6 +121,7 @@
> (S0_REGNUM 8)
> (S1_REGNUM 9)
> (A0_REGNUM 10)
> + (A1_REGNUM 11)
> (S2_REGNUM 18)
> (S3_REGNUM 19)
> (S4_REGNUM 20)
> diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
> index 217e115035b..bb4975cd333 100644
> --- a/gcc/config/riscv/zc.md
> +++ b/gcc/config/riscv/zc.md
> @@ -1433,3 +1433,25 @@
> "TARGET_ZCMP"
> "cm.push {ra, s0-s11}, %0"
> )
> +
> +;; ZCMP mv
> +(define_insn "*mva01s<X:mode>"
> + [(set (match_operand:X 0 "a0a1_reg_operand" "=r")
> + (match_operand:X 1 "zcmp_mv_sreg_operand" "r"))
> + (set (match_operand:X 2 "a0a1_reg_operand" "=r")
> + (match_operand:X 3 "zcmp_mv_sreg_operand" "r"))]
> + "TARGET_ZCMP
> + && (REGNO (operands[2]) != REGNO (operands[0]))"
> + { return (REGNO (operands[0]) == A0_REGNUM)?"cm.mva01s\t%1,%3":"cm.mva01s\t%3,%1"; }
> + [(set_attr "mode" "<X:MODE>")])
> +
> +(define_insn "*mvsa01<X:mode>"
> + [(set (match_operand:X 0 "zcmp_mv_sreg_operand" "=r")
> + (match_operand:X 1 "a0a1_reg_operand" "r"))
> + (set (match_operand:X 2 "zcmp_mv_sreg_operand" "=r")
> + (match_operand:X 3 "a0a1_reg_operand" "r"))]
> + "TARGET_ZCMP
> + && (REGNO (operands[0]) != REGNO (operands[2]))
> + && (REGNO (operands[1]) != REGNO (operands[3]))"
> + { return (REGNO (operands[1]) == A0_REGNUM)?"cm.mvsa01\t%0,%2":"cm.mvsa01\t%2,%0"; }
> + [(set_attr "mode" "<X:MODE>")])
> diff --git a/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
> new file mode 100644
> index 00000000000..49c94c01603
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options " -Os -march=rv32i_zca_zcmp -mabi=ilp32 " } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +int func (int a, int b);
> +
> +/*
> +**sum:
> +** ...
> +** cm.mvsa01 s1,s2
> +** call func
> +** mv s0,a0
> +** cm.mva01s s1,s2
> +** call func
> +** ...
> +*/
> +int sum (int a, int b)
> +{
> + return func (a, b) + func (a, b);
> +}
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] [RISC-V] support cm.popretz in zcmp
2023-06-07 5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
@ 2023-07-13 8:31 ` Kito Cheng
0 siblings, 0 replies; 17+ messages in thread
From: Kito Cheng @ 2023-07-13 8:31 UTC (permalink / raw)
To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei
I was thinking does it possible to using peephole2 to optimize this
case, but I realized their is several barrier, like stack tie and
note...so it seems hard to just leverage peephole2.
And the patch is LGTM, only a few minor coding format issues, but you
don't need to send new patch, I can fix those stuff when I push, and I
would strongly suggest you setup git-format-patch, <gcc-src>/contrib
has a clang format setting , that can release you from the boring
coding format issues.
# Copy to <gcc-src>/.clang-format, so that clang-format can found that
automatically.
$ cp contrib/clang-format .clang-format
> @@ -5747,6 +5748,80 @@ riscv_adjust_libcall_cfi_epilogue ()
> return dwarf;
> }
>
> +/* return true if popretz pattern can be matched.
> + set (reg 10 a0) (const_int 0)
> + use (reg 10 a0)
> + NOTE_INSN_EPILOGUE_BEG */
> +static rtx_insn *
> +riscv_zcmp_can_use_popretz(void)
Need space between function name and (void)
> +{
> + rtx_insn *insn = NULL, *use = NULL, *clear = NULL;
> +
> + /* sequence stack for NOTE_INSN_EPILOGUE_BEG*/
> + struct sequence_stack * outer_seq = get_current_sequence ()->next;
> + if (!outer_seq)
> + return NULL;
> + insn = outer_seq->first;
> + if(!insn || !NOTE_P (insn) || NOTE_KIND (insn) != NOTE_INSN_EPILOGUE_BEG)
> + return NULL;
> +
> + /* sequence stack for the insn before NOTE_INSN_EPILOGUE_BEG*/
> + outer_seq = outer_seq->next;
> + if (outer_seq)
> + insn = outer_seq->last;
> +
> + /* skip notes */
> + while (insn && NOTE_P (insn))
> + {
> + insn = PREV_INSN (insn);
> + }
> + use = insn;
> +
> + /* match use (reg 10 a0) */
> + if (use == NULL || !INSN_P (use)
> + || GET_CODE (PATTERN (use)) != USE
> + || !REG_P(XEXP(PATTERN (use), 0))
> + || REGNO(XEXP(PATTERN (use), 0)) != A0_REGNUM)
> + return NULL;
> +
> + /* match set (reg 10 a0) (const_int 0 [0]) */
> + clear = PREV_INSN (use);
> + if (clear != NULL && INSN_P (clear)
> + && GET_CODE (PATTERN (clear)) == SET
> + && REG_P (SET_DEST (PATTERN (clear)))
> + && REGNO (SET_DEST (PATTERN (clear))) == A0_REGNUM
> + && SET_SRC (PATTERN (clear)) == const0_rtx)
> + return clear;
> +
> + return NULL;
> +}
> +
> +static void
> +riscv_gen_multi_pop_insn(bool use_multi_pop_normal, unsigned mask,
> + unsigned multipop_size)
Same issue here, need space between argument and function name.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-06-07 5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
2023-06-07 10:11 ` jiawei
@ 2023-08-16 8:33 ` Kito Cheng
2023-08-16 8:38 ` Kito Cheng
2023-08-17 11:39 ` Fei Gao
1 sibling, 2 replies; 17+ messages in thread
From: Kito Cheng @ 2023-08-16 8:33 UTC (permalink / raw)
To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei
Hi Fei:
Tried to use Jiawei's patch to test this patch and found some issue:
> @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
> /* Save the registers. */
> if ((frame->mask | frame->fmask) != 0)
> {
> - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> -
> - insn = gen_add3_insn (stack_pointer_rtx,
> - stack_pointer_rtx,
> - GEN_INT (-step1));
> - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> - remaining_size -= step1;
> + if (known_gt (remaining_size, frame->frame_pointer_offset))
> + {
> + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> + remaining_size -= step1;
> + insn = gen_add3_insn (stack_pointer_rtx,
> + stack_pointer_rtx,
> + GEN_INT (-step1));
> + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> + }
> riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
> }
>
I hit some issue here during building libgcc, I use
riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
And the error message is:
In file included from
../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
function '_Unwind_Backtrace':
../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
330 | }
| ^
0x83753a gen_reg_rtx(machine_mode)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
0xf5566f maybe_legitimize_operand
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
int, expand_operand*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
0xf58539 expand_binop_directly
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
rtx_def*, int, optab_methods)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
0xcbfdd0 force_operand(rtx_def*, rtx_def*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
0xc8fca1 force_reg(machine_mode, rtx_def*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
0x144b8cd riscv_force_temporary
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
0x144b8cd riscv_force_address
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
0x1af063e gen_movdf(rtx_def*, rtx_def*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
rtx_def*>(rtx_def*, rtx_def*) const
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
0x143d6c4 riscv_save_reg
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
0x143e2b9 riscv_for_each_saved_reg
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
0x14480d0 riscv_expand_prologue()
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
0x1af57fb gen_prologue()
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
0x143c746 target_gen_prologue
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
Reduced case:
$ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
-mabi=lp64d unwind-dw2.i -Os
typedef struct {
struct {
struct {
struct {
long a
}
} a[129]
}
} b;
struct c {
void *a[129]
} d() {
struct c a;
__builtin_unwind_init();
b e;
f(a, &e);
}
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-08-16 8:33 ` Kito Cheng
@ 2023-08-16 8:38 ` Kito Cheng
2023-08-16 9:03 ` Fei Gao
2023-08-20 10:53 ` Fei Gao
2023-08-17 11:39 ` Fei Gao
1 sibling, 2 replies; 17+ messages in thread
From: Kito Cheng @ 2023-08-16 8:38 UTC (permalink / raw)
To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei
Another fail case for CFI:
$ riscv64-unknown-elf-gcc _mulhc3.i
-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g -O2 -o
_mulhc3.s
typedef float a __attribute__((mode(HF)));
b, c;
f() {
a a, d, e = a + d;
if (g() && e)
c = b;
}
0x10e508a maybe_record_trace_start
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
0x10e58fb scan_trace
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
0x10e5fab create_cfi_notes
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
0x10e6ee4 execute_dwarf2_frame
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
0x10e7c5a execute
../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Hi Fei:
>
> Tried to use Jiawei's patch to test this patch and found some issue:
>
>
> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
> > /* Save the registers. */
> > if ((frame->mask | frame->fmask) != 0)
> > {
> > - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> > -
> > - insn = gen_add3_insn (stack_pointer_rtx,
> > - stack_pointer_rtx,
> > - GEN_INT (-step1));
> > - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> > - remaining_size -= step1;
> > + if (known_gt (remaining_size, frame->frame_pointer_offset))
> > + {
> > + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> > + remaining_size -= step1;
> > + insn = gen_add3_insn (stack_pointer_rtx,
> > + stack_pointer_rtx,
> > + GEN_INT (-step1));
> > + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> > + }
> > riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
> > }
> >
>
> I hit some issue here during building libgcc, I use
> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>
> And the error message is:
>
> In file included from
> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
> function '_Unwind_Backtrace':
> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
> 330 | }
> | ^
> 0x83753a gen_reg_rtx(machine_mode)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
> 0xf5566f maybe_legitimize_operand
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
> int, expand_operand*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
> 0xf58539 expand_binop_directly
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
> rtx_def*, int, optab_methods)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
> 0xc8fca1 force_reg(machine_mode, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
> 0x144b8cd riscv_force_temporary
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
> 0x144b8cd riscv_force_address
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
> rtx_def*>(rtx_def*, rtx_def*) const
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
> 0x143d6c4 riscv_save_reg
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
> 0x143e2b9 riscv_for_each_saved_reg
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
> 0x14480d0 riscv_expand_prologue()
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
> 0x1af57fb gen_prologue()
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
> 0x143c746 target_gen_prologue
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>
>
> Reduced case:
>
> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
> -mabi=lp64d unwind-dw2.i -Os
>
> typedef struct {
> struct {
> struct {
> struct {
> long a
> }
> } a[129]
> }
> } b;
> struct c {
> void *a[129]
> } d() {
> struct c a;
> __builtin_unwind_init();
> b e;
> f(a, &e);
> }
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-08-16 8:38 ` Kito Cheng
@ 2023-08-16 9:03 ` Fei Gao
2023-08-20 10:53 ` Fei Gao
1 sibling, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-08-16 9:03 UTC (permalink / raw)
To: Kito Cheng; +Cc: gcc-patches, Palmer Dabbelt, jeffreyalaw, Sinan, jiawei
Hi Kito
Thanks for reporting these 2 issues.
Let me check and feedback you soon.
BR
Fei
On 2023-08-16 16:38 Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Another fail case for CFI:
>
>$ riscv64-unknown-elf-gcc _mulhc3.i
>-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g -O2 -o
>_mulhc3.s
>
>typedef float a __attribute__((mode(HF)));
>b, c;
>f() {
> a a, d, e = a + d;
> if (g() && e)
> c = b;
>}
>
>
>0x10e508a maybe_record_trace_start
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
>0x10e58fb scan_trace
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
>0x10e5fab create_cfi_notes
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
>0x10e6ee4 execute_dwarf2_frame
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
>0x10e7c5a execute
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
>
>On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>> Hi Fei:
>>
>> Tried to use Jiawei's patch to test this patch and found some issue:
>>
>>
>> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>> > /* Save the registers. */
>> > if ((frame->mask | frame->fmask) != 0)
>> > {
>> > - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > -
>> > - insn = gen_add3_insn (stack_pointer_rtx,
>> > - stack_pointer_rtx,
>> > - GEN_INT (-step1));
>> > - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > - remaining_size -= step1;
>> > + if (known_gt (remaining_size, frame->frame_pointer_offset))
>> > + {
>> > + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > + remaining_size -= step1;
>> > + insn = gen_add3_insn (stack_pointer_rtx,
>> > + stack_pointer_rtx,
>> > + GEN_INT (-step1));
>> > + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > + }
>> > riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>> > }
>> >
>>
>> I hit some issue here during building libgcc, I use
>> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>>
>> And the error message is:
>>
>> In file included from
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>> function '_Unwind_Backtrace':
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>> 330 | }
>> | ^
>> 0x83753a gen_reg_rtx(machine_mode)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>> 0xf5566f maybe_legitimize_operand
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>> int, expand_operand*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>> 0xf58539 expand_binop_directly
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>> rtx_def*, int, optab_methods)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>> 0x144b8cd riscv_force_temporary
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>> 0x144b8cd riscv_force_address
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>> rtx_def*>(rtx_def*, rtx_def*) const
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>> 0x143d6c4 riscv_save_reg
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>> 0x143e2b9 riscv_for_each_saved_reg
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>> 0x14480d0 riscv_expand_prologue()
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>> 0x1af57fb gen_prologue()
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>> 0x143c746 target_gen_prologue
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>>
>>
>> Reduced case:
>>
>> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>> -mabi=lp64d unwind-dw2.i -Os
>>
>> typedef struct {
>> struct {
>> struct {
>> struct {
>> long a
>> }
>> } a[129]
>> }
>> } b;
>> struct c {
>> void *a[129]
>> } d() {
>> struct c a;
>> __builtin_unwind_init();
>> b e;
>> f(a, &e);
>> }
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-08-16 8:33 ` Kito Cheng
2023-08-16 8:38 ` Kito Cheng
@ 2023-08-17 11:39 ` Fei Gao
1 sibling, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-08-17 11:39 UTC (permalink / raw)
To: Kito Cheng
Cc: gcc-patches, Palmer Dabbelt, jeffreyalaw, Sinan, jiawei,
eri-sw-toolchain
Hi Kito
Root cause has been identified.
Here's the frame layout fo the TC, please use courier font :)
+-------------------------------+
| |
| GPR save area 112 B |
| |
+-------------------------------+
| |<-- fs0 is beyond sp based 12-bit range
| FPR save area 96 B |
| |
+-------------------------------+
| |
| local variables |<-- stack_pointer_rtx after riscv_first_stack_step
| |
+-------------------------------+
During stack frame allocation:
1. cm.push reserves 160 bytes, 112 for ra and sregs with 128-bit alignment as per ABI, and additional 48 bytes for first 6 fprs.
2. riscv_first_stack_step reserves 2032 bytes for the rest 6 fprs and local variables.
3. riscv_for_each_saved_reg tries to save fs0 which is beyond sp based 12-bit range,
thus breaking gcc_assert (can_create_pseudo_p ()) in gen_reg_rtx when doing force reg as it's already after reload complete.
I tried with a solution like saving first 6 fprs immediately after cm.push. It seems working:)
I will fix epilogue correspondingly as well.
Thanks again for your test.
BR,
Fei
On 2023-08-16 16:33 Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Hi Fei:
>
>Tried to use Jiawei's patch to test this patch and found some issue:
>
>
>> @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>> /* Save the registers. */
>> if ((frame->mask | frame->fmask) != 0)
>> {
>> - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> -
>> - insn = gen_add3_insn (stack_pointer_rtx,
>> - stack_pointer_rtx,
>> - GEN_INT (-step1));
>> - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> - remaining_size -= step1;
>> + if (known_gt (remaining_size, frame->frame_pointer_offset))
>> + {
>> + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> + remaining_size -= step1;
>> + insn = gen_add3_insn (stack_pointer_rtx,
>> + stack_pointer_rtx,
>> + GEN_INT (-step1));
>> + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> + }
>> riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>> }
>>
>
>I hit some issue here during building libgcc, I use
>riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>
>And the error message is:
>
>In file included from
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>function '_Unwind_Backtrace':
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
> 330 | }
> | ^
>0x83753a gen_reg_rtx(machine_mode)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>0xf5566f maybe_legitimize_operand
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>int, expand_operand*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>0xf58539 expand_binop_directly
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>rtx_def*, int, optab_methods)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>0xcbfdd0 force_operand(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>0xc8fca1 force_reg(machine_mode, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>0x144b8cd riscv_force_temporary
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>0x144b8cd riscv_force_address
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>0x1af063e gen_movdf(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>rtx_def*>(rtx_def*, rtx_def*) const
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>0x143d6c4 riscv_save_reg
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>0x143e2b9 riscv_for_each_saved_reg
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>0x14480d0 riscv_expand_prologue()
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>0x1af57fb gen_prologue()
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>0x143c746 target_gen_prologue
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>
>
>Reduced case:
>
>$ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>-mabi=lp64d unwind-dw2.i -Os
>
>typedef struct {
> struct {
> struct {
> struct {
> long a
> }
> } a[129]
> }
>} b;
>struct c {
> void *a[129]
>} d() {
> struct c a;
> __builtin_unwind_init();
> b e;
> f(a, &e);
>}
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-08-16 8:38 ` Kito Cheng
2023-08-16 9:03 ` Fei Gao
@ 2023-08-20 10:53 ` Fei Gao
2023-08-28 8:04 ` Fei Gao
1 sibling, 1 reply; 17+ messages in thread
From: Fei Gao @ 2023-08-20 10:53 UTC (permalink / raw)
To: Kito Cheng
Cc: gcc-patches, Palmer Dabbelt, jeffreyalaw, Sinan, jiawei,
eri-sw-toolchain
Hi Kito
This issue is due to zcmp and shrink-wrap-separate conflict,
which has been addressed by an under-review patch.
[PATCH 0/2] resolve confilct between RISC-V zcmp and shrink-wrap-separate
https://patchwork.sourceware.org/project/gcc/list/?series=21577
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311487.html
I'm making [PATCH 1/4][V5][RISC-V] support cm.push cm.pop cm.popret in zcmp for the 1st issue you catched.
Please let me know if you want me to merge
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311486.html
into [PATCH 1/4][V5][RISC-V].
BR,
Fei
On 2023-08-16 16:38 Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Another fail case for CFI:
>
>$ riscv64-unknown-elf-gcc _mulhc3.i
>-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g -O2 -o
>_mulhc3.s
>
>typedef float a __attribute__((mode(HF)));
>b, c;
>f() {
> a a, d, e = a + d;
> if (g() && e)
> c = b;
>}
>
>
>0x10e508a maybe_record_trace_start
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
>0x10e58fb scan_trace
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
>0x10e5fab create_cfi_notes
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
>0x10e6ee4 execute_dwarf2_frame
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
>0x10e7c5a execute
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
>
>On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>> Hi Fei:
>>
>> Tried to use Jiawei's patch to test this patch and found some issue:
>>
>>
>> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>> > /* Save the registers. */
>> > if ((frame->mask | frame->fmask) != 0)
>> > {
>> > - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > -
>> > - insn = gen_add3_insn (stack_pointer_rtx,
>> > - stack_pointer_rtx,
>> > - GEN_INT (-step1));
>> > - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > - remaining_size -= step1;
>> > + if (known_gt (remaining_size, frame->frame_pointer_offset))
>> > + {
>> > + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > + remaining_size -= step1;
>> > + insn = gen_add3_insn (stack_pointer_rtx,
>> > + stack_pointer_rtx,
>> > + GEN_INT (-step1));
>> > + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > + }
>> > riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>> > }
>> >
>>
>> I hit some issue here during building libgcc, I use
>> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>>
>> And the error message is:
>>
>> In file included from
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>> function '_Unwind_Backtrace':
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>> 330 | }
>> | ^
>> 0x83753a gen_reg_rtx(machine_mode)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>> 0xf5566f maybe_legitimize_operand
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>> int, expand_operand*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>> 0xf58539 expand_binop_directly
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>> rtx_def*, int, optab_methods)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>> 0x144b8cd riscv_force_temporary
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>> 0x144b8cd riscv_force_address
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>> rtx_def*>(rtx_def*, rtx_def*) const
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>> 0x143d6c4 riscv_save_reg
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>> 0x143e2b9 riscv_for_each_saved_reg
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>> 0x14480d0 riscv_expand_prologue()
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>> 0x1af57fb gen_prologue()
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>> 0x143c746 target_gen_prologue
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>>
>>
>> Reduced case:
>>
>> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>> -mabi=lp64d unwind-dw2.i -Os
>>
>> typedef struct {
>> struct {
>> struct {
>> struct {
>> long a
>> }
>> } a[129]
>> }
>> } b;
>> struct c {
>> void *a[129]
>> } d() {
>> struct c a;
>> __builtin_unwind_init();
>> b e;
>> f(a, &e);
>> }
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
2023-08-20 10:53 ` Fei Gao
@ 2023-08-28 8:04 ` Fei Gao
0 siblings, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-08-28 8:04 UTC (permalink / raw)
To: Kito Cheng, jeffreyalaw; +Cc: gcc-patches, Palmer Dabbelt, Sinan, jiawei
Hi Kito & Jeff
A new series for zcmp(https://patchwork.sourceware.org/project/gcc/list/?series=23929) to:
1. solve the 2 issues Kito catched
2. rebase
The new series would be a replacement of the following:
https://patchwork.sourceware.org/project/gcc/list/?series=21577
https://patchwork.sourceware.org/project/gcc/patch/20230607055215.29332-2-gaofei@eswincomputing.com/
The rest of zcmp patches will be send out after the new series accepted to avoid rebase again an again.
BR,
Fei
On 2023-08-20 18:53 Fei Gao <gaofei@eswincomputing.com> wrote:
>
>
>Hi Kito
>
>This issue is due to zcmp and shrink-wrap-separate conflict,
>which has been addressed by an under-review patch.
>[PATCH 0/2] resolve confilct between RISC-V zcmp and shrink-wrap-separate
>https://patchwork.sourceware.org/project/gcc/list/?series=21577
>https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311487.html
>
>I'm making [PATCH 1/4][V5][RISC-V] support cm.push cm.pop cm.popret in zcmp for the 1st issue you catched.
>Please let me know if you want me to merge
>https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311486.html
>into [PATCH 1/4][V5][RISC-V].
>
>BR,
>Fei
>On 2023-08-16 16:38 Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>>Another fail case for CFI:
>>
>>$ riscv64-unknown-elf-gcc _mulhc3.i
>>-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g -O2 -o
>>_mulhc3.s
>>
>>typedef float a __attribute__((mode(HF)));
>>b, c;
>>f() {
>> a a, d, e = a + d;
>> if (g() && e)
>> c = b;
>>}
>>
>>
>>0x10e508a maybe_record_trace_start
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
>>0x10e58fb scan_trace
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
>>0x10e5fab create_cfi_notes
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
>>0x10e6ee4 execute_dwarf2_frame
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
>>0x10e7c5a execute
>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
>>
>>On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>>>
>>> Hi Fei:
>>>
>>> Tried to use Jiawei's patch to test this patch and found some issue:
>>>
>>>
>>> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>>> > /* Save the registers. */
>>> > if ((frame->mask | frame->fmask) != 0)
>>> > {
>>> > - HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>>> > -
>>> > - insn = gen_add3_insn (stack_pointer_rtx,
>>> > - stack_pointer_rtx,
>>> > - GEN_INT (-step1));
>>> > - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>>> > - remaining_size -= step1;
>>> > + if (known_gt (remaining_size, frame->frame_pointer_offset))
>>> > + {
>>> > + HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>>> > + remaining_size -= step1;
>>> > + insn = gen_add3_insn (stack_pointer_rtx,
>>> > + stack_pointer_rtx,
>>> > + GEN_INT (-step1));
>>> > + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>>> > + }
>>> > riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>>> > }
>>> >
>>>
>>> I hit some issue here during building libgcc, I use
>>> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>>>
>>> And the error message is:
>>>
>>> In file included from
>>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>>> function '_Unwind_Backtrace':
>>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>>> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>>> 330 | }
>>> | ^
>>> 0x83753a gen_reg_rtx(machine_mode)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>>> 0xf5566f maybe_legitimize_operand
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>>> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>>> int, expand_operand*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>>> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>>> 0xf58539 expand_binop_directly
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>>> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>>> rtx_def*, int, optab_methods)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>>> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>>> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>>> 0x144b8cd riscv_force_temporary
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>>> 0x144b8cd riscv_force_address
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>>> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>>> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>>> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>>> rtx_def*>(rtx_def*, rtx_def*) const
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>>> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>>> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>>> 0x143d6c4 riscv_save_reg
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>>> 0x143e2b9 riscv_for_each_saved_reg
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>>> 0x14480d0 riscv_expand_prologue()
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>>> 0x1af57fb gen_prologue()
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>>> 0x143c746 target_gen_prologue
>>> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>>>
>>>
>>> Reduced case:
>>>
>>> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>>> -mabi=lp64d unwind-dw2.i -Os
>>>
>>> typedef struct {
>>> struct {
>>> struct {
>>> struct {
>>> long a
>>> }
>>> } a[129]
>>> }
>>> } b;
>>> struct c {
>>> void *a[129]
>>> } d() {
>>> struct c a;
>>> __builtin_unwind_init();
>>> b e;
>>> f(a, &e);
>>> }
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2023-08-28 8:04 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
2023-06-07 5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
2023-06-07 10:11 ` jiawei
2023-08-16 8:33 ` Kito Cheng
2023-08-16 8:38 ` Kito Cheng
2023-08-16 9:03 ` Fei Gao
2023-08-20 10:53 ` Fei Gao
2023-08-28 8:04 ` Fei Gao
2023-08-17 11:39 ` Fei Gao
2023-06-07 5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
2023-07-13 8:31 ` Kito Cheng
2023-06-07 5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
2023-06-12 15:17 ` Kito Cheng
2023-06-12 19:26 ` Jeff Law
2023-06-13 2:35 ` Fei Gao
2023-06-07 5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
2023-07-13 8:18 ` Kito Cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).