public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp
       [not found] ` <20230516093354.1521-2-gaofei@eswincomputing.com>
@ 2023-05-30  5:26   ` Sinan
  2023-05-30  7:44     ` Fei Gao
  0 siblings, 1 reply; 4+ messages in thread
From: Sinan @ 2023-05-30  5:26 UTC (permalink / raw)
  To: Fei Gao; +Cc: kito.cheng, Jiawei, Die Li, Liao Shihua, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 82060 bytes --]

>> +/* Return TRUE if Zcmp push and pop insns should be
>> + avoided. FALSE otherwise.
>> + Only use multi push & pop if all GPRs masked can be covered,
>> + and stack access is SP based,
>> + and GPRs are at top of the stack frame,
>> + and no conflicts in stack allocation with other features */
>> +static bool
>> +riscv_avoid_multi_push(const struct riscv_frame_info *frame)
>> +{
>> + if (!TARGET_ZCMP
>> + || crtl->calls_eh_return
>> + || frame_pointer_needed
>> + || cfun->machine->interrupt_handler_p
>> + || cfun->machine->varargs_size != 0
>> + || crtl->args.pretend_args_size != 0
>> + || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
>> + return true;
>> +
>> + return false;
>> +}
Any reason to skip generating push/pop in the cases where a frame pointer is needed?
IIRC, only code compiled with O1 and above will omit frame pointer, if so then code with
O0 will never generate cm.push/pop. 
Same question for interrupt_handler_p. I think cm.push/pop can handle this case. e.g.
the test case zc-zcmp-push-pop-6.c from Jiawei's patch.
BR,
Sinan
------------------------------------------------------------------
Sender:Fei Gao <gaofei@eswincomputing.com>
Sent At:2023 May 16 (Tue.) 17:34
Recipient:sinan.lin <sinan.lin@linux.alibaba.com>; jiawei <jiawei@iscas.ac.cn>; shihua <shihua@iscas.ac.cn>; lidie <lidie@eswincomputing.com>
Cc:Fei Gao <gaofei@eswincomputing.com>
Subject:[PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp
Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
by cm.push, step 1 and step 2.
please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
So adaption has been done in .cfi directives in my patch.
gcc/ChangeLog:
 * config/riscv/predicates.md (slot_0_offset_operand): predicates for slot 0 offset.
 (slot_1_offset_operand): likewise
 (slot_2_offset_operand): likewise
 (slot_3_offset_operand): likewise
 (slot_4_offset_operand): likewise
 (slot_5_offset_operand): likewise
 (slot_6_offset_operand): likewise
 (slot_7_offset_operand): likewise
 (slot_8_offset_operand): likewise
 (slot_9_offset_operand): likewise
 (slot_10_offset_operand): likewise
 (slot_11_offset_operand): likewise
 (slot_12_offset_operand): likewise
 (stack_push_up_to_ra_operand): predicates for stack adjust of pushing ra
 (stack_push_up_to_s0_operand): predicates for stack adjust of pushing ra, s0
 (stack_push_up_to_s1_operand): likewise
 (stack_push_up_to_s2_operand): likewise
 (stack_push_up_to_s3_operand): likewise
 (stack_push_up_to_s4_operand): likewise
 (stack_push_up_to_s5_operand): likewise
 (stack_push_up_to_s6_operand): likewise
 (stack_push_up_to_s7_operand): likewise
 (stack_push_up_to_s8_operand): likewise
 (stack_push_up_to_s9_operand): likewise
 (stack_push_up_to_s11_operand): likewise
 (stack_pop_up_to_ra_operand): predicates for stack adjust of poping ra
 (stack_pop_up_to_s0_operand): predicates for stack adjust of poping ra, s0
 (stack_pop_up_to_s1_operand): likewise
 (stack_pop_up_to_s2_operand): likewise
 (stack_pop_up_to_s3_operand): likewise
 (stack_pop_up_to_s4_operand): likewise
 (stack_pop_up_to_s5_operand): likewise
 (stack_pop_up_to_s6_operand): likewise
 (stack_pop_up_to_s7_operand): likewise
 (stack_pop_up_to_s8_operand): likewise
 (stack_pop_up_to_s9_operand): likewise
 (stack_pop_up_to_s11_operand): likewise
 * config/riscv/riscv-protos.h (riscv_zcmp_valid_slot_offset_p): declaration
 (riscv_zcmp_valid_stack_adj_bytes_p): declaration
 * config/riscv/riscv.cc (struct riscv_frame_info): comment change
 (riscv_avoid_multi_push): helper function of riscv_use_multi_push
 (riscv_use_multi_push): true if multi push is used
 (riscv_multi_push_sregs_count): num of sregs in multi-push
 (riscv_multi_push_regs_count): num of regs in multi-push
 (riscv_16bytes_align): align to 16 bytes
 (riscv_stack_align): moved to a better place
 (riscv_save_libcall_count): no functional change
 (riscv_compute_frame_info): add zcmp frame info
 (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
 (get_slot_offset_rtx): get the rtx of slot to push or pop
 (riscv_gen_multi_push_pop_insn): gen function for multi push and pop
 (riscv_expand_prologue): allocate stack by cm.push
 (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
 (riscv_expand_epilogue): allocate stack by cm.pop[ret]
 (zcmp_base_adj): calculate stack adjustment base size
 (zcmp_additional_adj): calculate stack adjustment additional size
 (riscv_zcmp_valid_slot_offset_p): check if offset is valid for a slot
 (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment size is valid
 * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
 (S0_MASK): likewise
 (S1_MASK): likewise
 (S2_MASK): likewise
 (S3_MASK): likewise
 (S4_MASK): likewise
 (S5_MASK): likewise
 (S6_MASK): likewise
 (S7_MASK): likewise
 (S8_MASK): likewise
 (S9_MASK): likewise
 (S10_MASK): likewise
 (S11_MASK): likewise
 (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
 (ZCMP_MAX_SPIMM): max spimm value
 (ZCMP_SP_INC_STEP): zcmp sp increment step
 (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
 (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
 (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
 * config/riscv/riscv.md: include zc.md
 * config/riscv/zc.md: New file. machine description for zcmp
gcc/testsuite/ChangeLog:
 * gcc.target/riscv/rv32e_zcmp.c: New test.
 * gcc.target/riscv/rv32i_zcmp.c: New test.
 * gcc.target/riscv/zcmp_stack_alignment.c: New test.
---
 gcc/config/riscv/predicates.md | 148 +++
 gcc/config/riscv/riscv-protos.h | 2 +
 gcc/config/riscv/riscv.cc | 477 +++++++-
 gcc/config/riscv/riscv.h | 23 +
 gcc/config/riscv/riscv.md | 2 +
 gcc/config/riscv/zc.md | 1042 +++++++++++++++++
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 239 ++++
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 239 ++++
 .../gcc.target/riscv/zcmp_stack_alignment.c | 23 +
 9 files changed, 2155 insertions(+), 40 deletions(-)
 create mode 100644 gcc/config/riscv/zc.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index e5adf06fa25..d3d30dc67f7 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -59,6 +59,154 @@
 (ior (match_operand 0 "const_0_operand")
 (match_operand 0 "register_operand")))
+(define_predicate "slot_0_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 0)")))
+
+(define_predicate "slot_1_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 1)")))
+
+(define_predicate "slot_2_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 2)")))
+
+(define_predicate "slot_3_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 3)")))
+
+(define_predicate "slot_4_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 4)")))
+
+(define_predicate "slot_5_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 5)")))
+
+(define_predicate "slot_6_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 6)")))
+
+(define_predicate "slot_7_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 7)")))
+
+(define_predicate "slot_8_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 8)")))
+
+(define_predicate "slot_9_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 9)")))
+
+(define_predicate "slot_10_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 10)")))
+
+(define_predicate "slot_11_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 11)")))
+
+(define_predicate "slot_12_offset_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 12)")))
+
+(define_predicate "stack_push_up_to_ra_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
+
+(define_predicate "stack_push_up_to_s0_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
+
+(define_predicate "stack_push_up_to_s1_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
+
+(define_predicate "stack_push_up_to_s2_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
+
+(define_predicate "stack_push_up_to_s3_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
+
+(define_predicate "stack_push_up_to_s4_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
+
+(define_predicate "stack_push_up_to_s5_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
+
+(define_predicate "stack_push_up_to_s6_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
+
+(define_predicate "stack_push_up_to_s7_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
+
+(define_predicate "stack_push_up_to_s8_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
+
+(define_predicate "stack_push_up_to_s9_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
+
+(define_predicate "stack_push_up_to_s11_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
+
+(define_predicate "stack_pop_up_to_ra_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
+
+(define_predicate "stack_pop_up_to_s0_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
+
+(define_predicate "stack_pop_up_to_s1_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
+
+(define_predicate "stack_pop_up_to_s2_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
+
+(define_predicate "stack_pop_up_to_s3_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
+
+(define_predicate "stack_pop_up_to_s4_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
+
+(define_predicate "stack_pop_up_to_s5_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
+
+(define_predicate "stack_pop_up_to_s6_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
+
+(define_predicate "stack_pop_up_to_s7_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
+
+(define_predicate "stack_pop_up_to_s8_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
+
+(define_predicate "stack_pop_up_to_s9_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
+
+(define_predicate "stack_pop_up_to_s11_operand"
+ (and (match_code "const_int")
+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
+
 ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
 (define_predicate "branch_on_bit_operand"
 (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 7760a9cac8d..f0ea14f05be 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -56,6 +56,8 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
 extern void riscv_split_doubleword_move (rtx, rtx);
 extern const char *riscv_output_move (rtx, rtx);
 extern const char *riscv_output_return ();
+extern bool riscv_zcmp_valid_slot_offset_p (HOST_WIDE_INT, int);
+extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 629e5e45cac..a0a2db1f594 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -117,6 +117,14 @@ struct GTY(()) riscv_frame_info {
 /* How much the GPR save/restore routines adjust sp (or 0 if unused). */
 unsigned save_libcall_adjustment;
+ /* the minimum number of bytes, in multiples of 16-byte address increments,
+ required to cover the registers in a multi push & pop. */
+ unsigned multi_push_adj_base;
+
+ /* the number of additional 16-byte address increments allocated for the stack frame
+ in a multi push & pop. */
+ unsigned multi_push_adj_addi;
+
 /* Offsets of fixed-point and floating-point save areas from frame bottom */
 poly_int64 gp_sp_offset;
 poly_int64 fp_sp_offset;
@@ -413,6 +421,21 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
 #include "riscv-cores.def"
 };
+typedef enum
+{
+ SI_IDX = 0,
+ DI_IDX,
+ ZCMP_MODE_NUM = DI_IDX
+} mode_idx;
+
+typedef enum
+{
+ PUSH_IDX = 0,
+ POP_IDX,
+ POPRET_IDX,
+ ZCMP_OP_NUM = POPRET_IDX
+} op_idx;
+
 void riscv_frame_info::reset(void)
 {
 total_size = 0;
@@ -4844,6 +4867,37 @@ riscv_save_reg_p (unsigned int regno)
 return false;
 }
+/* Return TRUE if Zcmp push and pop insns should be
+ avoided. FALSE otherwise.
+ Only use multi push & pop if all GPRs masked can be covered,
+ and stack access is SP based,
+ and GPRs are at top of the stack frame,
+ and no conflicts in stack allocation with other features */
+static bool
+riscv_avoid_multi_push(const struct riscv_frame_info *frame)
+{
+ if (!TARGET_ZCMP
+ || crtl->calls_eh_return
+ || frame_pointer_needed
+ || cfun->machine->interrupt_handler_p
+ || cfun->machine->varargs_size != 0
+ || crtl->args.pretend_args_size != 0
+ || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
+ return true;
+
+ return false;
+}
+
+/* Determine whether to use multi push insn. */
+static bool
+riscv_use_multi_push(const struct riscv_frame_info *frame)
+{
+ if (riscv_avoid_multi_push (frame))
+ return false;
+
+ return (frame->multi_push_adj_base != 0);
+}
+
 /* Return TRUE if a libcall to save/restore GPRs should be
 avoided. FALSE otherwise. */
 static bool
@@ -4881,6 +4935,51 @@ riscv_save_libcall_count (unsigned mask)
 abort ();
 }
+/* calculate number of s regs in multi push and pop.
+ Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead. */
+static unsigned
+riscv_multi_push_sregs_count (unsigned mask)
+{
+ unsigned num = riscv_save_libcall_count (mask);
+ return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
+ ? ZCMP_S0S11_SREGS_COUNTS
+ : num;
+}
+
+/* calculate number of regs(ra, s0-sx) in multi push and pop. */
+static unsigned
+riscv_multi_push_regs_count (unsigned mask)
+{
+ /* 1 is for ra */
+ return riscv_multi_push_sregs_count (mask) + 1;
+}
+
+/* Handle 16 bytes align for poly_int. */
+static poly_int64
+riscv_16bytes_align (poly_int64 value)
+{
+ return aligned_upper_bound (value, 16);
+}
+
+static HOST_WIDE_INT
+riscv_16bytes_align (HOST_WIDE_INT value)
+{
+ return ROUND_UP(value, 16);
+}
+
+/* Handle stack align for poly_int. */
+static poly_int64
+riscv_stack_align (poly_int64 value)
+{
+ return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
+}
+
+static HOST_WIDE_INT
+riscv_stack_align (HOST_WIDE_INT value)
+{
+ return RISCV_STACK_ALIGN (value);
+}
+
 /* Populate the current function's riscv_frame_info structure.
 RISC-V stack frames grown downward. High addresses are at the top.
@@ -4906,7 +5005,7 @@ riscv_save_libcall_count (unsigned mask)
 | GPR save area | + UNITS_PER_WORD
 | |
 +-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
- | | + UNITS_PER_HWVALUE
+ | | + UNITS_PER_FP_REG
 | FPR save area |
 | |
 +-------------------------------+ <-- frame_pointer_rtx (virtual)
@@ -4925,19 +5024,6 @@ riscv_save_libcall_count (unsigned mask)
 static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
-/* Handle stack align for poly_int. */
-static poly_int64
-riscv_stack_align (poly_int64 value)
-{
- return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
-}
-
-static HOST_WIDE_INT
-riscv_stack_align (HOST_WIDE_INT value)
-{
- return RISCV_STACK_ALIGN (value);
-}
-
 static void
 riscv_compute_frame_info (void)
 {
@@ -4985,8 +5071,9 @@ riscv_compute_frame_info (void)
 if (frame->mask)
 {
 x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
- unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
+ /* 1 is for ra */
+ unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
 /* Only use save/restore routines if they don't alter the stack size. */
 if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
 && !riscv_avoid_save_libcall ())
@@ -4998,6 +5085,15 @@ riscv_compute_frame_info (void)
 frame->save_libcall_adjustment = x_save_size;
 }
+
+ if (!riscv_avoid_multi_push (frame))
+ {
+ /* num(ra, s0-sx) */
+ unsigned num_multi_push =
+ riscv_multi_push_regs_count (frame->mask);
+ x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
+ frame->multi_push_adj_base = riscv_16bytes_align (x_save_size);
+ }
 }
 /* At the bottom of the frame are any outgoing stack arguments. */
@@ -5012,7 +5108,15 @@ riscv_compute_frame_info (void)
 frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
 /* Next are the callee-saved GPRs. */
 if (frame->mask)
- offset += x_save_size;
+ {
+ offset += x_save_size;
+ /* align to 16 bytes and add paddings to GPR part to honor
+ both stack alignment and zcmp pus/pop size alignment. */
+ if (riscv_use_multi_push (frame)
+ && known_lt(offset,
+ frame->multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
+ offset = riscv_16bytes_align (offset);
+ }
 frame->gp_sp_offset = offset - UNITS_PER_WORD;
 /* The hard frame pointer points above the callee-saved GPRs. */
 frame->hard_frame_pointer_offset = offset;
@@ -5356,6 +5460,42 @@ riscv_adjust_libcall_cfi_prologue ()
 return dwarf;
 }
+static rtx
+riscv_adjust_multi_push_cfi_prologue (int saved_size)
+{
+ rtx dwarf = NULL_RTX;
+ rtx adjust_sp_rtx, reg, mem, insn;
+ unsigned int mask = cfun->machine->frame.mask;
+ int offset;
+ int saved_cnt = 0;
+
+ if (mask & S10_MASK)
+ mask |= S11_MASK;
+
+ for (int regno = GP_REG_LAST; regno >= GP_REG_FIRST; regno--)
+ if (BITSET_P (mask & MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
+ {
+ /* The save order is s11-s0, ra
+ from high to low addr. */
+ offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
+
+ reg = gen_rtx_REG (SImode, regno);
+ mem = gen_frame_mem (SImode, plus_constant (Pmode,
+ stack_pointer_rtx,
+ offset));
+
+ insn = gen_rtx_SET (mem, reg);
+ dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
+ }
+
+ /* Debug info for adjust sp. */
+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+ plus_constant(Pmode, stack_pointer_rtx, -saved_size));
+ dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+ dwarf);
+ return dwarf;
+}
+
 static void
 riscv_emit_stack_tie (void)
 {
@@ -5365,6 +5505,152 @@ riscv_emit_stack_tie (void)
 emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
 }
+static rtx
+get_slot_offset_rtx (int slot_idx)
+{
+ HOST_WIDE_INT slot_offset = -1 * (slot_idx + 1) * GET_MODE_SIZE (word_mode);
+ return GEN_INT (slot_offset);
+}
+
+/*zcmp multi push and pop function ptr array */
+const insn_gen_fn gen_push_pop [ZCMP_OP_NUM + 1][ZCMP_MODE_NUM + 1][ZCMP_MAX_GRP_SLOTS] =
+{{{(insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_ra_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s0_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s1_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s2_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s3_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s4_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s5_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s6_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s7_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s8_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s9_si,
+ NULL,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s11_si},
+ {(insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_ra_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s0_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s1_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s2_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s3_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s4_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s5_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s6_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s7_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s8_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s9_di,
+ NULL,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s11_di}},
+ {{(insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_ra_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s0_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s1_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s2_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s3_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s4_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s5_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s6_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s7_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s8_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s9_si,
+ NULL,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s11_si},
+ {(insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_ra_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s0_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s1_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s2_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s3_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s4_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s5_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s6_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s7_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s8_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s9_di,
+ NULL,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s11_di}},
+ {{(insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_ra_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s0_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s1_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s2_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s3_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s4_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s5_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s6_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s7_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s8_si,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s9_si,
+ NULL,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s11_si},
+ {(insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_ra_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s0_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s1_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s2_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s3_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s4_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s5_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s6_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s7_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s8_di,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s9_di,
+ NULL,
+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s11_di}}};
+
+static rtx
+riscv_gen_multi_push_pop_insn (op_idx op, HOST_WIDE_INT adj_size, unsigned int regs_num)
+{
+ rtx stack_adj = GEN_INT (adj_size);
+ rtx slots[ZCMP_MAX_GRP_SLOTS];
+
+ for (int slot_idx = 0; slot_idx < ZCMP_MAX_GRP_SLOTS; slot_idx++)
+ slots[slot_idx] = get_slot_offset_rtx (slot_idx);
+
+ switch (regs_num)
+ {
+ case 1:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0]);
+ case 2:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1]);
+ case 3:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2]);
+ case 4:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3]);
+ case 5:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4]);
+ case 6:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5]);
+ case 7:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+ slots[6]);
+ case 8:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+ slots[6], slots[7]);
+ case 9:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+ slots[6], slots[7], slots[8]);
+ case 10:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+ slots[6], slots[7], slots[8], slots[9]);
+ case 11:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+ slots[6], slots[7], slots[8], slots[9], slots[10]);
+ case 13:
+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+ slots[6], slots[7], slots[8], slots[9], slots[10], slots[11], slots[12]);
+ default:
+ gcc_unreachable ();
+ }
+}
+
 /* Expand the "prologue" pattern. */
 void
@@ -5373,7 +5659,8 @@ riscv_expand_prologue (void)
 struct riscv_frame_info *frame = &cfun->machine->frame;
 poly_int64 remaining_size = frame->total_size;
 unsigned mask = frame->mask;
- rtx insn;
+ int spimm, multi_push_additional, stack_adj;
+ rtx insn, dwarf = NULL_RTX;
 if (flag_stack_usage_info)
 current_function_static_stack_size = constant_lower_bound (remaining_size);
@@ -5381,8 +5668,35 @@ riscv_expand_prologue (void)
 if (cfun->machine->naked_p)
 return;
+ /* prefer muti-push to save-restore libcall. */
+ if (riscv_use_multi_push(frame))
+ {
+ remaining_size -= frame->multi_push_adj_base;
+ if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
+ spimm = 3;
+ else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
+ spimm = 2;
+ else if (known_gt(remaining_size, 0))
+ spimm = 1;
+ else
+ spimm = 0;
+ multi_push_additional = spimm * ZCMP_SP_INC_STEP;
+ frame->multi_push_adj_addi = multi_push_additional;
+ remaining_size -= multi_push_additional;
+
+ /* emit multi push insn & dwarf along with it. */
+ stack_adj = frame->multi_push_adj_base + multi_push_additional;
+ insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
+ -stack_adj, riscv_multi_push_regs_count(frame->mask)));
+ dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
+ RTX_FRAME_RELATED_P (insn) = 1;
+ REG_NOTES (insn) = dwarf;
+
+ /* Temporarily fib that we need not save GPRs. */
+ frame->mask = 0; 
+ }
 /* When optimizing for size, call a subroutine to save the registers. */
- if (riscv_use_save_libcall (frame))
+ else if (riscv_use_save_libcall (frame))
 {
 rtx dwarf = NULL_RTX;
 dwarf = riscv_adjust_libcall_cfi_prologue ();
@@ -5398,13 +5712,15 @@ riscv_expand_prologue (void)
 /* Save the registers. */
 if ((frame->mask | frame->fmask) != 0)
 {
- HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
-
- insn = gen_add3_insn (stack_pointer_rtx,
- stack_pointer_rtx,
- GEN_INT (-step1));
- RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
- remaining_size -= step1;
+ if (known_gt (remaining_size, frame->frame_pointer_offset))
+ {
+ HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
+ remaining_size -= step1;
+ insn = gen_add3_insn (stack_pointer_rtx,
+ stack_pointer_rtx,
+ GEN_INT (-step1));
+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+ }
 riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
 }
@@ -5461,6 +5777,32 @@ riscv_expand_prologue (void)
 }
 }
+static rtx
+riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
+{
+ rtx dwarf = NULL_RTX;
+ rtx adjust_sp_rtx, reg;
+ unsigned int mask = cfun->machine->frame.mask;
+
+ if (mask & S10_MASK)
+ mask |= S11_MASK;
+
+ /* Debug info for adjust sp. */
+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+ plus_constant(Pmode, stack_pointer_rtx, saved_size));
+ dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+ dwarf);
+
+ for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+ if (BITSET_P (mask, regno - GP_REG_FIRST))
+ {
+ reg = gen_rtx_REG (SImode, regno);
+ dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
+ }
+
+ return dwarf;
+}
+
 static rtx
 riscv_adjust_libcall_cfi_epilogue ()
 {
@@ -5500,10 +5842,18 @@ riscv_expand_epilogue (int style)
 struct riscv_frame_info *frame = &cfun->machine->frame;
 unsigned mask = frame->mask;
 HOST_WIDE_INT step2 = 0;
- bool use_restore_libcall = ((style == NORMAL_RETURN)
- && riscv_use_save_libcall (frame));
- unsigned libcall_size = (use_restore_libcall
- ? frame->save_libcall_adjustment : 0);
+ bool use_multi_pop_normal = ((style == NORMAL_RETURN)
+ && riscv_use_multi_push (frame));
+ bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
+ && riscv_use_multi_push (frame));
+ bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
+
+ bool use_restore_libcall = !use_multi_pop && ((style == NORMAL_RETURN)
+ && riscv_use_save_libcall (frame));
+ unsigned libcall_size = use_restore_libcall && !use_multi_pop ?
+ frame->save_libcall_adjustment : 0;
+ unsigned multipop_size = use_multi_pop ?
+ frame->multi_push_adj_base + frame->multi_push_adj_addi : 0;
 rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
 rtx insn;
@@ -5574,18 +5924,25 @@ riscv_expand_epilogue (int style)
 REG_NOTES (insn) = dwarf;
 }
- if (use_restore_libcall)
- frame->mask = 0; /* Temporarily fib for GPRs. */
+ if (use_restore_libcall || use_multi_pop)
+ frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
 /* If we need to restore registers, deallocate as much stack as
 possible in the second step without going out of range. */
- if ((frame->mask | frame->fmask) != 0)
+ if (use_multi_pop)
+ {
+ if (frame->fmask
+ && known_gt (frame->total_size - multipop_size,
+ frame->frame_pointer_offset))
+ step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
+ }
+ else if ((frame->mask | frame->fmask) != 0)
 step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
- if (use_restore_libcall)
+ if (use_restore_libcall || use_multi_pop)
 frame->mask = mask; /* Undo the above fib. */
- poly_int64 step1 = frame->total_size - step2 - libcall_size;
+ poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
 /* Set TARGET to BASE + STEP1. */
 if (known_gt (step1, 0))
@@ -5620,7 +5977,7 @@ riscv_expand_epilogue (int style)
 adjust));
 rtx dwarf = NULL_RTX;
 rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
- GEN_INT (step2));
+ GEN_INT (step2 + libcall_size + multipop_size));
 dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
 RTX_FRAME_RELATED_P (insn) = 1;
@@ -5635,15 +5992,15 @@ riscv_expand_epilogue (int style)
 epilogue_cfa_sp_offset = step2;
 }
- if (use_restore_libcall)
+ if (use_restore_libcall || use_multi_pop)
 frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
 /* Restore the registers. */
- riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
+ riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
 riscv_restore_reg,
 true, style == EXCEPTION_RETURN);
- if (use_restore_libcall)
+ if (use_restore_libcall || use_multi_pop)
 frame->mask = mask; /* Undo the above fib. */
 if (need_barrier_p)
@@ -5657,14 +6014,30 @@ riscv_expand_epilogue (int style)
 rtx dwarf = NULL_RTX;
 rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
- const0_rtx);
+ GEN_INT (libcall_size + multipop_size));
 dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
 RTX_FRAME_RELATED_P (insn) = 1;
 REG_NOTES (insn) = dwarf;
 }
- if (use_restore_libcall)
+ if (use_multi_pop)
+ {
+ unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
+ if (use_multi_pop_normal)
+ insn = emit_jump_insn (
+ riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
+ else
+ insn= emit_insn (
+ riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
+
+ rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
+ RTX_FRAME_RELATED_P (insn) = 1;
+ REG_NOTES (insn) = dwarf;
+ if (use_multi_pop_normal)
+ return;
+ }
+ else if (use_restore_libcall)
 {
 rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
 insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
@@ -6937,6 +7310,30 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
 return gen_rtx_PARALLEL (VOIDmode, vec);
 }
+static HOST_WIDE_INT zcmp_base_adj(int regs_num)
+{
+ return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
+}
+
+static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
+{
+ return total - zcmp_base_adj(regs_num);
+}
+
+bool riscv_zcmp_valid_slot_offset_p (HOST_WIDE_INT offset, int slot_idx)
+{
+ return offset == -1 * (slot_idx + 1) * GET_MODE_SIZE (word_mode);
+}
+
+bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
+{
+ HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
+ return additioanl_bytes == 0
+ || additioanl_bytes == 1 * ZCMP_SP_INC_STEP
+ || additioanl_bytes == 2 * ZCMP_SP_INC_STEP
+ || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
+}
+
 /* Return true if it's valid gpr_save pattern. */
 bool
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 13038a39e5c..ff210083004 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -413,6 +413,29 @@ ASM_MISA_SPEC
 #define RISCV_CALL_ADDRESS_TEMP(MODE) \
 gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
+#define RETURN_ADDR_MASK ( 1 << RETURN_ADDR_REGNUM)
+#define S0_MASK ( 1 << S0_REGNUM)
+#define S1_MASK ( 1 << S1_REGNUM)
+#define S2_MASK ( 1 << S2_REGNUM)
+#define S3_MASK ( 1 << S3_REGNUM)
+#define S4_MASK ( 1 << S4_REGNUM)
+#define S5_MASK ( 1 << S5_REGNUM)
+#define S6_MASK ( 1 << S6_REGNUM)
+#define S7_MASK ( 1 << S7_REGNUM)
+#define S8_MASK ( 1 << S8_REGNUM)
+#define S9_MASK ( 1 << S9_REGNUM)
+#define S10_MASK ( 1 << S10_REGNUM)
+#define S11_MASK ( 1 << S11_REGNUM)
+
+#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK | S3_MASK \
+ | S4_MASK | S5_MASK | S6_MASK | S7_MASK \
+ | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
+#define ZCMP_MAX_SPIMM 3
+#define ZCMP_SP_INC_STEP 16
+#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
+#define ZCMP_S0S11_SREGS_COUNTS 12
+#define ZCMP_MAX_GRP_SLOTS 13
+
 #define MCOUNT_NAME "_mcount"
 #define NO_PROFILE_COUNTERS 1
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 7065e68c0b7..73fc8cb69bc 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -113,6 +113,7 @@
 (define_constants
 [(RETURN_ADDR_REGNUM 1)
+ (SP_REGNUM 2)
 (GP_REGNUM 3)
 (TP_REGNUM 4)
 (T0_REGNUM 5)
@@ -3205,3 +3206,4 @@
 (include "sifive-7.md")
 (include "thead.md")
 (include "vector.md")
+(include "zc.md")
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
new file mode 100644
index 00000000000..6e6c87983fb
--- /dev/null
+++ b/gcc/config/riscv/zc.md
@@ -0,0 +1,1042 @@
+;; Machine description for RISC-V Zc extention.
+;; Copyright (C) 2011-2023 Free Software Foundation, Inc.
+;; Contributed by Fei Gao (gaofei@eswincomputing.com).
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3. If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_insn "gpr_multi_pop_up_to_ra_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s0_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s1_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s1}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s2_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s2}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s3_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s3}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s4_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s4}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s5_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s5}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s6_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s6}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s7_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s7}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s8_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s8}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s9_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 11 "slot_10_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s9}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s11_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+ (set (reg:X S11_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S10_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 11 "slot_10_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 12 "slot_11_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 13 "slot_12_offset_operand" "I"))))]
+ "TARGET_ZCMP"
+ "cm.pop {ra, s0-s11}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_ra_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s0_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s1_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s1}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s2_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s2}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s3_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s3}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s4_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s4}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s5_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s5}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s6_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s6}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s7_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s7}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s8_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s8}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s9_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 11 "slot_10_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s9}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s11_<mode>"
+ [(set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+ (set (reg:X S11_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
+ (set (reg:X S10_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
+ (set (reg:X S9_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
+ (set (reg:X S8_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
+ (set (reg:X S7_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
+ (set (reg:X S6_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
+ (set (reg:X S5_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
+ (set (reg:X S4_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
+ (set (reg:X S3_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
+ (set (reg:X S2_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
+ (set (reg:X S1_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 11 "slot_10_offset_operand" "I"))))
+ (set (reg:X S0_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 12 "slot_11_offset_operand" "I"))))
+ (set (reg:X RETURN_ADDR_REGNUM)
+ (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 13 "slot_12_offset_operand" "I"))))
+ (return)
+ (use (reg:SI RETURN_ADDR_REGNUM))]
+ "TARGET_ZCMP"
+ "cm.popret {ra, s0-s11}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_ra_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s0_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s1_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s1}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s2_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s2}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s3_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s3}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s4_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s4}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s5_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s5}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s6_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s6}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s7_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s7}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s8_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S8_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s8}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s9_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S9_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S8_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 11 "slot_10_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s9}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s11_<mode>"
+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 1 "slot_0_offset_operand" "I")))
+ (reg:X S11_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 2 "slot_1_offset_operand" "I")))
+ (reg:X S10_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 3 "slot_2_offset_operand" "I")))
+ (reg:X S9_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 4 "slot_3_offset_operand" "I")))
+ (reg:X S8_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 5 "slot_4_offset_operand" "I")))
+ (reg:X S7_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 6 "slot_5_offset_operand" "I")))
+ (reg:X S6_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 7 "slot_6_offset_operand" "I")))
+ (reg:X S5_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 8 "slot_7_offset_operand" "I")))
+ (reg:X S4_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 9 "slot_8_offset_operand" "I")))
+ (reg:X S3_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 10 "slot_9_offset_operand" "I")))
+ (reg:X S2_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 11 "slot_10_offset_operand" "I")))
+ (reg:X S1_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 12 "slot_11_offset_operand" "I")))
+ (reg:X S0_REGNUM))
+ (set (mem:X (plus:X (reg:X SP_REGNUM)
+ (match_operand:X 13 "slot_12_offset_operand" "I")))
+ (reg:X RETURN_ADDR_REGNUM))
+ (set (reg:X SP_REGNUM)
+ (plus:X (reg:X SP_REGNUM)
+ (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
+ "TARGET_ZCMP"
+ "cm.push {ra, s0-s11}, %0"
+)
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
new file mode 100644
index 00000000000..6dbe489da9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+ (int arg0, int arg1, int arg2, int arg3,
+ int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+** ...
+** cm.push {ra, s0-s1}, -64
+** ...
+** cm.popret {ra, s0-s1}, 64
+** ...
+*/
+int test1()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0;
+ for (int i = 0; i < 3120; i++)
+ {
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i];
+ }
+ return sum;
+}
+
+/*
+**test2_step1_0_size:
+** ...
+** cm.push {ra, s0}, -64
+** ...
+** cm.popret {ra, s0}, 64
+** ...
+*/
+int test2_step1_0_size()
+{
+ int volatile iarray[3120 + 1824/4 -8];
+
+ for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+ {
+ iarray[i] = my_getchar() * 2;
+ }
+ return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+** ...
+** cm.push {ra, s0-s1}, -64
+** ...
+** cm.popret {ra, s0-s1}, 64
+** ...
+*/
+float test3()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+ for (int i = 0; i < 3120; i++)
+ {
+ f1 = getf();
+ f2 = getf();
+ f3 = getf();
+ f4 = getf();
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+ }
+ return sum;
+}
+
+/*
+**outgoing_stack_args:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+int outgoing_stack_args()
+{
+ int local = getint();
+ return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+** ...
+** cm.push {ra}, -32
+** ...
+** cm.popret {ra}, 32
+** ...
+*/
+float callPrintInts()
+{
+ volatile float f = getf(); // f in local
+ PrintInts(9,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint:
+** ...
+** cm.push {ra}, -32
+** ...
+** cm.popret {ra}, 32
+** ...
+*/
+float callPrint()
+{
+ volatile float f = getf(); // f in local
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_S:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+float callPrint_S()
+{
+ float f = getf();
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_2:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+float callPrint_2()
+{
+ float f = getf();
+ PrintInts2(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+** ...
+** cm.push {ra}, -16
+** ...
+** cm.popret {ra}, 16
+** ...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+ int a = 9;
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s0:
+** ...
+** cm.push {ra, s0}, -16
+** ...
+** cm.popret {ra, s0}, 16
+** ...
+*/
+int test_s0()
+{
+
+ int a = my_getchar();
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s1:
+** ...
+** cm.push {ra, s0-s1}, -16
+** ...
+** cm.popret {ra, s0-s1}, 16
+** ...
+*/
+int test_s1()
+{
+
+ int s0 = my_getchar();
+ int s1 = my_getchar();
+ int b = my_getchar();
+ return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+** ...
+** cm.push {ra, s0-s1}, -16
+** ...
+** cm.popret {ra, s0-s1}, 16
+** ...
+*/
+int test_f0()
+{
+
+ int s0 = my_getchar();
+ float f0 = getf(); 
+ int b = my_getchar();
+ return f0 +s0 +b;
+}
+
+/*
+**foo:
+** cm.push {ra}, -16
+** call f1
+** cm.pop {ra}, 16
+** tail f2
+*/
+void foo(void)
+{
+ f1();
+ f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
new file mode 100644
index 00000000000..924197cb3c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+ (int arg0, int arg1, int arg2, int arg3,
+ int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+** ...
+** cm.push {ra, s0-s4}, -80
+** ...
+** cm.popret {ra, s0-s4}, 80
+** ...
+*/
+int test1()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0;
+ for (int i = 0; i < 3120; i++)
+ {
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i];
+ }
+ return sum;
+}
+
+/*
+**test2_step1_0_size:
+** ...
+** cm.push {ra, s0-s1}, -64
+** ...
+** cm.popret {ra, s0-s1}, 64
+** ...
+*/
+int test2_step1_0_size()
+{
+ int volatile iarray[3120 + 1824/4 -8];
+
+ for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+ {
+ iarray[i] = my_getchar() * 2;
+ }
+ return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+** ...
+** cm.push {ra, s0-s4}, -80
+** ...
+** cm.popret {ra, s0-s4}, 80
+** ...
+*/
+float test3()
+{
+ char volatile array[3120];
+ float volatile farray[3120];
+
+ float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+ for (int i = 0; i < 3120; i++)
+ {
+ f1 = getf();
+ f2 = getf();
+ f3 = getf();
+ f4 = getf();
+ array[i] = my_getchar();
+ farray[i] = my_getchar() * 1.2;
+ sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+ }
+ return sum;
+}
+
+/*
+**outgoing_stack_args:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+int outgoing_stack_args()
+{
+ int local = getint();
+ return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrintInts()
+{
+ volatile float f = getf(); // f in local
+ PrintInts(9,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrint()
+{
+ volatile float f = getf(); // f in local
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_S:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrint_S()
+{
+ float f = getf();
+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**callPrint_2:
+** ...
+** cm.push {ra}, -48
+** ...
+** cm.popret {ra}, 48
+** ...
+*/
+float callPrint_2()
+{
+ float f = getf();
+ PrintInts2(0,1,2,3,4,5,6,7,8,9);
+ return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+** ...
+** cm.push {ra}, -16
+** ...
+** cm.popret {ra}, 16
+** ...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+ int a = 9;
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s0:
+** ...
+** cm.push {ra, s0}, -16
+** ...
+** cm.popret {ra, s0}, 16
+** ...
+*/
+int test_s0()
+{
+
+ int a = my_getchar();
+ int b = my_getchar();
+ return a +b;
+}
+
+/*
+**test_s1:
+** ...
+** cm.push {ra, s0-s1}, -16
+** ...
+** cm.popret {ra, s0-s1}, 16
+** ...
+*/
+int test_s1()
+{
+
+ int s0 = my_getchar();
+ int s1 = my_getchar();
+ int b = my_getchar();
+ return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+** ...
+** cm.push {ra, s0}, -32
+** ...
+** cm.popret {ra, s0}, 32
+** ...
+*/
+int test_f0()
+{
+
+ int s0 = my_getchar();
+ float f0 = getf(); 
+ int b = my_getchar();
+ return f0 +s0 +b;
+}
+
+/*
+**foo:
+** cm.push {ra}, -16
+** call f1
+** cm.pop {ra}, 16
+** tail f2
+*/
+void foo(void)
+{
+ f1();
+ f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
new file mode 100644
index 00000000000..8e481522f89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
+/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+void bar();
+
+/*
+**fool_rv32e:
+** cm.push {ra}, -32
+** ...
+** call bar
+** ...
+** lw a5,32\(sp\)
+** ...
+** cm.popret {ra}, 32
+*/
+int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
+ int incoming0)
+{
+ bar();
+ return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
+}
\ No newline at end of file
-- 
2.17.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-05-30  5:26   ` [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp Sinan
@ 2023-05-30  7:44     ` Fei Gao
  0 siblings, 0 replies; 4+ messages in thread
From: Fei Gao @ 2023-05-30  7:44 UTC (permalink / raw)
  To: Sinan; +Cc: Kito Cheng, jiawei, lidie, Liao Shihua, gcc-patches

On 2023-05-30 13:26  Sinan <sinan.lin@linux.alibaba.com> wrote:
>
>>> +/* Return TRUE if Zcmp push and pop insns should be
>>> + avoided. FALSE otherwise.
>>> + Only use multi push & pop if all GPRs masked can be covered,
>>> + and stack access is SP based,
>>> + and GPRs are at top of the stack frame,
>>> + and no conflicts in stack allocation with other features */
>>> +static bool
>>> +riscv_avoid_multi_push(const struct riscv_frame_info *frame)
>>> +{
>>> + if (!TARGET_ZCMP
>>> + || crtl->calls_eh_return
>>> + || frame_pointer_needed
>>> + || cfun->machine->interrupt_handler_p
>>> + || cfun->machine->varargs_size != 0
>>> + || crtl->args.pretend_args_size != 0
>>> + || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
>>> + return true;
>>> +
>>> + return false;
>>> +}
>Any reason to skip generating push/pop in the cases where a frame pointer is needed?
>IIRC, only code compiled with O1 and above will omit frame pointer, if so then code with
>O0 will never generate cm.push/pop. 
without -fomit-frame-pointer in O0, the stack access is s0 based, while cm.push/pop is sp based access.
So cm.push/pop will not be generated. Same logic as taken by save-restore.

>Same question for interrupt_handler_p. I think cm.push/pop can handle this case. e.g.
>the test case zc-zcmp-push-pop-6.c from Jiawei's patch. 
Same logic as taken by save-restore. I don't know the exact reason why save-restore cannot be used in interrupt. 
In riscv_compute_frame_info, riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size fails in most cases for interrupt.
          if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
              && !riscv_avoid_save_libcall ())
            {
              ...
              frame->save_libcall_adjustment = x_save_size;
            }
In my understanding, use save-restore if all regs to be saved can be covered. That's why i added (frame->mask & ~ MULTI_PUSH_GPR_MASK)
in riscv_avoid_multi_push.

BR, 
Fei

>BR,
>Sinan 

>------------------------------------------------------------------
>Sender:Fei Gao <gaofei@eswincomputing.com>
>Sent At:2023 May 16 (Tue.) 17:34
>Recipient:sinan.lin <sinan.lin@linux.alibaba.com>; jiawei <jiawei@iscas.ac.cn>; shihua <shihua@iscas.ac.cn>; lidie <lidie@eswincomputing.com>
>Cc:Fei Gao <gaofei@eswincomputing.com>
>Subject:[PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp
>Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
>by cm.push, step 1 and step 2.
>please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
>So adaption has been done in .cfi directives in my patch.
>gcc/ChangeLog:
> * config/riscv/predicates.md (slot_0_offset_operand): predicates for slot 0 offset.
> (slot_1_offset_operand): likewise
> (slot_2_offset_operand): likewise
> (slot_3_offset_operand): likewise
> (slot_4_offset_operand): likewise
> (slot_5_offset_operand): likewise
> (slot_6_offset_operand): likewise
> (slot_7_offset_operand): likewise
> (slot_8_offset_operand): likewise
> (slot_9_offset_operand): likewise
> (slot_10_offset_operand): likewise
> (slot_11_offset_operand): likewise
> (slot_12_offset_operand): likewise
> (stack_push_up_to_ra_operand): predicates for stack adjust of pushing ra
> (stack_push_up_to_s0_operand): predicates for stack adjust of pushing ra, s0
> (stack_push_up_to_s1_operand): likewise
> (stack_push_up_to_s2_operand): likewise
> (stack_push_up_to_s3_operand): likewise
> (stack_push_up_to_s4_operand): likewise
> (stack_push_up_to_s5_operand): likewise
> (stack_push_up_to_s6_operand): likewise
> (stack_push_up_to_s7_operand): likewise
> (stack_push_up_to_s8_operand): likewise
> (stack_push_up_to_s9_operand): likewise
> (stack_push_up_to_s11_operand): likewise
> (stack_pop_up_to_ra_operand): predicates for stack adjust of poping ra
> (stack_pop_up_to_s0_operand): predicates for stack adjust of poping ra, s0
> (stack_pop_up_to_s1_operand): likewise
> (stack_pop_up_to_s2_operand): likewise
> (stack_pop_up_to_s3_operand): likewise
> (stack_pop_up_to_s4_operand): likewise
> (stack_pop_up_to_s5_operand): likewise
> (stack_pop_up_to_s6_operand): likewise
> (stack_pop_up_to_s7_operand): likewise
> (stack_pop_up_to_s8_operand): likewise
> (stack_pop_up_to_s9_operand): likewise
> (stack_pop_up_to_s11_operand): likewise
> * config/riscv/riscv-protos.h (riscv_zcmp_valid_slot_offset_p): declaration
> (riscv_zcmp_valid_stack_adj_bytes_p): declaration
> * config/riscv/riscv.cc (struct riscv_frame_info): comment change
> (riscv_avoid_multi_push): helper function of riscv_use_multi_push
> (riscv_use_multi_push): true if multi push is used
> (riscv_multi_push_sregs_count): num of sregs in multi-push
> (riscv_multi_push_regs_count): num of regs in multi-push
> (riscv_16bytes_align): align to 16 bytes
> (riscv_stack_align): moved to a better place
> (riscv_save_libcall_count): no functional change
> (riscv_compute_frame_info): add zcmp frame info
> (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
> (get_slot_offset_rtx): get the rtx of slot to push or pop
> (riscv_gen_multi_push_pop_insn): gen function for multi push and pop
> (riscv_expand_prologue): allocate stack by cm.push
> (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
> (riscv_expand_epilogue): allocate stack by cm.pop[ret]
> (zcmp_base_adj): calculate stack adjustment base size
> (zcmp_additional_adj): calculate stack adjustment additional size
> (riscv_zcmp_valid_slot_offset_p): check if offset is valid for a slot
> (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment size is valid
> * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
> (S0_MASK): likewise
> (S1_MASK): likewise
> (S2_MASK): likewise
> (S3_MASK): likewise
> (S4_MASK): likewise
> (S5_MASK): likewise
> (S6_MASK): likewise
> (S7_MASK): likewise
> (S8_MASK): likewise
> (S9_MASK): likewise
> (S10_MASK): likewise
> (S11_MASK): likewise
> (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
> (ZCMP_MAX_SPIMM): max spimm value
> (ZCMP_SP_INC_STEP): zcmp sp increment step
> (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
> (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
> (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
> * config/riscv/riscv.md: include zc.md
> * config/riscv/zc.md: New file. machine description for zcmp
>gcc/testsuite/ChangeLog:
> * gcc.target/riscv/rv32e_zcmp.c: New test.
> * gcc.target/riscv/rv32i_zcmp.c: New test.
> * gcc.target/riscv/zcmp_stack_alignment.c: New test.
>---
> gcc/config/riscv/predicates.md | 148 +++
> gcc/config/riscv/riscv-protos.h | 2 +
> gcc/config/riscv/riscv.cc | 477 +++++++-
> gcc/config/riscv/riscv.h | 23 +
> gcc/config/riscv/riscv.md | 2 +
> gcc/config/riscv/zc.md | 1042 +++++++++++++++++
> gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 239 ++++
> gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 239 ++++
> .../gcc.target/riscv/zcmp_stack_alignment.c | 23 +
> 9 files changed, 2155 insertions(+), 40 deletions(-)
> create mode 100644 gcc/config/riscv/zc.md
> create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
>diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
>index e5adf06fa25..d3d30dc67f7 100644
>--- a/gcc/config/riscv/predicates.md
>+++ b/gcc/config/riscv/predicates.md
>@@ -59,6 +59,154 @@
> (ior (match_operand 0 "const_0_operand")
> (match_operand 0 "register_operand")))
>+(define_predicate "slot_0_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 0)")))
>+
>+(define_predicate "slot_1_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 1)")))
>+
>+(define_predicate "slot_2_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 2)")))
>+
>+(define_predicate "slot_3_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 3)")))
>+
>+(define_predicate "slot_4_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 4)")))
>+
>+(define_predicate "slot_5_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 5)")))
>+
>+(define_predicate "slot_6_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 6)")))
>+
>+(define_predicate "slot_7_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 7)")))
>+
>+(define_predicate "slot_8_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 8)")))
>+
>+(define_predicate "slot_9_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 9)")))
>+
>+(define_predicate "slot_10_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 10)")))
>+
>+(define_predicate "slot_11_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 11)")))
>+
>+(define_predicate "slot_12_offset_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 12)")))
>+
>+(define_predicate "stack_push_up_to_ra_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
>+
>+(define_predicate "stack_push_up_to_s0_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
>+
>+(define_predicate "stack_push_up_to_s1_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
>+
>+(define_predicate "stack_push_up_to_s2_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
>+
>+(define_predicate "stack_push_up_to_s3_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
>+
>+(define_predicate "stack_push_up_to_s4_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
>+
>+(define_predicate "stack_push_up_to_s5_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
>+
>+(define_predicate "stack_push_up_to_s6_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
>+
>+(define_predicate "stack_push_up_to_s7_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
>+
>+(define_predicate "stack_push_up_to_s8_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
>+
>+(define_predicate "stack_push_up_to_s9_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
>+
>+(define_predicate "stack_push_up_to_s11_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
>+
>+(define_predicate "stack_pop_up_to_ra_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
>+
>+(define_predicate "stack_pop_up_to_s0_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
>+
>+(define_predicate "stack_pop_up_to_s1_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
>+
>+(define_predicate "stack_pop_up_to_s2_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
>+
>+(define_predicate "stack_pop_up_to_s3_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
>+
>+(define_predicate "stack_pop_up_to_s4_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
>+
>+(define_predicate "stack_pop_up_to_s5_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
>+
>+(define_predicate "stack_pop_up_to_s6_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
>+
>+(define_predicate "stack_pop_up_to_s7_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
>+
>+(define_predicate "stack_pop_up_to_s8_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
>+
>+(define_predicate "stack_pop_up_to_s9_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
>+
>+(define_predicate "stack_pop_up_to_s11_operand"
>+ (and (match_code "const_int")
>+ (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
>+
> ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
> (define_predicate "branch_on_bit_operand"
> (and (match_code "const_int")
>diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
>index 7760a9cac8d..f0ea14f05be 100644
>--- a/gcc/config/riscv/riscv-protos.h
>+++ b/gcc/config/riscv/riscv-protos.h
>@@ -56,6 +56,8 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
> extern void riscv_split_doubleword_move (rtx, rtx);
> extern const char *riscv_output_move (rtx, rtx);
> extern const char *riscv_output_return ();
>+extern bool riscv_zcmp_valid_slot_offset_p (HOST_WIDE_INT, int);
>+extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
> #ifdef RTX_CODE
> extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
>diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>index 629e5e45cac..a0a2db1f594 100644
>--- a/gcc/config/riscv/riscv.cc
>+++ b/gcc/config/riscv/riscv.cc
>@@ -117,6 +117,14 @@ struct GTY(()) riscv_frame_info {
> /* How much the GPR save/restore routines adjust sp (or 0 if unused). */
> unsigned save_libcall_adjustment;
>+ /* the minimum number of bytes, in multiples of 16-byte address increments,
>+ required to cover the registers in a multi push & pop. */
>+ unsigned multi_push_adj_base;
>+
>+ /* the number of additional 16-byte address increments allocated for the stack frame
>+ in a multi push & pop. */
>+ unsigned multi_push_adj_addi;
>+
> /* Offsets of fixed-point and floating-point save areas from frame bottom */
> poly_int64 gp_sp_offset;
> poly_int64 fp_sp_offset;
>@@ -413,6 +421,21 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
> #include "riscv-cores.def"
> };
>+typedef enum
>+{
>+ SI_IDX = 0,
>+ DI_IDX,
>+ ZCMP_MODE_NUM = DI_IDX
>+} mode_idx;
>+
>+typedef enum
>+{
>+ PUSH_IDX = 0,
>+ POP_IDX,
>+ POPRET_IDX,
>+ ZCMP_OP_NUM = POPRET_IDX
>+} op_idx;
>+
> void riscv_frame_info::reset(void)
> {
> total_size = 0;
>@@ -4844,6 +4867,37 @@ riscv_save_reg_p (unsigned int regno)
> return false;
> }
>+/* Return TRUE if Zcmp push and pop insns should be
>+ avoided. FALSE otherwise.
>+ Only use multi push & pop if all GPRs masked can be covered,
>+ and stack access is SP based,
>+ and GPRs are at top of the stack frame,
>+ and no conflicts in stack allocation with other features */
>+static bool
>+riscv_avoid_multi_push(const struct riscv_frame_info *frame)
>+{
>+ if (!TARGET_ZCMP
>+ || crtl->calls_eh_return
>+ || frame_pointer_needed
>+ || cfun->machine->interrupt_handler_p
>+ || cfun->machine->varargs_size != 0
>+ || crtl->args.pretend_args_size != 0
>+ || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
>+ return true;
>+
>+ return false;
>+}
>+
>+/* Determine whether to use multi push insn. */
>+static bool
>+riscv_use_multi_push(const struct riscv_frame_info *frame)
>+{
>+ if (riscv_avoid_multi_push (frame))
>+ return false;
>+
>+ return (frame->multi_push_adj_base != 0);
>+}
>+
> /* Return TRUE if a libcall to save/restore GPRs should be
> avoided. FALSE otherwise. */
> static bool
>@@ -4881,6 +4935,51 @@ riscv_save_libcall_count (unsigned mask)
> abort ();
> }
>+/* calculate number of s regs in multi push and pop.
>+ Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead. */
>+static unsigned
>+riscv_multi_push_sregs_count (unsigned mask)
>+{
>+ unsigned num = riscv_save_libcall_count (mask);
>+ return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
>+ ? ZCMP_S0S11_SREGS_COUNTS
>+ : num;
>+}
>+
>+/* calculate number of regs(ra, s0-sx) in multi push and pop. */
>+static unsigned
>+riscv_multi_push_regs_count (unsigned mask)
>+{
>+ /* 1 is for ra */
>+ return riscv_multi_push_sregs_count (mask) + 1;
>+}
>+
>+/* Handle 16 bytes align for poly_int. */
>+static poly_int64
>+riscv_16bytes_align (poly_int64 value)
>+{
>+ return aligned_upper_bound (value, 16);
>+}
>+
>+static HOST_WIDE_INT
>+riscv_16bytes_align (HOST_WIDE_INT value)
>+{
>+ return ROUND_UP(value, 16);
>+}
>+
>+/* Handle stack align for poly_int. */
>+static poly_int64
>+riscv_stack_align (poly_int64 value)
>+{
>+ return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
>+}
>+
>+static HOST_WIDE_INT
>+riscv_stack_align (HOST_WIDE_INT value)
>+{
>+ return RISCV_STACK_ALIGN (value);
>+}
>+
> /* Populate the current function's riscv_frame_info structure.
> RISC-V stack frames grown downward. High addresses are at the top.
>@@ -4906,7 +5005,7 @@ riscv_save_libcall_count (unsigned mask)
> | GPR save area | + UNITS_PER_WORD
> | |
> +-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
>- | | + UNITS_PER_HWVALUE
>+ | | + UNITS_PER_FP_REG
> | FPR save area |
> | |
> +-------------------------------+ <-- frame_pointer_rtx (virtual)
>@@ -4925,19 +5024,6 @@ riscv_save_libcall_count (unsigned mask)
> static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
>-/* Handle stack align for poly_int. */
>-static poly_int64
>-riscv_stack_align (poly_int64 value)
>-{
>- return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
>-}
>-
>-static HOST_WIDE_INT
>-riscv_stack_align (HOST_WIDE_INT value)
>-{
>- return RISCV_STACK_ALIGN (value);
>-}
>-
> static void
> riscv_compute_frame_info (void)
> {
>@@ -4985,8 +5071,9 @@ riscv_compute_frame_info (void)
> if (frame->mask)
> {
> x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
>- unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
>+ /* 1 is for ra */
>+ unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
> /* Only use save/restore routines if they don't alter the stack size. */
> if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
> && !riscv_avoid_save_libcall ())
>@@ -4998,6 +5085,15 @@ riscv_compute_frame_info (void)
> frame->save_libcall_adjustment = x_save_size;
> }
>+
>+ if (!riscv_avoid_multi_push (frame))
>+ {
>+ /* num(ra, s0-sx) */
>+ unsigned num_multi_push =
>+ riscv_multi_push_regs_count (frame->mask);
>+ x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
>+ frame->multi_push_adj_base = riscv_16bytes_align (x_save_size);
>+ }
> }
> /* At the bottom of the frame are any outgoing stack arguments. */
>@@ -5012,7 +5108,15 @@ riscv_compute_frame_info (void)
> frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
> /* Next are the callee-saved GPRs. */
> if (frame->mask)
>- offset += x_save_size;
>+ {
>+ offset += x_save_size;
>+ /* align to 16 bytes and add paddings to GPR part to honor
>+ both stack alignment and zcmp pus/pop size alignment. */
>+ if (riscv_use_multi_push (frame)
>+ && known_lt(offset,
>+ frame->multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
>+ offset = riscv_16bytes_align (offset);
>+ }
> frame->gp_sp_offset = offset - UNITS_PER_WORD;
> /* The hard frame pointer points above the callee-saved GPRs. */
> frame->hard_frame_pointer_offset = offset;
>@@ -5356,6 +5460,42 @@ riscv_adjust_libcall_cfi_prologue ()
> return dwarf;
> }
>+static rtx
>+riscv_adjust_multi_push_cfi_prologue (int saved_size)
>+{
>+ rtx dwarf = NULL_RTX;
>+ rtx adjust_sp_rtx, reg, mem, insn;
>+ unsigned int mask = cfun->machine->frame.mask;
>+ int offset;
>+ int saved_cnt = 0;
>+
>+ if (mask & S10_MASK)
>+ mask |= S11_MASK;
>+
>+ for (int regno = GP_REG_LAST; regno >= GP_REG_FIRST; regno--)
>+ if (BITSET_P (mask & MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
>+ {
>+ /* The save order is s11-s0, ra
>+ from high to low addr. */
>+ offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
>+
>+ reg = gen_rtx_REG (SImode, regno);
>+ mem = gen_frame_mem (SImode, plus_constant (Pmode,
>+ stack_pointer_rtx,
>+ offset));
>+
>+ insn = gen_rtx_SET (mem, reg);
>+ dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
>+ }
>+
>+ /* Debug info for adjust sp. */
>+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
>+ plus_constant(Pmode, stack_pointer_rtx, -saved_size));
>+ dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
>+ dwarf);
>+ return dwarf;
>+}
>+
> static void
> riscv_emit_stack_tie (void)
> {
>@@ -5365,6 +5505,152 @@ riscv_emit_stack_tie (void)
> emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
> }
>+static rtx
>+get_slot_offset_rtx (int slot_idx)
>+{
>+ HOST_WIDE_INT slot_offset = -1 * (slot_idx + 1) * GET_MODE_SIZE (word_mode);
>+ return GEN_INT (slot_offset);
>+}
>+
>+/*zcmp multi push and pop function ptr array */
>+const insn_gen_fn gen_push_pop [ZCMP_OP_NUM + 1][ZCMP_MODE_NUM + 1][ZCMP_MAX_GRP_SLOTS] =
>+{{{(insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_ra_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s0_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s1_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s2_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s3_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s4_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s5_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s6_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s7_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s8_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s9_si,
>+ NULL,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s11_si},
>+ {(insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_ra_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s0_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s1_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s2_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s3_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s4_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s5_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s6_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s7_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s8_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s9_di,
>+ NULL,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s11_di}},
>+ {{(insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_ra_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s0_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s1_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s2_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s3_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s4_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s5_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s6_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s7_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s8_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s9_si,
>+ NULL,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s11_si},
>+ {(insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_ra_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s0_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s1_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s2_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s3_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s4_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s5_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s6_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s7_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s8_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s9_di,
>+ NULL,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s11_di}},
>+ {{(insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_ra_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s0_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s1_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s2_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s3_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s4_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s5_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s6_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s7_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s8_si,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s9_si,
>+ NULL,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s11_si},
>+ {(insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_ra_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s0_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s1_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s2_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s3_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s4_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s5_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s6_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s7_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s8_di,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s9_di,
>+ NULL,
>+ (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s11_di}}};
>+
>+static rtx
>+riscv_gen_multi_push_pop_insn (op_idx op, HOST_WIDE_INT adj_size, unsigned int regs_num)
>+{
>+ rtx stack_adj = GEN_INT (adj_size);
>+ rtx slots[ZCMP_MAX_GRP_SLOTS];
>+
>+ for (int slot_idx = 0; slot_idx < ZCMP_MAX_GRP_SLOTS; slot_idx++)
>+ slots[slot_idx] = get_slot_offset_rtx (slot_idx);
>+
>+ switch (regs_num)
>+ {
>+ case 1:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0]);
>+ case 2:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1]);
>+ case 3:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2]);
>+ case 4:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3]);
>+ case 5:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4]);
>+ case 6:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5]);
>+ case 7:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
>+ slots[6]);
>+ case 8:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
>+ slots[6], slots[7]);
>+ case 9:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
>+ slots[6], slots[7], slots[8]);
>+ case 10:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
>+ slots[6], slots[7], slots[8], slots[9]);
>+ case 11:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
>+ slots[6], slots[7], slots[8], slots[9], slots[10]);
>+ case 13:
>+ return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
>+ (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
>+ slots[6], slots[7], slots[8], slots[9], slots[10], slots[11], slots[12]);
>+ default:
>+ gcc_unreachable ();
>+ }
>+}
>+
> /* Expand the "prologue" pattern. */
> void
>@@ -5373,7 +5659,8 @@ riscv_expand_prologue (void)
> struct riscv_frame_info *frame = &cfun->machine->frame;
> poly_int64 remaining_size = frame->total_size;
> unsigned mask = frame->mask;
>- rtx insn;
>+ int spimm, multi_push_additional, stack_adj;
>+ rtx insn, dwarf = NULL_RTX;
> if (flag_stack_usage_info)
> current_function_static_stack_size = constant_lower_bound (remaining_size);
>@@ -5381,8 +5668,35 @@ riscv_expand_prologue (void)
> if (cfun->machine->naked_p)
> return;
>+ /* prefer muti-push to save-restore libcall. */
>+ if (riscv_use_multi_push(frame))
>+ {
>+ remaining_size -= frame->multi_push_adj_base;
>+ if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
>+ spimm = 3;
>+ else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
>+ spimm = 2;
>+ else if (known_gt(remaining_size, 0))
>+ spimm = 1;
>+ else
>+ spimm = 0;
>+ multi_push_additional = spimm * ZCMP_SP_INC_STEP;
>+ frame->multi_push_adj_addi = multi_push_additional;
>+ remaining_size -= multi_push_additional;
>+
>+ /* emit multi push insn & dwarf along with it. */
>+ stack_adj = frame->multi_push_adj_base + multi_push_additional;
>+ insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
>+ -stack_adj, riscv_multi_push_regs_count(frame->mask)));
>+ dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
>+ RTX_FRAME_RELATED_P (insn) = 1;
>+ REG_NOTES (insn) = dwarf;
>+
>+ /* Temporarily fib that we need not save GPRs. */
>+ frame->mask = 0;
>+ }
> /* When optimizing for size, call a subroutine to save the registers. */
>- if (riscv_use_save_libcall (frame))
>+ else if (riscv_use_save_libcall (frame))
> {
> rtx dwarf = NULL_RTX;
> dwarf = riscv_adjust_libcall_cfi_prologue ();
>@@ -5398,13 +5712,15 @@ riscv_expand_prologue (void)
> /* Save the registers. */
> if ((frame->mask | frame->fmask) != 0)
> {
>- HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>-
>- insn = gen_add3_insn (stack_pointer_rtx,
>- stack_pointer_rtx,
>- GEN_INT (-step1));
>- RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>- remaining_size -= step1;
>+ if (known_gt (remaining_size, frame->frame_pointer_offset))
>+ {
>+ HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>+ remaining_size -= step1;
>+ insn = gen_add3_insn (stack_pointer_rtx,
>+ stack_pointer_rtx,
>+ GEN_INT (-step1));
>+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>+ }
> riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
> }
>@@ -5461,6 +5777,32 @@ riscv_expand_prologue (void)
> }
> }
>+static rtx
>+riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
>+{
>+ rtx dwarf = NULL_RTX;
>+ rtx adjust_sp_rtx, reg;
>+ unsigned int mask = cfun->machine->frame.mask;
>+
>+ if (mask & S10_MASK)
>+ mask |= S11_MASK;
>+
>+ /* Debug info for adjust sp. */
>+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
>+ plus_constant(Pmode, stack_pointer_rtx, saved_size));
>+ dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
>+ dwarf);
>+
>+ for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
>+ if (BITSET_P (mask, regno - GP_REG_FIRST))
>+ {
>+ reg = gen_rtx_REG (SImode, regno);
>+ dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
>+ }
>+
>+ return dwarf;
>+}
>+
> static rtx
> riscv_adjust_libcall_cfi_epilogue ()
> {
>@@ -5500,10 +5842,18 @@ riscv_expand_epilogue (int style)
> struct riscv_frame_info *frame = &cfun->machine->frame;
> unsigned mask = frame->mask;
> HOST_WIDE_INT step2 = 0;
>- bool use_restore_libcall = ((style == NORMAL_RETURN)
>- && riscv_use_save_libcall (frame));
>- unsigned libcall_size = (use_restore_libcall
>- ? frame->save_libcall_adjustment : 0);
>+ bool use_multi_pop_normal = ((style == NORMAL_RETURN)
>+ && riscv_use_multi_push (frame));
>+ bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
>+ && riscv_use_multi_push (frame));
>+ bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
>+
>+ bool use_restore_libcall = !use_multi_pop && ((style == NORMAL_RETURN)
>+ && riscv_use_save_libcall (frame));
>+ unsigned libcall_size = use_restore_libcall && !use_multi_pop ?
>+ frame->save_libcall_adjustment : 0;
>+ unsigned multipop_size = use_multi_pop ?
>+ frame->multi_push_adj_base + frame->multi_push_adj_addi : 0;
> rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
> rtx insn;
>@@ -5574,18 +5924,25 @@ riscv_expand_epilogue (int style)
> REG_NOTES (insn) = dwarf;
> }
>- if (use_restore_libcall)
>- frame->mask = 0; /* Temporarily fib for GPRs. */
>+ if (use_restore_libcall || use_multi_pop)
>+ frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
> /* If we need to restore registers, deallocate as much stack as
> possible in the second step without going out of range. */
>- if ((frame->mask | frame->fmask) != 0)
>+ if (use_multi_pop)
>+ {
>+ if (frame->fmask
>+ && known_gt (frame->total_size - multipop_size,
>+ frame->frame_pointer_offset))
>+ step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
>+ }
>+ else if ((frame->mask | frame->fmask) != 0)
> step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
>- if (use_restore_libcall)
>+ if (use_restore_libcall || use_multi_pop)
> frame->mask = mask; /* Undo the above fib. */
>- poly_int64 step1 = frame->total_size - step2 - libcall_size;
>+ poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
> /* Set TARGET to BASE + STEP1. */
> if (known_gt (step1, 0))
>@@ -5620,7 +5977,7 @@ riscv_expand_epilogue (int style)
> adjust));
> rtx dwarf = NULL_RTX;
> rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
>- GEN_INT (step2));
>+ GEN_INT (step2 + libcall_size + multipop_size));
> dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
> RTX_FRAME_RELATED_P (insn) = 1;
>@@ -5635,15 +5992,15 @@ riscv_expand_epilogue (int style)
> epilogue_cfa_sp_offset = step2;
> }
>- if (use_restore_libcall)
>+ if (use_restore_libcall || use_multi_pop)
> frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
> /* Restore the registers. */
>- riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
>+ riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
> riscv_restore_reg,
> true, style == EXCEPTION_RETURN);
>- if (use_restore_libcall)
>+ if (use_restore_libcall || use_multi_pop)
> frame->mask = mask; /* Undo the above fib. */
> if (need_barrier_p)
>@@ -5657,14 +6014,30 @@ riscv_expand_epilogue (int style)
> rtx dwarf = NULL_RTX;
> rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
>- const0_rtx);
>+ GEN_INT (libcall_size + multipop_size));
> dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
> RTX_FRAME_RELATED_P (insn) = 1;
> REG_NOTES (insn) = dwarf;
> }
>- if (use_restore_libcall)
>+ if (use_multi_pop)
>+ {
>+ unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
>+ if (use_multi_pop_normal)
>+ insn = emit_jump_insn (
>+ riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
>+ else
>+ insn= emit_insn (
>+ riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
>+
>+ rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
>+ RTX_FRAME_RELATED_P (insn) = 1;
>+ REG_NOTES (insn) = dwarf;
>+ if (use_multi_pop_normal)
>+ return;
>+ }
>+ else if (use_restore_libcall)
> {
> rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
> insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
>@@ -6937,6 +7310,30 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
> return gen_rtx_PARALLEL (VOIDmode, vec);
> }
>+static HOST_WIDE_INT zcmp_base_adj(int regs_num)
>+{
>+ return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
>+}
>+
>+static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
>+{
>+ return total - zcmp_base_adj(regs_num);
>+}
>+
>+bool riscv_zcmp_valid_slot_offset_p (HOST_WIDE_INT offset, int slot_idx)
>+{
>+ return offset == -1 * (slot_idx + 1) * GET_MODE_SIZE (word_mode);
>+}
>+
>+bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
>+{
>+ HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
>+ return additioanl_bytes == 0
>+ || additioanl_bytes == 1 * ZCMP_SP_INC_STEP
>+ || additioanl_bytes == 2 * ZCMP_SP_INC_STEP
>+ || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
>+}
>+
> /* Return true if it's valid gpr_save pattern. */
> bool
>diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
>index 13038a39e5c..ff210083004 100644
>--- a/gcc/config/riscv/riscv.h
>+++ b/gcc/config/riscv/riscv.h
>@@ -413,6 +413,29 @@ ASM_MISA_SPEC
> #define RISCV_CALL_ADDRESS_TEMP(MODE) \
> gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
>+#define RETURN_ADDR_MASK ( 1 << RETURN_ADDR_REGNUM)
>+#define S0_MASK ( 1 << S0_REGNUM)
>+#define S1_MASK ( 1 << S1_REGNUM)
>+#define S2_MASK ( 1 << S2_REGNUM)
>+#define S3_MASK ( 1 << S3_REGNUM)
>+#define S4_MASK ( 1 << S4_REGNUM)
>+#define S5_MASK ( 1 << S5_REGNUM)
>+#define S6_MASK ( 1 << S6_REGNUM)
>+#define S7_MASK ( 1 << S7_REGNUM)
>+#define S8_MASK ( 1 << S8_REGNUM)
>+#define S9_MASK ( 1 << S9_REGNUM)
>+#define S10_MASK ( 1 << S10_REGNUM)
>+#define S11_MASK ( 1 << S11_REGNUM)
>+
>+#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK | S3_MASK \
>+ | S4_MASK | S5_MASK | S6_MASK | S7_MASK \
>+ | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
>+#define ZCMP_MAX_SPIMM 3
>+#define ZCMP_SP_INC_STEP 16
>+#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
>+#define ZCMP_S0S11_SREGS_COUNTS 12
>+#define ZCMP_MAX_GRP_SLOTS 13
>+
> #define MCOUNT_NAME "_mcount"
> #define NO_PROFILE_COUNTERS 1
>diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
>index 7065e68c0b7..73fc8cb69bc 100644
>--- a/gcc/config/riscv/riscv.md
>+++ b/gcc/config/riscv/riscv.md
>@@ -113,6 +113,7 @@
> (define_constants
> [(RETURN_ADDR_REGNUM 1)
>+ (SP_REGNUM 2)
> (GP_REGNUM 3)
> (TP_REGNUM 4)
> (T0_REGNUM 5)
>@@ -3205,3 +3206,4 @@
> (include "sifive-7.md")
> (include "thead.md")
> (include "vector.md")
>+(include "zc.md")
>diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
>new file mode 100644
>index 00000000000..6e6c87983fb
>--- /dev/null
>+++ b/gcc/config/riscv/zc.md
>@@ -0,0 +1,1042 @@
>+;; Machine description for RISC-V Zc extention.
>+;; Copyright (C) 2011-2023 Free Software Foundation, Inc.
>+;; Contributed by Fei Gao (gaofei@eswincomputing.com).
>+
>+;; This file is part of GCC.
>+
>+;; GCC is free software; you can redistribute it and/or modify
>+;; it under the terms of the GNU General Public License as published by
>+;; the Free Software Foundation; either version 3, or (at your option)
>+;; any later version.
>+
>+;; GCC is distributed in the hope that it will be useful,
>+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
>+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>+;; GNU General Public License for more details.
>+
>+;; You should have received a copy of the GNU General Public License
>+;; along with GCC; see the file COPYING3. If not see
>+;; <http://www.gnu.org/licenses/>.
>+
>+(define_insn "gpr_multi_pop_up_to_ra_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s0_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s1_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s1}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s2_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s2}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s3_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s3}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s4_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s4}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s5_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s5}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s6_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s6}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s7_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s7}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s8_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
>+ (set (reg:X S8_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s8}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s9_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
>+ (set (reg:X S9_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S8_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 11 "slot_10_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s9}, %0"
>+)
>+
>+(define_insn "gpr_multi_pop_up_to_s11_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
>+ (set (reg:X S11_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S10_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S9_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S8_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 11 "slot_10_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 12 "slot_11_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 13 "slot_12_offset_operand" "I"))))]
>+ "TARGET_ZCMP"
>+ "cm.pop {ra, s0-s11}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_ra_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s0_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s1_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s1}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s2_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s2}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s3_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s3}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s4_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s4}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s5_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s5}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s6_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s6}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s7_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s7}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s8_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
>+ (set (reg:X S8_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s8}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s9_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
>+ (set (reg:X S9_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S8_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 11 "slot_10_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s9}, %0"
>+)
>+
>+(define_insn "gpr_multi_popret_up_to_s11_<mode>"
>+ [(set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
>+ (set (reg:X S11_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I"))))
>+ (set (reg:X S10_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I"))))
>+ (set (reg:X S9_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I"))))
>+ (set (reg:X S8_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I"))))
>+ (set (reg:X S7_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I"))))
>+ (set (reg:X S6_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I"))))
>+ (set (reg:X S5_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I"))))
>+ (set (reg:X S4_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I"))))
>+ (set (reg:X S3_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I"))))
>+ (set (reg:X S2_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I"))))
>+ (set (reg:X S1_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 11 "slot_10_offset_operand" "I"))))
>+ (set (reg:X S0_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 12 "slot_11_offset_operand" "I"))))
>+ (set (reg:X RETURN_ADDR_REGNUM)
>+ (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 13 "slot_12_offset_operand" "I"))))
>+ (return)
>+ (use (reg:SI RETURN_ADDR_REGNUM))]
>+ "TARGET_ZCMP"
>+ "cm.popret {ra, s0-s11}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_ra_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s0_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s1_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s1}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s2_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s2}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s3_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s3}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s4_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s4}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s5_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S5_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s5}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s6_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S6_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S5_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s6}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s7_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S7_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S6_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S5_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s7}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s8_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S8_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S7_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S6_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S5_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s8}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s9_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S9_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S8_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S7_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S6_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S5_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 11 "slot_10_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s9}, %0"
>+)
>+
>+(define_insn "gpr_multi_push_up_to_s11_<mode>"
>+ [(set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 1 "slot_0_offset_operand" "I")))
>+ (reg:X S11_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 2 "slot_1_offset_operand" "I")))
>+ (reg:X S10_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 3 "slot_2_offset_operand" "I")))
>+ (reg:X S9_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 4 "slot_3_offset_operand" "I")))
>+ (reg:X S8_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 5 "slot_4_offset_operand" "I")))
>+ (reg:X S7_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 6 "slot_5_offset_operand" "I")))
>+ (reg:X S6_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 7 "slot_6_offset_operand" "I")))
>+ (reg:X S5_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 8 "slot_7_offset_operand" "I")))
>+ (reg:X S4_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 9 "slot_8_offset_operand" "I")))
>+ (reg:X S3_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 10 "slot_9_offset_operand" "I")))
>+ (reg:X S2_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 11 "slot_10_offset_operand" "I")))
>+ (reg:X S1_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 12 "slot_11_offset_operand" "I")))
>+ (reg:X S0_REGNUM))
>+ (set (mem:X (plus:X (reg:X SP_REGNUM)
>+ (match_operand:X 13 "slot_12_offset_operand" "I")))
>+ (reg:X RETURN_ADDR_REGNUM))
>+ (set (reg:X SP_REGNUM)
>+ (plus:X (reg:X SP_REGNUM)
>+ (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
>+ "TARGET_ZCMP"
>+ "cm.push {ra, s0-s11}, %0"
>+)
>diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
>new file mode 100644
>index 00000000000..6dbe489da9b
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
>@@ -0,0 +1,239 @@
>+/* { dg-do compile } */
>+/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
>+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+char my_getchar();
>+float getf();
>+int __attribute__((noinline)) incoming_stack_args
>+ (int arg0, int arg1, int arg2, int arg3,
>+ int arg4, int arg5, int arg6, int arg7, int arg8);
>+int getint();
>+void PrintInts (int n, ...); // varargs
>+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
>+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
>+extern void f1(void);
>+extern void f2(void);
>+
>+/*
>+**test1:
>+** ...
>+** cm.push {ra, s0-s1}, -64
>+** ...
>+** cm.popret {ra, s0-s1}, 64
>+** ...
>+*/
>+int test1()
>+{
>+ char volatile array[3120];
>+ float volatile farray[3120];
>+
>+ float sum = 0;
>+ for (int i = 0; i < 3120; i++)
>+ {
>+ array[i] = my_getchar();
>+ farray[i] = my_getchar() * 1.2;
>+ sum += array[i] + farray[i];
>+ }
>+ return sum;
>+}
>+
>+/*
>+**test2_step1_0_size:
>+** ...
>+** cm.push {ra, s0}, -64
>+** ...
>+** cm.popret {ra, s0}, 64
>+** ...
>+*/
>+int test2_step1_0_size()
>+{
>+ int volatile iarray[3120 + 1824/4 -8];
>+
>+ for (int i = 0; i < 3120 + 1824/4 - 8; i++)
>+ {
>+ iarray[i] = my_getchar() * 2;
>+ }
>+ return iarray[0] + iarray[1];
>+}
>+
>+/*
>+**test3:
>+** ...
>+** cm.push {ra, s0-s1}, -64
>+** ...
>+** cm.popret {ra, s0-s1}, 64
>+** ...
>+*/
>+float test3()
>+{
>+ char volatile array[3120];
>+ float volatile farray[3120];
>+
>+ float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
>+
>+ for (int i = 0; i < 3120; i++)
>+ {
>+ f1 = getf();
>+ f2 = getf();
>+ f3 = getf();
>+ f4 = getf();
>+ array[i] = my_getchar();
>+ farray[i] = my_getchar() * 1.2;
>+ sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
>+ }
>+ return sum;
>+}
>+
>+/*
>+**outgoing_stack_args:
>+** ...
>+** cm.push {ra, s0}, -32
>+** ...
>+** cm.popret {ra, s0}, 32
>+** ...
>+*/
>+int outgoing_stack_args()
>+{
>+ int local = getint();
>+ return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
>+}
>+
>+/*
>+**callPrintInts:
>+** ...
>+** cm.push {ra}, -32
>+** ...
>+** cm.popret {ra}, 32
>+** ...
>+*/
>+float callPrintInts()
>+{
>+ volatile float f = getf(); // f in local
>+ PrintInts(9,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**callPrint:
>+** ...
>+** cm.push {ra}, -32
>+** ...
>+** cm.popret {ra}, 32
>+** ...
>+*/
>+float callPrint()
>+{
>+ volatile float f = getf(); // f in local
>+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**callPrint_S:
>+** ...
>+** cm.push {ra, s0}, -32
>+** ...
>+** cm.popret {ra, s0}, 32
>+** ...
>+*/
>+float callPrint_S()
>+{
>+ float f = getf();
>+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**callPrint_2:
>+** ...
>+** cm.push {ra, s0}, -32
>+** ...
>+** cm.popret {ra, s0}, 32
>+** ...
>+*/
>+float callPrint_2()
>+{
>+ float f = getf();
>+ PrintInts2(0,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**test_step1_0bytes_save_restore:
>+** ...
>+** cm.push {ra}, -16
>+** ...
>+** cm.popret {ra}, 16
>+** ...
>+*/
>+int test_step1_0bytes_save_restore()
>+{
>+
>+ int a = 9;
>+ int b = my_getchar();
>+ return a +b;
>+}
>+
>+/*
>+**test_s0:
>+** ...
>+** cm.push {ra, s0}, -16
>+** ...
>+** cm.popret {ra, s0}, 16
>+** ...
>+*/
>+int test_s0()
>+{
>+
>+ int a = my_getchar();
>+ int b = my_getchar();
>+ return a +b;
>+}
>+
>+/*
>+**test_s1:
>+** ...
>+** cm.push {ra, s0-s1}, -16
>+** ...
>+** cm.popret {ra, s0-s1}, 16
>+** ...
>+*/
>+int test_s1()
>+{
>+
>+ int s0 = my_getchar();
>+ int s1 = my_getchar();
>+ int b = my_getchar();
>+ return s1 +s0 +b;
>+}
>+
>+/*
>+**test_f0:
>+** ...
>+** cm.push {ra, s0-s1}, -16
>+** ...
>+** cm.popret {ra, s0-s1}, 16
>+** ...
>+*/
>+int test_f0()
>+{
>+
>+ int s0 = my_getchar();
>+ float f0 = getf();
>+ int b = my_getchar();
>+ return f0 +s0 +b;
>+}
>+
>+/*
>+**foo:
>+** cm.push {ra}, -16
>+** call f1
>+** cm.pop {ra}, 16
>+** tail f2
>+*/
>+void foo(void)
>+{
>+ f1();
>+ f2();
>+}
>diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
>new file mode 100644
>index 00000000000..924197cb3c4
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
>@@ -0,0 +1,239 @@
>+/* { dg-do compile } */
>+/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
>+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+char my_getchar();
>+float getf();
>+int __attribute__((noinline)) incoming_stack_args
>+ (int arg0, int arg1, int arg2, int arg3,
>+ int arg4, int arg5, int arg6, int arg7, int arg8);
>+int getint();
>+void PrintInts (int n, ...); // varargs
>+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
>+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
>+extern void f1(void);
>+extern void f2(void);
>+
>+/*
>+**test1:
>+** ...
>+** cm.push {ra, s0-s4}, -80
>+** ...
>+** cm.popret {ra, s0-s4}, 80
>+** ...
>+*/
>+int test1()
>+{
>+ char volatile array[3120];
>+ float volatile farray[3120];
>+
>+ float sum = 0;
>+ for (int i = 0; i < 3120; i++)
>+ {
>+ array[i] = my_getchar();
>+ farray[i] = my_getchar() * 1.2;
>+ sum += array[i] + farray[i];
>+ }
>+ return sum;
>+}
>+
>+/*
>+**test2_step1_0_size:
>+** ...
>+** cm.push {ra, s0-s1}, -64
>+** ...
>+** cm.popret {ra, s0-s1}, 64
>+** ...
>+*/
>+int test2_step1_0_size()
>+{
>+ int volatile iarray[3120 + 1824/4 -8];
>+
>+ for (int i = 0; i < 3120 + 1824/4 - 8; i++)
>+ {
>+ iarray[i] = my_getchar() * 2;
>+ }
>+ return iarray[0] + iarray[1];
>+}
>+
>+/*
>+**test3:
>+** ...
>+** cm.push {ra, s0-s4}, -80
>+** ...
>+** cm.popret {ra, s0-s4}, 80
>+** ...
>+*/
>+float test3()
>+{
>+ char volatile array[3120];
>+ float volatile farray[3120];
>+
>+ float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
>+
>+ for (int i = 0; i < 3120; i++)
>+ {
>+ f1 = getf();
>+ f2 = getf();
>+ f3 = getf();
>+ f4 = getf();
>+ array[i] = my_getchar();
>+ farray[i] = my_getchar() * 1.2;
>+ sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
>+ }
>+ return sum;
>+}
>+
>+/*
>+**outgoing_stack_args:
>+** ...
>+** cm.push {ra, s0}, -32
>+** ...
>+** cm.popret {ra, s0}, 32
>+** ...
>+*/
>+int outgoing_stack_args()
>+{
>+ int local = getint();
>+ return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
>+}
>+
>+/*
>+**callPrintInts:
>+** ...
>+** cm.push {ra}, -48
>+** ...
>+** cm.popret {ra}, 48
>+** ...
>+*/
>+float callPrintInts()
>+{
>+ volatile float f = getf(); // f in local
>+ PrintInts(9,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**callPrint:
>+** ...
>+** cm.push {ra}, -48
>+** ...
>+** cm.popret {ra}, 48
>+** ...
>+*/
>+float callPrint()
>+{
>+ volatile float f = getf(); // f in local
>+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**callPrint_S:
>+** ...
>+** cm.push {ra}, -48
>+** ...
>+** cm.popret {ra}, 48
>+** ...
>+*/
>+float callPrint_S()
>+{
>+ float f = getf();
>+ PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**callPrint_2:
>+** ...
>+** cm.push {ra}, -48
>+** ...
>+** cm.popret {ra}, 48
>+** ...
>+*/
>+float callPrint_2()
>+{
>+ float f = getf();
>+ PrintInts2(0,1,2,3,4,5,6,7,8,9);
>+ return f;
>+}
>+
>+/*
>+**test_step1_0bytes_save_restore:
>+** ...
>+** cm.push {ra}, -16
>+** ...
>+** cm.popret {ra}, 16
>+** ...
>+*/
>+int test_step1_0bytes_save_restore()
>+{
>+
>+ int a = 9;
>+ int b = my_getchar();
>+ return a +b;
>+}
>+
>+/*
>+**test_s0:
>+** ...
>+** cm.push {ra, s0}, -16
>+** ...
>+** cm.popret {ra, s0}, 16
>+** ...
>+*/
>+int test_s0()
>+{
>+
>+ int a = my_getchar();
>+ int b = my_getchar();
>+ return a +b;
>+}
>+
>+/*
>+**test_s1:
>+** ...
>+** cm.push {ra, s0-s1}, -16
>+** ...
>+** cm.popret {ra, s0-s1}, 16
>+** ...
>+*/
>+int test_s1()
>+{
>+
>+ int s0 = my_getchar();
>+ int s1 = my_getchar();
>+ int b = my_getchar();
>+ return s1 +s0 +b;
>+}
>+
>+/*
>+**test_f0:
>+** ...
>+** cm.push {ra, s0}, -32
>+** ...
>+** cm.popret {ra, s0}, 32
>+** ...
>+*/
>+int test_f0()
>+{
>+
>+ int s0 = my_getchar();
>+ float f0 = getf();
>+ int b = my_getchar();
>+ return f0 +s0 +b;
>+}
>+
>+/*
>+**foo:
>+** cm.push {ra}, -16
>+** call f1
>+** cm.pop {ra}, 16
>+** tail f2
>+*/
>+void foo(void)
>+{
>+ f1();
>+ f2();
>+}
>diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
>new file mode 100644
>index 00000000000..8e481522f89
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
>@@ -0,0 +1,23 @@
>+/* { dg-do compile } */
>+/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
>+/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+void bar();
>+
>+/*
>+**fool_rv32e:
>+** cm.push {ra}, -32
>+** ...
>+** call bar
>+** ...
>+** lw a5,32\(sp\)
>+** ...
>+** cm.popret {ra}, 32
>+*/
>+int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
>+ int incoming0)
>+{
>+ bar();
>+ return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
>+}
>\ No newline at end of file
>--
>2.17.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-05-12  9:04 ` [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
@ 2023-05-29  3:05   ` Kito Cheng
  0 siblings, 0 replies; 4+ messages in thread
From: Kito Cheng @ 2023-05-29  3:05 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei

Thanks for this patch, just few minor comment, I think this is pretty
close to accept :)

Could you reference JiaWei's match_parallel[1] to prevent adding bunch
of *_offset_operand and stack_push_up_to_*_operand?


[1] https://patchwork.sourceware.org/project/gcc/patch/20230406062118.47431-5-jiawei@iscas.ac.cn/


> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 629e5e45cac..a0a2db1f594 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -117,6 +117,14 @@ struct GTY(())  riscv_frame_info {
>    /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
>    unsigned save_libcall_adjustment;
>
> +  /* the minimum number of bytes, in multiples of 16-byte address increments,
> +     required to cover the registers in a multi push & pop.  */
> +  unsigned multi_push_adj_base;
> +
> +  /* the number of additional 16-byte address increments allocated for the stack frame
> +     in a multi push & pop.  */
> +  unsigned multi_push_adj_addi;
> +
>    /* Offsets of fixed-point and floating-point save areas from frame bottom */
>    poly_int64 gp_sp_offset;
>    poly_int64 fp_sp_offset;
> @@ -413,6 +421,21 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
>  #include "riscv-cores.def"
>  };
>
> +typedef enum
> +{
> +  SI_IDX = 0,
> +  DI_IDX,
> +  MAX_MODE_IDX = DI_IDX
> +} mode_idx;
> +

Didn't see any use in this version?

> @@ -5574,18 +5924,25 @@ riscv_expand_epilogue (int style)
>        REG_NOTES (insn) = dwarf;
>      }
>
> -  if (use_restore_libcall)
> -    frame->mask = 0; /* Temporarily fib for GPRs.  */
> +  if (use_restore_libcall || use_multi_pop)
> +    frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
>
>    /* If we need to restore registers, deallocate as much stack as
>       possible in the second step without going out of range.  */
> -  if ((frame->mask | frame->fmask) != 0)
> +  if (use_multi_pop)
> +    {
> +      if (frame->fmask
> +          && known_gt (frame->total_size - multipop_size,
> +                      frame->frame_pointer_offset))
> +        step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
> +    }
> +  else if ((frame->mask | frame->fmask) != 0)
>      step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
>
> -  if (use_restore_libcall)
> +  if (use_restore_libcall || use_multi_pop)
>      frame->mask = mask; /* Undo the above fib.  */
>
> -  poly_int64 step1 = frame->total_size - step2 - libcall_size;
> +  poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
>
>    /* Set TARGET to BASE + STEP1.  */
>    if (known_gt (step1, 0))
> @@ -5620,7 +5977,7 @@ riscv_expand_epilogue (int style)
>                                            adjust));
>           rtx dwarf = NULL_RTX;
>           rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> -                                            GEN_INT (step2));
> +                                            GEN_INT (step2 + libcall_size + multipop_size));

Why we need `+ libcall_size` here? or...why we don't need that before?

>
>           dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
>           RTX_FRAME_RELATED_P (insn) = 1;
> @@ -5635,15 +5992,15 @@ riscv_expand_epilogue (int style)
>        epilogue_cfa_sp_offset = step2;
>      }
>
> -  if (use_restore_libcall)
> +  if (use_restore_libcall || use_multi_pop)
>      frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
>
>    /* Restore the registers.  */
> -  riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
> +  riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
>                             riscv_restore_reg,
>                             true, style == EXCEPTION_RETURN);
>
> -  if (use_restore_libcall)
> +  if (use_restore_libcall || use_multi_pop)
>        frame->mask = mask; /* Undo the above fib.  */
>
>    if (need_barrier_p)
> @@ -5657,14 +6014,30 @@ riscv_expand_epilogue (int style)
>
>        rtx dwarf = NULL_RTX;
>        rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> -                                        const0_rtx);
> +                                        GEN_INT (libcall_size + multipop_size));

Same question for `libcall_size` part.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-05-12  9:04 [PATCH 0/1] [V2] RISC-V: support Zcmp extension Fei Gao
@ 2023-05-12  9:04 ` Fei Gao
  2023-05-29  3:05   ` Kito Cheng
  0 siblings, 1 reply; 4+ messages in thread
From: Fei Gao @ 2023-05-12  9:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao

Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
by cm.push, step 1 and step 2.

please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
So adaption has been done in .cfi directives in my patch.

gcc/ChangeLog:

        * config/riscv/predicates.md (slot_0_offset_operand): predicates for slot 0 offset.
        (slot_1_offset_operand): likewise
        (slot_2_offset_operand): likewise
        (slot_3_offset_operand): likewise
        (slot_4_offset_operand): likewise
        (slot_5_offset_operand): likewise
        (slot_6_offset_operand): likewise
        (slot_7_offset_operand): likewise
        (slot_8_offset_operand): likewise
        (slot_9_offset_operand): likewise
        (slot_10_offset_operand): likewise
        (slot_11_offset_operand): likewise
        (slot_12_offset_operand): likewise
        (stack_push_up_to_ra_operand): predicates for stack adjust of pushing ra
        (stack_push_up_to_s0_operand): predicates for stack adjust of pushing ra, s0
        (stack_push_up_to_s1_operand): likewise
        (stack_push_up_to_s2_operand): likewise
        (stack_push_up_to_s3_operand): likewise
        (stack_push_up_to_s4_operand): likewise
        (stack_push_up_to_s5_operand): likewise
        (stack_push_up_to_s6_operand): likewise
        (stack_push_up_to_s7_operand): likewise
        (stack_push_up_to_s8_operand): likewise
        (stack_push_up_to_s9_operand): likewise
        (stack_push_up_to_s11_operand): likewise
        (stack_pop_up_to_ra_operand): predicates for stack adjust of poping ra
        (stack_pop_up_to_s0_operand): predicates for stack adjust of poping ra, s0
        (stack_pop_up_to_s1_operand): likewise
        (stack_pop_up_to_s2_operand): likewise
        (stack_pop_up_to_s3_operand): likewise
        (stack_pop_up_to_s4_operand): likewise
        (stack_pop_up_to_s5_operand): likewise
        (stack_pop_up_to_s6_operand): likewise
        (stack_pop_up_to_s7_operand): likewise
        (stack_pop_up_to_s8_operand): likewise
        (stack_pop_up_to_s9_operand): likewise
        (stack_pop_up_to_s11_operand): likewise
        * config/riscv/riscv-protos.h (riscv_zcmp_valid_slot_offset_p): declaration
        (riscv_zcmp_valid_stack_adj_bytes_p): declaration
        * config/riscv/riscv.cc (struct riscv_frame_info): comment change
        (riscv_avoid_multi_push): helper function of riscv_use_multi_push
        (riscv_use_multi_push): true if multi push is used
        (riscv_multi_push_sregs_count): num of sregs in multi-push
        (riscv_multi_push_regs_count): num of regs in multi-push
        (riscv_16bytes_align): align to 16 bytes
        (riscv_stack_align): moved to a better place
        (riscv_save_libcall_count): no functional change
        (riscv_compute_frame_info): add zcmp frame info
        (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
        (get_slot_offset_rtx): get the rtx of slot to push or pop
        (riscv_gen_multi_push_pop_insn): gen function for multi push and pop
        (riscv_expand_prologue): allocate stack by cm.push
        (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
        (riscv_expand_epilogue): allocate stack by cm.pop[ret]
        (zcmp_base_adj): calculate stack adjustment base size
        (zcmp_additional_adj): calculate stack adjustment additional size
        (riscv_zcmp_valid_slot_offset_p): check if offset is valid for a slot
        (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment size is valid
        * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
        (S0_MASK): likewise
        (S1_MASK): likewise
        (S2_MASK): likewise
        (S3_MASK): likewise
        (S4_MASK): likewise
        (S5_MASK): likewise
        (S6_MASK): likewise
        (S7_MASK): likewise
        (S8_MASK): likewise
        (S9_MASK): likewise
        (S10_MASK): likewise
        (S11_MASK): likewise
        (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
        (ZCMP_MAX_SPIMM): max spimm value
        (ZCMP_SP_INC_STEP): zcmp sp increment step
        (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
        (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
        (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
        * config/riscv/riscv.md: include zc.md
        * config/riscv/zc.md: New file. machine description for zcmp

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rv32e_zcmp.c: New test.
        * gcc.target/riscv/rv32i_zcmp.c: New test.
        * gcc.target/riscv/zcmp_stack_alignment.c: New test.
---
 gcc/config/riscv/predicates.md                |  148 +++
 gcc/config/riscv/riscv-protos.h               |    2 +
 gcc/config/riscv/riscv.cc                     |  477 +++++++-
 gcc/config/riscv/riscv.h                      |   23 +
 gcc/config/riscv/riscv.md                     |    2 +
 gcc/config/riscv/zc.md                        | 1042 +++++++++++++++++
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c   |  239 ++++
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c   |  239 ++++
 .../gcc.target/riscv/zcmp_stack_alignment.c   |   23 +
 9 files changed, 2155 insertions(+), 40 deletions(-)
 create mode 100644 gcc/config/riscv/zc.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index e5adf06fa25..d3d30dc67f7 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -59,6 +59,154 @@
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
 
+(define_predicate "slot_0_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 0)")))
+
+(define_predicate "slot_1_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 1)")))
+
+(define_predicate "slot_2_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 2)")))
+
+(define_predicate "slot_3_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 3)")))
+
+(define_predicate "slot_4_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 4)")))
+
+(define_predicate "slot_5_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 5)")))
+
+(define_predicate "slot_6_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 6)")))
+
+(define_predicate "slot_7_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 7)")))
+
+(define_predicate "slot_8_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 8)")))
+
+(define_predicate "slot_9_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 9)")))
+
+(define_predicate "slot_10_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 10)")))
+
+(define_predicate "slot_11_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 11)")))
+
+(define_predicate "slot_12_offset_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_slot_offset_p (INTVAL (op), 12)")))
+
+(define_predicate "stack_push_up_to_ra_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
+
+(define_predicate "stack_push_up_to_s0_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
+
+(define_predicate "stack_push_up_to_s1_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
+
+(define_predicate "stack_push_up_to_s2_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
+
+(define_predicate "stack_push_up_to_s3_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
+
+(define_predicate "stack_push_up_to_s4_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
+
+(define_predicate "stack_push_up_to_s5_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
+
+(define_predicate "stack_push_up_to_s6_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
+
+(define_predicate "stack_push_up_to_s7_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
+
+(define_predicate "stack_push_up_to_s8_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
+
+(define_predicate "stack_push_up_to_s9_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
+
+(define_predicate "stack_push_up_to_s11_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
+
+(define_predicate "stack_pop_up_to_ra_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
+
+(define_predicate "stack_pop_up_to_s0_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
+
+(define_predicate "stack_pop_up_to_s1_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
+
+(define_predicate "stack_pop_up_to_s2_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
+
+(define_predicate "stack_pop_up_to_s3_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
+
+(define_predicate "stack_pop_up_to_s4_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
+
+(define_predicate "stack_pop_up_to_s5_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
+
+(define_predicate "stack_pop_up_to_s6_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
+
+(define_predicate "stack_pop_up_to_s7_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
+
+(define_predicate "stack_pop_up_to_s8_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
+
+(define_predicate "stack_pop_up_to_s9_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
+
+(define_predicate "stack_pop_up_to_s11_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
+
 ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
 (define_predicate "branch_on_bit_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 7760a9cac8d..f0ea14f05be 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -56,6 +56,8 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
 extern void riscv_split_doubleword_move (rtx, rtx);
 extern const char *riscv_output_move (rtx, rtx);
 extern const char *riscv_output_return ();
+extern bool riscv_zcmp_valid_slot_offset_p (HOST_WIDE_INT, int);
+extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 629e5e45cac..a0a2db1f594 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -117,6 +117,14 @@ struct GTY(())  riscv_frame_info {
   /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
   unsigned save_libcall_adjustment;
 
+  /* the minimum number of bytes, in multiples of 16-byte address increments,
+     required to cover the registers in a multi push & pop.  */
+  unsigned multi_push_adj_base;
+
+  /* the number of additional 16-byte address increments allocated for the stack frame
+     in a multi push & pop.  */
+  unsigned multi_push_adj_addi;
+
   /* Offsets of fixed-point and floating-point save areas from frame bottom */
   poly_int64 gp_sp_offset;
   poly_int64 fp_sp_offset;
@@ -413,6 +421,21 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
 #include "riscv-cores.def"
 };
 
+typedef enum
+{
+  SI_IDX = 0,
+  DI_IDX,
+  MAX_MODE_IDX = DI_IDX
+} mode_idx;
+
+typedef enum
+{
+  PUSH_IDX = 0,
+  POP_IDX,
+  POPRET_IDX,
+  MAX_OP_IDX = POPRET_IDX
+} op_idx;
+
 void riscv_frame_info::reset(void)
 {
   total_size = 0;
@@ -4844,6 +4867,37 @@ riscv_save_reg_p (unsigned int regno)
   return false;
 }
 
+/* Return TRUE if Zcmp push and pop insns should be
+   avoided. FALSE otherwise.
+   Only use multi push & pop if all GPRs masked can be covered,
+   and stack access is SP based,
+   and GPRs are at top of the stack frame,
+   and no conflicts in stack allocation with other features  */
+static bool
+riscv_avoid_multi_push(const struct riscv_frame_info *frame)
+{
+  if (!TARGET_ZCMP
+      || crtl->calls_eh_return
+      || frame_pointer_needed
+      || cfun->machine->interrupt_handler_p
+      || cfun->machine->varargs_size != 0
+      || crtl->args.pretend_args_size != 0
+      || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
+    return true;
+
+  return false;
+}
+
+/* Determine whether to use multi push insn.  */
+static bool
+riscv_use_multi_push(const struct riscv_frame_info *frame)
+{
+  if (riscv_avoid_multi_push (frame))
+    return false;
+
+  return (frame->multi_push_adj_base != 0);
+}
+
 /* Return TRUE if a libcall to save/restore GPRs should be
    avoided.  FALSE otherwise.  */
 static bool
@@ -4881,6 +4935,51 @@ riscv_save_libcall_count (unsigned mask)
   abort ();
 }
 
+/* calculate number of s regs in multi push and pop.
+   Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead.  */
+static unsigned
+riscv_multi_push_sregs_count (unsigned mask)
+{
+  unsigned num = riscv_save_libcall_count (mask);
+  return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
+    ? ZCMP_S0S11_SREGS_COUNTS
+    : num;
+}
+
+/* calculate number of regs(ra, s0-sx) in multi push and pop.  */
+static unsigned
+riscv_multi_push_regs_count (unsigned mask)
+{
+  /* 1 is for ra  */
+  return riscv_multi_push_sregs_count (mask) + 1;
+}
+
+/* Handle 16 bytes align for poly_int.  */
+static poly_int64
+riscv_16bytes_align (poly_int64 value)
+{
+  return aligned_upper_bound (value, 16);
+}
+
+static HOST_WIDE_INT
+riscv_16bytes_align (HOST_WIDE_INT value)
+{
+  return ROUND_UP(value, 16);
+}
+
+/* Handle stack align for poly_int.  */
+static poly_int64
+riscv_stack_align (poly_int64 value)
+{
+  return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
+}
+
+static HOST_WIDE_INT
+riscv_stack_align (HOST_WIDE_INT value)
+{
+  return RISCV_STACK_ALIGN (value);
+}
+
 /* Populate the current function's riscv_frame_info structure.
 
    RISC-V stack frames grown downward.  High addresses are at the top.
@@ -4906,7 +5005,7 @@ riscv_save_libcall_count (unsigned mask)
 	|  GPR save area                |       + UNITS_PER_WORD
 	|                               |
 	+-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
-	|                               |       + UNITS_PER_HWVALUE
+	|                               |       + UNITS_PER_FP_REG
 	|  FPR save area                |
 	|                               |
 	+-------------------------------+ <-- frame_pointer_rtx (virtual)
@@ -4925,19 +5024,6 @@ riscv_save_libcall_count (unsigned mask)
 
 static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
 
-/* Handle stack align for poly_int.  */
-static poly_int64
-riscv_stack_align (poly_int64 value)
-{
-  return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
-}
-
-static HOST_WIDE_INT
-riscv_stack_align (HOST_WIDE_INT value)
-{
-  return RISCV_STACK_ALIGN (value);
-}
-
 static void
 riscv_compute_frame_info (void)
 {
@@ -4985,8 +5071,9 @@ riscv_compute_frame_info (void)
   if (frame->mask)
     {
       x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
-      unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
 
+      /* 1 is for ra  */
+      unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
       /* Only use save/restore routines if they don't alter the stack size.  */
       if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
           && !riscv_avoid_save_libcall ())
@@ -4998,6 +5085,15 @@ riscv_compute_frame_info (void)
 
 	  frame->save_libcall_adjustment = x_save_size;
 	}
+
+      if (!riscv_avoid_multi_push (frame))
+        {
+          /* num(ra, s0-sx)  */
+          unsigned num_multi_push =
+            riscv_multi_push_regs_count (frame->mask);
+          x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
+          frame->multi_push_adj_base = riscv_16bytes_align (x_save_size);
+        }
     }
 
   /* At the bottom of the frame are any outgoing stack arguments. */
@@ -5012,7 +5108,15 @@ riscv_compute_frame_info (void)
   frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
   /* Next are the callee-saved GPRs. */
   if (frame->mask)
-    offset += x_save_size;
+    {
+      offset += x_save_size;
+      /* align to 16 bytes and add paddings to GPR part to honor
+         both stack alignment and zcmp pus/pop size alignment. */
+      if (riscv_use_multi_push (frame)
+          && known_lt(offset,
+                      frame->multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
+        offset = riscv_16bytes_align (offset);
+    }
   frame->gp_sp_offset = offset - UNITS_PER_WORD;
   /* The hard frame pointer points above the callee-saved GPRs. */
   frame->hard_frame_pointer_offset = offset;
@@ -5356,6 +5460,42 @@ riscv_adjust_libcall_cfi_prologue ()
   return dwarf;
 }
 
+static rtx
+riscv_adjust_multi_push_cfi_prologue (int saved_size)
+{
+  rtx dwarf = NULL_RTX;
+  rtx adjust_sp_rtx, reg, mem, insn;
+  unsigned int mask = cfun->machine->frame.mask;
+  int offset;
+  int saved_cnt = 0;
+
+  if (mask & S10_MASK)
+    mask |= S11_MASK;
+
+  for (int regno = GP_REG_LAST; regno >= GP_REG_FIRST; regno--)
+    if (BITSET_P (mask & MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
+      {
+        /* The save order is s11-s0, ra
+           from high to low addr.  */
+        offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
+
+        reg = gen_rtx_REG (SImode, regno);
+        mem = gen_frame_mem (SImode, plus_constant (Pmode,
+                                                    stack_pointer_rtx,
+                                                    offset));
+
+        insn = gen_rtx_SET (mem, reg);
+        dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
+      }
+
+  /* Debug info for adjust sp.  */
+  adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+                               plus_constant(Pmode, stack_pointer_rtx, -saved_size));
+  dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+                          dwarf);
+  return dwarf;
+}
+
 static void
 riscv_emit_stack_tie (void)
 {
@@ -5365,6 +5505,152 @@ riscv_emit_stack_tie (void)
     emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
 }
 
+static rtx
+get_slot_offset_rtx (int slot_idx)
+{
+  HOST_WIDE_INT slot_offset = -1 * (slot_idx + 1) * GET_MODE_SIZE (word_mode);
+  return GEN_INT (slot_offset);
+}
+
+/*zcmp multi push and pop function ptr array  */
+const insn_gen_fn gen_push_pop [MAX_OP_IDX + 1][MAX_MODE_IDX + 1][ZCMP_MAX_GRP_SLOTS] =
+{{{(insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_ra_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s0_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s1_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s2_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s3_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s4_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s5_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s6_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s7_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s8_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s9_si,
+   NULL,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s11_si},
+  {(insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_ra_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s0_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s1_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s2_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s3_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s4_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s5_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s6_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s7_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s8_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s9_di,
+   NULL,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_push_up_to_s11_di}},
+ {{(insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_ra_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s0_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s1_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s2_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s3_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s4_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s5_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s6_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s7_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s8_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s9_si,
+   NULL,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s11_si},
+  {(insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_ra_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s0_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s1_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s2_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s3_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s4_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s5_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s6_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s7_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s8_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s9_di,
+   NULL,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_pop_up_to_s11_di}},
+ {{(insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_ra_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s0_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s1_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s2_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s3_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s4_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s5_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s6_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s7_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s8_si,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s9_si,
+   NULL,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s11_si},
+  {(insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_ra_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s0_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s1_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s2_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s3_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s4_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s5_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s6_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s7_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s8_di,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s9_di,
+   NULL,
+   (insn_gen_fn::stored_funcptr) gen_gpr_multi_popret_up_to_s11_di}}};
+
+static rtx
+riscv_gen_multi_push_pop_insn (op_idx op, HOST_WIDE_INT adj_size, unsigned int regs_num)
+{
+  rtx stack_adj = GEN_INT (adj_size);
+  rtx slots[ZCMP_MAX_GRP_SLOTS];
+
+  for (int slot_idx = 0; slot_idx < ZCMP_MAX_GRP_SLOTS; slot_idx++)
+    slots[slot_idx] = get_slot_offset_rtx (slot_idx);
+
+  switch (regs_num)
+    {
+    case 1:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0]);
+    case 2:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1]);
+    case 3:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2]);
+    case 4:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3]);
+    case 5:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4]);
+    case 6:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5]);
+    case 7:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+        slots[6]);
+    case 8:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+        slots[6], slots[7]);
+    case 9:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+        slots[6], slots[7], slots[8]);
+    case 10:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+        slots[6], slots[7], slots[8], slots[9]);
+    case 11:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+        slots[6], slots[7], slots[8], slots[9], slots[10]);
+    case 13:
+      return (gen_push_pop[op][TARGET_64BIT][regs_num - 1])
+        (stack_adj, slots[0], slots[1], slots[2], slots[3], slots[4], slots[5],
+        slots[6], slots[7], slots[8], slots[9], slots[10], slots[11], slots[12]);
+    default:
+      gcc_unreachable ();
+    }
+}
+
 /* Expand the "prologue" pattern.  */
 
 void
@@ -5373,7 +5659,8 @@ riscv_expand_prologue (void)
   struct riscv_frame_info *frame = &cfun->machine->frame;
   poly_int64 remaining_size = frame->total_size;
   unsigned mask = frame->mask;
-  rtx insn;
+  int spimm, multi_push_additional, stack_adj;
+  rtx insn, dwarf = NULL_RTX;
 
   if (flag_stack_usage_info)
     current_function_static_stack_size = constant_lower_bound (remaining_size);
@@ -5381,8 +5668,35 @@ riscv_expand_prologue (void)
   if (cfun->machine->naked_p)
     return;
 
+  /* prefer muti-push to save-restore libcall.  */
+  if (riscv_use_multi_push(frame))
+    {
+      remaining_size -= frame->multi_push_adj_base;
+      if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
+        spimm = 3;
+      else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
+        spimm = 2;
+      else if (known_gt(remaining_size, 0))
+        spimm = 1;
+      else
+        spimm = 0;
+      multi_push_additional = spimm * ZCMP_SP_INC_STEP;
+      frame->multi_push_adj_addi = multi_push_additional;
+      remaining_size -= multi_push_additional;
+
+      /* emit multi push insn & dwarf along with it.  */
+      stack_adj = frame->multi_push_adj_base + multi_push_additional;
+      insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
+        -stack_adj, riscv_multi_push_regs_count(frame->mask)));
+      dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
+      RTX_FRAME_RELATED_P (insn) = 1;
+      REG_NOTES (insn) = dwarf;
+
+      /* Temporarily fib that we need not save GPRs.  */
+      frame->mask = 0; 
+    }
   /* When optimizing for size, call a subroutine to save the registers.  */
-  if (riscv_use_save_libcall (frame))
+  else if (riscv_use_save_libcall (frame))
     {
       rtx dwarf = NULL_RTX;
       dwarf = riscv_adjust_libcall_cfi_prologue ();
@@ -5398,13 +5712,15 @@ riscv_expand_prologue (void)
   /* Save the registers.  */
   if ((frame->mask | frame->fmask) != 0)
     {
-      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
-
-      insn = gen_add3_insn (stack_pointer_rtx,
-			    stack_pointer_rtx,
-			    GEN_INT (-step1));
-      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
-      remaining_size -= step1;
+      if (known_gt (remaining_size, frame->frame_pointer_offset))
+        {
+          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
+          remaining_size -= step1;
+          insn = gen_add3_insn (stack_pointer_rtx,
+                                stack_pointer_rtx,
+                                GEN_INT (-step1));
+          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+        }
       riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
     }
 
@@ -5461,6 +5777,32 @@ riscv_expand_prologue (void)
     }
 }
 
+static rtx
+riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
+{
+  rtx dwarf = NULL_RTX;
+  rtx adjust_sp_rtx, reg;
+  unsigned int mask = cfun->machine->frame.mask;
+
+  if (mask & S10_MASK)
+    mask |= S11_MASK;
+
+  /* Debug info for adjust sp.  */
+  adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+                               plus_constant(Pmode, stack_pointer_rtx, saved_size));
+  dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+                          dwarf);
+
+  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+    if (BITSET_P (mask, regno - GP_REG_FIRST))
+      {
+        reg = gen_rtx_REG (SImode, regno);
+        dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
+      }
+
+  return dwarf;
+}
+
 static rtx
 riscv_adjust_libcall_cfi_epilogue ()
 {
@@ -5500,10 +5842,18 @@ riscv_expand_epilogue (int style)
   struct riscv_frame_info *frame = &cfun->machine->frame;
   unsigned mask = frame->mask;
   HOST_WIDE_INT step2 = 0;
-  bool use_restore_libcall = ((style == NORMAL_RETURN)
-			      && riscv_use_save_libcall (frame));
-  unsigned libcall_size = (use_restore_libcall
-			   ? frame->save_libcall_adjustment : 0);
+  bool use_multi_pop_normal = ((style == NORMAL_RETURN)
+                              && riscv_use_multi_push (frame));
+  bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
+                              && riscv_use_multi_push (frame));
+  bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
+
+  bool use_restore_libcall = !use_multi_pop && ((style == NORMAL_RETURN)
+                              && riscv_use_save_libcall (frame));
+  unsigned libcall_size = use_restore_libcall && !use_multi_pop ?
+                            frame->save_libcall_adjustment : 0;
+  unsigned multipop_size = use_multi_pop ?
+                            frame->multi_push_adj_base + frame->multi_push_adj_addi : 0;
   rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
   rtx insn;
 
@@ -5574,18 +5924,25 @@ riscv_expand_epilogue (int style)
       REG_NOTES (insn) = dwarf;
     }
 
-  if (use_restore_libcall)
-    frame->mask = 0; /* Temporarily fib for GPRs.  */
+  if (use_restore_libcall || use_multi_pop)
+    frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
 
   /* If we need to restore registers, deallocate as much stack as
      possible in the second step without going out of range.  */
-  if ((frame->mask | frame->fmask) != 0)
+  if (use_multi_pop)
+    {
+      if (frame->fmask
+          && known_gt (frame->total_size - multipop_size,
+                      frame->frame_pointer_offset))
+        step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
+    }
+  else if ((frame->mask | frame->fmask) != 0)
     step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
 
-  if (use_restore_libcall)
+  if (use_restore_libcall || use_multi_pop)
     frame->mask = mask; /* Undo the above fib.  */
 
-  poly_int64 step1 = frame->total_size - step2 - libcall_size;
+  poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
 
   /* Set TARGET to BASE + STEP1.  */
   if (known_gt (step1, 0))
@@ -5620,7 +5977,7 @@ riscv_expand_epilogue (int style)
 					   adjust));
 	  rtx dwarf = NULL_RTX;
 	  rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-					     GEN_INT (step2));
+					     GEN_INT (step2 + libcall_size + multipop_size));
 
 	  dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
 	  RTX_FRAME_RELATED_P (insn) = 1;
@@ -5635,15 +5992,15 @@ riscv_expand_epilogue (int style)
       epilogue_cfa_sp_offset = step2;
     }
 
-  if (use_restore_libcall)
+  if (use_restore_libcall || use_multi_pop)
     frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
 
   /* Restore the registers.  */
-  riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
+  riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
 			    riscv_restore_reg,
 			    true, style == EXCEPTION_RETURN);
 
-  if (use_restore_libcall)
+  if (use_restore_libcall || use_multi_pop)
       frame->mask = mask; /* Undo the above fib.  */
 
   if (need_barrier_p)
@@ -5657,14 +6014,30 @@ riscv_expand_epilogue (int style)
 
       rtx dwarf = NULL_RTX;
       rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-					 const0_rtx);
+					 GEN_INT (libcall_size + multipop_size));
       dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
       RTX_FRAME_RELATED_P (insn) = 1;
 
       REG_NOTES (insn) = dwarf;
     }
 
-  if (use_restore_libcall)
+  if (use_multi_pop)
+    {
+      unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
+      if (use_multi_pop_normal)
+        insn = emit_jump_insn (
+          riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
+      else
+        insn= emit_insn (
+          riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
+
+      rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
+      RTX_FRAME_RELATED_P (insn) = 1;
+      REG_NOTES (insn) = dwarf;
+      if (use_multi_pop_normal)
+        return;
+    }
+  else if (use_restore_libcall)
     {
       rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
       insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
@@ -6937,6 +7310,30 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
   return gen_rtx_PARALLEL (VOIDmode, vec);
 }
 
+static HOST_WIDE_INT zcmp_base_adj(int regs_num)
+{
+  return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
+}
+
+static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
+{
+  return total - zcmp_base_adj(regs_num);
+}
+
+bool riscv_zcmp_valid_slot_offset_p (HOST_WIDE_INT offset, int slot_idx)
+{
+  return offset == -1 * (slot_idx + 1) * GET_MODE_SIZE (word_mode);
+}
+
+bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
+{
+  HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
+  return additioanl_bytes == 0
+         || additioanl_bytes  == 1 * ZCMP_SP_INC_STEP
+         || additioanl_bytes  == 2 * ZCMP_SP_INC_STEP
+         || additioanl_bytes  == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
+}
+
 /* Return true if it's valid gpr_save pattern.  */
 
 bool
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 13038a39e5c..ff210083004 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -413,6 +413,29 @@ ASM_MISA_SPEC
 #define RISCV_CALL_ADDRESS_TEMP(MODE) \
   gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
 
+#define RETURN_ADDR_MASK        ( 1 << RETURN_ADDR_REGNUM)
+#define S0_MASK                 ( 1 << S0_REGNUM)
+#define S1_MASK                 ( 1 << S1_REGNUM)
+#define S2_MASK                 ( 1 << S2_REGNUM)
+#define S3_MASK                 ( 1 << S3_REGNUM)
+#define S4_MASK                 ( 1 << S4_REGNUM)
+#define S5_MASK                 ( 1 << S5_REGNUM)
+#define S6_MASK                 ( 1 << S6_REGNUM)
+#define S7_MASK                 ( 1 << S7_REGNUM)
+#define S8_MASK                 ( 1 << S8_REGNUM)
+#define S9_MASK                 ( 1 << S9_REGNUM)
+#define S10_MASK                ( 1 << S10_REGNUM)
+#define S11_MASK                ( 1 << S11_REGNUM)
+
+#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK  | S3_MASK \
+                                               | S4_MASK | S5_MASK | S6_MASK  | S7_MASK \
+                                               | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
+#define ZCMP_MAX_SPIMM 3
+#define ZCMP_SP_INC_STEP 16
+#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
+#define ZCMP_S0S11_SREGS_COUNTS 12
+#define ZCMP_MAX_GRP_SLOTS 13
+
 #define MCOUNT_NAME "_mcount"
 
 #define NO_PROFILE_COUNTERS 1
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 7065e68c0b7..73fc8cb69bc 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -113,6 +113,7 @@
 
 (define_constants
   [(RETURN_ADDR_REGNUM		1)
+   (SP_REGNUM 			2)
    (GP_REGNUM 			3)
    (TP_REGNUM			4)
    (T0_REGNUM			5)
@@ -3205,3 +3206,4 @@
 (include "sifive-7.md")
 (include "thead.md")
 (include "vector.md")
+(include "zc.md")
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
new file mode 100644
index 00000000000..6e6c87983fb
--- /dev/null
+++ b/gcc/config/riscv/zc.md
@@ -0,0 +1,1042 @@
+;; Machine description for RISC-V Zc extention.
+;; Copyright (C) 2011-2023 Free Software Foundation, Inc.
+;; Contributed by Fei Gao (gaofei@eswincomputing.com).
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_insn "gpr_multi_pop_up_to_ra_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s0_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s1_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s1}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s2_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s2}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s3_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s3}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s4_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s4}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s5_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s5}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s6_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s6}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s7_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s7}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s8_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s8}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s9_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 11 "slot_10_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s9}, %0"
+)
+
+(define_insn "gpr_multi_pop_up_to_s11_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+   (set (reg:X S11_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S10_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 11 "slot_10_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 12 "slot_11_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 13 "slot_12_offset_operand" "I"))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s11}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_ra_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s0_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s1_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s1}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s2_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s2}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s3_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s3}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s4_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s4}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s5_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s5}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s6_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s6}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s7_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s7}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s8_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s8}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s9_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 11 "slot_10_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s9}, %0"
+)
+
+(define_insn "gpr_multi_popret_up_to_s11_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+   (set (reg:X S11_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I"))))
+   (set (reg:X S10_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I"))))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I"))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I"))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I"))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I"))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I"))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I"))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I"))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I"))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 11 "slot_10_offset_operand" "I"))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 12 "slot_11_offset_operand" "I"))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 13 "slot_12_offset_operand" "I"))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s11}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_ra_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s0_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s1_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s1}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s2_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s2}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s3_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s3}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s4_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s4}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s5_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s5}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s6_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s6}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s7_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s7}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s8_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S8_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s8}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s9_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S9_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S8_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 11 "slot_10_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s9}, %0"
+)
+
+(define_insn "gpr_multi_push_up_to_s11_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 1 "slot_0_offset_operand" "I")))
+        (reg:X S11_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 2 "slot_1_offset_operand" "I")))
+        (reg:X S10_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 3 "slot_2_offset_operand" "I")))
+        (reg:X S9_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 4 "slot_3_offset_operand" "I")))
+        (reg:X S8_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 5 "slot_4_offset_operand" "I")))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 6 "slot_5_offset_operand" "I")))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 7 "slot_6_offset_operand" "I")))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 8 "slot_7_offset_operand" "I")))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 9 "slot_8_offset_operand" "I")))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 10 "slot_9_offset_operand" "I")))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 11 "slot_10_offset_operand" "I")))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 12 "slot_11_offset_operand" "I")))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (match_operand:X 13 "slot_12_offset_operand" "I")))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s11}, %0"
+)
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
new file mode 100644
index 00000000000..6dbe489da9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+  (int arg0, int arg1, int arg2, int arg3,
+   int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+**	...
+**	cm.push	{ra, s0-s1}, -64
+**	...
+**	cm.popret	{ra, s0-s1}, 64
+**	...
+*/
+int test1()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0;
+  for (int i = 0; i < 3120; i++)
+  {
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i];
+  }
+  return sum;
+}
+
+/*
+**test2_step1_0_size:
+**	...
+**	cm.push	{ra, s0}, -64
+**	...
+**	cm.popret	{ra, s0}, 64
+**	...
+*/
+int test2_step1_0_size()
+{
+  int volatile iarray[3120 + 1824/4 -8];
+
+  for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+  {
+    iarray[i] = my_getchar() * 2;
+  }
+  return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+**	...
+**	cm.push	{ra, s0-s1}, -64
+**	...
+**	cm.popret	{ra, s0-s1}, 64
+**	...
+*/
+float test3()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+  for (int i = 0; i < 3120; i++)
+  {
+    f1 = getf();
+    f2 = getf();
+    f3 = getf();
+    f4 = getf();
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+  }
+  return sum;
+}
+
+/*
+**outgoing_stack_args:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+int outgoing_stack_args()
+{
+  int  local = getint();
+  return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+**	...
+**	cm.push	{ra}, -32
+**	...
+**	cm.popret	{ra}, 32
+**	...
+*/
+float callPrintInts()
+{
+  volatile float f = getf(); // f in local
+  PrintInts(9,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint:
+**	...
+**	cm.push	{ra}, -32
+**	...
+**	cm.popret	{ra}, 32
+**	...
+*/
+float callPrint()
+{
+  volatile float f = getf(); // f in local
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_S:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+float callPrint_S()
+{
+  float f = getf();
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_2:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+float callPrint_2()
+{
+  float f = getf();
+  PrintInts2(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+**	...
+**	cm.push	{ra}, -16
+**	...
+**	cm.popret	{ra}, 16
+**	...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+  int a  =  9;
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s0:
+**	...
+**	cm.push	{ra, s0}, -16
+**	...
+**	cm.popret	{ra, s0}, 16
+**	...
+*/
+int test_s0()
+{
+
+  int a  =  my_getchar();
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s1:
+**	...
+**	cm.push	{ra, s0-s1}, -16
+**	...
+**	cm.popret	{ra, s0-s1}, 16
+**	...
+*/
+int test_s1()
+{
+
+  int s0  =  my_getchar();
+  int s1  =  my_getchar();
+  int b  =  my_getchar();
+  return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+**	...
+**	cm.push	{ra, s0-s1}, -16
+**	...
+**	cm.popret	{ra, s0-s1}, 16
+**	...
+*/
+int test_f0()
+{
+
+  int s0  =  my_getchar();
+  float f0  =  getf(); 
+  int b  =  my_getchar();
+  return f0 +s0 +b;
+}
+
+/*
+**foo:
+**	cm.push	{ra}, -16
+**	call	f1
+**	cm.pop	{ra}, 16
+**	tail	f2
+*/
+void foo(void)
+{
+  f1();
+  f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
new file mode 100644
index 00000000000..924197cb3c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+  (int arg0, int arg1, int arg2, int arg3,
+   int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+**	...
+**	cm.push	{ra, s0-s4}, -80
+**	...
+**	cm.popret	{ra, s0-s4}, 80
+**	...
+*/
+int test1()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0;
+  for (int i = 0; i < 3120; i++)
+  {
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i];
+  }
+  return sum;
+}
+
+/*
+**test2_step1_0_size:
+**	...
+**	cm.push	{ra, s0-s1}, -64
+**	...
+**	cm.popret	{ra, s0-s1}, 64
+**	...
+*/
+int test2_step1_0_size()
+{
+  int volatile iarray[3120 + 1824/4 -8];
+
+  for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+  {
+    iarray[i] = my_getchar() * 2;
+  }
+  return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+**	...
+**	cm.push	{ra, s0-s4}, -80
+**	...
+**	cm.popret	{ra, s0-s4}, 80
+**	...
+*/
+float test3()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+  for (int i = 0; i < 3120; i++)
+  {
+    f1 = getf();
+    f2 = getf();
+    f3 = getf();
+    f4 = getf();
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+  }
+  return sum;
+}
+
+/*
+**outgoing_stack_args:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+int outgoing_stack_args()
+{
+  int  local = getint();
+  return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrintInts()
+{
+  volatile float f = getf(); // f in local
+  PrintInts(9,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrint()
+{
+  volatile float f = getf(); // f in local
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_S:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrint_S()
+{
+  float f = getf();
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_2:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrint_2()
+{
+  float f = getf();
+  PrintInts2(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+**	...
+**	cm.push	{ra}, -16
+**	...
+**	cm.popret	{ra}, 16
+**	...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+  int a  =  9;
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s0:
+**	...
+**	cm.push	{ra, s0}, -16
+**	...
+**	cm.popret	{ra, s0}, 16
+**	...
+*/
+int test_s0()
+{
+
+  int a  =  my_getchar();
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s1:
+**	...
+**	cm.push	{ra, s0-s1}, -16
+**	...
+**	cm.popret	{ra, s0-s1}, 16
+**	...
+*/
+int test_s1()
+{
+
+  int s0  =  my_getchar();
+  int s1  =  my_getchar();
+  int b  =  my_getchar();
+  return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+int test_f0()
+{
+
+  int s0  =  my_getchar();
+  float f0  =  getf(); 
+  int b  =  my_getchar();
+  return f0 +s0 +b;
+}
+
+/*
+**foo:
+**	cm.push	{ra}, -16
+**	call	f1
+**	cm.pop	{ra}, 16
+**	tail	f2
+*/
+void foo(void)
+{
+  f1();
+  f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
new file mode 100644
index 00000000000..8e481522f89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
+/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+void bar();
+
+/*
+**fool_rv32e:
+**	cm.push	{ra}, -32
+**	...
+**	call	bar
+**	...
+**	lw	a5,32\(sp\)
+**	...
+**	cm.popret	{ra}, 32
+*/
+int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
+                  int incoming0)
+{
+  bar();
+  return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
+}
\ No newline at end of file
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-30  7:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20230516093354.1521-1-gaofei@eswincomputing.com>
     [not found] ` <20230516093354.1521-2-gaofei@eswincomputing.com>
2023-05-30  5:26   ` [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp Sinan
2023-05-30  7:44     ` Fei Gao
2023-05-12  9:04 [PATCH 0/1] [V2] RISC-V: support Zcmp extension Fei Gao
2023-05-12  9:04 ` [PATCH 1/1] [V2] [RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
2023-05-29  3:05   ` Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).