public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/4] [RISC-V] support zcmp extention
@ 2023-06-07  5:52 Fei Gao
  2023-06-07  5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-07  5:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao

please be noted the series depend on the zcmp switch that Jiawei posted
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615289.html

The 1st patch is a follow up on Kito's V3 review. 
Others are new.

Fei Gao (4):
  [RISC-V] support cm.push cm.pop cm.popret in zcmp
  [RISC-V] support cm.popretz in zcmp
  [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
  [RISC-V] support cm.mva01s cm.mvsa01 in zcmp

 gcc/config/riscv/iterators.md                 |   15 +
 gcc/config/riscv/peephole.md                  |   28 +
 gcc/config/riscv/predicates.md                |  107 ++
 gcc/config/riscv/riscv-protos.h               |    1 +
 gcc/config/riscv/riscv.cc                     |  445 ++++-
 gcc/config/riscv/riscv.h                      |   23 +
 gcc/config/riscv/riscv.md                     |    4 +
 gcc/config/riscv/zc.md                        | 1457 +++++++++++++++++
 gcc/shrink-wrap.cc                            |   25 +-
 gcc/shrink-wrap.h                             |    1 +
 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c   |   21 +
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c   |  251 +++
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c   |  251 +++
 .../riscv/zcmp_shrink_wrap_separate.c         |   97 ++
 .../riscv/zcmp_shrink_wrap_separate2.c        |   97 ++
 .../gcc.target/riscv/zcmp_stack_alignment.c   |   23 +
 16 files changed, 2795 insertions(+), 51 deletions(-)
 create mode 100644 gcc/config/riscv/zc.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-06-07  5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
@ 2023-06-07  5:52 ` Fei Gao
  2023-06-07 10:11   ` jiawei
  2023-08-16  8:33   ` Kito Cheng
  2023-06-07  5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-07  5:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao

Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
by cm.push, step 1 and step 2.

please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
So adaption has been done in .cfi directives in my patch.

Signed-off-by: Fei Gao <gaofei@eswincomputing.com>

gcc/ChangeLog:

        * config/riscv/iterators.md
        slot0_offset: slot 0 offset in stack GPRs area in bytes
        slot1_offset: slot 1 offset in stack GPRs area in bytes
        slot2_offset: likewise
        slot3_offset: likewise
        slot4_offset: likewise
        slot5_offset: likewise
        slot6_offset: likewise
        slot7_offset: likewise
        slot8_offset: likewise
        slot9_offset: likewise
        slot10_offset: likewise
        slot11_offset: likewise
        slot12_offset: likewise
        * config/riscv/predicates.md
        (stack_push_up_to_ra_operand): predicates of stack adjust pushing ra
        (stack_push_up_to_s0_operand): predicates of stack adjust pushing ra, s0
        (stack_push_up_to_s1_operand): likewise
        (stack_push_up_to_s2_operand): likewise
        (stack_push_up_to_s3_operand): likewise
        (stack_push_up_to_s4_operand): likewise
        (stack_push_up_to_s5_operand): likewise
        (stack_push_up_to_s6_operand): likewise
        (stack_push_up_to_s7_operand): likewise
        (stack_push_up_to_s8_operand): likewise
        (stack_push_up_to_s9_operand): likewise
        (stack_push_up_to_s11_operand): likewise
        (stack_pop_up_to_ra_operand): predicates of stack adjust poping ra
        (stack_pop_up_to_s0_operand): predicates of stack adjust poping ra, s0
        (stack_pop_up_to_s1_operand): likewise
        (stack_pop_up_to_s2_operand): likewise
        (stack_pop_up_to_s3_operand): likewise
        (stack_pop_up_to_s4_operand): likewise
        (stack_pop_up_to_s5_operand): likewise
        (stack_pop_up_to_s6_operand): likewise
        (stack_pop_up_to_s7_operand): likewise
        (stack_pop_up_to_s8_operand): likewise
        (stack_pop_up_to_s9_operand): likewise
        (stack_pop_up_to_s11_operand): likewise
        * config/riscv/riscv-protos.h
        (riscv_zcmp_valid_stack_adj_bytes_p):declaration
        * config/riscv/riscv.cc (struct riscv_frame_info): comment change
        (riscv_avoid_multi_push): helper function of riscv_use_multi_push
        (riscv_use_multi_push): true if multi push is used
        (riscv_multi_push_sregs_count): num of sregs in multi-push
        (riscv_multi_push_regs_count): num of regs in multi-push
        (riscv_16bytes_align): align to 16 bytes
        (riscv_stack_align): moved to a better place
        (riscv_save_libcall_count): no functional change
        (riscv_compute_frame_info): add zcmp frame info
        (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
        (riscv_gen_multi_push_pop_insn): gen function for multi push and pop
        (riscv_expand_prologue): allocate stack by cm.push
        (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
        (riscv_expand_epilogue): allocate stack by cm.pop[ret]
        (zcmp_base_adj): calculate stack adjustment base size
        (zcmp_additional_adj): calculate stack adjustment additional size
        (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment valid
        * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
        (S0_MASK): likewise
        (S1_MASK): likewise
        (S2_MASK): likewise
        (S3_MASK): likewise
        (S4_MASK): likewise
        (S5_MASK): likewise
        (S6_MASK): likewise
        (S7_MASK): likewise
        (S8_MASK): likewise
        (S9_MASK): likewise
        (S10_MASK): likewise
        (S11_MASK): likewise
        (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
        (ZCMP_MAX_SPIMM): max spimm value
        (ZCMP_SP_INC_STEP): zcmp sp increment step
        (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
        (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
        (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
        * config/riscv/riscv.md: include zc.md
        * config/riscv/zc.md: New file. machine description for zcmp

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rv32e_zcmp.c: New test.
        * gcc.target/riscv/rv32i_zcmp.c: New test.
        * gcc.target/riscv/zcmp_stack_alignment.c: New test.
---
 gcc/config/riscv/iterators.md                 |   15 +
 gcc/config/riscv/predicates.md                |   96 ++
 gcc/config/riscv/riscv-protos.h               |    1 +
 gcc/config/riscv/riscv.cc                     |  360 +++++-
 gcc/config/riscv/riscv.h                      |   23 +
 gcc/config/riscv/riscv.md                     |    2 +
 gcc/config/riscv/zc.md                        | 1042 +++++++++++++++++
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c   |  239 ++++
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c   |  239 ++++
 .../gcc.target/riscv/zcmp_stack_alignment.c   |   23 +
 10 files changed, 2000 insertions(+), 40 deletions(-)
 create mode 100644 gcc/config/riscv/zc.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index d374a10810c..6ed4174f9cc 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -120,6 +120,21 @@
 (define_mode_attr shiftm1 [(SI "const_si_mask_operand") (DI "const_di_mask_operand")])
 (define_mode_attr shiftm1p [(SI "DsS") (DI "DsD")])
 
+; zcmp mode attribute
+(define_mode_attr slot0_offset  [(SI "-4")  (DI "-8")])
+(define_mode_attr slot1_offset  [(SI "-8")  (DI "-16")])
+(define_mode_attr slot2_offset  [(SI "-12") (DI "-24")])
+(define_mode_attr slot3_offset  [(SI "-16") (DI "-32")])
+(define_mode_attr slot4_offset  [(SI "-20") (DI "-40")])
+(define_mode_attr slot5_offset  [(SI "-24") (DI "-48")])
+(define_mode_attr slot6_offset  [(SI "-28") (DI "-56")])
+(define_mode_attr slot7_offset  [(SI "-32") (DI "-64")])
+(define_mode_attr slot8_offset  [(SI "-36") (DI "-72")])
+(define_mode_attr slot9_offset  [(SI "-40") (DI "-80")])
+(define_mode_attr slot10_offset [(SI "-44") (DI "-88")])
+(define_mode_attr slot11_offset [(SI "-48") (DI "-96")])
+(define_mode_attr slot12_offset [(SI "-52") (DI "-104")])
+
 ;; -------------------------------------------------------------------
 ;; Code Iterators
 ;; -------------------------------------------------------------------
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 04ca6ceabc7..ab67b3332f0 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -65,6 +65,102 @@
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
 
+(define_predicate "stack_push_up_to_ra_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
+
+(define_predicate "stack_push_up_to_s0_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
+
+(define_predicate "stack_push_up_to_s1_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
+
+(define_predicate "stack_push_up_to_s2_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
+
+(define_predicate "stack_push_up_to_s3_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
+
+(define_predicate "stack_push_up_to_s4_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
+
+(define_predicate "stack_push_up_to_s5_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
+
+(define_predicate "stack_push_up_to_s6_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
+
+(define_predicate "stack_push_up_to_s7_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
+
+(define_predicate "stack_push_up_to_s8_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
+
+(define_predicate "stack_push_up_to_s9_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
+
+(define_predicate "stack_push_up_to_s11_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
+
+(define_predicate "stack_pop_up_to_ra_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
+
+(define_predicate "stack_pop_up_to_s0_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
+
+(define_predicate "stack_pop_up_to_s1_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
+
+(define_predicate "stack_pop_up_to_s2_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
+
+(define_predicate "stack_pop_up_to_s3_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
+
+(define_predicate "stack_pop_up_to_s4_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
+
+(define_predicate "stack_pop_up_to_s5_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
+
+(define_predicate "stack_pop_up_to_s6_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
+
+(define_predicate "stack_pop_up_to_s7_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
+
+(define_predicate "stack_pop_up_to_s8_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
+
+(define_predicate "stack_pop_up_to_s9_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
+
+(define_predicate "stack_pop_up_to_s11_operand"
+  (and (match_code "const_int")
+       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
+
 ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
 (define_predicate "branch_on_bit_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 00e1b20c6c6..f23b11622a2 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -56,6 +56,7 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
 extern void riscv_split_doubleword_move (rtx, rtx);
 extern const char *riscv_output_move (rtx, rtx);
 extern const char *riscv_output_return ();
+extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3954c89a039..c476c699f4c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -126,6 +126,14 @@ struct GTY(())  riscv_frame_info {
   /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
   unsigned save_libcall_adjustment;
 
+  /* the minimum number of bytes, in multiples of 16-byte address increments,
+     required to cover the registers in a multi push & pop.  */
+  unsigned multi_push_adj_base;
+
+  /* the number of additional 16-byte address increments allocated for the stack frame
+     in a multi push & pop.  */
+  unsigned multi_push_adj_addi;
+
   /* Offsets of fixed-point and floating-point save areas from frame bottom */
   poly_int64 gp_sp_offset;
   poly_int64 fp_sp_offset;
@@ -422,6 +430,16 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
 #include "riscv-cores.def"
 };
 
+typedef enum
+{
+  PUSH_IDX = 0,
+  POP_IDX,
+  POPRET_IDX,
+  ZCMP_OP_NUM
+} riscv_zcmp_op_t;
+
+typedef insn_code (* code_for_push_pop_t)(machine_mode);
+
 void riscv_frame_info::reset(void)
 {
   total_size = 0;
@@ -4876,6 +4894,37 @@ riscv_save_reg_p (unsigned int regno)
   return false;
 }
 
+/* Return TRUE if Zcmp push and pop insns should be
+   avoided. FALSE otherwise.
+   Only use multi push & pop if all GPRs masked can be covered,
+   and stack access is SP based,
+   and GPRs are at top of the stack frame,
+   and no conflicts in stack allocation with other features  */
+static bool
+riscv_avoid_multi_push(const struct riscv_frame_info *frame)
+{
+  if (!TARGET_ZCMP
+      || crtl->calls_eh_return
+      || frame_pointer_needed
+      || cfun->machine->interrupt_handler_p
+      || cfun->machine->varargs_size != 0
+      || crtl->args.pretend_args_size != 0
+      || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
+    return true;
+
+  return false;
+}
+
+/* Determine whether to use multi push insn.  */
+static bool
+riscv_use_multi_push(const struct riscv_frame_info *frame)
+{
+  if (riscv_avoid_multi_push (frame))
+    return false;
+
+  return (frame->multi_push_adj_base != 0);
+}
+
 /* Return TRUE if a libcall to save/restore GPRs should be
    avoided.  FALSE otherwise.  */
 static bool
@@ -4913,6 +4962,51 @@ riscv_save_libcall_count (unsigned mask)
   abort ();
 }
 
+/* calculate number of s regs in multi push and pop.
+   Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead.  */
+static unsigned
+riscv_multi_push_sregs_count (unsigned mask)
+{
+  unsigned num = riscv_save_libcall_count (mask);
+  return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
+    ? ZCMP_S0S11_SREGS_COUNTS
+    : num;
+}
+
+/* calculate number of regs(ra, s0-sx) in multi push and pop.  */
+static unsigned
+riscv_multi_push_regs_count (unsigned mask)
+{
+  /* 1 is for ra  */
+  return riscv_multi_push_sregs_count (mask) + 1;
+}
+
+/* Handle 16 bytes align for poly_int.  */
+static poly_int64
+riscv_16bytes_align (poly_int64 value)
+{
+  return aligned_upper_bound (value, 16);
+}
+
+static HOST_WIDE_INT
+riscv_16bytes_align (HOST_WIDE_INT value)
+{
+  return ROUND_UP(value, 16);
+}
+
+/* Handle stack align for poly_int.  */
+static poly_int64
+riscv_stack_align (poly_int64 value)
+{
+  return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
+}
+
+static HOST_WIDE_INT
+riscv_stack_align (HOST_WIDE_INT value)
+{
+  return RISCV_STACK_ALIGN (value);
+}
+
 /* Populate the current function's riscv_frame_info structure.
 
    RISC-V stack frames grown downward.  High addresses are at the top.
@@ -4938,7 +5032,7 @@ riscv_save_libcall_count (unsigned mask)
 	|  GPR save area                |       + UNITS_PER_WORD
 	|                               |
 	+-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
-	|                               |       + UNITS_PER_HWVALUE
+	|                               |       + UNITS_PER_FP_REG
 	|  FPR save area                |
 	|                               |
 	+-------------------------------+ <-- frame_pointer_rtx (virtual)
@@ -4957,19 +5051,6 @@ riscv_save_libcall_count (unsigned mask)
 
 static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
 
-/* Handle stack align for poly_int.  */
-static poly_int64
-riscv_stack_align (poly_int64 value)
-{
-  return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
-}
-
-static HOST_WIDE_INT
-riscv_stack_align (HOST_WIDE_INT value)
-{
-  return RISCV_STACK_ALIGN (value);
-}
-
 static void
 riscv_compute_frame_info (void)
 {
@@ -5017,8 +5098,9 @@ riscv_compute_frame_info (void)
   if (frame->mask)
     {
       x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
-      unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
 
+      /* 1 is for ra  */
+      unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask);
       /* Only use save/restore routines if they don't alter the stack size.  */
       if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
           && !riscv_avoid_save_libcall ())
@@ -5030,6 +5112,15 @@ riscv_compute_frame_info (void)
 
 	  frame->save_libcall_adjustment = x_save_size;
 	}
+
+      if (!riscv_avoid_multi_push (frame))
+        {
+          /* num(ra, s0-sx)  */
+          unsigned num_multi_push =
+            riscv_multi_push_regs_count (frame->mask);
+          x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
+          frame->multi_push_adj_base = riscv_16bytes_align (x_save_size);
+        }
     }
 
   /* At the bottom of the frame are any outgoing stack arguments. */
@@ -5044,7 +5135,15 @@ riscv_compute_frame_info (void)
   frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
   /* Next are the callee-saved GPRs. */
   if (frame->mask)
-    offset += x_save_size;
+    {
+      offset += x_save_size;
+      /* align to 16 bytes and add paddings to GPR part to honor
+         both stack alignment and zcmp pus/pop size alignment. */
+      if (riscv_use_multi_push (frame)
+          && known_lt(offset,
+                      frame->multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
+        offset = riscv_16bytes_align (offset);
+    }
   frame->gp_sp_offset = offset - UNITS_PER_WORD;
   /* The hard frame pointer points above the callee-saved GPRs. */
   frame->hard_frame_pointer_offset = offset;
@@ -5388,6 +5487,42 @@ riscv_adjust_libcall_cfi_prologue ()
   return dwarf;
 }
 
+static rtx
+riscv_adjust_multi_push_cfi_prologue (int saved_size)
+{
+  rtx dwarf = NULL_RTX;
+  rtx adjust_sp_rtx, reg, mem, insn;
+  unsigned int mask = cfun->machine->frame.mask;
+  int offset;
+  int saved_cnt = 0;
+
+  if (mask & S10_MASK)
+    mask |= S11_MASK;
+
+  for (int regno = GP_REG_LAST; regno >= GP_REG_FIRST; regno--)
+    if (BITSET_P (mask & MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
+      {
+        /* The save order is s11-s0, ra
+           from high to low addr.  */
+        offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
+
+        reg = gen_rtx_REG (Pmode, regno);
+        mem = gen_frame_mem (Pmode, plus_constant (Pmode,
+                                                   stack_pointer_rtx,
+                                                   offset));
+
+        insn = gen_rtx_SET (mem, reg);
+        dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
+      }
+
+  /* Debug info for adjust sp.  */
+  adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+                               plus_constant(Pmode, stack_pointer_rtx, -saved_size));
+  dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+                          dwarf);
+  return dwarf;
+}
+
 static void
 riscv_emit_stack_tie (void)
 {
@@ -5397,6 +5532,45 @@ riscv_emit_stack_tie (void)
     emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
 }
 
+/*zcmp multi push and pop code_for_push_pop function ptr array  */
+const code_for_push_pop_t code_for_push_pop [ZCMP_MAX_GRP_SLOTS][ZCMP_OP_NUM] = {
+  {code_for_gpr_multi_push_up_to_ra,    code_for_gpr_multi_pop_up_to_ra,
+   code_for_gpr_multi_popret_up_to_ra},
+  {code_for_gpr_multi_push_up_to_s0,    code_for_gpr_multi_pop_up_to_s0,
+   code_for_gpr_multi_popret_up_to_s0},
+  {code_for_gpr_multi_push_up_to_s1,    code_for_gpr_multi_pop_up_to_s1,
+   code_for_gpr_multi_popret_up_to_s1},
+  {code_for_gpr_multi_push_up_to_s2,    code_for_gpr_multi_pop_up_to_s2,
+   code_for_gpr_multi_popret_up_to_s2},
+  {code_for_gpr_multi_push_up_to_s3,    code_for_gpr_multi_pop_up_to_s3,
+   code_for_gpr_multi_popret_up_to_s3},
+  {code_for_gpr_multi_push_up_to_s4,    code_for_gpr_multi_pop_up_to_s4,
+   code_for_gpr_multi_popret_up_to_s4},
+  {code_for_gpr_multi_push_up_to_s5,    code_for_gpr_multi_pop_up_to_s5,
+   code_for_gpr_multi_popret_up_to_s5},
+  {code_for_gpr_multi_push_up_to_s6,    code_for_gpr_multi_pop_up_to_s6,
+   code_for_gpr_multi_popret_up_to_s6},
+  {code_for_gpr_multi_push_up_to_s7,    code_for_gpr_multi_pop_up_to_s7,
+   code_for_gpr_multi_popret_up_to_s7},
+  {code_for_gpr_multi_push_up_to_s8,    code_for_gpr_multi_pop_up_to_s8,
+   code_for_gpr_multi_popret_up_to_s8},
+  {code_for_gpr_multi_push_up_to_s9,    code_for_gpr_multi_pop_up_to_s9,
+   code_for_gpr_multi_popret_up_to_s9},
+  {nullptr, nullptr, nullptr},
+  {code_for_gpr_multi_push_up_to_s11,   code_for_gpr_multi_pop_up_to_s11,
+   code_for_gpr_multi_popret_up_to_s11}};
+
+static rtx
+riscv_gen_multi_push_pop_insn (riscv_zcmp_op_t op, HOST_WIDE_INT adj_size,
+                               unsigned int regs_num)
+{
+  gcc_assert (op < ZCMP_OP_NUM);
+  gcc_assert (regs_num <= ZCMP_MAX_GRP_SLOTS
+              && regs_num != ZCMP_INVALID_S0S10_SREGS_COUNTS + 1); /* 1 for ra*/
+  rtx stack_adj = GEN_INT (adj_size);
+  return GEN_FCN (code_for_push_pop[regs_num - 1][op] (Pmode)) (stack_adj);
+}
+
 /* Expand the "prologue" pattern.  */
 
 void
@@ -5405,7 +5579,8 @@ riscv_expand_prologue (void)
   struct riscv_frame_info *frame = &cfun->machine->frame;
   poly_int64 remaining_size = frame->total_size;
   unsigned mask = frame->mask;
-  rtx insn;
+  int spimm, multi_push_additional, stack_adj;
+  rtx insn, dwarf = NULL_RTX;
 
   if (flag_stack_usage_info)
     current_function_static_stack_size = constant_lower_bound (remaining_size);
@@ -5413,8 +5588,35 @@ riscv_expand_prologue (void)
   if (cfun->machine->naked_p)
     return;
 
+  /* prefer muti-push to save-restore libcall.  */
+  if (riscv_use_multi_push(frame))
+    {
+      remaining_size -= frame->multi_push_adj_base;
+      if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
+        spimm = 3;
+      else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
+        spimm = 2;
+      else if (known_gt(remaining_size, 0))
+        spimm = 1;
+      else
+        spimm = 0;
+      multi_push_additional = spimm * ZCMP_SP_INC_STEP;
+      frame->multi_push_adj_addi = multi_push_additional;
+      remaining_size -= multi_push_additional;
+
+      /* emit multi push insn & dwarf along with it.  */
+      stack_adj = frame->multi_push_adj_base + multi_push_additional;
+      insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
+        -stack_adj, riscv_multi_push_regs_count(frame->mask)));
+      dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
+      RTX_FRAME_RELATED_P (insn) = 1;
+      REG_NOTES (insn) = dwarf;
+
+      /* Temporarily fib that we need not save GPRs.  */
+      frame->mask = 0;
+    }
   /* When optimizing for size, call a subroutine to save the registers.  */
-  if (riscv_use_save_libcall (frame))
+  else if (riscv_use_save_libcall (frame))
     {
       rtx dwarf = NULL_RTX;
       dwarf = riscv_adjust_libcall_cfi_prologue ();
@@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
   /* Save the registers.  */
   if ((frame->mask | frame->fmask) != 0)
     {
-      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
-
-      insn = gen_add3_insn (stack_pointer_rtx,
-			    stack_pointer_rtx,
-			    GEN_INT (-step1));
-      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
-      remaining_size -= step1;
+      if (known_gt (remaining_size, frame->frame_pointer_offset))
+        {
+          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
+          remaining_size -= step1;
+          insn = gen_add3_insn (stack_pointer_rtx,
+                                stack_pointer_rtx,
+                                GEN_INT (-step1));
+          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+        }
       riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
     }
 
@@ -5493,6 +5697,32 @@ riscv_expand_prologue (void)
     }
 }
 
+static rtx
+riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
+{
+  rtx dwarf = NULL_RTX;
+  rtx adjust_sp_rtx, reg;
+  unsigned int mask = cfun->machine->frame.mask;
+
+  if (mask & S10_MASK)
+    mask |= S11_MASK;
+
+  /* Debug info for adjust sp.  */
+  adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+                               plus_constant(Pmode, stack_pointer_rtx, saved_size));
+  dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
+                          dwarf);
+
+  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+    if (BITSET_P (mask, regno - GP_REG_FIRST))
+      {
+        reg = gen_rtx_REG (Pmode, regno);
+        dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
+      }
+
+  return dwarf;
+}
+
 static rtx
 riscv_adjust_libcall_cfi_epilogue ()
 {
@@ -5532,10 +5762,18 @@ riscv_expand_epilogue (int style)
   struct riscv_frame_info *frame = &cfun->machine->frame;
   unsigned mask = frame->mask;
   HOST_WIDE_INT step2 = 0;
-  bool use_restore_libcall = ((style == NORMAL_RETURN)
-			      && riscv_use_save_libcall (frame));
-  unsigned libcall_size = (use_restore_libcall
-			   ? frame->save_libcall_adjustment : 0);
+  bool use_multi_pop_normal = ((style == NORMAL_RETURN)
+                              && riscv_use_multi_push (frame));
+  bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
+                              && riscv_use_multi_push (frame));
+  bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
+
+  bool use_restore_libcall = !use_multi_pop && ((style == NORMAL_RETURN)
+                              && riscv_use_save_libcall (frame));
+  unsigned libcall_size = use_restore_libcall && !use_multi_pop ?
+                            frame->save_libcall_adjustment : 0;
+  unsigned multipop_size = use_multi_pop ?
+                            frame->multi_push_adj_base + frame->multi_push_adj_addi : 0;
   rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
   rtx insn;
 
@@ -5606,18 +5844,25 @@ riscv_expand_epilogue (int style)
       REG_NOTES (insn) = dwarf;
     }
 
-  if (use_restore_libcall)
-    frame->mask = 0; /* Temporarily fib for GPRs.  */
+  if (use_restore_libcall || use_multi_pop)
+    frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
 
   /* If we need to restore registers, deallocate as much stack as
      possible in the second step without going out of range.  */
-  if ((frame->mask | frame->fmask) != 0)
+  if (use_multi_pop)
+    {
+      if (frame->fmask
+          && known_gt (frame->total_size - multipop_size,
+                      frame->frame_pointer_offset))
+        step2 = riscv_first_stack_step (frame, frame->total_size - multipop_size);
+    }
+  else if ((frame->mask | frame->fmask) != 0)
     step2 = riscv_first_stack_step (frame, frame->total_size - libcall_size);
 
-  if (use_restore_libcall)
+  if (use_restore_libcall || use_multi_pop)
     frame->mask = mask; /* Undo the above fib.  */
 
-  poly_int64 step1 = frame->total_size - step2 - libcall_size;
+  poly_int64 step1 = frame->total_size - step2 - libcall_size - multipop_size ;
 
   /* Set TARGET to BASE + STEP1.  */
   if (known_gt (step1, 0))
@@ -5652,7 +5897,7 @@ riscv_expand_epilogue (int style)
 					   adjust));
 	  rtx dwarf = NULL_RTX;
 	  rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-					     GEN_INT (step2 + libcall_size));
+					     GEN_INT (step2 + libcall_size + multipop_size));
 
 	  dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
 	  RTX_FRAME_RELATED_P (insn) = 1;
@@ -5667,15 +5912,15 @@ riscv_expand_epilogue (int style)
       epilogue_cfa_sp_offset = step2;
     }
 
-  if (use_restore_libcall)
+  if (use_restore_libcall || use_multi_pop)
     frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
 
   /* Restore the registers.  */
-  riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size,
+  riscv_for_each_saved_reg (frame->total_size - step2 - libcall_size - multipop_size,
 			    riscv_restore_reg,
 			    true, style == EXCEPTION_RETURN);
 
-  if (use_restore_libcall)
+  if (use_restore_libcall || use_multi_pop)
       frame->mask = mask; /* Undo the above fib.  */
 
   if (need_barrier_p)
@@ -5689,14 +5934,30 @@ riscv_expand_epilogue (int style)
 
       rtx dwarf = NULL_RTX;
       rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-					 GEN_INT (libcall_size));
+					 GEN_INT (libcall_size + multipop_size));
       dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
       RTX_FRAME_RELATED_P (insn) = 1;
 
       REG_NOTES (insn) = dwarf;
     }
 
-  if (use_restore_libcall)
+  if (use_multi_pop)
+    {
+      unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
+      if (use_multi_pop_normal)
+        insn = emit_jump_insn (
+          riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
+      else
+        insn= emit_insn (
+          riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
+
+      rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
+      RTX_FRAME_RELATED_P (insn) = 1;
+      REG_NOTES (insn) = dwarf;
+      if (use_multi_pop_normal)
+        return;
+    }
+  else if (use_restore_libcall)
     {
       rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
       insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
@@ -6980,6 +7241,25 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
   return gen_rtx_PARALLEL (VOIDmode, vec);
 }
 
+static HOST_WIDE_INT zcmp_base_adj(int regs_num)
+{
+  return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
+}
+
+static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
+{
+  return total - zcmp_base_adj(regs_num);
+}
+
+bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
+{
+  HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
+  return additioanl_bytes == 0
+         || additioanl_bytes  == 1 * ZCMP_SP_INC_STEP
+         || additioanl_bytes  == 2 * ZCMP_SP_INC_STEP
+         || additioanl_bytes  == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
+}
+
 /* Return true if it's valid gpr_save pattern.  */
 
 bool
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 4541255a8ae..2fa555dce2d 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -420,6 +420,29 @@ ASM_MISA_SPEC
 #define RISCV_CALL_ADDRESS_TEMP(MODE) \
   gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
 
+#define RETURN_ADDR_MASK        ( 1 << RETURN_ADDR_REGNUM)
+#define S0_MASK                 ( 1 << S0_REGNUM)
+#define S1_MASK                 ( 1 << S1_REGNUM)
+#define S2_MASK                 ( 1 << S2_REGNUM)
+#define S3_MASK                 ( 1 << S3_REGNUM)
+#define S4_MASK                 ( 1 << S4_REGNUM)
+#define S5_MASK                 ( 1 << S5_REGNUM)
+#define S6_MASK                 ( 1 << S6_REGNUM)
+#define S7_MASK                 ( 1 << S7_REGNUM)
+#define S8_MASK                 ( 1 << S8_REGNUM)
+#define S9_MASK                 ( 1 << S9_REGNUM)
+#define S10_MASK                ( 1 << S10_REGNUM)
+#define S11_MASK                ( 1 << S11_REGNUM)
+
+#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK  | S3_MASK \
+                                               | S4_MASK | S5_MASK | S6_MASK  | S7_MASK \
+                                               | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
+#define ZCMP_MAX_SPIMM 3
+#define ZCMP_SP_INC_STEP 16
+#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
+#define ZCMP_S0S11_SREGS_COUNTS 12
+#define ZCMP_MAX_GRP_SLOTS 13
+
 #define MCOUNT_NAME "_mcount"
 
 #define NO_PROFILE_COUNTERS 1
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index be960583101..c858b3bc9ef 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -113,6 +113,7 @@
 
 (define_constants
   [(RETURN_ADDR_REGNUM		1)
+   (SP_REGNUM 			2)
    (GP_REGNUM 			3)
    (TP_REGNUM			4)
    (T0_REGNUM			5)
@@ -3163,3 +3164,4 @@
 (include "sifive-7.md")
 (include "thead.md")
 (include "vector.md")
+(include "zc.md")
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
new file mode 100644
index 00000000000..5c1bf031b8d
--- /dev/null
+++ b/gcc/config/riscv/zc.md
@@ -0,0 +1,1042 @@
+;; Machine description for RISC-V Zc extention.
+;; Copyright (C) 2023 Free Software Foundation, Inc.
+;; Contributed by Fei Gao (gaofei@eswincomputing.com).
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_insn "@gpr_multi_pop_up_to_ra_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s0_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s1_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s2_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s3_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s4_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s5_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s6_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s7_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                      (const_int <slot8_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s8_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s9_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                      (const_int <slot8_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_pop_up_to_s11_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+   (set (reg:X S11_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S10_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                      (const_int <slot8_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot11_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot12_offset>))))]
+  "TARGET_ZCMP"
+  "cm.pop	{ra, s0-s11}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_ra_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s0_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s1_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s2_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s3_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s4_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s5_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s6_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s7_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s8_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s9_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_popret_up_to_s11_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+   (set (reg:X S11_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S10_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                      (const_int <slot8_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot11_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot12_offset>))))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popret	{ra, s0-s11}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_ra_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s0_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s1_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s2_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s3_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s4_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s5_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s6_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s7_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                      (const_int <slot8_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s8_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S8_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s9_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S9_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S8_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_push_up_to_s11_<mode>"
+  [(set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>)))
+        (reg:X S11_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>)))
+        (reg:X S10_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>)))
+        (reg:X S9_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>)))
+        (reg:X S8_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>)))
+        (reg:X S7_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>)))
+        (reg:X S6_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>)))
+        (reg:X S5_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>)))
+        (reg:X S4_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>)))
+        (reg:X S3_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>)))
+        (reg:X S2_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>)))
+        (reg:X S1_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot11_offset>)))
+        (reg:X S0_REGNUM))
+   (set (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot12_offset>)))
+        (reg:X RETURN_ADDR_REGNUM))
+   (set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
+  "TARGET_ZCMP"
+  "cm.push	{ra, s0-s11}, %0"
+)
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
new file mode 100644
index 00000000000..6dbe489da9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+  (int arg0, int arg1, int arg2, int arg3,
+   int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+**	...
+**	cm.push	{ra, s0-s1}, -64
+**	...
+**	cm.popret	{ra, s0-s1}, 64
+**	...
+*/
+int test1()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0;
+  for (int i = 0; i < 3120; i++)
+  {
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i];
+  }
+  return sum;
+}
+
+/*
+**test2_step1_0_size:
+**	...
+**	cm.push	{ra, s0}, -64
+**	...
+**	cm.popret	{ra, s0}, 64
+**	...
+*/
+int test2_step1_0_size()
+{
+  int volatile iarray[3120 + 1824/4 -8];
+
+  for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+  {
+    iarray[i] = my_getchar() * 2;
+  }
+  return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+**	...
+**	cm.push	{ra, s0-s1}, -64
+**	...
+**	cm.popret	{ra, s0-s1}, 64
+**	...
+*/
+float test3()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+  for (int i = 0; i < 3120; i++)
+  {
+    f1 = getf();
+    f2 = getf();
+    f3 = getf();
+    f4 = getf();
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+  }
+  return sum;
+}
+
+/*
+**outgoing_stack_args:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+int outgoing_stack_args()
+{
+  int  local = getint();
+  return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+**	...
+**	cm.push	{ra}, -32
+**	...
+**	cm.popret	{ra}, 32
+**	...
+*/
+float callPrintInts()
+{
+  volatile float f = getf(); // f in local
+  PrintInts(9,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint:
+**	...
+**	cm.push	{ra}, -32
+**	...
+**	cm.popret	{ra}, 32
+**	...
+*/
+float callPrint()
+{
+  volatile float f = getf(); // f in local
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_S:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+float callPrint_S()
+{
+  float f = getf();
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_2:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+float callPrint_2()
+{
+  float f = getf();
+  PrintInts2(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+**	...
+**	cm.push	{ra}, -16
+**	...
+**	cm.popret	{ra}, 16
+**	...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+  int a  =  9;
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s0:
+**	...
+**	cm.push	{ra, s0}, -16
+**	...
+**	cm.popret	{ra, s0}, 16
+**	...
+*/
+int test_s0()
+{
+
+  int a  =  my_getchar();
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s1:
+**	...
+**	cm.push	{ra, s0-s1}, -16
+**	...
+**	cm.popret	{ra, s0-s1}, 16
+**	...
+*/
+int test_s1()
+{
+
+  int s0  =  my_getchar();
+  int s1  =  my_getchar();
+  int b  =  my_getchar();
+  return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+**	...
+**	cm.push	{ra, s0-s1}, -16
+**	...
+**	cm.popret	{ra, s0-s1}, 16
+**	...
+*/
+int test_f0()
+{
+
+  int s0  =  my_getchar();
+  float f0  =  getf(); 
+  int b  =  my_getchar();
+  return f0 +s0 +b;
+}
+
+/*
+**foo:
+**	cm.push	{ra}, -16
+**	call	f1
+**	cm.pop	{ra}, 16
+**	tail	f2
+*/
+void foo(void)
+{
+  f1();
+  f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
new file mode 100644
index 00000000000..924197cb3c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
@@ -0,0 +1,239 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+char my_getchar();
+float getf();
+int __attribute__((noinline)) incoming_stack_args
+  (int arg0, int arg1, int arg2, int arg3,
+   int arg4, int arg5, int arg6, int arg7, int arg8);
+int getint();
+void PrintInts (int n, ...); // varargs
+void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
+void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
+extern void f1(void);
+extern void f2(void);
+
+/*
+**test1:
+**	...
+**	cm.push	{ra, s0-s4}, -80
+**	...
+**	cm.popret	{ra, s0-s4}, 80
+**	...
+*/
+int test1()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0;
+  for (int i = 0; i < 3120; i++)
+  {
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i];
+  }
+  return sum;
+}
+
+/*
+**test2_step1_0_size:
+**	...
+**	cm.push	{ra, s0-s1}, -64
+**	...
+**	cm.popret	{ra, s0-s1}, 64
+**	...
+*/
+int test2_step1_0_size()
+{
+  int volatile iarray[3120 + 1824/4 -8];
+
+  for (int i = 0; i < 3120 + 1824/4 - 8; i++)
+  {
+    iarray[i] = my_getchar() * 2;
+  }
+  return iarray[0] + iarray[1];
+}
+
+/*
+**test3:
+**	...
+**	cm.push	{ra, s0-s4}, -80
+**	...
+**	cm.popret	{ra, s0-s4}, 80
+**	...
+*/
+float test3()
+{
+  char volatile array[3120];
+  float volatile farray[3120];
+
+  float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
+
+  for (int i = 0; i < 3120; i++)
+  {
+    f1 = getf();
+    f2 = getf();
+    f3 = getf();
+    f4 = getf();
+    array[i] = my_getchar();
+    farray[i] = my_getchar() * 1.2;
+    sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
+  }
+  return sum;
+}
+
+/*
+**outgoing_stack_args:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+int outgoing_stack_args()
+{
+  int  local = getint();
+  return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+/*
+**callPrintInts:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrintInts()
+{
+  volatile float f = getf(); // f in local
+  PrintInts(9,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrint()
+{
+  volatile float f = getf(); // f in local
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_S:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrint_S()
+{
+  float f = getf();
+  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**callPrint_2:
+**	...
+**	cm.push	{ra}, -48
+**	...
+**	cm.popret	{ra}, 48
+**	...
+*/
+float callPrint_2()
+{
+  float f = getf();
+  PrintInts2(0,1,2,3,4,5,6,7,8,9);
+  return f;
+}
+
+/*
+**test_step1_0bytes_save_restore:
+**	...
+**	cm.push	{ra}, -16
+**	...
+**	cm.popret	{ra}, 16
+**	...
+*/
+int test_step1_0bytes_save_restore()
+{
+
+  int a  =  9;
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s0:
+**	...
+**	cm.push	{ra, s0}, -16
+**	...
+**	cm.popret	{ra, s0}, 16
+**	...
+*/
+int test_s0()
+{
+
+  int a  =  my_getchar();
+  int b  =  my_getchar();
+  return a +b;
+}
+
+/*
+**test_s1:
+**	...
+**	cm.push	{ra, s0-s1}, -16
+**	...
+**	cm.popret	{ra, s0-s1}, 16
+**	...
+*/
+int test_s1()
+{
+
+  int s0  =  my_getchar();
+  int s1  =  my_getchar();
+  int b  =  my_getchar();
+  return s1 +s0 +b;
+}
+
+/*
+**test_f0:
+**	...
+**	cm.push	{ra, s0}, -32
+**	...
+**	cm.popret	{ra, s0}, 32
+**	...
+*/
+int test_f0()
+{
+
+  int s0  =  my_getchar();
+  float f0  =  getf(); 
+  int b  =  my_getchar();
+  return f0 +s0 +b;
+}
+
+/*
+**foo:
+**	cm.push	{ra}, -16
+**	call	f1
+**	cm.pop	{ra}, 16
+**	tail	f2
+*/
+void foo(void)
+{
+  f1();
+  f2();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
new file mode 100644
index 00000000000..05602302a8f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
+/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+void bar();
+
+/*
+**fool_rv32e:
+**	cm.push	{ra}, -32
+**	...
+**	call	bar
+**	...
+**	lw	a5,32\(sp\)
+**	...
+**	cm.popret	{ra}, 32
+*/
+int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
+                  int incoming0)
+{
+  bar();
+  return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/4] [RISC-V] support cm.popretz in zcmp
  2023-06-07  5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
  2023-06-07  5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
@ 2023-06-07  5:52 ` Fei Gao
  2023-07-13  8:31   ` Kito Cheng
  2023-06-07  5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
  2023-06-07  5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
  3 siblings, 1 reply; 17+ messages in thread
From: Fei Gao @ 2023-06-07  5:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao

Generate cm.popretz instead of cm.popret if return value is 0.

Signed-off-by: Fei Gao <gaofei@eswincomputing.com>

gcc/ChangeLog:

        * config/riscv/riscv.cc
        (riscv_zcmp_can_use_popretz): true if popretz can be used
        (riscv_gen_multi_pop_insn): interface to generate cm.pop[ret][z]
        (riscv_expand_epilogue): expand cm.pop[ret][z] in epilogue
        * config/riscv/riscv.md:
        * config/riscv/zc.md
        (@gpr_multi_popretz_up_to_ra_<mode>): md for popretz ra
        (@gpr_multi_popretz_up_to_s0_<mode>): md for popretz ra, s0
        (@gpr_multi_popretz_up_to_s1_<mode>): likewise
        (@gpr_multi_popretz_up_to_s2_<mode>): likewise
        (@gpr_multi_popretz_up_to_s3_<mode>): likewise
        (@gpr_multi_popretz_up_to_s4_<mode>): likewise
        (@gpr_multi_popretz_up_to_s5_<mode>): likewise
        (@gpr_multi_popretz_up_to_s6_<mode>): likewise
        (@gpr_multi_popretz_up_to_s7_<mode>): likewise
        (@gpr_multi_popretz_up_to_s8_<mode>): likewise
        (@gpr_multi_popretz_up_to_s9_<mode>): likewise
        (@gpr_multi_popretz_up_to_s11_<mode>): likewise

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rv32e_zcmp.c: add testcase for cm.popretz in rv32e
        * gcc.target/riscv/rv32i_zcmp.c: add testcase for cm.popretz in rv32i
---
 gcc/config/riscv/riscv.cc                   | 114 ++++--
 gcc/config/riscv/riscv.md                   |   1 +
 gcc/config/riscv/zc.md                      | 393 ++++++++++++++++++++
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c |  12 +
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c |  12 +
 5 files changed, 508 insertions(+), 24 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index c476c699f4c..f60c241a526 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -435,6 +435,7 @@ typedef enum
   PUSH_IDX = 0,
   POP_IDX,
   POPRET_IDX,
+  POPRETZ_IDX,
   ZCMP_OP_NUM
 } riscv_zcmp_op_t;
 
@@ -5535,30 +5536,30 @@ riscv_emit_stack_tie (void)
 /*zcmp multi push and pop code_for_push_pop function ptr array  */
 const code_for_push_pop_t code_for_push_pop [ZCMP_MAX_GRP_SLOTS][ZCMP_OP_NUM] = {
   {code_for_gpr_multi_push_up_to_ra,    code_for_gpr_multi_pop_up_to_ra,
-   code_for_gpr_multi_popret_up_to_ra},
+   code_for_gpr_multi_popret_up_to_ra,  code_for_gpr_multi_popretz_up_to_ra},
   {code_for_gpr_multi_push_up_to_s0,    code_for_gpr_multi_pop_up_to_s0,
-   code_for_gpr_multi_popret_up_to_s0},
+   code_for_gpr_multi_popret_up_to_s0,  code_for_gpr_multi_popretz_up_to_s0},
   {code_for_gpr_multi_push_up_to_s1,    code_for_gpr_multi_pop_up_to_s1,
-   code_for_gpr_multi_popret_up_to_s1},
+   code_for_gpr_multi_popret_up_to_s1,  code_for_gpr_multi_popretz_up_to_s1},
   {code_for_gpr_multi_push_up_to_s2,    code_for_gpr_multi_pop_up_to_s2,
-   code_for_gpr_multi_popret_up_to_s2},
+   code_for_gpr_multi_popret_up_to_s2,  code_for_gpr_multi_popretz_up_to_s2},
   {code_for_gpr_multi_push_up_to_s3,    code_for_gpr_multi_pop_up_to_s3,
-   code_for_gpr_multi_popret_up_to_s3},
+   code_for_gpr_multi_popret_up_to_s3,  code_for_gpr_multi_popretz_up_to_s3},
   {code_for_gpr_multi_push_up_to_s4,    code_for_gpr_multi_pop_up_to_s4,
-   code_for_gpr_multi_popret_up_to_s4},
+   code_for_gpr_multi_popret_up_to_s4,  code_for_gpr_multi_popretz_up_to_s4},
   {code_for_gpr_multi_push_up_to_s5,    code_for_gpr_multi_pop_up_to_s5,
-   code_for_gpr_multi_popret_up_to_s5},
+   code_for_gpr_multi_popret_up_to_s5,  code_for_gpr_multi_popretz_up_to_s5},
   {code_for_gpr_multi_push_up_to_s6,    code_for_gpr_multi_pop_up_to_s6,
-   code_for_gpr_multi_popret_up_to_s6},
+   code_for_gpr_multi_popret_up_to_s6,  code_for_gpr_multi_popretz_up_to_s6},
   {code_for_gpr_multi_push_up_to_s7,    code_for_gpr_multi_pop_up_to_s7,
-   code_for_gpr_multi_popret_up_to_s7},
+   code_for_gpr_multi_popret_up_to_s7,  code_for_gpr_multi_popretz_up_to_s7},
   {code_for_gpr_multi_push_up_to_s8,    code_for_gpr_multi_pop_up_to_s8,
-   code_for_gpr_multi_popret_up_to_s8},
+   code_for_gpr_multi_popret_up_to_s8,  code_for_gpr_multi_popretz_up_to_s8},
   {code_for_gpr_multi_push_up_to_s9,    code_for_gpr_multi_pop_up_to_s9,
-   code_for_gpr_multi_popret_up_to_s9},
-  {nullptr, nullptr, nullptr},
+   code_for_gpr_multi_popret_up_to_s9,  code_for_gpr_multi_popretz_up_to_s9},
+  {nullptr, nullptr, nullptr, nullptr},
   {code_for_gpr_multi_push_up_to_s11,   code_for_gpr_multi_pop_up_to_s11,
-   code_for_gpr_multi_popret_up_to_s11}};
+   code_for_gpr_multi_popret_up_to_s11, code_for_gpr_multi_popretz_up_to_s11}};
 
 static rtx
 riscv_gen_multi_push_pop_insn (riscv_zcmp_op_t op, HOST_WIDE_INT adj_size,
@@ -5747,6 +5748,80 @@ riscv_adjust_libcall_cfi_epilogue ()
   return dwarf;
 }
 
+/* return true if popretz pattern can be matched.
+   set (reg 10 a0) (const_int 0)
+   use (reg 10 a0)
+   NOTE_INSN_EPILOGUE_BEG  */
+static rtx_insn *
+riscv_zcmp_can_use_popretz(void)
+{
+  rtx_insn *insn = NULL, *use = NULL, *clear = NULL;
+
+  /* sequence stack for NOTE_INSN_EPILOGUE_BEG*/
+  struct sequence_stack * outer_seq = get_current_sequence ()->next;
+  if (!outer_seq)
+    return NULL;
+  insn = outer_seq->first;
+  if(!insn || !NOTE_P (insn) || NOTE_KIND (insn) != NOTE_INSN_EPILOGUE_BEG)
+    return NULL;
+
+  /* sequence stack for the insn before NOTE_INSN_EPILOGUE_BEG*/
+  outer_seq = outer_seq->next;
+  if (outer_seq)
+    insn = outer_seq->last;
+
+  /* skip notes  */
+  while (insn && NOTE_P (insn))
+    {
+      insn = PREV_INSN (insn);
+    }
+  use = insn;
+
+  /* match use (reg 10 a0)  */
+  if (use == NULL || !INSN_P (use)
+      || GET_CODE (PATTERN (use)) != USE
+      || !REG_P(XEXP(PATTERN (use), 0))
+      || REGNO(XEXP(PATTERN (use), 0)) != A0_REGNUM)
+    return NULL;
+
+  /* match set (reg 10 a0) (const_int 0 [0])  */
+  clear = PREV_INSN (use);
+  if (clear != NULL && INSN_P (clear)
+      && GET_CODE (PATTERN (clear)) == SET
+      && REG_P (SET_DEST (PATTERN (clear)))
+      && REGNO (SET_DEST (PATTERN (clear))) == A0_REGNUM
+      && SET_SRC (PATTERN (clear)) == const0_rtx)
+    return clear;
+
+  return NULL;
+}
+
+static void
+riscv_gen_multi_pop_insn(bool use_multi_pop_normal, unsigned mask,
+                         unsigned multipop_size)
+{
+  rtx insn;
+  unsigned regs_count = riscv_multi_push_regs_count (mask);
+
+  if (!use_multi_pop_normal)
+    insn= emit_insn (
+      riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
+  else if(rtx_insn * clear_a0_insn = riscv_zcmp_can_use_popretz())
+    {
+      delete_insn (NEXT_INSN (clear_a0_insn));
+      delete_insn (clear_a0_insn);
+      insn = emit_jump_insn (
+        riscv_gen_multi_push_pop_insn (POPRETZ_IDX, multipop_size, regs_count));
+    }
+  else
+    insn = emit_jump_insn (
+      riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
+
+  rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
+  RTX_FRAME_RELATED_P (insn) = 1;
+  REG_NOTES (insn) = dwarf;
+}
+
 /* Expand an "epilogue", "sibcall_epilogue", or "eh_return_internal" pattern;
    style says which.  */
 
@@ -5943,17 +6018,8 @@ riscv_expand_epilogue (int style)
 
   if (use_multi_pop)
     {
-      unsigned regs_count = riscv_multi_push_regs_count (frame->mask);
-      if (use_multi_pop_normal)
-        insn = emit_jump_insn (
-          riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
-      else
-        insn= emit_insn (
-          riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
-
-      rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
-      RTX_FRAME_RELATED_P (insn) = 1;
-      REG_NOTES (insn) = dwarf;
+      riscv_gen_multi_pop_insn(use_multi_pop_normal,
+                               frame->mask, multipop_size);
       if (use_multi_pop_normal)
         return;
     }
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c858b3bc9ef..b2e1f82f627 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -120,6 +120,7 @@
    (T1_REGNUM			6)
    (S0_REGNUM			8)
    (S1_REGNUM			9)
+   (A0_REGNUM			10)
    (S2_REGNUM			18)
    (S3_REGNUM			19)
    (S4_REGNUM			20)
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
index 5c1bf031b8d..8d7de97daad 100644
--- a/gcc/config/riscv/zc.md
+++ b/gcc/config/riscv/zc.md
@@ -708,6 +708,399 @@
   "cm.popret	{ra, s0-s11}, %0"
 )
 
+(define_insn "@gpr_multi_popretz_up_to_ra_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s0_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s1_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s1}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s2_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s2}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s3_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s3}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s4_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s4}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s5_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s5}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s6_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s6}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s7_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s7}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s8_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s8}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s9_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot8_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s9}, %0"
+)
+
+(define_insn "@gpr_multi_popretz_up_to_s11_<mode>"
+  [(set (reg:X SP_REGNUM)
+        (plus:X (reg:X SP_REGNUM)
+                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
+   (set (reg:X S11_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot0_offset>))))
+   (set (reg:X S10_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot1_offset>))))
+   (set (reg:X S9_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot2_offset>))))
+   (set (reg:X S8_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot3_offset>))))
+   (set (reg:X S7_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot4_offset>))))
+   (set (reg:X S6_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot5_offset>))))
+   (set (reg:X S5_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot6_offset>))))
+   (set (reg:X S4_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot7_offset>))))
+   (set (reg:X S3_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                      (const_int <slot8_offset>))))
+   (set (reg:X S2_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot9_offset>))))
+   (set (reg:X S1_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot10_offset>))))
+   (set (reg:X S0_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot11_offset>))))
+   (set (reg:X RETURN_ADDR_REGNUM)
+        (mem:X (plus:X (reg:X SP_REGNUM)
+                       (const_int <slot12_offset>))))
+   (set (reg:X A0_REGNUM)
+        (const_int 0))
+   (use (reg:X A0_REGNUM))
+   (return)
+   (use (reg:SI RETURN_ADDR_REGNUM))]
+  "TARGET_ZCMP"
+  "cm.popretz	{ra, s0-s11}, %0"
+)
+
 (define_insn "@gpr_multi_push_up_to_ra_<mode>"
   [(set (mem:X (plus:X (reg:X SP_REGNUM)
                        (const_int <slot0_offset>)))
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
index 6dbe489da9b..05e52df99c2 100644
--- a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -237,3 +237,15 @@ void foo(void)
   f1();
   f2();
 }
+
+/*
+**test_popretz:
+**	cm.push	{ra}, -16
+**	call	f1
+**	cm.popretz	{ra}, 16
+*/
+long test_popretz()
+{
+        f1();
+        return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
index 924197cb3c4..7d5c1121c35 100644
--- a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
+++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
@@ -237,3 +237,15 @@ void foo(void)
   f1();
   f2();
 }
+
+/*
+**test_popretz:
+**	cm.push	{ra}, -16
+**	call	f1
+**	cm.popretz	{ra}, 16
+*/
+long test_popretz()
+{
+        f1();
+        return 0;
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
  2023-06-07  5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
  2023-06-07  5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
  2023-06-07  5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
@ 2023-06-07  5:52 ` Fei Gao
  2023-06-12 15:17   ` Kito Cheng
  2023-06-12 19:26   ` Jeff Law
  2023-06-07  5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
  3 siblings, 2 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-07  5:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Fei Gao

Disable zcmp multi push/pop if shrink-wrap-separate is active.

So in -Os that prefers smaller code size, by default shrink-wrap-separate
is disabled while zcmp multi push/pop is enabled.

And in -O2 and others that prefers speed, by default shrink-wrap-separate
is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.

The following TC shows the issues in -O2 before this patch with both
shrink-wrap-separate and zcmp multi push/pop active.
1. duplicated store of s regs.
2. cm.push pushes ra, s0-s11 in reverse order than what normal
   prologue does, causing stack corruption and failure to resotre s regs.

TC: zcmp_shrink_wrap_separate.c included in this patch.

output asm before this patch:
calc_func:
	cm.push	{ra, s0-s3}, -32
	...
	beq	a5,zero,.L2
	...
.L2:
	...
	sw	s1,20(sp) //issue here
	sw	s3,12(sp) //issue here
	...
	sw	s2,16(sp) //issue here

output asm after this patch:
calc_func:
	addi	sp,sp,-32
	sw	s0,24(sp)
	...
	beq	a5,zero,.L2
	...
.L2:
	...
	sw	s1,20(sp)
	sw	s3,12(sp)
	...
	sw	s2,16(sp)
gcc/ChangeLog:

        * config/riscv/riscv.cc
        (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
        riscv_avoid_shrink_wrapping_separate.
        (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
          is active.
        (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
        * shrink-wrap.cc (try_shrink_wrapping_separate): call
          use_shrink_wrapping_separate.
        (use_shrink_wrapping_separate):wrap the condition
          check in use_shrink_wrapping_separate 
        * shrink-wrap.h (use_shrink_wrapping_separate): add to extern

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
        * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.

Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
Co-Authored-By: Zhangjin Liao <liaozhangjin@eswincomputing.com>
---
 gcc/config/riscv/riscv.cc                     | 19 +++-
 gcc/shrink-wrap.cc                            | 25 +++--
 gcc/shrink-wrap.h                             |  1 +
 .../riscv/zcmp_shrink_wrap_separate.c         | 97 +++++++++++++++++++
 .../riscv/zcmp_shrink_wrap_separate2.c        | 97 +++++++++++++++++++
 5 files changed, 228 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f60c241a526..b505cdeca34 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfghooks.h"
 #include "cfgloop.h"
 #include "cfgrtl.h"
+#include "shrink-wrap.h"
 #include "sel-sched.h"
 #include "fold-const.h"
 #include "gimple-iterator.h"
@@ -389,6 +390,7 @@ static const struct riscv_tune_param optimize_size_tune_info = {
   false,					/* use_divmod_expansion */
 };
 
+static bool riscv_avoid_shrink_wrapping_separate ();
 static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
 static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);
 
@@ -4910,6 +4912,8 @@ riscv_avoid_multi_push(const struct riscv_frame_info *frame)
       || cfun->machine->interrupt_handler_p
       || cfun->machine->varargs_size != 0
       || crtl->args.pretend_args_size != 0
+      || (use_shrink_wrapping_separate ()
+          && !riscv_avoid_shrink_wrapping_separate ())
       || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
     return true;
 
@@ -6077,6 +6081,17 @@ riscv_epilogue_uses (unsigned int regno)
   return false;
 }
 
+static bool
+riscv_avoid_shrink_wrapping_separate ()
+{
+  if (riscv_use_save_libcall (&cfun->machine->frame)
+      || cfun->machine->interrupt_handler_p
+      || !cfun->machine->frame.gp_sp_offset.is_constant ())
+    return true;
+
+  return false;
+}
+
 /* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS.  */
 
 static sbitmap
@@ -6086,9 +6101,7 @@ riscv_get_separate_components (void)
   sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER);
   bitmap_clear (components);
 
-  if (riscv_use_save_libcall (&cfun->machine->frame)
-      || cfun->machine->interrupt_handler_p
-      || !cfun->machine->frame.gp_sp_offset.is_constant ())
+  if (riscv_avoid_shrink_wrapping_separate ())
     return components;
 
   offset = cfun->machine->frame.gp_sp_offset.to_constant ();
diff --git a/gcc/shrink-wrap.cc b/gcc/shrink-wrap.cc
index b8d7b557130..d534964321a 100644
--- a/gcc/shrink-wrap.cc
+++ b/gcc/shrink-wrap.cc
@@ -1776,16 +1776,14 @@ insert_prologue_epilogue_for_components (sbitmap components)
   commit_edge_insertions ();
 }
 
-/* The main entry point to this subpass.  FIRST_BB is where the prologue
-   would be normally put.  */
-void
-try_shrink_wrapping_separate (basic_block first_bb)
+bool
+use_shrink_wrapping_separate (void)
 {
   if (!(SHRINK_WRAPPING_ENABLED
-	&& flag_shrink_wrap_separate
-	&& optimize_function_for_speed_p (cfun)
-	&& targetm.shrink_wrap.get_separate_components))
-    return;
+        && flag_shrink_wrap_separate
+        && optimize_function_for_speed_p (cfun)
+        && targetm.shrink_wrap.get_separate_components))
+    return false;
 
   /* We don't handle "strange" functions.  */
   if (cfun->calls_alloca
@@ -1794,6 +1792,17 @@ try_shrink_wrapping_separate (basic_block first_bb)
       || crtl->calls_eh_return
       || crtl->has_nonlocal_goto
       || crtl->saves_all_registers)
+    return false;
+
+  return true;
+}
+
+/* The main entry point to this subpass.  FIRST_BB is where the prologue
+   would be normally put.  */
+void
+try_shrink_wrapping_separate (basic_block first_bb)
+{
+  if (!use_shrink_wrapping_separate ())
     return;
 
   /* Ask the target what components there are.  If it returns NULL, don't
diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
index 161647711a3..82386c2b712 100644
--- a/gcc/shrink-wrap.h
+++ b/gcc/shrink-wrap.h
@@ -26,6 +26,7 @@ along with GCC; see the file COPYING3.  If not see
 extern bool requires_stack_frame_p (rtx_insn *, HARD_REG_SET, HARD_REG_SET);
 extern void try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq);
 extern void try_shrink_wrapping_separate (basic_block first_bb);
+extern bool use_shrink_wrapping_separate (void);
 #define SHRINK_WRAPPING_ENABLED \
   (flag_shrink_wrap && targetm.have_simple_return ())
 
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
new file mode 100644
index 00000000000..11f87aee607
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
@@ -0,0 +1,97 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+
+typedef struct MAT_PARAMS_S
+{
+    int     N;
+    signed short *A;
+    signed short *B;
+    signed int *C;
+} mat_params;
+
+typedef struct CORE_PORTABLE_S
+{
+    unsigned char portable_id;
+} core_portable;
+
+typedef struct RESULTS_S
+{
+    /* inputs */
+    signed short              seed1;       /* Initializing seed */
+    signed short              seed2;       /* Initializing seed */
+    signed short              seed3;       /* Initializing seed */
+    void *              memblock[4]; /* Pointer to safe memory location */
+    unsigned int              size;        /* Size of the data */
+    unsigned int              iterations;  /* Number of iterations to execute */
+    unsigned int              execs;       /* Bitmask of operations to execute */
+    struct list_head_s *list;
+    mat_params          mat;
+    /* outputs */
+    unsigned short crc;
+    unsigned short crclist;
+    unsigned short crcmatrix;
+    unsigned short crcstate;
+    signed short err;
+    /* ultithread specific */
+    core_portable port;
+} core_results;
+
+extern signed short
+core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
+
+extern signed short
+core_bench_matrix(mat_params *, signed short, unsigned short);
+
+extern unsigned short
+crcu16(signed short, unsigned short);
+
+signed short
+calc_func(signed short *pdata, core_results *res)
+{
+    signed short data = *pdata;
+    signed short retval;
+    unsigned char  optype
+        = (data >> 7)
+          & 1;  /* bit 7 indicates if the function result has been cached */
+    if (optype) /* if cached, use cache */
+        return (data & 0x007f);
+    else
+    {                             /* otherwise calculate and cache the result */
+        signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
+        signed short dtype
+            = ((data >> 3)
+               & 0xf);       /* bits 3-6 is specific data for the operation */
+        dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
+        switch (flag)
+        {
+            case 0:
+                if (dtype < 0x22) /* set min period for bit corruption */
+                    dtype = 0x22;
+                retval = core_bench_state(res->size,
+                                          res->memblock[3],
+                                          res->seed1,
+                                          res->seed2,
+                                          dtype,
+                                          res->crc);
+                if (res->crcstate == 0)
+                    res->crcstate = retval;
+                break;
+            case 1:
+                retval = core_bench_matrix(&(res->mat), dtype, res->crc);
+                if (res->crcmatrix == 0)
+                    res->crcmatrix = retval;
+                break;
+            default:
+                retval = data;
+                break;
+        }
+        res->crc = crcu16(retval, res->crc);
+        retval &= 0x007f;
+        *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
+        return retval;
+    }
+}
+
+/* { dg-final { scan-assembler-not "cm\.push" } } */
+
diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
new file mode 100644
index 00000000000..ec7e9c39b5d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
@@ -0,0 +1,97 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 -fno-shrink-wrap-separate -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
+
+typedef struct MAT_PARAMS_S
+{
+    int     N;
+    signed short *A;
+    signed short *B;
+    signed int *C;
+} mat_params;
+
+typedef struct CORE_PORTABLE_S
+{
+    unsigned char portable_id;
+} core_portable;
+
+typedef struct RESULTS_S
+{
+    /* inputs */
+    signed short              seed1;       /* Initializing seed */
+    signed short              seed2;       /* Initializing seed */
+    signed short              seed3;       /* Initializing seed */
+    void *              memblock[4]; /* Pointer to safe memory location */
+    unsigned int              size;        /* Size of the data */
+    unsigned int              iterations;  /* Number of iterations to execute */
+    unsigned int              execs;       /* Bitmask of operations to execute */
+    struct list_head_s *list;
+    mat_params          mat;
+    /* outputs */
+    unsigned short crc;
+    unsigned short crclist;
+    unsigned short crcmatrix;
+    unsigned short crcstate;
+    signed short err;
+    /* ultithread specific */
+    core_portable port;
+} core_results;
+
+extern signed short
+core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
+
+extern signed short
+core_bench_matrix(mat_params *, signed short, unsigned short);
+
+extern unsigned short
+crcu16(signed short, unsigned short);
+
+signed short
+calc_func(signed short *pdata, core_results *res)
+{
+    signed short data = *pdata;
+    signed short retval;
+    unsigned char  optype
+        = (data >> 7)
+          & 1;  /* bit 7 indicates if the function result has been cached */
+    if (optype) /* if cached, use cache */
+        return (data & 0x007f);
+    else
+    {                             /* otherwise calculate and cache the result */
+        signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
+        signed short dtype
+            = ((data >> 3)
+               & 0xf);       /* bits 3-6 is specific data for the operation */
+        dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
+        switch (flag)
+        {
+            case 0:
+                if (dtype < 0x22) /* set min period for bit corruption */
+                    dtype = 0x22;
+                retval = core_bench_state(res->size,
+                                          res->memblock[3],
+                                          res->seed1,
+                                          res->seed2,
+                                          dtype,
+                                          res->crc);
+                if (res->crcstate == 0)
+                    res->crcstate = retval;
+                break;
+            case 1:
+                retval = core_bench_matrix(&(res->mat), dtype, res->crc);
+                if (res->crcmatrix == 0)
+                    res->crcmatrix = retval;
+                break;
+            default:
+                retval = data;
+                break;
+        }
+        res->crc = crcu16(retval, res->crc);
+        retval &= 0x007f;
+        *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
+        return retval;
+    }
+}
+
+/* { dg-final { scan-assembler "cm\.push" } } */
+
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp
  2023-06-07  5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
                   ` (2 preceding siblings ...)
  2023-06-07  5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
@ 2023-06-07  5:52 ` Fei Gao
  2023-07-13  8:18   ` Kito Cheng
  3 siblings, 1 reply; 17+ messages in thread
From: Fei Gao @ 2023-06-07  5:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, sinan.lin, jiawei, Die Li

From: Die Li <lidie@eswincomputing.com>

Signed-off-by: Die Li <lidie@eswincomputing.com>
Co-Authored-By: Fei Gao <gaofei@eswincomputing.com>

gcc/ChangeLog:

        * config/riscv/peephole.md: New pattern.
        * config/riscv/predicates.md (a0a1_reg_operand): New predicate.
        (zcmp_mv_sreg_operand): New predicate.
        * config/riscv/riscv.md: New predicate.
        * config/riscv/zc.md (*mva01s<X:mode>): New pattern.
        (*mvsa01<X:mode>): New pattern.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/cm_mv_rv32.c: New test.
---
 gcc/config/riscv/peephole.md                | 28 +++++++++++++++++++++
 gcc/config/riscv/predicates.md              | 11 ++++++++
 gcc/config/riscv/riscv.md                   |  1 +
 gcc/config/riscv/zc.md                      | 22 ++++++++++++++++
 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c | 21 ++++++++++++++++
 5 files changed, 83 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c

diff --git a/gcc/config/riscv/peephole.md b/gcc/config/riscv/peephole.md
index 67e7046d7e6..e8cb1ba4838 100644
--- a/gcc/config/riscv/peephole.md
+++ b/gcc/config/riscv/peephole.md
@@ -94,3 +94,31 @@
 {
   th_mempair_order_operands (operands, true, SImode);
 })
+
+;; ZCMP
+(define_peephole2
+  [(set (match_operand:X 0 "a0a1_reg_operand")
+        (match_operand:X 1 "zcmp_mv_sreg_operand"))
+   (set (match_operand:X 2 "a0a1_reg_operand")
+        (match_operand:X 3 "zcmp_mv_sreg_operand"))]
+  "TARGET_ZCMP
+   && (REGNO (operands[2]) != REGNO (operands[0]))"
+  [(parallel [(set (match_dup 0)
+                   (match_dup 1))
+              (set (match_dup 2)
+                   (match_dup 3))])]
+)
+
+(define_peephole2
+  [(set (match_operand:X 0 "zcmp_mv_sreg_operand")
+        (match_operand:X 1 "a0a1_reg_operand"))
+   (set (match_operand:X 2 "zcmp_mv_sreg_operand")
+        (match_operand:X 3 "a0a1_reg_operand"))]
+  "TARGET_ZCMP
+   && (REGNO (operands[0]) != REGNO (operands[2]))
+   && (REGNO (operands[1]) != REGNO (operands[3]))"
+  [(parallel [(set (match_dup 0)
+                   (match_dup 1))
+              (set (match_dup 2)
+                   (match_dup 3))])]
+)
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index a1b9367b997..6d5e8630cb5 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -207,6 +207,17 @@
   (and (match_code "const_int")
        (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
 
+;; ZCMP predicates
+(define_predicate "a0a1_reg_operand"
+  (and (match_operand 0 "register_operand")
+       (match_test "IN_RANGE (REGNO (op), A0_REGNUM, A1_REGNUM)")))
+
+(define_predicate "zcmp_mv_sreg_operand"
+  (and (match_operand 0 "register_operand")
+       (match_test "TARGET_RVE ? IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
+                    : IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
+                    || IN_RANGE (REGNO (op), S2_REGNUM, S7_REGNUM)")))
+
 ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
 (define_predicate "branch_on_bit_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 02802d2685d..25bc3e6ab4c 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -121,6 +121,7 @@
    (S0_REGNUM			8)
    (S1_REGNUM			9)
    (A0_REGNUM			10)
+   (A1_REGNUM			11)
    (S2_REGNUM			18)
    (S3_REGNUM			19)
    (S4_REGNUM			20)
diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
index 217e115035b..bb4975cd333 100644
--- a/gcc/config/riscv/zc.md
+++ b/gcc/config/riscv/zc.md
@@ -1433,3 +1433,25 @@
   "TARGET_ZCMP"
   "cm.push	{ra, s0-s11}, %0"
 )
+
+;; ZCMP mv
+(define_insn "*mva01s<X:mode>"
+  [(set (match_operand:X 0 "a0a1_reg_operand" "=r")
+        (match_operand:X 1 "zcmp_mv_sreg_operand" "r"))
+   (set (match_operand:X 2 "a0a1_reg_operand" "=r")
+        (match_operand:X 3 "zcmp_mv_sreg_operand" "r"))]
+  "TARGET_ZCMP
+   && (REGNO (operands[2]) != REGNO (operands[0]))"
+  { return (REGNO (operands[0]) == A0_REGNUM)?"cm.mva01s\t%1,%3":"cm.mva01s\t%3,%1"; }
+  [(set_attr "mode" "<X:MODE>")])
+
+(define_insn "*mvsa01<X:mode>"
+  [(set (match_operand:X 0 "zcmp_mv_sreg_operand" "=r")
+        (match_operand:X 1 "a0a1_reg_operand" "r"))
+   (set (match_operand:X 2 "zcmp_mv_sreg_operand" "=r")
+        (match_operand:X 3 "a0a1_reg_operand" "r"))]
+  "TARGET_ZCMP
+   && (REGNO (operands[0]) != REGNO (operands[2]))
+   && (REGNO (operands[1]) != REGNO (operands[3]))"
+  { return (REGNO (operands[1]) == A0_REGNUM)?"cm.mvsa01\t%0,%2":"cm.mvsa01\t%2,%0"; }
+  [(set_attr "mode" "<X:MODE>")])
diff --git a/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
new file mode 100644
index 00000000000..49c94c01603
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options " -Os -march=rv32i_zca_zcmp -mabi=ilp32 " } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+int func (int a, int b);
+
+/*
+**sum:
+**	...
+**	cm.mvsa01	s1,s2
+**	call	func
+**	mv	s0,a0
+**	cm.mva01s	s1,s2
+**	call	func
+**	...
+*/
+int sum (int a, int b)
+{
+        return func (a, b) + func (a, b);
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-06-07  5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
@ 2023-06-07 10:11   ` jiawei
  2023-08-16  8:33   ` Kito Cheng
  1 sibling, 0 replies; 17+ messages in thread
From: jiawei @ 2023-06-07 10:11 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, kito.cheng, palmer, jeffreyalaw, sinan.lin

Seems there are some indent format problems in the patch, could you fix them :)

```
patch:509: indent with spaces.
          x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
error: patch failed: gcc/config/riscv/riscv.cc:5652
error: gcc/config/riscv/riscv.cc: patch does not apply
```

&gt; -----原始邮件-----
&gt; 发件人: "Fei Gao" <gaofei@eswincomputing.com>
&gt; 发送时间: 2023-06-07 13:52:12 (星期三)
&gt; 收件人: gcc-patches@gcc.gnu.org
&gt; 抄送: kito.cheng@gmail.com, palmer@dabbelt.com, jeffreyalaw@gmail.com, sinan.lin@linux.alibaba.com, jiawei@iscas.ac.cn, "Fei Gao" <gaofei@eswincomputing.com>
&gt; 主题: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
&gt; 
&gt; Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
&gt; by cm.push, step 1 and step 2.
&gt; 
&gt; please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
&gt; So adaption has been done in .cfi directives in my patch.
&gt; 
&gt; Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
&gt; 
&gt; gcc/ChangeLog:
&gt; 
&gt;         * config/riscv/iterators.md
&gt;         slot0_offset: slot 0 offset in stack GPRs area in bytes
&gt;         slot1_offset: slot 1 offset in stack GPRs area in bytes
&gt;         slot2_offset: likewise
&gt;         slot3_offset: likewise
&gt;         slot4_offset: likewise
&gt;         slot5_offset: likewise
&gt;         slot6_offset: likewise
&gt;         slot7_offset: likewise
&gt;         slot8_offset: likewise
&gt;         slot9_offset: likewise
&gt;         slot10_offset: likewise
&gt;         slot11_offset: likewise
&gt;         slot12_offset: likewise
&gt;         * config/riscv/predicates.md
&gt;         (stack_push_up_to_ra_operand): predicates of stack adjust pushing ra
&gt;         (stack_push_up_to_s0_operand): predicates of stack adjust pushing ra, s0
&gt;         (stack_push_up_to_s1_operand): likewise
&gt;         (stack_push_up_to_s2_operand): likewise
&gt;         (stack_push_up_to_s3_operand): likewise
&gt;         (stack_push_up_to_s4_operand): likewise
&gt;         (stack_push_up_to_s5_operand): likewise
&gt;         (stack_push_up_to_s6_operand): likewise
&gt;         (stack_push_up_to_s7_operand): likewise
&gt;         (stack_push_up_to_s8_operand): likewise
&gt;         (stack_push_up_to_s9_operand): likewise
&gt;         (stack_push_up_to_s11_operand): likewise
&gt;         (stack_pop_up_to_ra_operand): predicates of stack adjust poping ra
&gt;         (stack_pop_up_to_s0_operand): predicates of stack adjust poping ra, s0
&gt;         (stack_pop_up_to_s1_operand): likewise
&gt;         (stack_pop_up_to_s2_operand): likewise
&gt;         (stack_pop_up_to_s3_operand): likewise
&gt;         (stack_pop_up_to_s4_operand): likewise
&gt;         (stack_pop_up_to_s5_operand): likewise
&gt;         (stack_pop_up_to_s6_operand): likewise
&gt;         (stack_pop_up_to_s7_operand): likewise
&gt;         (stack_pop_up_to_s8_operand): likewise
&gt;         (stack_pop_up_to_s9_operand): likewise
&gt;         (stack_pop_up_to_s11_operand): likewise
&gt;         * config/riscv/riscv-protos.h
&gt;         (riscv_zcmp_valid_stack_adj_bytes_p):declaration
&gt;         * config/riscv/riscv.cc (struct riscv_frame_info): comment change
&gt;         (riscv_avoid_multi_push): helper function of riscv_use_multi_push
&gt;         (riscv_use_multi_push): true if multi push is used
&gt;         (riscv_multi_push_sregs_count): num of sregs in multi-push
&gt;         (riscv_multi_push_regs_count): num of regs in multi-push
&gt;         (riscv_16bytes_align): align to 16 bytes
&gt;         (riscv_stack_align): moved to a better place
&gt;         (riscv_save_libcall_count): no functional change
&gt;         (riscv_compute_frame_info): add zcmp frame info
&gt;         (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
&gt;         (riscv_gen_multi_push_pop_insn): gen function for multi push and pop
&gt;         (riscv_expand_prologue): allocate stack by cm.push
&gt;         (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
&gt;         (riscv_expand_epilogue): allocate stack by cm.pop[ret]
&gt;         (zcmp_base_adj): calculate stack adjustment base size
&gt;         (zcmp_additional_adj): calculate stack adjustment additional size
&gt;         (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment valid
&gt;         * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
&gt;         (S0_MASK): likewise
&gt;         (S1_MASK): likewise
&gt;         (S2_MASK): likewise
&gt;         (S3_MASK): likewise
&gt;         (S4_MASK): likewise
&gt;         (S5_MASK): likewise
&gt;         (S6_MASK): likewise
&gt;         (S7_MASK): likewise
&gt;         (S8_MASK): likewise
&gt;         (S9_MASK): likewise
&gt;         (S10_MASK): likewise
&gt;         (S11_MASK): likewise
&gt;         (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
&gt;         (ZCMP_MAX_SPIMM): max spimm value
&gt;         (ZCMP_SP_INC_STEP): zcmp sp increment step
&gt;         (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
&gt;         (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
&gt;         (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
&gt;         * config/riscv/riscv.md: include zc.md
&gt;         * config/riscv/zc.md: New file. machine description for zcmp
&gt; 
&gt; gcc/testsuite/ChangeLog:
&gt; 
&gt;         * gcc.target/riscv/rv32e_zcmp.c: New test.
&gt;         * gcc.target/riscv/rv32i_zcmp.c: New test.
&gt;         * gcc.target/riscv/zcmp_stack_alignment.c: New test.
&gt; ---
&gt;  gcc/config/riscv/iterators.md                 |   15 +
&gt;  gcc/config/riscv/predicates.md                |   96 ++
&gt;  gcc/config/riscv/riscv-protos.h               |    1 +
&gt;  gcc/config/riscv/riscv.cc                     |  360 +++++-
&gt;  gcc/config/riscv/riscv.h                      |   23 +
&gt;  gcc/config/riscv/riscv.md                     |    2 +
&gt;  gcc/config/riscv/zc.md                        | 1042 +++++++++++++++++
&gt;  gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c   |  239 ++++
&gt;  gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c   |  239 ++++
&gt;  .../gcc.target/riscv/zcmp_stack_alignment.c   |   23 +
&gt;  10 files changed, 2000 insertions(+), 40 deletions(-)
&gt;  create mode 100644 gcc/config/riscv/zc.md
&gt;  create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
&gt;  create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
&gt;  create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
&gt; 
&gt; diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
&gt; index d374a10810c..6ed4174f9cc 100644
&gt; --- a/gcc/config/riscv/iterators.md
&gt; +++ b/gcc/config/riscv/iterators.md
&gt; @@ -120,6 +120,21 @@
&gt;  (define_mode_attr shiftm1 [(SI "const_si_mask_operand") (DI "const_di_mask_operand")])
&gt;  (define_mode_attr shiftm1p [(SI "DsS") (DI "DsD")])
&gt;  
&gt; +; zcmp mode attribute
&gt; +(define_mode_attr slot0_offset  [(SI "-4")  (DI "-8")])
&gt; +(define_mode_attr slot1_offset  [(SI "-8")  (DI "-16")])
&gt; +(define_mode_attr slot2_offset  [(SI "-12") (DI "-24")])
&gt; +(define_mode_attr slot3_offset  [(SI "-16") (DI "-32")])
&gt; +(define_mode_attr slot4_offset  [(SI "-20") (DI "-40")])
&gt; +(define_mode_attr slot5_offset  [(SI "-24") (DI "-48")])
&gt; +(define_mode_attr slot6_offset  [(SI "-28") (DI "-56")])
&gt; +(define_mode_attr slot7_offset  [(SI "-32") (DI "-64")])
&gt; +(define_mode_attr slot8_offset  [(SI "-36") (DI "-72")])
&gt; +(define_mode_attr slot9_offset  [(SI "-40") (DI "-80")])
&gt; +(define_mode_attr slot10_offset [(SI "-44") (DI "-88")])
&gt; +(define_mode_attr slot11_offset [(SI "-48") (DI "-96")])
&gt; +(define_mode_attr slot12_offset [(SI "-52") (DI "-104")])
&gt; +
&gt;  ;; -------------------------------------------------------------------
&gt;  ;; Code Iterators
&gt;  ;; -------------------------------------------------------------------
&gt; diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
&gt; index 04ca6ceabc7..ab67b3332f0 100644
&gt; --- a/gcc/config/riscv/predicates.md
&gt; +++ b/gcc/config/riscv/predicates.md
&gt; @@ -65,6 +65,102 @@
&gt;    (ior (match_operand 0 "const_0_operand")
&gt;         (match_operand 0 "register_operand")))
&gt;  
&gt; +(define_predicate "stack_push_up_to_ra_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 1)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s0_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 2)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s1_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 3)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s2_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 4)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s3_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 5)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s4_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 6)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s5_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 7)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s6_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 8)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s7_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 9)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s8_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 10)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s9_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 11)")))
&gt; +
&gt; +(define_predicate "stack_push_up_to_s11_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op) * -1, 13)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_ra_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 1)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s0_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 2)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s1_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 3)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s2_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 4)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s3_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 5)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s4_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 6)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s5_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 7)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s6_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 8)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s7_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 9)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s8_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 10)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s9_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 11)")))
&gt; +
&gt; +(define_predicate "stack_pop_up_to_s11_operand"
&gt; +  (and (match_code "const_int")
&gt; +       (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
&gt; +
&gt;  ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
&gt;  (define_predicate "branch_on_bit_operand"
&gt;    (and (match_code "const_int")
&gt; diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
&gt; index 00e1b20c6c6..f23b11622a2 100644
&gt; --- a/gcc/config/riscv/riscv-protos.h
&gt; +++ b/gcc/config/riscv/riscv-protos.h
&gt; @@ -56,6 +56,7 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
&gt;  extern void riscv_split_doubleword_move (rtx, rtx);
&gt;  extern const char *riscv_output_move (rtx, rtx);
&gt;  extern const char *riscv_output_return ();
&gt; +extern bool riscv_zcmp_valid_stack_adj_bytes_p(HOST_WIDE_INT, int);
&gt;  
&gt;  #ifdef RTX_CODE
&gt;  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
&gt; diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
&gt; index 3954c89a039..c476c699f4c 100644
&gt; --- a/gcc/config/riscv/riscv.cc
&gt; +++ b/gcc/config/riscv/riscv.cc
&gt; @@ -126,6 +126,14 @@ struct GTY(())  riscv_frame_info {
&gt;    /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
&gt;    unsigned save_libcall_adjustment;
&gt;  
&gt; +  /* the minimum number of bytes, in multiples of 16-byte address increments,
&gt; +     required to cover the registers in a multi push &amp; pop.  */
&gt; +  unsigned multi_push_adj_base;
&gt; +
&gt; +  /* the number of additional 16-byte address increments allocated for the stack frame
&gt; +     in a multi push &amp; pop.  */
&gt; +  unsigned multi_push_adj_addi;
&gt; +
&gt;    /* Offsets of fixed-point and floating-point save areas from frame bottom */
&gt;    poly_int64 gp_sp_offset;
&gt;    poly_int64 fp_sp_offset;
&gt; @@ -422,6 +430,16 @@ static const struct riscv_tune_info riscv_tune_info_table[] = {
&gt;  #include "riscv-cores.def"
&gt;  };
&gt;  
&gt; +typedef enum
&gt; +{
&gt; +  PUSH_IDX = 0,
&gt; +  POP_IDX,
&gt; +  POPRET_IDX,
&gt; +  ZCMP_OP_NUM
&gt; +} riscv_zcmp_op_t;
&gt; +
&gt; +typedef insn_code (* code_for_push_pop_t)(machine_mode);
&gt; +
&gt;  void riscv_frame_info::reset(void)
&gt;  {
&gt;    total_size = 0;
&gt; @@ -4876,6 +4894,37 @@ riscv_save_reg_p (unsigned int regno)
&gt;    return false;
&gt;  }
&gt;  
&gt; +/* Return TRUE if Zcmp push and pop insns should be
&gt; +   avoided. FALSE otherwise.
&gt; +   Only use multi push &amp; pop if all GPRs masked can be covered,
&gt; +   and stack access is SP based,
&gt; +   and GPRs are at top of the stack frame,
&gt; +   and no conflicts in stack allocation with other features  */
&gt; +static bool
&gt; +riscv_avoid_multi_push(const struct riscv_frame_info *frame)
&gt; +{
&gt; +  if (!TARGET_ZCMP
&gt; +      || crtl-&gt;calls_eh_return
&gt; +      || frame_pointer_needed
&gt; +      || cfun-&gt;machine-&gt;interrupt_handler_p
&gt; +      || cfun-&gt;machine-&gt;varargs_size != 0
&gt; +      || crtl-&gt;args.pretend_args_size != 0
&gt; +      || (frame-&gt;mask &amp; ~ MULTI_PUSH_GPR_MASK))
&gt; +    return true;
&gt; +
&gt; +  return false;
&gt; +}
&gt; +
&gt; +/* Determine whether to use multi push insn.  */
&gt; +static bool
&gt; +riscv_use_multi_push(const struct riscv_frame_info *frame)
&gt; +{
&gt; +  if (riscv_avoid_multi_push (frame))
&gt; +    return false;
&gt; +
&gt; +  return (frame-&gt;multi_push_adj_base != 0);
&gt; +}
&gt; +
&gt;  /* Return TRUE if a libcall to save/restore GPRs should be
&gt;     avoided.  FALSE otherwise.  */
&gt;  static bool
&gt; @@ -4913,6 +4962,51 @@ riscv_save_libcall_count (unsigned mask)
&gt;    abort ();
&gt;  }
&gt;  
&gt; +/* calculate number of s regs in multi push and pop.
&gt; +   Note that {s0-s10} is not valid in Zcmp, use {s0-s11} instead.  */
&gt; +static unsigned
&gt; +riscv_multi_push_sregs_count (unsigned mask)
&gt; +{
&gt; +  unsigned num = riscv_save_libcall_count (mask);
&gt; +  return (num == ZCMP_INVALID_S0S10_SREGS_COUNTS)
&gt; +    ? ZCMP_S0S11_SREGS_COUNTS
&gt; +    : num;
&gt; +}
&gt; +
&gt; +/* calculate number of regs(ra, s0-sx) in multi push and pop.  */
&gt; +static unsigned
&gt; +riscv_multi_push_regs_count (unsigned mask)
&gt; +{
&gt; +  /* 1 is for ra  */
&gt; +  return riscv_multi_push_sregs_count (mask) + 1;
&gt; +}
&gt; +
&gt; +/* Handle 16 bytes align for poly_int.  */
&gt; +static poly_int64
&gt; +riscv_16bytes_align (poly_int64 value)
&gt; +{
&gt; +  return aligned_upper_bound (value, 16);
&gt; +}
&gt; +
&gt; +static HOST_WIDE_INT
&gt; +riscv_16bytes_align (HOST_WIDE_INT value)
&gt; +{
&gt; +  return ROUND_UP(value, 16);
&gt; +}
&gt; +
&gt; +/* Handle stack align for poly_int.  */
&gt; +static poly_int64
&gt; +riscv_stack_align (poly_int64 value)
&gt; +{
&gt; +  return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
&gt; +}
&gt; +
&gt; +static HOST_WIDE_INT
&gt; +riscv_stack_align (HOST_WIDE_INT value)
&gt; +{
&gt; +  return RISCV_STACK_ALIGN (value);
&gt; +}
&gt; +
&gt;  /* Populate the current function's riscv_frame_info structure.
&gt;  
&gt;     RISC-V stack frames grown downward.  High addresses are at the top.
&gt; @@ -4938,7 +5032,7 @@ riscv_save_libcall_count (unsigned mask)
&gt;  	|  GPR save area                |       + UNITS_PER_WORD
&gt;  	|                               |
&gt;  	+-------------------------------+ &lt;-- stack_pointer_rtx + fp_sp_offset
&gt; -	|                               |       + UNITS_PER_HWVALUE
&gt; +	|                               |       + UNITS_PER_FP_REG
&gt;  	|  FPR save area                |
&gt;  	|                               |
&gt;  	+-------------------------------+ &lt;-- frame_pointer_rtx (virtual)
&gt; @@ -4957,19 +5051,6 @@ riscv_save_libcall_count (unsigned mask)
&gt;  
&gt;  static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 remaining_size);
&gt;  
&gt; -/* Handle stack align for poly_int.  */
&gt; -static poly_int64
&gt; -riscv_stack_align (poly_int64 value)
&gt; -{
&gt; -  return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8);
&gt; -}
&gt; -
&gt; -static HOST_WIDE_INT
&gt; -riscv_stack_align (HOST_WIDE_INT value)
&gt; -{
&gt; -  return RISCV_STACK_ALIGN (value);
&gt; -}
&gt; -
&gt;  static void
&gt;  riscv_compute_frame_info (void)
&gt;  {
&gt; @@ -5017,8 +5098,9 @@ riscv_compute_frame_info (void)
&gt;    if (frame-&gt;mask)
&gt;      {
&gt;        x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD);
&gt; -      unsigned num_save_restore = 1 + riscv_save_libcall_count (frame-&gt;mask);
&gt;  
&gt; +      /* 1 is for ra  */
&gt; +      unsigned num_save_restore = 1 + riscv_save_libcall_count (frame-&gt;mask);
&gt;        /* Only use save/restore routines if they don't alter the stack size.  */
&gt;        if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size
&gt;            &amp;&amp; !riscv_avoid_save_libcall ())
&gt; @@ -5030,6 +5112,15 @@ riscv_compute_frame_info (void)
&gt;  
&gt;  	  frame-&gt;save_libcall_adjustment = x_save_size;
&gt;  	}
&gt; +
&gt; +      if (!riscv_avoid_multi_push (frame))
&gt; +        {
&gt; +          /* num(ra, s0-sx)  */
&gt; +          unsigned num_multi_push =
&gt; +            riscv_multi_push_regs_count (frame-&gt;mask);
&gt; +          x_save_size = riscv_stack_align (num_multi_push * UNITS_PER_WORD);
&gt; +          frame-&gt;multi_push_adj_base = riscv_16bytes_align (x_save_size);
&gt; +        }
&gt;      }
&gt;  
&gt;    /* At the bottom of the frame are any outgoing stack arguments. */
&gt; @@ -5044,7 +5135,15 @@ riscv_compute_frame_info (void)
&gt;    frame-&gt;fp_sp_offset = offset - UNITS_PER_FP_REG;
&gt;    /* Next are the callee-saved GPRs. */
&gt;    if (frame-&gt;mask)
&gt; -    offset += x_save_size;
&gt; +    {
&gt; +      offset += x_save_size;
&gt; +      /* align to 16 bytes and add paddings to GPR part to honor
&gt; +         both stack alignment and zcmp pus/pop size alignment. */
&gt; +      if (riscv_use_multi_push (frame)
&gt; +          &amp;&amp; known_lt(offset,
&gt; +                      frame-&gt;multi_push_adj_base + ZCMP_SP_INC_STEP * ZCMP_MAX_SPIMM))
&gt; +        offset = riscv_16bytes_align (offset);
&gt; +    }
&gt;    frame-&gt;gp_sp_offset = offset - UNITS_PER_WORD;
&gt;    /* The hard frame pointer points above the callee-saved GPRs. */
&gt;    frame-&gt;hard_frame_pointer_offset = offset;
&gt; @@ -5388,6 +5487,42 @@ riscv_adjust_libcall_cfi_prologue ()
&gt;    return dwarf;
&gt;  }
&gt;  
&gt; +static rtx
&gt; +riscv_adjust_multi_push_cfi_prologue (int saved_size)
&gt; +{
&gt; +  rtx dwarf = NULL_RTX;
&gt; +  rtx adjust_sp_rtx, reg, mem, insn;
&gt; +  unsigned int mask = cfun-&gt;machine-&gt;frame.mask;
&gt; +  int offset;
&gt; +  int saved_cnt = 0;
&gt; +
&gt; +  if (mask &amp; S10_MASK)
&gt; +    mask |= S11_MASK;
&gt; +
&gt; +  for (int regno = GP_REG_LAST; regno &gt;= GP_REG_FIRST; regno--)
&gt; +    if (BITSET_P (mask &amp; MULTI_PUSH_GPR_MASK, regno - GP_REG_FIRST))
&gt; +      {
&gt; +        /* The save order is s11-s0, ra
&gt; +           from high to low addr.  */
&gt; +        offset = saved_size - UNITS_PER_WORD * (++saved_cnt);
&gt; +
&gt; +        reg = gen_rtx_REG (Pmode, regno);
&gt; +        mem = gen_frame_mem (Pmode, plus_constant (Pmode,
&gt; +                                                   stack_pointer_rtx,
&gt; +                                                   offset));
&gt; +
&gt; +        insn = gen_rtx_SET (mem, reg);
&gt; +        dwarf = alloc_reg_note (REG_CFA_OFFSET, insn, dwarf);
&gt; +      }
&gt; +
&gt; +  /* Debug info for adjust sp.  */
&gt; +  adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
&gt; +                               plus_constant(Pmode, stack_pointer_rtx, -saved_size));
&gt; +  dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
&gt; +                          dwarf);
&gt; +  return dwarf;
&gt; +}
&gt; +
&gt;  static void
&gt;  riscv_emit_stack_tie (void)
&gt;  {
&gt; @@ -5397,6 +5532,45 @@ riscv_emit_stack_tie (void)
&gt;      emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
&gt;  }
&gt;  
&gt; +/*zcmp multi push and pop code_for_push_pop function ptr array  */
&gt; +const code_for_push_pop_t code_for_push_pop [ZCMP_MAX_GRP_SLOTS][ZCMP_OP_NUM] = {
&gt; +  {code_for_gpr_multi_push_up_to_ra,    code_for_gpr_multi_pop_up_to_ra,
&gt; +   code_for_gpr_multi_popret_up_to_ra},
&gt; +  {code_for_gpr_multi_push_up_to_s0,    code_for_gpr_multi_pop_up_to_s0,
&gt; +   code_for_gpr_multi_popret_up_to_s0},
&gt; +  {code_for_gpr_multi_push_up_to_s1,    code_for_gpr_multi_pop_up_to_s1,
&gt; +   code_for_gpr_multi_popret_up_to_s1},
&gt; +  {code_for_gpr_multi_push_up_to_s2,    code_for_gpr_multi_pop_up_to_s2,
&gt; +   code_for_gpr_multi_popret_up_to_s2},
&gt; +  {code_for_gpr_multi_push_up_to_s3,    code_for_gpr_multi_pop_up_to_s3,
&gt; +   code_for_gpr_multi_popret_up_to_s3},
&gt; +  {code_for_gpr_multi_push_up_to_s4,    code_for_gpr_multi_pop_up_to_s4,
&gt; +   code_for_gpr_multi_popret_up_to_s4},
&gt; +  {code_for_gpr_multi_push_up_to_s5,    code_for_gpr_multi_pop_up_to_s5,
&gt; +   code_for_gpr_multi_popret_up_to_s5},
&gt; +  {code_for_gpr_multi_push_up_to_s6,    code_for_gpr_multi_pop_up_to_s6,
&gt; +   code_for_gpr_multi_popret_up_to_s6},
&gt; +  {code_for_gpr_multi_push_up_to_s7,    code_for_gpr_multi_pop_up_to_s7,
&gt; +   code_for_gpr_multi_popret_up_to_s7},
&gt; +  {code_for_gpr_multi_push_up_to_s8,    code_for_gpr_multi_pop_up_to_s8,
&gt; +   code_for_gpr_multi_popret_up_to_s8},
&gt; +  {code_for_gpr_multi_push_up_to_s9,    code_for_gpr_multi_pop_up_to_s9,
&gt; +   code_for_gpr_multi_popret_up_to_s9},
&gt; +  {nullptr, nullptr, nullptr},
&gt; +  {code_for_gpr_multi_push_up_to_s11,   code_for_gpr_multi_pop_up_to_s11,
&gt; +   code_for_gpr_multi_popret_up_to_s11}};
&gt; +
&gt; +static rtx
&gt; +riscv_gen_multi_push_pop_insn (riscv_zcmp_op_t op, HOST_WIDE_INT adj_size,
&gt; +                               unsigned int regs_num)
&gt; +{
&gt; +  gcc_assert (op &lt; ZCMP_OP_NUM);
&gt; +  gcc_assert (regs_num &lt;= ZCMP_MAX_GRP_SLOTS
&gt; +              &amp;&amp; regs_num != ZCMP_INVALID_S0S10_SREGS_COUNTS + 1); /* 1 for ra*/
&gt; +  rtx stack_adj = GEN_INT (adj_size);
&gt; +  return GEN_FCN (code_for_push_pop[regs_num - 1][op] (Pmode)) (stack_adj);
&gt; +}
&gt; +
&gt;  /* Expand the "prologue" pattern.  */
&gt;  
&gt;  void
&gt; @@ -5405,7 +5579,8 @@ riscv_expand_prologue (void)
&gt;    struct riscv_frame_info *frame = &amp;cfun-&gt;machine-&gt;frame;
&gt;    poly_int64 remaining_size = frame-&gt;total_size;
&gt;    unsigned mask = frame-&gt;mask;
&gt; -  rtx insn;
&gt; +  int spimm, multi_push_additional, stack_adj;
&gt; +  rtx insn, dwarf = NULL_RTX;
&gt;  
&gt;    if (flag_stack_usage_info)
&gt;      current_function_static_stack_size = constant_lower_bound (remaining_size);
&gt; @@ -5413,8 +5588,35 @@ riscv_expand_prologue (void)
&gt;    if (cfun-&gt;machine-&gt;naked_p)
&gt;      return;
&gt;  
&gt; +  /* prefer muti-push to save-restore libcall.  */
&gt; +  if (riscv_use_multi_push(frame))
&gt; +    {
&gt; +      remaining_size -= frame-&gt;multi_push_adj_base;
&gt; +      if (known_gt(remaining_size, 2 * ZCMP_SP_INC_STEP))
&gt; +        spimm = 3;
&gt; +      else if (known_gt(remaining_size, ZCMP_SP_INC_STEP))
&gt; +        spimm = 2;
&gt; +      else if (known_gt(remaining_size, 0))
&gt; +        spimm = 1;
&gt; +      else
&gt; +        spimm = 0;
&gt; +      multi_push_additional = spimm * ZCMP_SP_INC_STEP;
&gt; +      frame-&gt;multi_push_adj_addi = multi_push_additional;
&gt; +      remaining_size -= multi_push_additional;
&gt; +
&gt; +      /* emit multi push insn &amp; dwarf along with it.  */
&gt; +      stack_adj = frame-&gt;multi_push_adj_base + multi_push_additional;
&gt; +      insn = emit_insn (riscv_gen_multi_push_pop_insn(PUSH_IDX,
&gt; +        -stack_adj, riscv_multi_push_regs_count(frame-&gt;mask)));
&gt; +      dwarf = riscv_adjust_multi_push_cfi_prologue (stack_adj);
&gt; +      RTX_FRAME_RELATED_P (insn) = 1;
&gt; +      REG_NOTES (insn) = dwarf;
&gt; +
&gt; +      /* Temporarily fib that we need not save GPRs.  */
&gt; +      frame-&gt;mask = 0;
&gt; +    }
&gt;    /* When optimizing for size, call a subroutine to save the registers.  */
&gt; -  if (riscv_use_save_libcall (frame))
&gt; +  else if (riscv_use_save_libcall (frame))
&gt;      {
&gt;        rtx dwarf = NULL_RTX;
&gt;        dwarf = riscv_adjust_libcall_cfi_prologue ();
&gt; @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
&gt;    /* Save the registers.  */
&gt;    if ((frame-&gt;mask | frame-&gt;fmask) != 0)
&gt;      {
&gt; -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
&gt; -
&gt; -      insn = gen_add3_insn (stack_pointer_rtx,
&gt; -			    stack_pointer_rtx,
&gt; -			    GEN_INT (-step1));
&gt; -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
&gt; -      remaining_size -= step1;
&gt; +      if (known_gt (remaining_size, frame-&gt;frame_pointer_offset))
&gt; +        {
&gt; +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
&gt; +          remaining_size -= step1;
&gt; +          insn = gen_add3_insn (stack_pointer_rtx,
&gt; +                                stack_pointer_rtx,
&gt; +                                GEN_INT (-step1));
&gt; +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
&gt; +        }
&gt;        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
&gt;      }
&gt;  
&gt; @@ -5493,6 +5697,32 @@ riscv_expand_prologue (void)
&gt;      }
&gt;  }
&gt;  
&gt; +static rtx
&gt; +riscv_adjust_multi_pop_cfi_epilogue (int saved_size)
&gt; +{
&gt; +  rtx dwarf = NULL_RTX;
&gt; +  rtx adjust_sp_rtx, reg;
&gt; +  unsigned int mask = cfun-&gt;machine-&gt;frame.mask;
&gt; +
&gt; +  if (mask &amp; S10_MASK)
&gt; +    mask |= S11_MASK;
&gt; +
&gt; +  /* Debug info for adjust sp.  */
&gt; +  adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
&gt; +                               plus_constant(Pmode, stack_pointer_rtx, saved_size));
&gt; +  dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
&gt; +                          dwarf);
&gt; +
&gt; +  for (int regno = GP_REG_FIRST; regno &lt;= GP_REG_LAST; regno++)
&gt; +    if (BITSET_P (mask, regno - GP_REG_FIRST))
&gt; +      {
&gt; +        reg = gen_rtx_REG (Pmode, regno);
&gt; +        dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
&gt; +      }
&gt; +
&gt; +  return dwarf;
&gt; +}
&gt; +
&gt;  static rtx
&gt;  riscv_adjust_libcall_cfi_epilogue ()
&gt;  {
&gt; @@ -5532,10 +5762,18 @@ riscv_expand_epilogue (int style)
&gt;    struct riscv_frame_info *frame = &amp;cfun-&gt;machine-&gt;frame;
&gt;    unsigned mask = frame-&gt;mask;
&gt;    HOST_WIDE_INT step2 = 0;
&gt; -  bool use_restore_libcall = ((style == NORMAL_RETURN)
&gt; -			      &amp;&amp; riscv_use_save_libcall (frame));
&gt; -  unsigned libcall_size = (use_restore_libcall
&gt; -			   ? frame-&gt;save_libcall_adjustment : 0);
&gt; +  bool use_multi_pop_normal = ((style == NORMAL_RETURN)
&gt; +                              &amp;&amp; riscv_use_multi_push (frame));
&gt; +  bool use_multi_pop_sibcall = ((style == SIBCALL_RETURN)
&gt; +                              &amp;&amp; riscv_use_multi_push (frame));
&gt; +  bool use_multi_pop = use_multi_pop_normal || use_multi_pop_sibcall;
&gt; +
&gt; +  bool use_restore_libcall = !use_multi_pop &amp;&amp; ((style == NORMAL_RETURN)
&gt; +                              &amp;&amp; riscv_use_save_libcall (frame));
&gt; +  unsigned libcall_size = use_restore_libcall &amp;&amp; !use_multi_pop ?
&gt; +                            frame-&gt;save_libcall_adjustment : 0;
&gt; +  unsigned multipop_size = use_multi_pop ?
&gt; +                            frame-&gt;multi_push_adj_base + frame-&gt;multi_push_adj_addi : 0;
&gt;    rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
&gt;    rtx insn;
&gt;  
&gt; @@ -5606,18 +5844,25 @@ riscv_expand_epilogue (int style)
&gt;        REG_NOTES (insn) = dwarf;
&gt;      }
&gt;  
&gt; -  if (use_restore_libcall)
&gt; -    frame-&gt;mask = 0; /* Temporarily fib for GPRs.  */
&gt; +  if (use_restore_libcall || use_multi_pop)
&gt; +    frame-&gt;mask = 0; /* Temporarily fib that we need not save GPRs.  */
&gt;  
&gt;    /* If we need to restore registers, deallocate as much stack as
&gt;       possible in the second step without going out of range.  */
&gt; -  if ((frame-&gt;mask | frame-&gt;fmask) != 0)
&gt; +  if (use_multi_pop)
&gt; +    {
&gt; +      if (frame-&gt;fmask
&gt; +          &amp;&amp; known_gt (frame-&gt;total_size - multipop_size,
&gt; +                      frame-&gt;frame_pointer_offset))
&gt; +        step2 = riscv_first_stack_step (frame, frame-&gt;total_size - multipop_size);
&gt; +    }
&gt; +  else if ((frame-&gt;mask | frame-&gt;fmask) != 0)
&gt;      step2 = riscv_first_stack_step (frame, frame-&gt;total_size - libcall_size);
&gt;  
&gt; -  if (use_restore_libcall)
&gt; +  if (use_restore_libcall || use_multi_pop)
&gt;      frame-&gt;mask = mask; /* Undo the above fib.  */
&gt;  
&gt; -  poly_int64 step1 = frame-&gt;total_size - step2 - libcall_size;
&gt; +  poly_int64 step1 = frame-&gt;total_size - step2 - libcall_size - multipop_size ;
&gt;  
&gt;    /* Set TARGET to BASE + STEP1.  */
&gt;    if (known_gt (step1, 0))
&gt; @@ -5652,7 +5897,7 @@ riscv_expand_epilogue (int style)
&gt;  					   adjust));
&gt;  	  rtx dwarf = NULL_RTX;
&gt;  	  rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
&gt; -					     GEN_INT (step2 + libcall_size));
&gt; +					     GEN_INT (step2 + libcall_size + multipop_size));
&gt;  
&gt;  	  dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
&gt;  	  RTX_FRAME_RELATED_P (insn) = 1;
&gt; @@ -5667,15 +5912,15 @@ riscv_expand_epilogue (int style)
&gt;        epilogue_cfa_sp_offset = step2;
&gt;      }
&gt;  
&gt; -  if (use_restore_libcall)
&gt; +  if (use_restore_libcall || use_multi_pop)
&gt;      frame-&gt;mask = 0; /* Temporarily fib that we need not save GPRs.  */
&gt;  
&gt;    /* Restore the registers.  */
&gt; -  riscv_for_each_saved_reg (frame-&gt;total_size - step2 - libcall_size,
&gt; +  riscv_for_each_saved_reg (frame-&gt;total_size - step2 - libcall_size - multipop_size,
&gt;  			    riscv_restore_reg,
&gt;  			    true, style == EXCEPTION_RETURN);
&gt;  
&gt; -  if (use_restore_libcall)
&gt; +  if (use_restore_libcall || use_multi_pop)
&gt;        frame-&gt;mask = mask; /* Undo the above fib.  */
&gt;  
&gt;    if (need_barrier_p)
&gt; @@ -5689,14 +5934,30 @@ riscv_expand_epilogue (int style)
&gt;  
&gt;        rtx dwarf = NULL_RTX;
&gt;        rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
&gt; -					 GEN_INT (libcall_size));
&gt; +					 GEN_INT (libcall_size + multipop_size));
&gt;        dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
&gt;        RTX_FRAME_RELATED_P (insn) = 1;
&gt;  
&gt;        REG_NOTES (insn) = dwarf;
&gt;      }
&gt;  
&gt; -  if (use_restore_libcall)
&gt; +  if (use_multi_pop)
&gt; +    {
&gt; +      unsigned regs_count = riscv_multi_push_regs_count (frame-&gt;mask);
&gt; +      if (use_multi_pop_normal)
&gt; +        insn = emit_jump_insn (
&gt; +          riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
&gt; +      else
&gt; +        insn= emit_insn (
&gt; +          riscv_gen_multi_push_pop_insn(POP_IDX, multipop_size, regs_count));
&gt; +
&gt; +      rtx dwarf = riscv_adjust_multi_pop_cfi_epilogue (multipop_size);
&gt; +      RTX_FRAME_RELATED_P (insn) = 1;
&gt; +      REG_NOTES (insn) = dwarf;
&gt; +      if (use_multi_pop_normal)
&gt; +        return;
&gt; +    }
&gt; +  else if (use_restore_libcall)
&gt;      {
&gt;        rtx dwarf = riscv_adjust_libcall_cfi_epilogue ();
&gt;        insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask))));
&gt; @@ -6980,6 +7241,25 @@ riscv_gen_gpr_save_insn (struct riscv_frame_info *frame)
&gt;    return gen_rtx_PARALLEL (VOIDmode, vec);
&gt;  }
&gt;  
&gt; +static HOST_WIDE_INT zcmp_base_adj(int regs_num)
&gt; +{
&gt; +  return riscv_16bytes_align ((regs_num) * GET_MODE_SIZE (word_mode));
&gt; +}
&gt; +
&gt; +static HOST_WIDE_INT zcmp_additional_adj(HOST_WIDE_INT total, int regs_num)
&gt; +{
&gt; +  return total - zcmp_base_adj(regs_num);
&gt; +}
&gt; +
&gt; +bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
&gt; +{
&gt; +  HOST_WIDE_INT additioanl_bytes = zcmp_additional_adj(total, regs_num);
&gt; +  return additioanl_bytes == 0
&gt; +         || additioanl_bytes  == 1 * ZCMP_SP_INC_STEP
&gt; +         || additioanl_bytes  == 2 * ZCMP_SP_INC_STEP
&gt; +         || additioanl_bytes  == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
&gt; +}
&gt; +
&gt;  /* Return true if it's valid gpr_save pattern.  */
&gt;  
&gt;  bool
&gt; diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
&gt; index 4541255a8ae..2fa555dce2d 100644
&gt; --- a/gcc/config/riscv/riscv.h
&gt; +++ b/gcc/config/riscv/riscv.h
&gt; @@ -420,6 +420,29 @@ ASM_MISA_SPEC
&gt;  #define RISCV_CALL_ADDRESS_TEMP(MODE) \
&gt;    gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
&gt;  
&gt; +#define RETURN_ADDR_MASK        ( 1 &lt;&lt; RETURN_ADDR_REGNUM)
&gt; +#define S0_MASK                 ( 1 &lt;&lt; S0_REGNUM)
&gt; +#define S1_MASK                 ( 1 &lt;&lt; S1_REGNUM)
&gt; +#define S2_MASK                 ( 1 &lt;&lt; S2_REGNUM)
&gt; +#define S3_MASK                 ( 1 &lt;&lt; S3_REGNUM)
&gt; +#define S4_MASK                 ( 1 &lt;&lt; S4_REGNUM)
&gt; +#define S5_MASK                 ( 1 &lt;&lt; S5_REGNUM)
&gt; +#define S6_MASK                 ( 1 &lt;&lt; S6_REGNUM)
&gt; +#define S7_MASK                 ( 1 &lt;&lt; S7_REGNUM)
&gt; +#define S8_MASK                 ( 1 &lt;&lt; S8_REGNUM)
&gt; +#define S9_MASK                 ( 1 &lt;&lt; S9_REGNUM)
&gt; +#define S10_MASK                ( 1 &lt;&lt; S10_REGNUM)
&gt; +#define S11_MASK                ( 1 &lt;&lt; S11_REGNUM)
&gt; +
&gt; +#define MULTI_PUSH_GPR_MASK ( RETURN_ADDR_MASK | S0_MASK | S1_MASK | S2_MASK  | S3_MASK \
&gt; +                                               | S4_MASK | S5_MASK | S6_MASK  | S7_MASK \
&gt; +                                               | S8_MASK | S9_MASK | S10_MASK | S11_MASK )
&gt; +#define ZCMP_MAX_SPIMM 3
&gt; +#define ZCMP_SP_INC_STEP 16
&gt; +#define ZCMP_INVALID_S0S10_SREGS_COUNTS 11
&gt; +#define ZCMP_S0S11_SREGS_COUNTS 12
&gt; +#define ZCMP_MAX_GRP_SLOTS 13
&gt; +
&gt;  #define MCOUNT_NAME "_mcount"
&gt;  
&gt;  #define NO_PROFILE_COUNTERS 1
&gt; diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
&gt; index be960583101..c858b3bc9ef 100644
&gt; --- a/gcc/config/riscv/riscv.md
&gt; +++ b/gcc/config/riscv/riscv.md
&gt; @@ -113,6 +113,7 @@
&gt;  
&gt;  (define_constants
&gt;    [(RETURN_ADDR_REGNUM		1)
&gt; +   (SP_REGNUM 			2)
&gt;     (GP_REGNUM 			3)
&gt;     (TP_REGNUM			4)
&gt;     (T0_REGNUM			5)
&gt; @@ -3163,3 +3164,4 @@
&gt;  (include "sifive-7.md")
&gt;  (include "thead.md")
&gt;  (include "vector.md")
&gt; +(include "zc.md")
&gt; diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
&gt; new file mode 100644
&gt; index 00000000000..5c1bf031b8d
&gt; --- /dev/null
&gt; +++ b/gcc/config/riscv/zc.md
&gt; @@ -0,0 +1,1042 @@
&gt; +;; Machine description for RISC-V Zc extention.
&gt; +;; Copyright (C) 2023 Free Software Foundation, Inc.
&gt; +;; Contributed by Fei Gao (gaofei@eswincomputing.com).
&gt; +
&gt; +;; This file is part of GCC.
&gt; +
&gt; +;; GCC is free software; you can redistribute it and/or modify
&gt; +;; it under the terms of the GNU General Public License as published by
&gt; +;; the Free Software Foundation; either version 3, or (at your option)
&gt; +;; any later version.
&gt; +
&gt; +;; GCC is distributed in the hope that it will be useful,
&gt; +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
&gt; +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
&gt; +;; GNU General Public License for more details.
&gt; +
&gt; +;; You should have received a copy of the GNU General Public License
&gt; +;; along with GCC; see the file COPYING3.  If not see
&gt; +;; <http: www.gnu.org="" licenses=""></http:>.
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_ra_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s0_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s1_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s1}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s2_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s2}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s3_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s3}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s4_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s4}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s5_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s5}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s6_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s6}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s7_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                      (const_int <slot8_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s7}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s8_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
&gt; +   (set (reg:X S8_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s8}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s9_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
&gt; +   (set (reg:X S9_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S8_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                      (const_int <slot8_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot10_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s9}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_pop_up_to_s11_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
&gt; +   (set (reg:X S11_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S10_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S9_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S8_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                      (const_int <slot8_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot10_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot11_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot12_offset>))))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.pop	{ra, s0-s11}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_ra_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_ra_operand" "I")))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s0_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s0_operand" "I")))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s1_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s1_operand" "I")))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s1}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s2_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s2_operand" "I")))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s2}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s3_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s3_operand" "I")))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s3}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s4_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s4_operand" "I")))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s4}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s5_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s5_operand" "I")))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s5}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s6_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s6_operand" "I")))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s6}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s7_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s7_operand" "I")))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s7}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s8_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s8_operand" "I")))
&gt; +   (set (reg:X S8_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s8}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s9_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s9_operand" "I")))
&gt; +   (set (reg:X S9_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S8_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot10_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s9}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_popret_up_to_s11_<mode>"
&gt; +  [(set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_pop_up_to_s11_operand" "I")))
&gt; +   (set (reg:X S11_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>))))
&gt; +   (set (reg:X S10_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>))))
&gt; +   (set (reg:X S9_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>))))
&gt; +   (set (reg:X S8_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>))))
&gt; +   (set (reg:X S7_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>))))
&gt; +   (set (reg:X S6_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>))))
&gt; +   (set (reg:X S5_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>))))
&gt; +   (set (reg:X S4_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>))))
&gt; +   (set (reg:X S3_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                      (const_int <slot8_offset>))))
&gt; +   (set (reg:X S2_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>))))
&gt; +   (set (reg:X S1_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot10_offset>))))
&gt; +   (set (reg:X S0_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot11_offset>))))
&gt; +   (set (reg:X RETURN_ADDR_REGNUM)
&gt; +        (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot12_offset>))))
&gt; +   (return)
&gt; +   (use (reg:SI RETURN_ADDR_REGNUM))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.popret	{ra, s0-s11}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_ra_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_ra_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s0_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s0_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s1_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s1_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s1}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s2_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s2_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s2}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s3_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s3_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s3}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s4_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s4_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s4}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s5_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S5_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s5_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s5}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s6_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S6_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S5_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s6_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s6}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s7_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S7_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S6_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S5_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                      (const_int <slot8_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s7_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s7}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s8_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S8_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S7_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S6_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S5_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s8_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s8}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s9_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S9_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S8_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S7_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S6_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S5_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot10_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s9_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s9}, %0"
&gt; +)
&gt; +
&gt; +(define_insn "@gpr_multi_push_up_to_s11_<mode>"
&gt; +  [(set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot0_offset>)))
&gt; +        (reg:X S11_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot1_offset>)))
&gt; +        (reg:X S10_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot2_offset>)))
&gt; +        (reg:X S9_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot3_offset>)))
&gt; +        (reg:X S8_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot4_offset>)))
&gt; +        (reg:X S7_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot5_offset>)))
&gt; +        (reg:X S6_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot6_offset>)))
&gt; +        (reg:X S5_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot7_offset>)))
&gt; +        (reg:X S4_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot8_offset>)))
&gt; +        (reg:X S3_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot9_offset>)))
&gt; +        (reg:X S2_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot10_offset>)))
&gt; +        (reg:X S1_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot11_offset>)))
&gt; +        (reg:X S0_REGNUM))
&gt; +   (set (mem:X (plus:X (reg:X SP_REGNUM)
&gt; +                       (const_int <slot12_offset>)))
&gt; +        (reg:X RETURN_ADDR_REGNUM))
&gt; +   (set (reg:X SP_REGNUM)
&gt; +        (plus:X (reg:X SP_REGNUM)
&gt; +                 (match_operand 0 "stack_push_up_to_s11_operand" "I")))]
&gt; +  "TARGET_ZCMP"
&gt; +  "cm.push	{ra, s0-s11}, %0"
&gt; +)
&gt; diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
&gt; new file mode 100644
&gt; index 00000000000..6dbe489da9b
&gt; --- /dev/null
&gt; +++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
&gt; @@ -0,0 +1,239 @@
&gt; +/* { dg-do compile } */
&gt; +/* { dg-options " -Os -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow" } */
&gt; +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
&gt; +/* { dg-final { check-function-bodies "**" "" } } */
&gt; +
&gt; +char my_getchar();
&gt; +float getf();
&gt; +int __attribute__((noinline)) incoming_stack_args
&gt; +  (int arg0, int arg1, int arg2, int arg3,
&gt; +   int arg4, int arg5, int arg6, int arg7, int arg8);
&gt; +int getint();
&gt; +void PrintInts (int n, ...); // varargs
&gt; +void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
&gt; +void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
&gt; +extern void f1(void);
&gt; +extern void f2(void);
&gt; +
&gt; +/*
&gt; +**test1:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s1}, -64
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s1}, 64
&gt; +**	...
&gt; +*/
&gt; +int test1()
&gt; +{
&gt; +  char volatile array[3120];
&gt; +  float volatile farray[3120];
&gt; +
&gt; +  float sum = 0;
&gt; +  for (int i = 0; i &lt; 3120; i++)
&gt; +  {
&gt; +    array[i] = my_getchar();
&gt; +    farray[i] = my_getchar() * 1.2;
&gt; +    sum += array[i] + farray[i];
&gt; +  }
&gt; +  return sum;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test2_step1_0_size:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -64
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 64
&gt; +**	...
&gt; +*/
&gt; +int test2_step1_0_size()
&gt; +{
&gt; +  int volatile iarray[3120 + 1824/4 -8];
&gt; +
&gt; +  for (int i = 0; i &lt; 3120 + 1824/4 - 8; i++)
&gt; +  {
&gt; +    iarray[i] = my_getchar() * 2;
&gt; +  }
&gt; +  return iarray[0] + iarray[1];
&gt; +}
&gt; +
&gt; +/*
&gt; +**test3:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s1}, -64
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s1}, 64
&gt; +**	...
&gt; +*/
&gt; +float test3()
&gt; +{
&gt; +  char volatile array[3120];
&gt; +  float volatile farray[3120];
&gt; +
&gt; +  float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
&gt; +
&gt; +  for (int i = 0; i &lt; 3120; i++)
&gt; +  {
&gt; +    f1 = getf();
&gt; +    f2 = getf();
&gt; +    f3 = getf();
&gt; +    f4 = getf();
&gt; +    array[i] = my_getchar();
&gt; +    farray[i] = my_getchar() * 1.2;
&gt; +    sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
&gt; +  }
&gt; +  return sum;
&gt; +}
&gt; +
&gt; +/*
&gt; +**outgoing_stack_args:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 32
&gt; +**	...
&gt; +*/
&gt; +int outgoing_stack_args()
&gt; +{
&gt; +  int  local = getint();
&gt; +  return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrintInts:
&gt; +**	...
&gt; +**	cm.push	{ra}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra}, 32
&gt; +**	...
&gt; +*/
&gt; +float callPrintInts()
&gt; +{
&gt; +  volatile float f = getf(); // f in local
&gt; +  PrintInts(9,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrint:
&gt; +**	...
&gt; +**	cm.push	{ra}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra}, 32
&gt; +**	...
&gt; +*/
&gt; +float callPrint()
&gt; +{
&gt; +  volatile float f = getf(); // f in local
&gt; +  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrint_S:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 32
&gt; +**	...
&gt; +*/
&gt; +float callPrint_S()
&gt; +{
&gt; +  float f = getf();
&gt; +  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrint_2:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 32
&gt; +**	...
&gt; +*/
&gt; +float callPrint_2()
&gt; +{
&gt; +  float f = getf();
&gt; +  PrintInts2(0,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_step1_0bytes_save_restore:
&gt; +**	...
&gt; +**	cm.push	{ra}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_step1_0bytes_save_restore()
&gt; +{
&gt; +
&gt; +  int a  =  9;
&gt; +  int b  =  my_getchar();
&gt; +  return a +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_s0:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_s0()
&gt; +{
&gt; +
&gt; +  int a  =  my_getchar();
&gt; +  int b  =  my_getchar();
&gt; +  return a +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_s1:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s1}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s1}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_s1()
&gt; +{
&gt; +
&gt; +  int s0  =  my_getchar();
&gt; +  int s1  =  my_getchar();
&gt; +  int b  =  my_getchar();
&gt; +  return s1 +s0 +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_f0:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s1}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s1}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_f0()
&gt; +{
&gt; +
&gt; +  int s0  =  my_getchar();
&gt; +  float f0  =  getf(); 
&gt; +  int b  =  my_getchar();
&gt; +  return f0 +s0 +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**foo:
&gt; +**	cm.push	{ra}, -16
&gt; +**	call	f1
&gt; +**	cm.pop	{ra}, 16
&gt; +**	tail	f2
&gt; +*/
&gt; +void foo(void)
&gt; +{
&gt; +  f1();
&gt; +  f2();
&gt; +}
&gt; diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
&gt; new file mode 100644
&gt; index 00000000000..924197cb3c4
&gt; --- /dev/null
&gt; +++ b/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c
&gt; @@ -0,0 +1,239 @@
&gt; +/* { dg-do compile } */
&gt; +/* { dg-options " -Os -march=rv32imaf_zca_zcmp -mabi=ilp32f -mcmodel=medlow" } */
&gt; +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
&gt; +/* { dg-final { check-function-bodies "**" "" } } */
&gt; +
&gt; +char my_getchar();
&gt; +float getf();
&gt; +int __attribute__((noinline)) incoming_stack_args
&gt; +  (int arg0, int arg1, int arg2, int arg3,
&gt; +   int arg4, int arg5, int arg6, int arg7, int arg8);
&gt; +int getint();
&gt; +void PrintInts (int n, ...); // varargs
&gt; +void __attribute__((noinline)) PrintIntsNoVaStart (int n, ...); // varargs
&gt; +void PrintInts2 (int arg0, int arg1, int arg2, int arg3, int arg4, int arg5, int n, ...);
&gt; +extern void f1(void);
&gt; +extern void f2(void);
&gt; +
&gt; +/*
&gt; +**test1:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s4}, -80
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s4}, 80
&gt; +**	...
&gt; +*/
&gt; +int test1()
&gt; +{
&gt; +  char volatile array[3120];
&gt; +  float volatile farray[3120];
&gt; +
&gt; +  float sum = 0;
&gt; +  for (int i = 0; i &lt; 3120; i++)
&gt; +  {
&gt; +    array[i] = my_getchar();
&gt; +    farray[i] = my_getchar() * 1.2;
&gt; +    sum += array[i] + farray[i];
&gt; +  }
&gt; +  return sum;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test2_step1_0_size:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s1}, -64
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s1}, 64
&gt; +**	...
&gt; +*/
&gt; +int test2_step1_0_size()
&gt; +{
&gt; +  int volatile iarray[3120 + 1824/4 -8];
&gt; +
&gt; +  for (int i = 0; i &lt; 3120 + 1824/4 - 8; i++)
&gt; +  {
&gt; +    iarray[i] = my_getchar() * 2;
&gt; +  }
&gt; +  return iarray[0] + iarray[1];
&gt; +}
&gt; +
&gt; +/*
&gt; +**test3:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s4}, -80
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s4}, 80
&gt; +**	...
&gt; +*/
&gt; +float test3()
&gt; +{
&gt; +  char volatile array[3120];
&gt; +  float volatile farray[3120];
&gt; +
&gt; +  float sum = 0, f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, f6 = 0, f7 = 0;
&gt; +
&gt; +  for (int i = 0; i &lt; 3120; i++)
&gt; +  {
&gt; +    f1 = getf();
&gt; +    f2 = getf();
&gt; +    f3 = getf();
&gt; +    f4 = getf();
&gt; +    array[i] = my_getchar();
&gt; +    farray[i] = my_getchar() * 1.2;
&gt; +    sum += array[i] + farray[i] + f1 + f2 + f3 + f4;
&gt; +  }
&gt; +  return sum;
&gt; +}
&gt; +
&gt; +/*
&gt; +**outgoing_stack_args:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 32
&gt; +**	...
&gt; +*/
&gt; +int outgoing_stack_args()
&gt; +{
&gt; +  int  local = getint();
&gt; +  return local +incoming_stack_args(0, 1, 2, 3, 4, 5, 6, 7, 8);
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrintInts:
&gt; +**	...
&gt; +**	cm.push	{ra}, -48
&gt; +**	...
&gt; +**	cm.popret	{ra}, 48
&gt; +**	...
&gt; +*/
&gt; +float callPrintInts()
&gt; +{
&gt; +  volatile float f = getf(); // f in local
&gt; +  PrintInts(9,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrint:
&gt; +**	...
&gt; +**	cm.push	{ra}, -48
&gt; +**	...
&gt; +**	cm.popret	{ra}, 48
&gt; +**	...
&gt; +*/
&gt; +float callPrint()
&gt; +{
&gt; +  volatile float f = getf(); // f in local
&gt; +  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrint_S:
&gt; +**	...
&gt; +**	cm.push	{ra}, -48
&gt; +**	...
&gt; +**	cm.popret	{ra}, 48
&gt; +**	...
&gt; +*/
&gt; +float callPrint_S()
&gt; +{
&gt; +  float f = getf();
&gt; +  PrintIntsNoVaStart(0,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**callPrint_2:
&gt; +**	...
&gt; +**	cm.push	{ra}, -48
&gt; +**	...
&gt; +**	cm.popret	{ra}, 48
&gt; +**	...
&gt; +*/
&gt; +float callPrint_2()
&gt; +{
&gt; +  float f = getf();
&gt; +  PrintInts2(0,1,2,3,4,5,6,7,8,9);
&gt; +  return f;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_step1_0bytes_save_restore:
&gt; +**	...
&gt; +**	cm.push	{ra}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_step1_0bytes_save_restore()
&gt; +{
&gt; +
&gt; +  int a  =  9;
&gt; +  int b  =  my_getchar();
&gt; +  return a +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_s0:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_s0()
&gt; +{
&gt; +
&gt; +  int a  =  my_getchar();
&gt; +  int b  =  my_getchar();
&gt; +  return a +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_s1:
&gt; +**	...
&gt; +**	cm.push	{ra, s0-s1}, -16
&gt; +**	...
&gt; +**	cm.popret	{ra, s0-s1}, 16
&gt; +**	...
&gt; +*/
&gt; +int test_s1()
&gt; +{
&gt; +
&gt; +  int s0  =  my_getchar();
&gt; +  int s1  =  my_getchar();
&gt; +  int b  =  my_getchar();
&gt; +  return s1 +s0 +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**test_f0:
&gt; +**	...
&gt; +**	cm.push	{ra, s0}, -32
&gt; +**	...
&gt; +**	cm.popret	{ra, s0}, 32
&gt; +**	...
&gt; +*/
&gt; +int test_f0()
&gt; +{
&gt; +
&gt; +  int s0  =  my_getchar();
&gt; +  float f0  =  getf(); 
&gt; +  int b  =  my_getchar();
&gt; +  return f0 +s0 +b;
&gt; +}
&gt; +
&gt; +/*
&gt; +**foo:
&gt; +**	cm.push	{ra}, -16
&gt; +**	call	f1
&gt; +**	cm.pop	{ra}, 16
&gt; +**	tail	f2
&gt; +*/
&gt; +void foo(void)
&gt; +{
&gt; +  f1();
&gt; +  f2();
&gt; +}
&gt; diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
&gt; new file mode 100644
&gt; index 00000000000..05602302a8f
&gt; --- /dev/null
&gt; +++ b/gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c
&gt; @@ -0,0 +1,23 @@
&gt; +/* { dg-do compile } */
&gt; +/* { dg-options " -O0 -march=rv32e_zca_zcb_zcmp -mabi=ilp32e -mcmodel=medlow -fomit-frame-pointer" } */
&gt; +/* { dg-skip-if "" { *-*-* } {"-O2" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
&gt; +/* { dg-final { check-function-bodies "**" "" } } */
&gt; +
&gt; +void bar();
&gt; +
&gt; +/*
&gt; +**fool_rv32e:
&gt; +**	cm.push	{ra}, -32
&gt; +**	...
&gt; +**	call	bar
&gt; +**	...
&gt; +**	lw	a5,32\(sp\)
&gt; +**	...
&gt; +**	cm.popret	{ra}, 32
&gt; +*/
&gt; +int fool_rv32e ( int a0, int a1, int a2, int a3, int a4, int a5,
&gt; +                  int incoming0)
&gt; +{
&gt; +  bar();
&gt; +  return a0 + a1 + a2 + a3 + a4 + a5 + incoming0;
&gt; +}
&gt; -- 
&gt; 2.17.1
</slot12_offset></slot11_offset></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot2_offset></slot1_offset></slot0_offset></mode></slot1_offset></slot0_offset></mode></slot0_offset></mode></slot12_offset></slot11_offset></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot2_offset></slot1_offset></slot0_offset></mode></slot1_offset></slot0_offset></mode></slot0_offset></mode></slot12_offset></slot11_offset></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot10_offset></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot9_offset></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot8_offset></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot7_offset></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot6_offset></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot5_offset></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot4_offset></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot3_offset></slot2_offset></slot1_offset></slot0_offset></mode></slot2_offset></slot1_offset></slot0_offset></mode></slot1_offset></slot0_offset></mode></slot0_offset></mode></gaofei@eswincomputing.com></gaofei@eswincomputing.com></gaofei@eswincomputing.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
  2023-06-07  5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
@ 2023-06-12 15:17   ` Kito Cheng
  2023-06-12 19:26   ` Jeff Law
  1 sibling, 0 replies; 17+ messages in thread
From: Kito Cheng @ 2023-06-12 15:17 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei

I would suggest breaking this patch into two parts: RISC-V part and
the rest part (shrink-wrap.h / shrink-wrap.cc).


On Wed, Jun 7, 2023 at 1:55 PM Fei Gao <gaofei@eswincomputing.com> wrote:
>
> Disable zcmp multi push/pop if shrink-wrap-separate is active.
>
> So in -Os that prefers smaller code size, by default shrink-wrap-separate
> is disabled while zcmp multi push/pop is enabled.
>
> And in -O2 and others that prefers speed, by default shrink-wrap-separate
> is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
> push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
>
> The following TC shows the issues in -O2 before this patch with both
> shrink-wrap-separate and zcmp multi push/pop active.
> 1. duplicated store of s regs.
> 2. cm.push pushes ra, s0-s11 in reverse order than what normal
>    prologue does, causing stack corruption and failure to resotre s regs.
>
> TC: zcmp_shrink_wrap_separate.c included in this patch.
>
> output asm before this patch:
> calc_func:
>         cm.push {ra, s0-s3}, -32
>         ...
>         beq     a5,zero,.L2
>         ...
> .L2:
>         ...
>         sw      s1,20(sp) //issue here
>         sw      s3,12(sp) //issue here
>         ...
>         sw      s2,16(sp) //issue here
>
> output asm after this patch:
> calc_func:
>         addi    sp,sp,-32
>         sw      s0,24(sp)
>         ...
>         beq     a5,zero,.L2
>         ...
> .L2:
>         ...
>         sw      s1,20(sp)
>         sw      s3,12(sp)
>         ...
>         sw      s2,16(sp)
> gcc/ChangeLog:
>
>         * config/riscv/riscv.cc
>         (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
>         riscv_avoid_shrink_wrapping_separate.
>         (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
>           is active.
>         (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
>         * shrink-wrap.cc (try_shrink_wrapping_separate): call
>           use_shrink_wrapping_separate.
>         (use_shrink_wrapping_separate):wrap the condition
>           check in use_shrink_wrapping_separate
>         * shrink-wrap.h (use_shrink_wrapping_separate): add to extern
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
>         * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
>
> Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
> Co-Authored-By: Zhangjin Liao <liaozhangjin@eswincomputing.com>
> ---
>  gcc/config/riscv/riscv.cc                     | 19 +++-
>  gcc/shrink-wrap.cc                            | 25 +++--
>  gcc/shrink-wrap.h                             |  1 +
>  .../riscv/zcmp_shrink_wrap_separate.c         | 97 +++++++++++++++++++
>  .../riscv/zcmp_shrink_wrap_separate2.c        | 97 +++++++++++++++++++
>  5 files changed, 228 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index f60c241a526..b505cdeca34 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfghooks.h"
>  #include "cfgloop.h"
>  #include "cfgrtl.h"
> +#include "shrink-wrap.h"
>  #include "sel-sched.h"
>  #include "fold-const.h"
>  #include "gimple-iterator.h"
> @@ -389,6 +390,7 @@ static const struct riscv_tune_param optimize_size_tune_info = {
>    false,                                       /* use_divmod_expansion */
>  };
>
> +static bool riscv_avoid_shrink_wrapping_separate ();
>  static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
>  static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);
>
> @@ -4910,6 +4912,8 @@ riscv_avoid_multi_push(const struct riscv_frame_info *frame)
>        || cfun->machine->interrupt_handler_p
>        || cfun->machine->varargs_size != 0
>        || crtl->args.pretend_args_size != 0
> +      || (use_shrink_wrapping_separate ()
> +          && !riscv_avoid_shrink_wrapping_separate ())
>        || (frame->mask & ~ MULTI_PUSH_GPR_MASK))
>      return true;
>
> @@ -6077,6 +6081,17 @@ riscv_epilogue_uses (unsigned int regno)
>    return false;
>  }
>
> +static bool
> +riscv_avoid_shrink_wrapping_separate ()
> +{
> +  if (riscv_use_save_libcall (&cfun->machine->frame)
> +      || cfun->machine->interrupt_handler_p
> +      || !cfun->machine->frame.gp_sp_offset.is_constant ())
> +    return true;
> +
> +  return false;
> +}
> +
>  /* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS.  */
>
>  static sbitmap
> @@ -6086,9 +6101,7 @@ riscv_get_separate_components (void)
>    sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER);
>    bitmap_clear (components);
>
> -  if (riscv_use_save_libcall (&cfun->machine->frame)
> -      || cfun->machine->interrupt_handler_p
> -      || !cfun->machine->frame.gp_sp_offset.is_constant ())
> +  if (riscv_avoid_shrink_wrapping_separate ())
>      return components;
>
>    offset = cfun->machine->frame.gp_sp_offset.to_constant ();
> diff --git a/gcc/shrink-wrap.cc b/gcc/shrink-wrap.cc
> index b8d7b557130..d534964321a 100644
> --- a/gcc/shrink-wrap.cc
> +++ b/gcc/shrink-wrap.cc
> @@ -1776,16 +1776,14 @@ insert_prologue_epilogue_for_components (sbitmap components)
>    commit_edge_insertions ();
>  }
>
> -/* The main entry point to this subpass.  FIRST_BB is where the prologue
> -   would be normally put.  */
> -void
> -try_shrink_wrapping_separate (basic_block first_bb)
> +bool
> +use_shrink_wrapping_separate (void)
>  {
>    if (!(SHRINK_WRAPPING_ENABLED
> -       && flag_shrink_wrap_separate
> -       && optimize_function_for_speed_p (cfun)
> -       && targetm.shrink_wrap.get_separate_components))
> -    return;
> +        && flag_shrink_wrap_separate
> +        && optimize_function_for_speed_p (cfun)
> +        && targetm.shrink_wrap.get_separate_components))
> +    return false;
>
>    /* We don't handle "strange" functions.  */
>    if (cfun->calls_alloca
> @@ -1794,6 +1792,17 @@ try_shrink_wrapping_separate (basic_block first_bb)
>        || crtl->calls_eh_return
>        || crtl->has_nonlocal_goto
>        || crtl->saves_all_registers)
> +    return false;
> +
> +  return true;
> +}
> +
> +/* The main entry point to this subpass.  FIRST_BB is where the prologue
> +   would be normally put.  */
> +void
> +try_shrink_wrapping_separate (basic_block first_bb)
> +{
> +  if (!use_shrink_wrapping_separate ())
>      return;
>
>    /* Ask the target what components there are.  If it returns NULL, don't
> diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
> index 161647711a3..82386c2b712 100644
> --- a/gcc/shrink-wrap.h
> +++ b/gcc/shrink-wrap.h
> @@ -26,6 +26,7 @@ along with GCC; see the file COPYING3.  If not see
>  extern bool requires_stack_frame_p (rtx_insn *, HARD_REG_SET, HARD_REG_SET);
>  extern void try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq);
>  extern void try_shrink_wrapping_separate (basic_block first_bb);
> +extern bool use_shrink_wrapping_separate (void);
>  #define SHRINK_WRAPPING_ENABLED \
>    (flag_shrink_wrap && targetm.have_simple_return ())
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
> new file mode 100644
> index 00000000000..11f87aee607
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate.c
> @@ -0,0 +1,97 @@
> +/* { dg-do compile } */
> +/* { dg-options " -O2 -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
> +
> +typedef struct MAT_PARAMS_S
> +{
> +    int     N;
> +    signed short *A;
> +    signed short *B;
> +    signed int *C;
> +} mat_params;
> +
> +typedef struct CORE_PORTABLE_S
> +{
> +    unsigned char portable_id;
> +} core_portable;
> +
> +typedef struct RESULTS_S
> +{
> +    /* inputs */
> +    signed short              seed1;       /* Initializing seed */
> +    signed short              seed2;       /* Initializing seed */
> +    signed short              seed3;       /* Initializing seed */
> +    void *              memblock[4]; /* Pointer to safe memory location */
> +    unsigned int              size;        /* Size of the data */
> +    unsigned int              iterations;  /* Number of iterations to execute */
> +    unsigned int              execs;       /* Bitmask of operations to execute */
> +    struct list_head_s *list;
> +    mat_params          mat;
> +    /* outputs */
> +    unsigned short crc;
> +    unsigned short crclist;
> +    unsigned short crcmatrix;
> +    unsigned short crcstate;
> +    signed short err;
> +    /* ultithread specific */
> +    core_portable port;
> +} core_results;
> +
> +extern signed short
> +core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
> +
> +extern signed short
> +core_bench_matrix(mat_params *, signed short, unsigned short);
> +
> +extern unsigned short
> +crcu16(signed short, unsigned short);
> +
> +signed short
> +calc_func(signed short *pdata, core_results *res)
> +{
> +    signed short data = *pdata;
> +    signed short retval;
> +    unsigned char  optype
> +        = (data >> 7)
> +          & 1;  /* bit 7 indicates if the function result has been cached */
> +    if (optype) /* if cached, use cache */
> +        return (data & 0x007f);
> +    else
> +    {                             /* otherwise calculate and cache the result */
> +        signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
> +        signed short dtype
> +            = ((data >> 3)
> +               & 0xf);       /* bits 3-6 is specific data for the operation */
> +        dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
> +        switch (flag)
> +        {
> +            case 0:
> +                if (dtype < 0x22) /* set min period for bit corruption */
> +                    dtype = 0x22;
> +                retval = core_bench_state(res->size,
> +                                          res->memblock[3],
> +                                          res->seed1,
> +                                          res->seed2,
> +                                          dtype,
> +                                          res->crc);
> +                if (res->crcstate == 0)
> +                    res->crcstate = retval;
> +                break;
> +            case 1:
> +                retval = core_bench_matrix(&(res->mat), dtype, res->crc);
> +                if (res->crcmatrix == 0)
> +                    res->crcmatrix = retval;
> +                break;
> +            default:
> +                retval = data;
> +                break;
> +        }
> +        res->crc = crcu16(retval, res->crc);
> +        retval &= 0x007f;
> +        *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
> +        return retval;
> +    }
> +}
> +
> +/* { dg-final { scan-assembler-not "cm\.push" } } */
> +
> diff --git a/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
> new file mode 100644
> index 00000000000..ec7e9c39b5d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zcmp_shrink_wrap_separate2.c
> @@ -0,0 +1,97 @@
> +/* { dg-do compile } */
> +/* { dg-options " -O2 -fno-shrink-wrap-separate -march=rv32imaf_zca_zcmp -mabi=ilp32f" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-Os" "-Og" "-O3" "-Oz" "-flto"} } */
> +
> +typedef struct MAT_PARAMS_S
> +{
> +    int     N;
> +    signed short *A;
> +    signed short *B;
> +    signed int *C;
> +} mat_params;
> +
> +typedef struct CORE_PORTABLE_S
> +{
> +    unsigned char portable_id;
> +} core_portable;
> +
> +typedef struct RESULTS_S
> +{
> +    /* inputs */
> +    signed short              seed1;       /* Initializing seed */
> +    signed short              seed2;       /* Initializing seed */
> +    signed short              seed3;       /* Initializing seed */
> +    void *              memblock[4]; /* Pointer to safe memory location */
> +    unsigned int              size;        /* Size of the data */
> +    unsigned int              iterations;  /* Number of iterations to execute */
> +    unsigned int              execs;       /* Bitmask of operations to execute */
> +    struct list_head_s *list;
> +    mat_params          mat;
> +    /* outputs */
> +    unsigned short crc;
> +    unsigned short crclist;
> +    unsigned short crcmatrix;
> +    unsigned short crcstate;
> +    signed short err;
> +    /* ultithread specific */
> +    core_portable port;
> +} core_results;
> +
> +extern signed short
> +core_bench_state(unsigned int, void *, signed short, signed short, signed short, unsigned short);
> +
> +extern signed short
> +core_bench_matrix(mat_params *, signed short, unsigned short);
> +
> +extern unsigned short
> +crcu16(signed short, unsigned short);
> +
> +signed short
> +calc_func(signed short *pdata, core_results *res)
> +{
> +    signed short data = *pdata;
> +    signed short retval;
> +    unsigned char  optype
> +        = (data >> 7)
> +          & 1;  /* bit 7 indicates if the function result has been cached */
> +    if (optype) /* if cached, use cache */
> +        return (data & 0x007f);
> +    else
> +    {                             /* otherwise calculate and cache the result */
> +        signed short flag = data & 0x7; /* bits 0-2 is type of function to perform */
> +        signed short dtype
> +            = ((data >> 3)
> +               & 0xf);       /* bits 3-6 is specific data for the operation */
> +        dtype |= dtype << 4; /* replicate the lower 4 bits to get an 8b value */
> +        switch (flag)
> +        {
> +            case 0:
> +                if (dtype < 0x22) /* set min period for bit corruption */
> +                    dtype = 0x22;
> +                retval = core_bench_state(res->size,
> +                                          res->memblock[3],
> +                                          res->seed1,
> +                                          res->seed2,
> +                                          dtype,
> +                                          res->crc);
> +                if (res->crcstate == 0)
> +                    res->crcstate = retval;
> +                break;
> +            case 1:
> +                retval = core_bench_matrix(&(res->mat), dtype, res->crc);
> +                if (res->crcmatrix == 0)
> +                    res->crcmatrix = retval;
> +                break;
> +            default:
> +                retval = data;
> +                break;
> +        }
> +        res->crc = crcu16(retval, res->crc);
> +        retval &= 0x007f;
> +        *pdata = (data & 0xff00) | 0x0080 | retval; /* cache the result */
> +        return retval;
> +    }
> +}
> +
> +/* { dg-final { scan-assembler "cm\.push" } } */
> +
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
  2023-06-07  5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
  2023-06-12 15:17   ` Kito Cheng
@ 2023-06-12 19:26   ` Jeff Law
  2023-06-13  2:35     ` Fei Gao
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Law @ 2023-06-12 19:26 UTC (permalink / raw)
  To: Fei Gao, gcc-patches; +Cc: kito.cheng, palmer, sinan.lin, jiawei



On 6/6/23 23:52, Fei Gao wrote:
> Disable zcmp multi push/pop if shrink-wrap-separate is active.
> 
> So in -Os that prefers smaller code size, by default shrink-wrap-separate
> is disabled while zcmp multi push/pop is enabled.
> 
> And in -O2 and others that prefers speed, by default shrink-wrap-separate
> is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
> push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
> 
> The following TC shows the issues in -O2 before this patch with both
> shrink-wrap-separate and zcmp multi push/pop active.
> 1. duplicated store of s regs.
> 2. cm.push pushes ra, s0-s11 in reverse order than what normal
>     prologue does, causing stack corruption and failure to resotre s regs.
> 
> TC: zcmp_shrink_wrap_separate.c included in this patch.
> 
> output asm before this patch:
> calc_func:
> 	cm.push	{ra, s0-s3}, -32
> 	...
> 	beq	a5,zero,.L2
> 	...
> .L2:
> 	...
> 	sw	s1,20(sp) //issue here
> 	sw	s3,12(sp) //issue here
> 	...
> 	sw	s2,16(sp) //issue here
> 
> output asm after this patch:
> calc_func:
> 	addi	sp,sp,-32
> 	sw	s0,24(sp)
> 	...
> 	beq	a5,zero,.L2
> 	...
> .L2:
> 	...
> 	sw	s1,20(sp)
> 	sw	s3,12(sp)
> 	...
> 	sw	s2,16(sp)
> gcc/ChangeLog:
> 
>          * config/riscv/riscv.cc
>          (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
>          riscv_avoid_shrink_wrapping_separate.
>          (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
>            is active.
>          (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
>          * shrink-wrap.cc (try_shrink_wrapping_separate): call
>            use_shrink_wrapping_separate.
>          (use_shrink_wrapping_separate):wrap the condition
>            check in use_shrink_wrapping_separate
>          * shrink-wrap.h (use_shrink_wrapping_separate): add to extern
> 
> gcc/testsuite/ChangeLog:
> 
>          * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
>          * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
I know Kito asked for this to be broken up into target dependent vs 
target independent changes, that's a good ask.

Can't we utilize the get_separate_components hook to accomplish what 
you're trying to do?  ie, put the logic to avoid shrink wrapping for 
this case within the existing risc-v hook?

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate
  2023-06-12 19:26   ` Jeff Law
@ 2023-06-13  2:35     ` Fei Gao
  0 siblings, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-06-13  2:35 UTC (permalink / raw)
  To: jeffreyalaw, gcc-patches; +Cc: Kito Cheng, Palmer Dabbelt, Sinan, jiawei

On 2023-06-13 03:26  Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
>On 6/6/23 23:52, Fei Gao wrote:
>> Disable zcmp multi push/pop if shrink-wrap-separate is active.
>>
>> So in -Os that prefers smaller code size, by default shrink-wrap-separate
>> is disabled while zcmp multi push/pop is enabled.
>>
>> And in -O2 and others that prefers speed, by default shrink-wrap-separate
>> is enabled while zcmp multi push/pop is disabled. To force enabling zcmp multi
>> push/pop in this case, -fno-shrink-wrap-separate has to be explictly given.
>>
>> The following TC shows the issues in -O2 before this patch with both
>> shrink-wrap-separate and zcmp multi push/pop active.
>> 1. duplicated store of s regs.
>> 2. cm.push pushes ra, s0-s11 in reverse order than what normal
>>     prologue does, causing stack corruption and failure to resotre s regs.
>>
>> TC: zcmp_shrink_wrap_separate.c included in this patch.
>>
>> output asm before this patch:
>> calc_func:
>> cm.push	{ra, s0-s3}, -32
>> ...
>> beq	a5,zero,.L2
>> ...
>> .L2:
>> ...
>> sw	s1,20(sp) //issue here
>> sw	s3,12(sp) //issue here
>> ...
>> sw	s2,16(sp) //issue here
>>
>> output asm after this patch:
>> calc_func:
>> addi	sp,sp,-32
>> sw	s0,24(sp)
>> ...
>> beq	a5,zero,.L2
>> ...
>> .L2:
>> ...
>> sw	s1,20(sp)
>> sw	s3,12(sp)
>> ...
>> sw	s2,16(sp)
>> gcc/ChangeLog:
>>
>>          * config/riscv/riscv.cc
>>          (riscv_avoid_shrink_wrapping_separate): wrap the condition check in
>>          riscv_avoid_shrink_wrapping_separate.
>>          (riscv_avoid_multi_push): avoid multi push if shrink_wrapping_separate
>>            is active.
>>          (riscv_get_separate_components): call riscv_avoid_shrink_wrapping_separate
>>          * shrink-wrap.cc (try_shrink_wrapping_separate): call
>>            use_shrink_wrapping_separate.
>>          (use_shrink_wrapping_separate):wrap the condition
>>            check in use_shrink_wrapping_separate
>>          * shrink-wrap.h (use_shrink_wrapping_separate): add to extern
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/riscv/zcmp_shrink_wrap_separate.c: New test.
>>          * gcc.target/riscv/zcmp_shrink_wrap_separate2.c: New test.
>I know Kito asked for this to be broken up into target dependent vs
>target independent changes, that's a good ask.
>
>Can't we utilize the get_separate_components hook to accomplish what
>you're trying to do?  ie, put the logic to avoid shrink wrapping for
>this case within the existing risc-v hook? 

Thank Jeff and Kito for your comments. 

My first try was to avoid shrink wrapping if zcmp is enabled.
But after discussion with Kito and Andrew Pinski, I realized it's better to disable
zcmp push and pops if shrink wrapping is active.
For detailed discussion, please check link below.
thread: [PATCH 1/2] [RISC-V] disable shrink-wrap-separate if zcmp enabled.
link: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg307203.html

I will go ahead with Kito's advice if you're fine with the current solution.
Thanks.

BR, 
Fei


>
>jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp
  2023-06-07  5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
@ 2023-07-13  8:18   ` Kito Cheng
  0 siblings, 0 replies; 17+ messages in thread
From: Kito Cheng @ 2023-07-13  8:18 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei, Die Li

LGTM, thanks, just like other zc* patches, I would like to defer this
until the binutils part landed :)

On Wed, Jun 7, 2023 at 1:54 PM Fei Gao <gaofei@eswincomputing.com> wrote:
>
> From: Die Li <lidie@eswincomputing.com>
>
> Signed-off-by: Die Li <lidie@eswincomputing.com>
> Co-Authored-By: Fei Gao <gaofei@eswincomputing.com>
>
> gcc/ChangeLog:
>
>         * config/riscv/peephole.md: New pattern.
>         * config/riscv/predicates.md (a0a1_reg_operand): New predicate.
>         (zcmp_mv_sreg_operand): New predicate.
>         * config/riscv/riscv.md: New predicate.
>         * config/riscv/zc.md (*mva01s<X:mode>): New pattern.
>         (*mvsa01<X:mode>): New pattern.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/cm_mv_rv32.c: New test.
> ---
>  gcc/config/riscv/peephole.md                | 28 +++++++++++++++++++++
>  gcc/config/riscv/predicates.md              | 11 ++++++++
>  gcc/config/riscv/riscv.md                   |  1 +
>  gcc/config/riscv/zc.md                      | 22 ++++++++++++++++
>  gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c | 21 ++++++++++++++++
>  5 files changed, 83 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
>
> diff --git a/gcc/config/riscv/peephole.md b/gcc/config/riscv/peephole.md
> index 67e7046d7e6..e8cb1ba4838 100644
> --- a/gcc/config/riscv/peephole.md
> +++ b/gcc/config/riscv/peephole.md
> @@ -94,3 +94,31 @@
>  {
>    th_mempair_order_operands (operands, true, SImode);
>  })
> +
> +;; ZCMP
> +(define_peephole2
> +  [(set (match_operand:X 0 "a0a1_reg_operand")
> +        (match_operand:X 1 "zcmp_mv_sreg_operand"))
> +   (set (match_operand:X 2 "a0a1_reg_operand")
> +        (match_operand:X 3 "zcmp_mv_sreg_operand"))]
> +  "TARGET_ZCMP
> +   && (REGNO (operands[2]) != REGNO (operands[0]))"
> +  [(parallel [(set (match_dup 0)
> +                   (match_dup 1))
> +              (set (match_dup 2)
> +                   (match_dup 3))])]
> +)
> +
> +(define_peephole2
> +  [(set (match_operand:X 0 "zcmp_mv_sreg_operand")
> +        (match_operand:X 1 "a0a1_reg_operand"))
> +   (set (match_operand:X 2 "zcmp_mv_sreg_operand")
> +        (match_operand:X 3 "a0a1_reg_operand"))]
> +  "TARGET_ZCMP
> +   && (REGNO (operands[0]) != REGNO (operands[2]))
> +   && (REGNO (operands[1]) != REGNO (operands[3]))"
> +  [(parallel [(set (match_dup 0)
> +                   (match_dup 1))
> +              (set (match_dup 2)
> +                   (match_dup 3))])]
> +)
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index a1b9367b997..6d5e8630cb5 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -207,6 +207,17 @@
>    (and (match_code "const_int")
>         (match_test "riscv_zcmp_valid_stack_adj_bytes_p (INTVAL (op), 13)")))
>
> +;; ZCMP predicates
> +(define_predicate "a0a1_reg_operand"
> +  (and (match_operand 0 "register_operand")
> +       (match_test "IN_RANGE (REGNO (op), A0_REGNUM, A1_REGNUM)")))
> +
> +(define_predicate "zcmp_mv_sreg_operand"
> +  (and (match_operand 0 "register_operand")
> +       (match_test "TARGET_RVE ? IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
> +                    : IN_RANGE (REGNO (op), S0_REGNUM, S1_REGNUM)
> +                    || IN_RANGE (REGNO (op), S2_REGNUM, S7_REGNUM)")))
> +
>  ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
>  (define_predicate "branch_on_bit_operand"
>    (and (match_code "const_int")
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 02802d2685d..25bc3e6ab4c 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -121,6 +121,7 @@
>     (S0_REGNUM                  8)
>     (S1_REGNUM                  9)
>     (A0_REGNUM                  10)
> +   (A1_REGNUM                  11)
>     (S2_REGNUM                  18)
>     (S3_REGNUM                  19)
>     (S4_REGNUM                  20)
> diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
> index 217e115035b..bb4975cd333 100644
> --- a/gcc/config/riscv/zc.md
> +++ b/gcc/config/riscv/zc.md
> @@ -1433,3 +1433,25 @@
>    "TARGET_ZCMP"
>    "cm.push     {ra, s0-s11}, %0"
>  )
> +
> +;; ZCMP mv
> +(define_insn "*mva01s<X:mode>"
> +  [(set (match_operand:X 0 "a0a1_reg_operand" "=r")
> +        (match_operand:X 1 "zcmp_mv_sreg_operand" "r"))
> +   (set (match_operand:X 2 "a0a1_reg_operand" "=r")
> +        (match_operand:X 3 "zcmp_mv_sreg_operand" "r"))]
> +  "TARGET_ZCMP
> +   && (REGNO (operands[2]) != REGNO (operands[0]))"
> +  { return (REGNO (operands[0]) == A0_REGNUM)?"cm.mva01s\t%1,%3":"cm.mva01s\t%3,%1"; }
> +  [(set_attr "mode" "<X:MODE>")])
> +
> +(define_insn "*mvsa01<X:mode>"
> +  [(set (match_operand:X 0 "zcmp_mv_sreg_operand" "=r")
> +        (match_operand:X 1 "a0a1_reg_operand" "r"))
> +   (set (match_operand:X 2 "zcmp_mv_sreg_operand" "=r")
> +        (match_operand:X 3 "a0a1_reg_operand" "r"))]
> +  "TARGET_ZCMP
> +   && (REGNO (operands[0]) != REGNO (operands[2]))
> +   && (REGNO (operands[1]) != REGNO (operands[3]))"
> +  { return (REGNO (operands[1]) == A0_REGNUM)?"cm.mvsa01\t%0,%2":"cm.mvsa01\t%2,%0"; }
> +  [(set_attr "mode" "<X:MODE>")])
> diff --git a/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
> new file mode 100644
> index 00000000000..49c94c01603
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/cm_mv_rv32.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options " -Os -march=rv32i_zca_zcmp -mabi=ilp32 " } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-O1" "-O2" "-Og" "-O3" "-Oz" "-flto"} } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +int func (int a, int b);
> +
> +/*
> +**sum:
> +**     ...
> +**     cm.mvsa01       s1,s2
> +**     call    func
> +**     mv      s0,a0
> +**     cm.mva01s       s1,s2
> +**     call    func
> +**     ...
> +*/
> +int sum (int a, int b)
> +{
> +        return func (a, b) + func (a, b);
> +}
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] [RISC-V] support cm.popretz in zcmp
  2023-06-07  5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
@ 2023-07-13  8:31   ` Kito Cheng
  0 siblings, 0 replies; 17+ messages in thread
From: Kito Cheng @ 2023-07-13  8:31 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei

I was thinking does it possible to using peephole2 to optimize this
case, but I realized their is several barrier, like stack tie and
note...so it seems hard to just leverage peephole2.

And the patch is LGTM, only a few minor coding format issues, but you
don't need to send new patch, I can fix those stuff when I push, and I
would strongly suggest you setup git-format-patch, <gcc-src>/contrib
has a clang format setting , that can release you from the boring
coding format issues.

# Copy to <gcc-src>/.clang-format, so that clang-format can found that
automatically.
$ cp contrib/clang-format .clang-format


> @@ -5747,6 +5748,80 @@ riscv_adjust_libcall_cfi_epilogue ()
>    return dwarf;
>  }
>
> +/* return true if popretz pattern can be matched.
> +   set (reg 10 a0) (const_int 0)
> +   use (reg 10 a0)
> +   NOTE_INSN_EPILOGUE_BEG  */
> +static rtx_insn *
> +riscv_zcmp_can_use_popretz(void)

Need space between function name and (void)

> +{
> +  rtx_insn *insn = NULL, *use = NULL, *clear = NULL;
> +
> +  /* sequence stack for NOTE_INSN_EPILOGUE_BEG*/
> +  struct sequence_stack * outer_seq = get_current_sequence ()->next;
> +  if (!outer_seq)
> +    return NULL;
> +  insn = outer_seq->first;
> +  if(!insn || !NOTE_P (insn) || NOTE_KIND (insn) != NOTE_INSN_EPILOGUE_BEG)
> +    return NULL;
> +
> +  /* sequence stack for the insn before NOTE_INSN_EPILOGUE_BEG*/
> +  outer_seq = outer_seq->next;
> +  if (outer_seq)
> +    insn = outer_seq->last;
> +
> +  /* skip notes  */
> +  while (insn && NOTE_P (insn))
> +    {
> +      insn = PREV_INSN (insn);
> +    }
> +  use = insn;
> +
> +  /* match use (reg 10 a0)  */
> +  if (use == NULL || !INSN_P (use)
> +      || GET_CODE (PATTERN (use)) != USE
> +      || !REG_P(XEXP(PATTERN (use), 0))
> +      || REGNO(XEXP(PATTERN (use), 0)) != A0_REGNUM)
> +    return NULL;
> +
> +  /* match set (reg 10 a0) (const_int 0 [0])  */
> +  clear = PREV_INSN (use);
> +  if (clear != NULL && INSN_P (clear)
> +      && GET_CODE (PATTERN (clear)) == SET
> +      && REG_P (SET_DEST (PATTERN (clear)))
> +      && REGNO (SET_DEST (PATTERN (clear))) == A0_REGNUM
> +      && SET_SRC (PATTERN (clear)) == const0_rtx)
> +    return clear;
> +
> +  return NULL;
> +}
> +
> +static void
> +riscv_gen_multi_pop_insn(bool use_multi_pop_normal, unsigned mask,
> +                         unsigned multipop_size)

Same issue here, need space between argument and function name.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-06-07  5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
  2023-06-07 10:11   ` jiawei
@ 2023-08-16  8:33   ` Kito Cheng
  2023-08-16  8:38     ` Kito Cheng
  2023-08-17 11:39     ` Fei Gao
  1 sibling, 2 replies; 17+ messages in thread
From: Kito Cheng @ 2023-08-16  8:33 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei

Hi Fei:

Tried to use Jiawei's patch to test this patch and found some issue:


> @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>    /* Save the registers.  */
>    if ((frame->mask | frame->fmask) != 0)
>      {
> -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> -
> -      insn = gen_add3_insn (stack_pointer_rtx,
> -                           stack_pointer_rtx,
> -                           GEN_INT (-step1));
> -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> -      remaining_size -= step1;
> +      if (known_gt (remaining_size, frame->frame_pointer_offset))
> +        {
> +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> +          remaining_size -= step1;
> +          insn = gen_add3_insn (stack_pointer_rtx,
> +                                stack_pointer_rtx,
> +                                GEN_INT (-step1));
> +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> +        }
>        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>      }
>

I hit some issue here during building libgcc, I use
riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp

And the error message is:

In file included from
../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
function '_Unwind_Backtrace':
../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
 330 | }
     | ^
0x83753a gen_reg_rtx(machine_mode)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
0xf5566f maybe_legitimize_operand
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
int, expand_operand*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
0xf58539 expand_binop_directly
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
rtx_def*, int, optab_methods)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
0xcbfdd0 force_operand(rtx_def*, rtx_def*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
0xc8fca1 force_reg(machine_mode, rtx_def*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
0x144b8cd riscv_force_temporary
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
0x144b8cd riscv_force_address
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
0x1af063e gen_movdf(rtx_def*, rtx_def*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
rtx_def*>(rtx_def*, rtx_def*) const
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
0x143d6c4 riscv_save_reg
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
0x143e2b9 riscv_for_each_saved_reg
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
0x14480d0 riscv_expand_prologue()
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
0x1af57fb gen_prologue()
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
0x143c746 target_gen_prologue
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302


Reduced case:

$ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
-mabi=lp64d  unwind-dw2.i -Os

typedef struct {
 struct {
   struct {
     struct {
       long a
     }
   } a[129]
 }
} b;
struct c {
 void *a[129]
} d() {
 struct c a;
 __builtin_unwind_init();
 b e;
 f(a, &e);
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-08-16  8:33   ` Kito Cheng
@ 2023-08-16  8:38     ` Kito Cheng
  2023-08-16  9:03       ` Fei Gao
  2023-08-20 10:53       ` Fei Gao
  2023-08-17 11:39     ` Fei Gao
  1 sibling, 2 replies; 17+ messages in thread
From: Kito Cheng @ 2023-08-16  8:38 UTC (permalink / raw)
  To: Fei Gao; +Cc: gcc-patches, palmer, jeffreyalaw, sinan.lin, jiawei

Another fail case for CFI:

$ riscv64-unknown-elf-gcc _mulhc3.i
-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g  -O2  -o
_mulhc3.s

typedef float a __attribute__((mode(HF)));
b, c;
f() {
 a a, d, e = a + d;
 if (g() && e)
   c = b;
}


0x10e508a maybe_record_trace_start
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
0x10e58fb scan_trace
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
0x10e5fab create_cfi_notes
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
0x10e6ee4 execute_dwarf2_frame
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
0x10e7c5a execute
       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797

On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Hi Fei:
>
> Tried to use Jiawei's patch to test this patch and found some issue:
>
>
> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
> >    /* Save the registers.  */
> >    if ((frame->mask | frame->fmask) != 0)
> >      {
> > -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> > -
> > -      insn = gen_add3_insn (stack_pointer_rtx,
> > -                           stack_pointer_rtx,
> > -                           GEN_INT (-step1));
> > -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> > -      remaining_size -= step1;
> > +      if (known_gt (remaining_size, frame->frame_pointer_offset))
> > +        {
> > +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
> > +          remaining_size -= step1;
> > +          insn = gen_add3_insn (stack_pointer_rtx,
> > +                                stack_pointer_rtx,
> > +                                GEN_INT (-step1));
> > +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> > +        }
> >        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
> >      }
> >
>
> I hit some issue here during building libgcc, I use
> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>
> And the error message is:
>
> In file included from
> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
> function '_Unwind_Backtrace':
> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>  330 | }
>      | ^
> 0x83753a gen_reg_rtx(machine_mode)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
> 0xf5566f maybe_legitimize_operand
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
> int, expand_operand*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
> 0xf58539 expand_binop_directly
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
> rtx_def*, int, optab_methods)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
> 0x144b8cd riscv_force_temporary
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
> 0x144b8cd riscv_force_address
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
> rtx_def*>(rtx_def*, rtx_def*) const
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
> 0x143d6c4 riscv_save_reg
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
> 0x143e2b9 riscv_for_each_saved_reg
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
> 0x14480d0 riscv_expand_prologue()
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
> 0x1af57fb gen_prologue()
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
> 0x143c746 target_gen_prologue
>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>
>
> Reduced case:
>
> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
> -mabi=lp64d  unwind-dw2.i -Os
>
> typedef struct {
>  struct {
>    struct {
>      struct {
>        long a
>      }
>    } a[129]
>  }
> } b;
> struct c {
>  void *a[129]
> } d() {
>  struct c a;
>  __builtin_unwind_init();
>  b e;
>  f(a, &e);
> }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-08-16  8:38     ` Kito Cheng
@ 2023-08-16  9:03       ` Fei Gao
  2023-08-20 10:53       ` Fei Gao
  1 sibling, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-08-16  9:03 UTC (permalink / raw)
  To: Kito Cheng; +Cc: gcc-patches, Palmer Dabbelt, jeffreyalaw, Sinan, jiawei

Hi Kito

Thanks for reporting these 2 issues. 
Let me check and feedback you soon. 

BR
Fei

On 2023-08-16 16:38  Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Another fail case for CFI:
>
>$ riscv64-unknown-elf-gcc _mulhc3.i
>-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g  -O2  -o
>_mulhc3.s
>
>typedef float a __attribute__((mode(HF)));
>b, c;
>f() {
> a a, d, e = a + d;
> if (g() && e)
>   c = b;
>}
>
>
>0x10e508a maybe_record_trace_start
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
>0x10e58fb scan_trace
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
>0x10e5fab create_cfi_notes
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
>0x10e6ee4 execute_dwarf2_frame
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
>0x10e7c5a execute
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
>
>On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>> Hi Fei:
>>
>> Tried to use Jiawei's patch to test this patch and found some issue:
>>
>>
>> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>> >    /* Save the registers.  */
>> >    if ((frame->mask | frame->fmask) != 0)
>> >      {
>> > -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > -
>> > -      insn = gen_add3_insn (stack_pointer_rtx,
>> > -                           stack_pointer_rtx,
>> > -                           GEN_INT (-step1));
>> > -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > -      remaining_size -= step1;
>> > +      if (known_gt (remaining_size, frame->frame_pointer_offset))
>> > +        {
>> > +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > +          remaining_size -= step1;
>> > +          insn = gen_add3_insn (stack_pointer_rtx,
>> > +                                stack_pointer_rtx,
>> > +                                GEN_INT (-step1));
>> > +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > +        }
>> >        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>> >      }
>> >
>>
>> I hit some issue here during building libgcc, I use
>> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>>
>> And the error message is:
>>
>> In file included from
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>> function '_Unwind_Backtrace':
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>>  330 | }
>>      | ^
>> 0x83753a gen_reg_rtx(machine_mode)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>> 0xf5566f maybe_legitimize_operand
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>> int, expand_operand*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>> 0xf58539 expand_binop_directly
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>> rtx_def*, int, optab_methods)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>> 0x144b8cd riscv_force_temporary
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>> 0x144b8cd riscv_force_address
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>> rtx_def*>(rtx_def*, rtx_def*) const
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>> 0x143d6c4 riscv_save_reg
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>> 0x143e2b9 riscv_for_each_saved_reg
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>> 0x14480d0 riscv_expand_prologue()
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>> 0x1af57fb gen_prologue()
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>> 0x143c746 target_gen_prologue
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>>
>>
>> Reduced case:
>>
>> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>> -mabi=lp64d  unwind-dw2.i -Os
>>
>> typedef struct {
>>  struct {
>>    struct {
>>      struct {
>>        long a
>>      }
>>    } a[129]
>>  }
>> } b;
>> struct c {
>>  void *a[129]
>> } d() {
>>  struct c a;
>>  __builtin_unwind_init();
>>  b e;
>>  f(a, &e);
>> }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-08-16  8:33   ` Kito Cheng
  2023-08-16  8:38     ` Kito Cheng
@ 2023-08-17 11:39     ` Fei Gao
  1 sibling, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-08-17 11:39 UTC (permalink / raw)
  To: Kito Cheng
  Cc: gcc-patches, Palmer Dabbelt, jeffreyalaw, Sinan, jiawei,
	eri-sw-toolchain

Hi Kito

Root cause has been identified.

Here's the frame layout fo the TC, please use courier font :)
	+-------------------------------+ 
	|                               | 
	|  GPR save area  112 B         | 
	|                               |
	+-------------------------------+ 
	|                               |<-- fs0 is beyond sp based 12-bit range 
	|  FPR save area  96 B          |
	|                               |
	+-------------------------------+ 
	|                               |
	|  local variables              |<-- stack_pointer_rtx after riscv_first_stack_step
	|                               |
	+-------------------------------+ 

During stack frame allocation:
1. cm.push reserves 160 bytes, 112 for ra and sregs with 128-bit alignment as per ABI, and additional 48 bytes for first 6 fprs.
2. riscv_first_stack_step reserves 2032 bytes for the rest 6 fprs and local variables.
3. riscv_for_each_saved_reg tries to save fs0 which is beyond sp based 12-bit range,
    thus breaking gcc_assert (can_create_pseudo_p ()) in gen_reg_rtx when doing force reg as it's already after reload complete.

I tried with a solution like saving first 6 fprs immediately after cm.push. It seems working:)
I will fix epilogue correspondingly as well.

Thanks again for your test. 

BR, 
Fei

On 2023-08-16 16:33  Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Hi Fei:
>
>Tried to use Jiawei's patch to test this patch and found some issue:
>
>
>> @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>>    /* Save the registers.  */
>>    if ((frame->mask | frame->fmask) != 0)
>>      {
>> -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> -
>> -      insn = gen_add3_insn (stack_pointer_rtx,
>> -                           stack_pointer_rtx,
>> -                           GEN_INT (-step1));
>> -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> -      remaining_size -= step1;
>> +      if (known_gt (remaining_size, frame->frame_pointer_offset))
>> +        {
>> +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> +          remaining_size -= step1;
>> +          insn = gen_add3_insn (stack_pointer_rtx,
>> +                                stack_pointer_rtx,
>> +                                GEN_INT (-step1));
>> +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> +        }
>>        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>>      }
>>
>
>I hit some issue here during building libgcc, I use
>riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>
>And the error message is:
>
>In file included from
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>function '_Unwind_Backtrace':
>../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
> 330 | }
>     | ^
>0x83753a gen_reg_rtx(machine_mode)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>0xf5566f maybe_legitimize_operand
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>int, expand_operand*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>0xf58539 expand_binop_directly
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>rtx_def*, int, optab_methods)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>0xc8fca1 force_reg(machine_mode, rtx_def*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>0x144b8cd riscv_force_temporary
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>0x144b8cd riscv_force_address
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>0x1af063e gen_movdf(rtx_def*, rtx_def*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>rtx_def*>(rtx_def*, rtx_def*) const
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>0x143d6c4 riscv_save_reg
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>0x143e2b9 riscv_for_each_saved_reg
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>0x14480d0 riscv_expand_prologue()
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>0x1af57fb gen_prologue()
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>0x143c746 target_gen_prologue
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>
>
>Reduced case:
>
>$ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>-mabi=lp64d  unwind-dw2.i -Os
>
>typedef struct {
> struct {
>   struct {
>     struct {
>       long a
>     }
>   } a[129]
> }
>} b;
>struct c {
> void *a[129]
>} d() {
> struct c a;
> __builtin_unwind_init();
> b e;
> f(a, &e);
>}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-08-16  8:38     ` Kito Cheng
  2023-08-16  9:03       ` Fei Gao
@ 2023-08-20 10:53       ` Fei Gao
  2023-08-28  8:04         ` Fei Gao
  1 sibling, 1 reply; 17+ messages in thread
From: Fei Gao @ 2023-08-20 10:53 UTC (permalink / raw)
  To: Kito Cheng
  Cc: gcc-patches, Palmer Dabbelt, jeffreyalaw, Sinan, jiawei,
	eri-sw-toolchain


Hi Kito

This issue is due to zcmp and shrink-wrap-separate conflict,
which has been addressed by an under-review patch.
[PATCH 0/2] resolve confilct between RISC-V zcmp and shrink-wrap-separate
https://patchwork.sourceware.org/project/gcc/list/?series=21577
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311487.html

I'm making  [PATCH 1/4][V5][RISC-V] support cm.push cm.pop cm.popret in zcmp for the 1st issue you catched.
Please let me know if you want me to merge 
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311486.html
into [PATCH 1/4][V5][RISC-V].

BR, 
Fei
On 2023-08-16 16:38  Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Another fail case for CFI:
>
>$ riscv64-unknown-elf-gcc _mulhc3.i
>-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g  -O2  -o
>_mulhc3.s
>
>typedef float a __attribute__((mode(HF)));
>b, c;
>f() {
> a a, d, e = a + d;
> if (g() && e)
>   c = b;
>}
>
>
>0x10e508a maybe_record_trace_start
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
>0x10e58fb scan_trace
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
>0x10e5fab create_cfi_notes
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
>0x10e6ee4 execute_dwarf2_frame
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
>0x10e7c5a execute
>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
>
>On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>> Hi Fei:
>>
>> Tried to use Jiawei's patch to test this patch and found some issue:
>>
>>
>> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>> >    /* Save the registers.  */
>> >    if ((frame->mask | frame->fmask) != 0)
>> >      {
>> > -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > -
>> > -      insn = gen_add3_insn (stack_pointer_rtx,
>> > -                           stack_pointer_rtx,
>> > -                           GEN_INT (-step1));
>> > -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > -      remaining_size -= step1;
>> > +      if (known_gt (remaining_size, frame->frame_pointer_offset))
>> > +        {
>> > +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>> > +          remaining_size -= step1;
>> > +          insn = gen_add3_insn (stack_pointer_rtx,
>> > +                                stack_pointer_rtx,
>> > +                                GEN_INT (-step1));
>> > +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>> > +        }
>> >        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>> >      }
>> >
>>
>> I hit some issue here during building libgcc, I use
>> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>>
>> And the error message is:
>>
>> In file included from
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>> function '_Unwind_Backtrace':
>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>>  330 | }
>>      | ^
>> 0x83753a gen_reg_rtx(machine_mode)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>> 0xf5566f maybe_legitimize_operand
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>> int, expand_operand*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>> 0xf58539 expand_binop_directly
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>> rtx_def*, int, optab_methods)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>> 0x144b8cd riscv_force_temporary
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>> 0x144b8cd riscv_force_address
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>> rtx_def*>(rtx_def*, rtx_def*) const
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>> 0x143d6c4 riscv_save_reg
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>> 0x143e2b9 riscv_for_each_saved_reg
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>> 0x14480d0 riscv_expand_prologue()
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>> 0x1af57fb gen_prologue()
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>> 0x143c746 target_gen_prologue
>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>>
>>
>> Reduced case:
>>
>> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>> -mabi=lp64d  unwind-dw2.i -Os
>>
>> typedef struct {
>>  struct {
>>    struct {
>>      struct {
>>        long a
>>      }
>>    } a[129]
>>  }
>> } b;
>> struct c {
>>  void *a[129]
>> } d() {
>>  struct c a;
>>  __builtin_unwind_init();
>>  b e;
>>  f(a, &e);
>> }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp
  2023-08-20 10:53       ` Fei Gao
@ 2023-08-28  8:04         ` Fei Gao
  0 siblings, 0 replies; 17+ messages in thread
From: Fei Gao @ 2023-08-28  8:04 UTC (permalink / raw)
  To: Kito Cheng, jeffreyalaw; +Cc: gcc-patches, Palmer Dabbelt, Sinan, jiawei

Hi Kito & Jeff

A new series for zcmp(https://patchwork.sourceware.org/project/gcc/list/?series=23929) to:
1. solve the 2 issues Kito catched
2. rebase

The new series would be a replacement of the following:
https://patchwork.sourceware.org/project/gcc/list/?series=21577
https://patchwork.sourceware.org/project/gcc/patch/20230607055215.29332-2-gaofei@eswincomputing.com/

The rest of zcmp patches will be send out after the new series accepted to avoid rebase again an again.

BR, 
Fei


On 2023-08-20 18:53  Fei Gao <gaofei@eswincomputing.com> wrote:
>
>
>Hi Kito
>
>This issue is due to zcmp and shrink-wrap-separate conflict,
>which has been addressed by an under-review patch.
>[PATCH 0/2] resolve confilct between RISC-V zcmp and shrink-wrap-separate
>https://patchwork.sourceware.org/project/gcc/list/?series=21577
>https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311487.html
>
>I'm making  [PATCH 1/4][V5][RISC-V] support cm.push cm.pop cm.popret in zcmp for the 1st issue you catched.
>Please let me know if you want me to merge 
>https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg311486.html
>into [PATCH 1/4][V5][RISC-V]. 
>
>BR, 
>Fei
>On 2023-08-16 16:38  Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>>Another fail case for CFI:
>>
>>$ riscv64-unknown-elf-gcc _mulhc3.i
>>-march=rv64imafd_zicsr_zifencei_zca_zcmp -mabi=lp64d -g  -O2  -o
>>_mulhc3.s
>>
>>typedef float a __attribute__((mode(HF)));
>>b, c;
>>f() {
>> a a, d, e = a + d;
>> if (g() && e)
>>   c = b;
>>}
>>
>>
>>0x10e508a maybe_record_trace_start
>>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2584
>>0x10e58fb scan_trace
>>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2784
>>0x10e5fab create_cfi_notes
>>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:2938
>>0x10e6ee4 execute_dwarf2_frame
>>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3309
>>0x10e7c5a execute
>>       ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/dwarf2cfi.cc:3797
>>
>>On Wed, Aug 16, 2023 at 4:33 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>>>
>>> Hi Fei:
>>>
>>> Tried to use Jiawei's patch to test this patch and found some issue:
>>>
>>>
>>> > @@ -5430,13 +5632,15 @@ riscv_expand_prologue (void)
>>> >    /* Save the registers.  */
>>> >    if ((frame->mask | frame->fmask) != 0)
>>> >      {
>>> > -      HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>>> > -
>>> > -      insn = gen_add3_insn (stack_pointer_rtx,
>>> > -                           stack_pointer_rtx,
>>> > -                           GEN_INT (-step1));
>>> > -      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>>> > -      remaining_size -= step1;
>>> > +      if (known_gt (remaining_size, frame->frame_pointer_offset))
>>> > +        {
>>> > +          HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
>>> > +          remaining_size -= step1;
>>> > +          insn = gen_add3_insn (stack_pointer_rtx,
>>> > +                                stack_pointer_rtx,
>>> > +                                GEN_INT (-step1));
>>> > +          RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>>> > +        }
>>> >        riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
>>> >      }
>>> >
>>>
>>> I hit some issue here during building libgcc, I use
>>> riscv-gnu-toolchain with --with-arch=rv64gzca_zcmp
>>>
>>> And the error message is:
>>>
>>> In file included from
>>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind-dw2.c:1471:
>>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc: In
>>> function '_Unwind_Backtrace':
>>> ../../../../../riscv-gnu-toolchain-trunk/gcc/libgcc/unwind.inc:330:1:
>>> internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1176
>>>  330 | }
>>>      | ^
>>> 0x83753a gen_reg_rtx(machine_mode)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/emit-rtl.cc:1176
>>> 0xf5566f maybe_legitimize_operand
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8047
>>> 0xf5566f maybe_legitimize_operands(insn_code, unsigned int, unsigned
>>> int, expand_operand*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8191
>>> 0xf511d9 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:8210
>>> 0xf58539 expand_binop_directly
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1452
>>> 0xf56666 expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
>>> rtx_def*, int, optab_methods)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/optabs.cc:1539
>>> 0xcbfdd0 force_operand(rtx_def*, rtx_def*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:8231
>>> 0xc8fca1 force_reg(machine_mode, rtx_def*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/explow.cc:687
>>> 0x144b8cd riscv_force_temporary
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1531
>>> 0x144b8cd riscv_force_address
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1528
>>> 0x144b8cd riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:2387
>>> 0x1af063e gen_movdf(rtx_def*, rtx_def*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2107
>>> 0xcba503 rtx_insn* insn_gen_fn::operator()<rtx_def*,
>>> rtx_def*>(rtx_def*, rtx_def*) const
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/recog.h:411
>>> 0xcba503 emit_move_insn_1(rtx_def*, rtx_def*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/expr.cc:4164
>>> 0x143d6c4 riscv_emit_move(rtx_def*, rtx_def*)
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:1486
>>> 0x143d6c4 riscv_save_reg
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5715
>>> 0x143e2b9 riscv_for_each_saved_reg
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:5904
>>> 0x14480d0 riscv_expand_prologue()
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.cc:6156
>>> 0x1af57fb gen_prologue()
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:2816
>>> 0x143c746 target_gen_prologue
>>>        ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/config/riscv/riscv.md:3302
>>>
>>>
>>> Reduced case:
>>>
>>> $ riscv64-unknown-elf-gcc -march=rv64imafd_zicsr_zifencei_zca_zcmp
>>> -mabi=lp64d  unwind-dw2.i -Os
>>>
>>> typedef struct {
>>>  struct {
>>>    struct {
>>>      struct {
>>>        long a
>>>      }
>>>    } a[129]
>>>  }
>>> } b;
>>> struct c {
>>>  void *a[129]
>>> } d() {
>>>  struct c a;
>>>  __builtin_unwind_init();
>>>  b e;
>>>  f(a, &e);
>>> }

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-08-28  8:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07  5:52 [PATCH 0/4] [RISC-V] support zcmp extention Fei Gao
2023-06-07  5:52 ` [PATCH 1/4][V4][RISC-V] support cm.push cm.pop cm.popret in zcmp Fei Gao
2023-06-07 10:11   ` jiawei
2023-08-16  8:33   ` Kito Cheng
2023-08-16  8:38     ` Kito Cheng
2023-08-16  9:03       ` Fei Gao
2023-08-20 10:53       ` Fei Gao
2023-08-28  8:04         ` Fei Gao
2023-08-17 11:39     ` Fei Gao
2023-06-07  5:52 ` [PATCH 2/4] [RISC-V] support cm.popretz " Fei Gao
2023-07-13  8:31   ` Kito Cheng
2023-06-07  5:52 ` [PATCH 3/4] [RISC-V] resolve confilct between zcmp multi push/pop and shrink-wrap-separate Fei Gao
2023-06-12 15:17   ` Kito Cheng
2023-06-12 19:26   ` Jeff Law
2023-06-13  2:35     ` Fei Gao
2023-06-07  5:52 ` [PATCH 4/4] [RISC-V] support cm.mva01s cm.mvsa01 in zcmp Fei Gao
2023-07-13  8:18   ` Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).