public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Fei Gao" <gaofei@eswincomputing.com>
To: Sinan <sinan.lin@linux.alibaba.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,  jiawei <jiawei@iscas.ac.cn>
Subject: Re: Re: [PATCH 4/5] RISC-V: Add Zcmp extension supports.
Date: Fri, 12 May 2023 17:10:21 +0800	[thread overview]
Message-ID: <20230512171020765870170@eswincomputing.com> (raw)
In-Reply-To: <10fd153f-397b-4372-9c43-96d7ce30a2c5.sinan.lin@linux.alibaba.com>

On 2023-05-12 16:12  Sinan <sinan.lin@linux.alibaba.com> wrote:
>
>Hi Fei,
>Sorry for the late reply, I've been busy with moving these days :(.
>Thanks for working on it. I would prefer removing the extra pass for popretz if possible ... I will test your patches ASAP.
>BR,
>Sinan 

hi Sinan

I posted V2 based on Kito's comment just now.
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg307507.html

For popretz, we can discuss further offline if it's convenient to you.

BR, 
Fei
>------------------------------------------------------------------
>Sender:Fei Gao <gaofei@eswincomputing.com>
>Sent At:2023 May 6 (Sat.) 16:53
>Recipient:Sinan <sinan.lin@linux.alibaba.com>
>Cc:jiawei <jiawei@iscas.ac.cn>; gcc-patches <gcc-patches@gcc.gnu.org>
>Subject:Re: Re: [PATCH 4/5] RISC-V: Add Zcmp extension supports.
>On 2023-05-05 23:57 Sinan <sinan.lin@linux.alibaba.com> wrote:
>>
>>> hi Jiawei
>>>
>>> Please ignore my previous reply. I accidently sent the email before I finished it.
>>> Sorry for that!
>>>
>>> I downloaded the series of patches from you and found in some cases
>>> it fails to generate zcmp push and pop insns.
>>>
>>> TC:
>>>
>>> char my_getchar();
>>> int test_s0()
>>> {
>>>
>>> int a = my_getchar();
>>> int b = my_getchar();
>>> return a+b;
>>> }
>>>
>>> cc1 -fno-shrink-wrap-separate -O2 -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow test.c
>>>
>>> -fno-shrink-wrap-separate is used here to avoid the impact from shrink-wrap-separate that is by default
>>> enabled in O2.
>>>
>>> As i'm also interested in Zc*, i did some changes mainly in prologue and epilogue pass quite simliar to
>>> what has been done for save and restore except the CFI directives due to reversed order that zcmp
>>> pushes and pops ra, s regs than what save and restore do.
>>>
>>> I will refine and share the code soon for your review.
>>>
>>> BR
>>> Fei
>>Hi Fei,
>>In the current implementation, cm.push will not increase the original adjustment size of the stack pointer. As cm.push uses a minimum adjustment size of 16, and in your example, the adjustment size of sp is 12, so cm.push will not be generated.
>>you can find the check at riscv_use_push_pop
>>> > + */
>>> > + if (base_size > frame_size)
>>> > + return false;
>>> > +
>>And if this check is removed, then you can get the output that you expect.
>>```
>> cm.push {ra,s0},-16
>> call my_getchar
>> mv s0,a0
>> call my_getchar
>> add a0,s0,a0
>> cm.popret {ra,s0},16
>>```
>>In many scenarios of rv32e, cm.push cannot be generated as a result. Perhaps we can remove this check? I haven't tested if it is ok to remove this check, and CC jiawei to help test it.
>>BR,
>>Sinan
>hi Sinan
>Thanks for your reply.
>I posted my codes at https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg306921.html
>In the cover letter, i did some comparision.
>Could you please review?
>Thanks & BR,
>Fei
>>------------------------------------------------------------------
>>Sender:Fei Gao <gaofei@eswincomputing.com>
>>Sent At:2023 Apr. 25 (Tue.) 18:12
>>Recipient:jiawei <jiawei@iscas.ac.cn>
>>Cc:gcc-patches <gcc-patches@gcc.gnu.org>
>>Subject:[PATCH 4/5] RISC-V: Add Zcmp extension supports.
>>hi Jiawei
>>Please ignore my previous reply. I accidently sent the email before I finished it.
>>Sorry for that!
>>I downloaded the series of patches from you and found in some cases
>>it fails to generate zcmp push and pop insns.
>>TC:
>>char my_getchar();
>>int test_s0()
>>{
>> int a = my_getchar();
>> int b = my_getchar();
>> return a+b;
>>}
>>cc1 -fno-shrink-wrap-separate -O2 -march=rv32e_zca_zcmp -mabi=ilp32e -mcmodel=medlow test.c
>>-fno-shrink-wrap-separate is used here to avoid the impact from shrink-wrap-separate that is by default
>>enabled in O2.
>>As i'm also interested in Zc*, i did some changes mainly in prologue and epilogue pass quite simliar to
>>what has been done for save and restore except the CFI directives due to reversed order that zcmp
>>pushes and pops ra, s regs than what save and restore do.
>>I will refine and share the code soon for your review.
>>BR
>>Fei
>>On Thu Apr 6 06:21:17 GMT 2023 Jiawei jiawei@iscas.ac.cn wrote:
>>>
>>>Add Zcmp extension instructions support. Generate push/pop
>>>with follow steps:
>>>
>>> 1. preprocessing:
>>> 1.1. if there is no push rtx, then just return. e.g.
>>> (note 5 1 22 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>>> (insn/f 22 5 23 2 (set (reg/f:SI 2 sp)
>>> (plus:SI (reg/f:SI 2 sp)
>>> (const_int -32 [0xffffffffffffffe0])))
>>> (nil))
>>> (note 23 22 2 2 NOTE_INSN_PROLOGUE_END)
>>> 1.2. if push rtx exists, then we compute the number of
>>> pushed s-registers, n_sreg.
>>>
>>> push rtx should be find before NOTE_INSN_PROLOGUE_END tag
>>>
>>> [2 and 3 happend simultaneously]
>>>
>>> 2. find valid move pattern, mv sN, aN, where N < n_sreg,
>>> and aN is not used the move pattern, and sN is not
>>> defined before the move pattern (from prologue to the
>>> position of move pattern).
>>>
>>> 3. analysis use and reach of every instruction from prologue
>>> to the position of move pattern.
>>> if any sN is used, then we mark the corresponding argument list
>>> candidate as invalid.
>>> e.g.
>>> push {ra,s0-s3}, {}, -32
>>> sw s0,44(sp) # s0 is used, then argument list is invalid
>>> mv a0,a5 # a0 is defined, then argument list is invalid
>>> ...
>>> mv s0,a0
>>> mv s1,a1
>>> mv s2,a2
>>>
>>> 4. if there is a valid argument list, then replace the pop
>>> push parallel insn, and delete mv pattern.
>>> if not, skip.
>>>
>>>All "zcmpe" means Zcmp with RVE extension.
>>>The push/pop instrunction implement is mostly finished by Sinan Lin.
>>>
>>>Co-Authored by: Sinan Lin <sinan....@linux.alibaba.com>
>>>Co-Authored by: Simon Cook <simon.c...@embecosm.com>
>>>Co-Authored by: Shihua Liao <shi...@iscas.ac.cn>
>>>
>>>gcc/ChangeLog:
>>>
>>> * config.gcc: New object.
>>> * config/riscv/predicates.md (riscv_stack_push_operation):
>>> New predicate.
>>> (riscv_stack_pop_operation): Ditto.
>>> (pop_return_value_constant): Ditto.
>>> * config/riscv/riscv-passes.def (INSERT_PASS_AFTER): New pass.
>>> * config/riscv/riscv-protos.h (riscv_output_popret_p):
>>> New routine.
>>> (riscv_valid_stack_push_pop_p): Ditto.
>>> (riscv_check_regno): Ditto.
>>> (make_pass_zcmp_popret): New pass.
>>> * config/riscv/riscv.cc (struct riscv_frame_info): New variable.
>>> (riscv_output_popret_p): New function.
>>> (riscv_print_pop_size): Ditto.
>>> (riscv_print_reglist): Ditto.
>>> (riscv_print_operand): New case symbols.
>>> (riscv_save_push_pop_count): New function.
>>> (riscv_push_pop_base_sp_adjust): Ditto.
>>> (riscv_use_push_pop): Ditto.
>>> (riscv_compute_frame_info): Adjust frame value.
>>> (riscv_emit_pop_insn): New function.
>>> (riscv_check_regno): Ditto.
>>> (riscv_valid_stack_push_pop_p): Ditto.
>>> (riscv_emit_push_insn): Ditto.
>>> (riscv_expand_prologue): Modify frame pattern.
>>> (riscv_expand_epilogue): Ditto.
>>> * config/riscv/riscv.h (RETURN_VALUE_REGNUM):
>>> (RISCV_ZCE_PUSH_POP_MASK): New mask.
>>> (RISCV_ZCMPE_PUSH_POP_MASK): Ditto.
>>> * config/riscv/riscv.md: Add new reg number and include info.
>>> * config/riscv/t-riscv: New object rules.
>>> * config/riscv/riscv-zcmp-popret.cc: New file.
>>> * config/riscv/zc.md: New file.
>>>---
>>> gcc/config.gcc | 2 +-
>>> gcc/config/riscv/predicates.md | 16 +
>>> gcc/config/riscv/riscv-passes.def | 1 +
>>> gcc/config/riscv/riscv-protos.h | 4 +
>>> gcc/config/riscv/riscv-zcmp-popret.cc | 260 +++++++++++++++
>>> gcc/config/riscv/riscv.cc | 437 +++++++++++++++++++++++++-
>>> gcc/config/riscv/riscv.h | 4 +
>>> gcc/config/riscv/riscv.md | 3 +
>>> gcc/config/riscv/t-riscv | 4 +
>>> gcc/config/riscv/zc.md | 47 +++
>>> 10 files changed, 767 insertions(+), 11 deletions(-)
>>> create mode 100644 gcc/config/riscv/riscv-zcmp-popret.cc
>>> create mode 100644 gcc/config/riscv/zc.md
>>>
>>>diff --git a/gcc/config.gcc b/gcc/config.gcc
>>>index 629d324b5ef..a991c5273f9 100644
>>>--- a/gcc/config.gcc
>>>+++ b/gcc/config.gcc
>>>@@ -529,7 +529,7 @@ pru-*-*)
>>> ;;
>>> riscv*)
>>> cpu_type=riscv
>>>- extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o
>>>riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
>>>+ extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o
>>>riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o
>>>riscv-zcmp-popret.o"
>>> extra_objs="${extra_objs} riscv-vector-builtins.o
>>>riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
>>> extra_objs="${extra_objs} thead.o"
>>> d_target_objs="riscv-d.o"
>>>diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
>>>index 0d9d7701c7e..6bff6cd047a 100644
>>>--- a/gcc/config/riscv/predicates.md
>>>+++ b/gcc/config/riscv/predicates.md
>>>@@ -412,3 +412,19 @@
>>> (and (match_code "const_int")
>>> (ior (match_operand 0 "not_uimm_extra_bit_operand")
>>> (match_operand 0 "const_nottwobits_operand"))))
>>>+
>>>+(define_special_predicate "riscv_stack_push_operation"
>>>+ (match_code "parallel")
>>>+{
>>>+ return riscv_valid_stack_push_pop_p (op, true);
>>>+})
>>>+
>>>+(define_special_predicate "riscv_stack_pop_operation"
>>>+ (match_code "parallel")
>>>+{
>>>+ return riscv_valid_stack_push_pop_p (op, false);
>>>+})
>>>+
>>>+(define_predicate "pop_return_value_constant"
>>>+ (and (match_code "const_int")
>>>+ (match_test "INTVAL (op) == 0")))
>>>diff --git a/gcc/config/riscv/riscv-passes.def
>>>b/gcc/config/riscv/riscv-passes.def
>>>index 4084122cf0a..25625b9af3e 100644
>>>--- a/gcc/config/riscv/riscv-passes.def
>>>+++ b/gcc/config/riscv/riscv-passes.def
>>>@@ -19,3 +19,4 @@
>>>
>>> INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
>>> INSERT_PASS_BEFORE (pass_fast_rtl_dce, 1, pass_vsetvl);
>>>+INSERT_PASS_AFTER (pass_cprop_hardreg, 1, pass_zcmp_popret);
>>>diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
>>>index 4611447ddde..8f243cd5f44 100644
>>>--- a/gcc/config/riscv/riscv-protos.h
>>>+++ b/gcc/config/riscv/riscv-protos.h
>>>@@ -54,6 +54,7 @@ extern bool riscv_split_64bit_move_p (rtx, rtx);
>>> extern void riscv_split_doubleword_move (rtx, rtx);
>>> extern const char *riscv_output_move (rtx, rtx);
>>> extern const char *riscv_output_return ();
>>>+extern bool riscv_output_popret_p (rtx);
>>>
>>> #ifdef RTX_CODE
>>> extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
>>>@@ -79,6 +80,8 @@ extern void riscv_reinit (void);
>>> extern poly_uint64 riscv_regmode_natural_size (machine_mode);
>>> extern bool riscv_v_ext_vector_mode_p (machine_mode);
>>> extern bool riscv_shamt_matches_mask_p (int, HOST_WIDE_INT);
>>>+extern bool riscv_valid_stack_push_pop_p (rtx, bool);
>>>+extern bool riscv_check_regno(rtx, unsigned);
>>>
>>> /* Routines implemented in riscv-c.cc. */
>>> void riscv_cpu_cpp_builtins (cpp_reader *);
>>>@@ -99,6 +102,7 @@ extern bool riscv_hard_regno_rename_ok (unsigned, unsigned);
>>>
>>> rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt);
>>> rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
>>>+rtl_opt_pass * make_pass_zcmp_popret (gcc::context *ctxt);
>>>
>>> /* Information about one CPU we know about. */
>>> struct riscv_cpu_info {
>>>diff --git a/gcc/config/riscv/riscv-zcmp-popret.cc
>>>b/gcc/config/riscv/riscv-zcmp-popret.cc
>>>new file mode 100644
>>>index 00000000000..d7b40f6a3e2
>>>--- /dev/null
>>>+++ b/gcc/config/riscv/riscv-zcmp-popret.cc
>>>@@ -0,0 +1,260 @@
>>>+#include "config.h"
>>>+#include "system.h"
>>>+#include "coretypes.h"
>>>+#include "tm.h"
>>>+#include "rtl.h"
>>>+#include "backend.h"
>>>+#include "regs.h"
>>>+#include "target.h"
>>>+#include "memmodel.h"
>>>+#include "emit-rtl.h"
>>>+#include "df.h"
>>>+#include "predict.h"
>>>+#include "tree-pass.h"
>>>+#include "tree.h"
>>>+#include "tm_p.h"
>>>+#include "optabs.h"
>>>+#include "recog.h"
>>>+#include "cfgrtl.h"
>>>+
>>>+#define IN_TARGET_CODE 1
>>>+
>>>+namespace {
>>>+
>>>+/*
>>>+ 1. preprocessing:
>>>+ 1.1. if there is no push rtx, then just return. e.g.
>>>+ (note 5 1 22 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>>>+ (insn/f 22 5 23 2 (set (reg/f:SI 2 sp)
>>>+ (plus:SI (reg/f:SI 2 sp)
>>>+ (const_int -32 [0xffffffffffffffe0])))
>>>+ (nil))
>>>+ (note 23 22 2 2 NOTE_INSN_PROLOGUE_END)
>>>+ 1.2. if push rtx exists, then we compute the number of
>>>+ pushed s-registers, n_sreg.
>>>+
>>>+ push rtx should be find before NOTE_INSN_PROLOGUE_END tag
>>>+
>>>+ [2 and 3 happend simultaneously]
>>>+ 2. find valid move pattern, mv sN, aN, where N < n_sreg,
>>>+ and aN is not used the move pattern, and sN is not
>>>+ defined before the move pattern (from prologue to the
>>>+ position of move pattern).
>>>+ 3. analysis use and reach of every instruction from prologue
>>>+ to the position of move pattern.
>>>+ if any sN is used, then we mark the corresponding argument list
>>>+ candidate as invalid.
>>>+ e.g.
>>>+ push {ra,s0-s3}, {}, -32
>>>+ sw s0,44(sp) # s0 is used, then argument list is invalid
>>>+ mv a0,a5 # a0 is defined, then argument list is invalid
>>>+ ...
>>>+ mv s0,a0
>>>+ mv s1,a1
>>>+ mv s2,a2
>>>+
>>>+ 4. if there is a valid argument list, then replace the pop
>>>+ push parallel insn, and delete mv pattern.
>>>+ if not, skip.
>>>+*/
>>>+
>>>+static void
>>>+emit_zcmp_popret (rtx_insn *pop_rtx,
>>>+ rtx_insn **candidates,
>>>+ basic_block bb)
>>>+{
>>>+ bool gen_popretz_p = candidates [0];
>>>+ bool gen_popret_p = candidates [2];
>>>+
>>>+ if (!(gen_popret_p || gen_popretz_p))
>>>+ return;
>>>+
>>>+ gcc_assert ((gen_popret_p && !gen_popretz_p)
>>>+ || (gen_popretz_p && gen_popret_p));
>>>+
>>>+ rtx pop_pat = PATTERN (pop_rtx);
>>>+ unsigned pop_idx = 0, popret_idx = 0;
>>>+ unsigned n_pop_par = XVECLEN (pop_pat, 0);
>>>+ unsigned n_popret_par = n_pop_par
>>>+ + (gen_popretz_p ? 2 : 0)
>>>+ + (gen_popret_p ? 2 : 0);
>>>+
>>>+ rtx popret_par = gen_rtx_PARALLEL (VOIDmode,
>>>+ rtvec_alloc (n_popret_par));
>>>+
>>>+ /* return zero pattern */
>>>+ if (gen_popretz_p)
>>>+ {
>>>+ XVECEXP (popret_par, 0, 0) = PATTERN (candidates[0]);
>>>+ XVECEXP (popret_par, 0, 1) = PATTERN (candidates[1]);
>>>+ popret_idx += 2;
>>>+ delete_insn (candidates[0]);
>>>+ delete_insn (candidates[1]);
>>>+ }
>>>+
>>>+ /* copy pop paruence. */
>>>+ for (; pop_idx < n_pop_par;
>>>+ pop_idx ++, popret_idx ++)
>>>+ {
>>>+ XVECEXP (popret_par, 0, popret_idx) =
>>>+ XVECEXP (pop_pat, 0, pop_idx);
>>>+ }
>>>+
>>>+ /* ret pattern. */
>>>+ rtx ret_pat = PATTERN (candidates[2]);
>>>+ gcc_assert (GET_CODE (ret_pat) == PARALLEL);
>>>+
>>>+ for (int i = 0; i < XVECLEN (ret_pat, 0);
>>>+ i++, popret_idx++)
>>>+ {
>>>+ XVECEXP (popret_par, 0, popret_idx) =
>>>+ XVECEXP (ret_pat, 0, i);
>>>+ }
>>>+
>>>+ rtx_insn *insn = emit_jump_insn_after (
>>>+ popret_par,
>>>+ BB_END (bb));
>>>+ JUMP_LABEL (insn) = simple_return_rtx;
>>>+
>>>+ REG_NOTES (insn) = REG_NOTES (pop_rtx);
>>>+ RTX_FRAME_RELATED_P (insn) = 1;
>>>+
>>>+ if (dump_file)
>>>+ {
>>>+ fprintf(dump_file, "new insn:\n");
>>>+ print_rtl (dump_file, insn);
>>>+ }
>>>+
>>>+ delete_insn (candidates [2]);
>>>+ delete_insn (pop_rtx);
>>>+}
>>>+
>>>+static void
>>>+zcmp_popret (void)
>>>+{
>>>+ basic_block bb;
>>>+ rtx_insn *insn = NULL, *pop_rtx = NULL;
>>>+ rtx_insn *pop_candidates[3] = {NULL, };
>>>+ /*
>>>+ find NOTE_INSN_EPILOGUE_BEG, but pop_rtx not found => return
>>>+ find NOTE_INSN_EPILOGUE_BEG, and pop_rtx is found => looking for a0
>>>+ */
>>>+
>>>+ FOR_EACH_BB_REVERSE_FN (bb, cfun)
>>>+ {
>>>+ FOR_BB_INSNS_REVERSE (bb, insn)
>>>+ {
>>>+ if (!pop_rtx
>>>+ && NOTE_P (insn)
>>>+ && NOTE_KIND (insn) == NOTE_INSN_EPILOGUE_BEG)
>>>+ return;
>>>+
>>>+ if (NOTE_P (insn)
>>>+ && NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG)
>>>+ {
>>>+ if (pop_rtx)
>>>+ emit_zcmp_popret (pop_rtx, pop_candidates, bb);
>>>+ return;
>>>+ };
>>>+
>>>+ if (!(NONDEBUG_INSN_P (insn)
>>>+ || CALL_P (insn)))
>>>+ continue;
>>>+
>>>+ rtx pop_pat = PATTERN (insn);
>>>+
>>>+ if (GET_CODE (pop_pat) == PARALLEL
>>>+ && riscv_valid_stack_push_pop_p (pop_pat, false))
>>>+ {
>>>+ pop_rtx = insn;
>>>+ continue;
>>>+ }
>>>+
>>>+ /* pattern for `ret`. */
>>>+ if (JUMP_P (insn)
>>>+ && GET_CODE (pop_pat) == PARALLEL
>>>+ && XVECLEN (pop_pat, 0) == 2
>>>+ && GET_CODE (XVECEXP (pop_pat, 0, 0)) == SIMPLE_RETURN
>>>+ && GET_CODE (XVECEXP (pop_pat, 0, 1)) == USE)
>>>+ {
>>>+ rtx use_reg = XEXP (XVECEXP (pop_pat, 0, 1), 0);
>>>+ if (REG_P (use_reg)
>>>+ && REGNO (use_reg) == RETURN_ADDR_REGNUM)
>>>+ {
>>>+ pop_candidates [2] = insn;
>>>+ continue;
>>>+ }
>>>+ }
>>>+
>>>+ if (!pop_rtx)
>>>+ continue;
>>>+
>>>+ /* pattern for return value. */
>>>+ if (!pop_candidates [0]
>>>+ && GET_CODE (pop_pat) == USE)
>>>+ {
>>>+ rtx_insn *set_insn = PREV_INSN (insn);
>>>+ rtx pat_set = PATTERN (set_insn);
>>>+
>>>+ if (riscv_check_regno (XEXP (pop_pat, 0),
>>>+ RETURN_VALUE_REGNUM)
>>>+ && insn
>>>+ && pat_set != NULL
>>>+ && GET_CODE (pat_set) == SET
>>>+ && riscv_check_regno (SET_DEST (pat_set),
>>>+ RETURN_VALUE_REGNUM)
>>>+ && CONST_INT_P (SET_SRC (pat_set))
>>>+ && INTVAL (SET_SRC (pat_set)) == 0)
>>>+ {
>>>+ pop_candidates [0] = set_insn;
>>>+ pop_candidates [1] = insn;
>>>+ break;
>>>+ }
>>>+ }
>>>+ }
>>>+
>>>+ if (pop_rtx)
>>>+ {
>>>+ emit_zcmp_popret (pop_rtx, pop_candidates, bb);
>>>+ return;
>>>+ }
>>>+ }
>>>+}
>>>+
>>>+const pass_data pass_data_zcmp_popret =
>>>+{
>>>+ RTL_PASS, /* type */
>>>+ "zcmp-popret", /* name */
>>>+ OPTGROUP_NONE, /* optinfo_flags */
>>>+ TV_NONE, /* tv_id */
>>>+ 0, /* properties_required */
>>>+ 0, /* properties_provided */
>>>+ 0, /* properties_destroyed */
>>>+ 0, /* todo_flags_start */
>>>+ 0, /* todo_flags_finish */
>>>+};
>>>+
>>>+class pass_zcmp_popret : public rtl_opt_pass
>>>+{
>>>+public:
>>>+ pass_zcmp_popret (gcc::context *ctxt)
>>>+ : rtl_opt_pass (pass_data_zcmp_popret, ctxt)
>>>+ {}
>>>+
>>>+ /* opt_pass methods: */
>>>+ virtual bool gate (function *)
>>>+ { return TARGET_ZCMP; }
>>>+ virtual unsigned int execute (function *)
>>>+ {
>>>+ zcmp_popret ();
>>>+ return 0;
>>>+ }
>>>+}; // class pass_zcmp_popret
>>>+
>>>+} // anon namespace
>>>+
>>>+rtl_opt_pass *
>>>+make_pass_zcmp_popret (gcc::context *ctxt)
>>>+{
>>>+ return new pass_zcmp_popret (ctxt);
>>>+}
>>>diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>>>index 5f8cbfc15ed..17df2f3f8cf 100644
>>>--- a/gcc/config/riscv/riscv.cc
>>>+++ b/gcc/config/riscv/riscv.cc
>>>@@ -114,6 +114,9 @@ struct GTY(()) riscv_frame_info {
>>> /* Likewise FPR X. */
>>> unsigned int fmask;
>>>
>>>+ /* How much the push/pop routines adjust sp (or 0 if unused). */
>>>+ unsigned push_pop_sp_adjust;
>>>+
>>> /* How much the GPR save/restore routines adjust sp (or 0 if unused). */
>>> unsigned save_libcall_adjustment;
>>>
>>>@@ -401,6 +404,20 @@ static const unsigned gpr_save_reg_order[] = {
>>> S10_REGNUM, S11_REGNUM
>>> };
>>>
>>>+/* Order for the CLOBBERs/USEs of push/pop. */
>>>+static const unsigned push_save_reg_order[] = {
>>>+ INVALID_REGNUM, RETURN_ADDR_REGNUM, S0_REGNUM,
>>>+ S1_REGNUM, S2_REGNUM, S3_REGNUM, S4_REGNUM,
>>>+ S5_REGNUM, S6_REGNUM, S7_REGNUM, S8_REGNUM,
>>>+ S9_REGNUM, S10_REGNUM, S11_REGNUM
>>>+};
>>>+
>>>+/* Order for the CLOBBERs/USEs of push/pop in rve. */
>>>+static const unsigned push_save_reg_order_zcmpe[] = {
>>>+ INVALID_REGNUM, RETURN_ADDR_REGNUM, S0_REGNUM,
>>>+ S1_REGNUM
>>>+};
>>>+
>>> /* A table describing all the processors GCC knows about. */
>>> static const struct riscv_tune_info riscv_tune_info_table[] = {
>>> #define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO) \
>>>@@ -2989,6 +3006,17 @@ riscv_output_return ()
>>> return "ret";
>>> }
>>>
>>>+bool
>>>+riscv_output_popret_p (rtx op)
>>>+{
>>>+ unsigned n_rtx = XVECLEN (op, 0);
>>>+ rtx use = XVECEXP (op, 0, n_rtx - 1);
>>>+ rtx ret = XVECEXP (op, 0, n_rtx - 2);
>>>+
>>>+ return GET_CODE (ret) == SIMPLE_RETURN
>>>+ && GET_CODE (use) == USE;
>>>+}
>>>+
>>>
>>>
>>> /* Return true if CMP1 is a suitable second operand for integer ordering
>>> test CODE. See also the *sCC patterns in riscv.md. */
>>>@@ -4306,6 +4334,74 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
>>> }
>>> }
>>>
>>>+/* Print Sp adjustment field of pop instruction. */
>>>+
>>>+static void
>>>+riscv_print_pop_size (FILE *file, rtx op)
>>>+{
>>>+ unsigned sp_adjust_idx = XVECLEN (op, 0) - 1;
>>>+ rtx sp_adjust_rtx = XVECEXP (op, 0, sp_adjust_idx);
>>>+
>>>+ /* Skip ret or pattern. */
>>>+ while (GET_CODE (sp_adjust_rtx) != SET)
>>>+ sp_adjust_rtx = XVECEXP (op, 0, --sp_adjust_idx);
>>>+
>>>+ rtx elt_plus = SET_SRC (sp_adjust_rtx);
>>>+ fprintf (file, "%ld", INTVAL (XEXP (elt_plus, 1)));
>>>+}
>>>+
>>>+/* Print push/pop register list. */
>>>+
>>>+static void
>>>+riscv_print_reglist (FILE *file, rtx op)
>>>+{
>>>+ /* we only deal with three formats:
>>>+ push {ra}
>>>+ push {ra, s0}
>>>+ push {ra, s0-sN}
>>>+ or
>>>+ pop {ra}
>>>+ pop {ra, s0}
>>>+ pop {ra, s0-sN}
>>>+ registers except ra has to be continuous s-register,
>>>+ and it is supposed to be checked before.
>>>+ register list patterns in push:
>>>+ (set/f (mem/c:SI
>>>+ (plus:SI (reg/f:SI 2 sp)
>>>+ (const_int 28 [0x1c])) [2 S4 A32])
>>>+ (reg:SI 1 ra))
>>>+ register list patterns in pop:
>>>+ (set/f (reg:DI 1 ra)
>>>+ (mem/c:DI (plus:DI (reg/f:DI 2 sp)
>>>+ (const_int 8 [0x8])) [2 S8 A64]))
>>>+ */
>>>+ int total_count = XVECLEN (op, 0);
>>>+ int n_regs = 0;
>>>+ bool push_p = GET_CODE (XVECEXP (op, 0, 0)) == SET
>>>+ && GET_CODE (SET_SRC (XVECEXP (op, 0, 0))) == PLUS;
>>>+
>>>+ for (int idx = 0; idx < total_count; ++idx)
>>>+ {
>>>+ rtx ele = XVECEXP (op, 0, idx);
>>>+ if (GET_CODE (ele) != SET)
>>>+ continue;
>>>+
>>>+ bool restore_save_p = push_p ?
>>>+ MEM_P (SET_DEST (ele)) :
>>>+ MEM_P (SET_SRC (ele));
>>>+
>>>+ if (restore_save_p)
>>>+ n_regs ++;
>>>+ }
>>>+
>>>+ if (n_regs > 2)
>>>+ fprintf (file, "ra,s0-s%u", n_regs - 2);
>>>+ else if (n_regs > 1)
>>>+ fprintf (file, "ra,s0");
>>>+ else
>>>+ fputs("ra", file);
>>>+}
>>>+
>>> /* Return true if a FENCE should be emitted to before a memory access to
>>> implement the release portion of memory model MODEL. */
>>>
>>>@@ -4517,6 +4613,14 @@ riscv_print_operand (FILE *file, rtx op, int letter)
>>> fputs (GET_RTX_NAME (code), file);
>>> break;
>>>
>>>+ case 'L':
>>>+ riscv_print_reglist (file, op);
>>>+ break;
>>>+
>>>+ case 's':
>>>+ riscv_print_pop_size (file, op);
>>>+ break;
>>>+
>>> case 'S':
>>> {
>>> rtx newop = GEN_INT (ctz_hwi (INTVAL (op)));
>>>@@ -4777,6 +4881,66 @@ riscv_use_save_libcall (const struct riscv_frame_info
>>>*frame)
>>> return frame->save_libcall_adjustment != 0;
>>> }
>>>
>>>+/* Determine how many instructions related to push/pop instructions. */
>>>+
>>>+static unsigned
>>>+riscv_save_push_pop_count (unsigned mask)
>>>+{
>>>+ if (!BITSET_P (mask, GP_REG_FIRST + RETURN_ADDR_REGNUM))
>>>+ return 0;
>>>+ for (unsigned n = GP_REG_LAST; n > GP_REG_FIRST; n--)
>>>+ if (BITSET_P (mask, n)
>>>+ && !call_used_regs [n])
>>>+ /* add ra saving and sp adjust. */
>>>+ return CALLEE_SAVED_REG_NUMBER (n) + 1 + 2;
>>>+ abort ();
>>>+}
>>>+
>>>+/* Calculate the maximum sp adjustment of push/pop instruction. */
>>>+
>>>+static unsigned
>>>+riscv_push_pop_base_sp_adjust (unsigned mask)
>>>+{
>>>+ unsigned n_regs = riscv_save_push_pop_count (mask) - 1;
>>>+ return (n_regs * UNITS_PER_WORD + 15) & (~0xf);
>>>+}
>>>+
>>>+/* Determine whether to call push/pop routines. */
>>>+
>>>+static bool
>>>+riscv_use_push_pop (const struct riscv_frame_info *frame, const HOST_WIDE_INT
>>>frame_size)
>>>+{
>>>+ if (!TARGET_ZCMP)
>>>+ return false;
>>>+
>>>+ /* We do not handler variable argument cases currently. */
>>>+ if (cfun->machine->varargs_size != 0)
>>>+ return false;
>>>+
>>>+ HOST_WIDE_INT base_size = riscv_push_pop_base_sp_adjust (frame->mask);
>>>+ /*
>>>+ Pr 960215-1.c in rv64 ouputs
>>>+
>>>+ addi sp,sp,-32
>>>+ sd ra,24(sp)
>>>+ sd s0,16(sp)
>>>+ sd s2,8(sp)
>>>+ sd s3,0(sp)
>>>+ it is a rare case that callee saved registers are not non-continous,
>>>+ which breaks the old push implementation, and we just reject this case
>>>+ like save-restore does now.
>>>+ */
>>>+ if (base_size > frame_size)
>>>+ return false;
>>>+
>>>+ /* {ra,s0-s10} is invalid. */
>>>+ if (frame->mask & (1 << (S10_REGNUM - GP_REG_FIRST))
>>>+ && !(frame->mask & (1 << (S11_REGNUM - GP_REG_FIRST))))
>>>+ return false;
>>>+
>>>+ return frame->mask & (1 << (RETURN_ADDR_REGNUM - GP_REG_FIRST));
>>>+}
>>>+
>>> /* Determine which GPR save/restore routine to call. */
>>>
>>> static unsigned
>>>@@ -4934,6 +5098,8 @@ riscv_compute_frame_info (void)
>>> /* Only use save/restore routines when the GPRs are atop the frame. */
>>> if (known_ne (frame->hard_frame_pointer_offset, frame->total_size))
>>> frame->save_libcall_adjustment = 0;
>>>+
>>>+ frame->push_pop_sp_adjust = 0;
>>> }
>>>
>>> /* Make sure that we're not trying to eliminate to the wrong hard frame
>>>@@ -5171,6 +5337,86 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
>>>riscv_save_restore_fn fn,
>>> }
>>> }
>>>
>>>+static void
>>>+riscv_emit_pop_insn (struct riscv_frame_info *frame, HOST_WIDE_INT offset,
>>>HOST_WIDE_INT size)
>>>+{
>>>+ unsigned int veclen = riscv_save_push_pop_count (frame->mask);
>>>+ unsigned int n_reg = veclen - 1;
>>>+ rtvec vec = rtvec_alloc (veclen);
>>>+ HOST_WIDE_INT sp_adjust;
>>>+ rtx dwarf = NULL_RTX;
>>>+
>>>+ const unsigned *reg_order = (TARGET_ZCMP && TARGET_RVE)
>>>+ ? push_save_reg_order_zcmpe
>>>+ : push_save_reg_order;
>>>+
>>>+ gcc_assert (n_reg >= 1
>>>+ && TARGET_ZCMP
>>>+ && ((TARGET_RVE && (n_reg <= ARRAY_SIZE (push_save_reg_order_zcmpe)))
>>>+ || (TARGET_ZCMP && (n_reg <= ARRAY_SIZE (push_save_reg_order)))));
>>>+
>>>+ /* sp adjust pattern */
>>>+ int max_allow_sp_adjust = riscv_push_pop_base_sp_adjust (frame->mask) + 48;
>>>+ int aligned_size = size;
>>>+
>>>+ /* if sp adjustment is too large, we should split it first. */
>>>+ if (aligned_size > max_allow_sp_adjust)
>>>+ {
>>>+ rtx dwarf_pre_sp_adjust = NULL_RTX;
>>>+ rtx pre_adjust_rtx = gen_add3_insn (stack_pointer_rtx,
>>>+ stack_pointer_rtx,
>>>+ GEN_INT (aligned_size - max_allow_sp_adjust));
>>>+ rtx insn = emit_insn (pre_adjust_rtx);
>>>+
>>>+ rtx cfa_pre_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
>>>+ GEN_INT (aligned_size - max_allow_sp_adjust));
>>>+ dwarf_pre_sp_adjust = alloc_reg_note (REG_CFA_DEF_CFA,
>>>+ cfa_pre_adjust_rtx,
>>>+ dwarf_pre_sp_adjust);
>>>+
>>>+ RTX_FRAME_RELATED_P (insn) = 1;
>>>+ REG_NOTES (insn) = dwarf_pre_sp_adjust;
>>>+
>>>+ sp_adjust = max_allow_sp_adjust;
>>>+ }
>>>+ else
>>>+ sp_adjust = (aligned_size + 15) & (~0xf);
>>>+
>>>+ /* register save sequence. */
>>>+ for (unsigned i = 1; i < veclen; ++i)
>>>+ {
>>>+ offset -= UNITS_PER_WORD;
>>>+ unsigned regno = reg_order[i];
>>>+ rtx reg = gen_rtx_REG (Pmode, regno);
>>>+ rtx mem = gen_frame_mem (Pmode, plus_constant (Pmode,
>>>+ stack_pointer_rtx,
>>>+ offset));
>>>+ rtx set = gen_rtx_SET (reg, mem);
>>>+ RTVEC_ELT (vec, i - 1) = set;
>>>+ RTX_FRAME_RELATED_P (set) = 1;
>>>+ dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
>>>+ }
>>>+
>>>+ /* sp adjust pattern */
>>>+ rtx adjust_sp_rtx
>>>+ = gen_rtx_SET (stack_pointer_rtx,
>>>+ plus_constant (Pmode,
>>>+ stack_pointer_rtx,
>>>+ sp_adjust));
>>>+ RTVEC_ELT (vec, veclen - 1) = adjust_sp_rtx;
>>>+
>>>+ rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
>>>+ const0_rtx);
>>>+ dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
>>>+
>>>+ frame->gp_sp_offset -= (veclen - 1) * UNITS_PER_WORD;
>>>+ frame->push_pop_sp_adjust = sp_adjust;
>>>+
>>>+ rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, vec));
>>>+ RTX_FRAME_RELATED_P (insn) = 1;
>>>+ REG_NOTES (insn) = dwarf;
>>>+}
>>>+
>>> /* For stack frames that can't be allocated with a single ADDI instruction,
>>> compute the best value to initially allocate. It must at a minimum
>>> allocate enough space to spill the callee-saved registers. If TARGET_RVC,
>>>@@ -5270,6 +5516,146 @@ riscv_emit_stack_tie (void)
>>> emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
>>> }
>>>
>>>+bool
>>>+riscv_check_regno(rtx pat, unsigned regno)
>>>+{
>>>+ return REG_P (pat)
>>>+ && REGNO (pat) == regno;
>>>+}
>>>+
>>>+/* Function to check whether the OP is a valid stack push/pop operation.
>>>+ This part is borrowed from nds32 nds32_valid_stack_push_pop_p */
>>>+
>>>+bool
>>>+riscv_valid_stack_push_pop_p (rtx op, bool push_p)
>>>+{
>>>+ int index;
>>>+ int total_count;
>>>+ int sp_adjust_rtx_index;
>>>+ rtx elt;
>>>+ rtx elt_reg;
>>>+ rtx elt_plus;
>>>+
>>>+ if (!TARGET_ZCMP)
>>>+ return false;
>>>+
>>>+ total_count = XVECLEN (op, 0);
>>>+ sp_adjust_rtx_index = push_p ? 0 : total_count - 1;
>>>+
>>>+ /* At least sp + one callee save/restore register rtx */
>>>+ if (total_count < 2)
>>>+ return false;
>>>+
>>>+ /* Perform some quick check for that every element should be 'set',
>>>+ for pop, it might contain `ret` and `ret value` pattern. */
>>>+ for (index = 0; index < total_count; index++)
>>>+ {
>>>+ elt = XVECEXP (op, 0, index);
>>>+
>>>+ /* skip pop return value rtx */
>>>+ if (!push_p && GET_CODE (elt) == SET
>>>+ && riscv_check_regno (SET_DEST (elt), RETURN_VALUE_REGNUM)
>>>+ && total_count >= 4
>>>+ && index + 1 < total_count
>>>+ && GET_CODE (XVECEXP (op, 0, index + 1)) == USE)
>>>+ {
>>>+ rtx use_reg = XEXP (XVECEXP (op, 0, index + 1), 0);
>>>+
>>>+ if (!riscv_check_regno (use_reg, RETURN_VALUE_REGNUM))
>>>+ return false;
>>>+
>>>+ index += 1;
>>>+ continue;
>>>+ }
>>>+
>>>+ /* skip ret rtx */
>>>+ if (!push_p && GET_CODE (elt) == SIMPLE_RETURN
>>>+ && total_count >= 4
>>>+ && index + 1 < total_count
>>>+ && GET_CODE (XVECEXP (op, 0, index + 1)) == USE)
>>>+ {
>>>+ rtx use_reg = XEXP (XVECEXP (op, 0, index + 1), 0);
>>>+
>>>+ if (!riscv_check_regno (use_reg, RETURN_ADDR_REGNUM))
>>>+ return false;
>>>+
>>>+ index += 1;
>>>+ sp_adjust_rtx_index -= 2;
>>>+ continue;
>>>+ }
>>>+
>>>+ if (GET_CODE (elt) != SET)
>>>+ return false;
>>>+ }
>>>+
>>>+ elt = XVECEXP (op, 0, sp_adjust_rtx_index);
>>>+ elt_reg = SET_DEST (elt);
>>>+ elt_plus = SET_SRC (elt);
>>>+
>>>+ /* Check this is (set (stack_reg) (plus stack_reg const)) pattern. */
>>>+ if (GET_CODE (elt_plus) != PLUS
>>>+ || !riscv_check_regno (elt_reg, STACK_POINTER_REGNUM))
>>>+ return false;
>>>+
>>>+ /* Pass all test, this is a valid rtx. */
>>>+ return true;
>>>+}
>>>+
>>>+/* Generate push/pop rtx */
>>>+
>>>+static void
>>>+riscv_emit_push_insn (struct riscv_frame_info *frame, HOST_WIDE_INT size)
>>>+{
>>>+ unsigned int veclen = riscv_save_push_pop_count (frame->mask);
>>>+ unsigned int n_reg = veclen - 1;
>>>+ rtvec vec = rtvec_alloc (veclen);
>>>+
>>>+ const unsigned *reg_order = (TARGET_ZCMP && TARGET_RVE)
>>>+ ? push_save_reg_order_zcmpe
>>>+ : push_save_reg_order;
>>>+
>>>+ int aligned_size = (size + 15) & (~0xf);
>>>+
>>>+ gcc_assert (n_reg >= 1
>>>+ && TARGET_ZCMP
>>>+ && ((TARGET_RVE && (n_reg <= ARRAY_SIZE (push_save_reg_order_zcmpe)))
>>>+ || (TARGET_ZCMP && (n_reg <= ARRAY_SIZE (push_save_reg_order)))));
>>>+
>>>+ /* sp adjust pattern */
>>>+ int max_allow_sp_adjust = riscv_push_pop_base_sp_adjust (frame->mask) + 48;
>>>+ int sp_adjust = aligned_size > max_allow_sp_adjust ?
>>>+ max_allow_sp_adjust
>>>+ : aligned_size;
>>>+
>>>+ /*TODO: move this part to frame computation function. */
>>>+ frame->gp_sp_offset = (veclen - 1) * UNITS_PER_WORD;
>>>+ frame->push_pop_sp_adjust = sp_adjust;
>>>+
>>>+ rtx adjust_sp_rtx
>>>+ = gen_rtx_SET (stack_pointer_rtx,
>>>+ plus_constant (Pmode,
>>>+ stack_pointer_rtx,
>>>+ -sp_adjust));
>>>+ RTVEC_ELT (vec, 0) = adjust_sp_rtx;
>>>+
>>>+ /* Register save sequence. */
>>>+ for (unsigned i = 1; i < veclen; ++i)
>>>+ {
>>>+ sp_adjust -= UNITS_PER_WORD;
>>>+ unsigned regno = reg_order[i];
>>>+ rtx reg = gen_rtx_REG (Pmode, regno);
>>>+ rtx mem = gen_frame_mem (Pmode, plus_constant (Pmode,
>>>+ stack_pointer_rtx,
>>>+ sp_adjust));
>>>+ rtx set = gen_rtx_SET (mem, reg);
>>>+ RTVEC_ELT (vec, i) = set;
>>>+ RTX_FRAME_RELATED_P (set) = 1;
>>>+ }
>>>+
>>>+ rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, vec));
>>>+ RTX_FRAME_RELATED_P (insn) = 1;
>>>+}
>>>+
>>> /* Expand the "prologue" pattern. */
>>>
>>> void
>>>@@ -5278,6 +5664,7 @@ riscv_expand_prologue (void)
>>> struct riscv_frame_info *frame = &cfun->machine->frame;
>>> poly_int64 size = frame->total_size;
>>> unsigned mask = frame->mask;
>>>+ HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
>>> rtx insn;
>>>
>>> if (flag_stack_usage_info)
>>>@@ -5300,19 +5687,32 @@ riscv_expand_prologue (void)
>>> REG_NOTES (insn) = dwarf;
>>> }
>>>
>>>+ if (size.is_constant ())
>>>+ step1 = MIN (size.to_constant(), step1);
>>>+ if (riscv_use_push_pop (frame, step1))
>>>+ {
>>>+ riscv_emit_push_insn (frame, step1);
>>>+
>>>+ step1 = MAX (step1 - frame->push_pop_sp_adjust, 0);
>>>+ size = MAX (size.to_constant() - frame->push_pop_sp_adjust, 0);
>>>+ frame->mask &= ~ ((TARGET_ZCMP && TARGET_RVE) ?
>>>+ RISCV_ZCMPE_PUSH_POP_MASK
>>>+ : RISCV_ZCE_PUSH_POP_MASK);
>>>+ }
>>>+
>>> /* Save the registers. */
>>> if ((frame->mask | frame->fmask) != 0)
>>> {
>>>- HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
>>>- if (size.is_constant ())
>>>- step1 = MIN (size.to_constant(), step1);
>>>-
>>>- insn = gen_add3_insn (stack_pointer_rtx,
>>>- stack_pointer_rtx,
>>>- GEN_INT (-step1));
>>>- RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>>>- size -= step1;
>>>- riscv_for_each_saved_reg (size, riscv_save_reg, false, false);
>>>+ if (step1 > 0)
>>>+ {
>>>+ insn = gen_add3_insn (stack_pointer_rtx,
>>>+ stack_pointer_rtx,
>>>+ GEN_INT (-step1));
>>>+ RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
>>>+ size -= step1;
>>>+ }
>>>+ riscv_for_each_saved_reg (size, riscv_save_reg,
>>>+ false /* bool epilogue */, false /* bool maybe_eh_return */);
>>> }
>>>
>>> frame->mask = mask; /* Undo the above fib. */
>>>@@ -5412,6 +5812,8 @@ riscv_expand_epilogue (int style)
>>> rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
>>> rtx insn;
>>>
>>>+ bool use_zcmp_pop = !use_restore_libcall && !(crtl->calls_eh_return);
>>>+
>>> /* We need to add memory barrier to prevent read from deallocated stack. */
>>> bool need_barrier_p = known_ne (get_frame_size ()
>>> + cfun->machine->frame.arg_pointer_offset, 0);
>>>@@ -5538,6 +5940,18 @@ riscv_expand_epilogue (int style)
>>> if (use_restore_libcall)
>>> frame->mask = 0; /* Temporarily fib that we need not save GPRs. */
>>>
>>>+ if (use_zcmp_pop && riscv_use_push_pop (frame, step2))
>>>+ {
>>>+ /* Emit a barrier to prevent loads from a deallocated stack. */
>>>+ riscv_emit_stack_tie ();
>>>+ need_barrier_p = false;
>>>+ riscv_emit_pop_insn (frame, frame->total_size.to_constant(), step2);
>>>+ frame->mask &= ~ ((TARGET_ZCMP && TARGET_RVE) ?
>>>+ RISCV_ZCMPE_PUSH_POP_MASK
>>>+ : RISCV_ZCE_PUSH_POP_MASK);
>>>+ step2 = 0;
>>>+ }
>>>+
>>> /* Restore the registers. */
>>> riscv_for_each_saved_reg (frame->total_size - step2, riscv_restore_reg,
>>> true, style == EXCEPTION_RETURN);
>>>@@ -5552,6 +5966,9 @@ riscv_expand_epilogue (int style)
>>> if (need_barrier_p)
>>> riscv_emit_stack_tie ();
>>>
>>>+ if (use_zcmp_pop)
>>>+ frame->mask = mask;
>>>+
>>> /* Deallocate the final bit of the frame. */
>>> if (step2 > 0)
>>> {
>>>diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
>>>index d05b1d59853..6e6e3ee2c25 100644
>>>--- a/gcc/config/riscv/riscv.h
>>>+++ b/gcc/config/riscv/riscv.h
>>>@@ -383,6 +383,7 @@ ASM_MISA_SPEC
>>> #define HARD_FRAME_POINTER_REGNUM 8
>>> #define STACK_POINTER_REGNUM 2
>>> #define THREAD_POINTER_REGNUM 4
>>>+#define RETURN_VALUE_REGNUM 10
>>>
>>> /* These two registers don't really exist: they get eliminated to either
>>> the stack or hard frame pointer. */
>>>@@ -1097,4 +1098,7 @@ extern void riscv_remove_unneeded_save_restore_calls
>>>(void);
>>> #define DWARF_REG_TO_UNWIND_COLUMN(REGNO) \
>>> ((REGNO == RISCV_DWARF_VLENB) ? (FIRST_PSEUDO_REGISTER + 1) : REGNO)
>>>
>>>+#define RISCV_ZCE_PUSH_POP_MASK 0x0ffc0302u
>>>+#define RISCV_ZCMPE_PUSH_POP_MASK 0x302u
>>>+
>>> #endif /* ! GCC_RISCV_H */
>>>diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
>>>index bc384d9aedf..b9f2a426e48 100644
>>>--- a/gcc/config/riscv/riscv.md
>>>+++ b/gcc/config/riscv/riscv.md
>>>@@ -108,12 +108,14 @@
>>>
>>> (define_constants
>>> [(RETURN_ADDR_REGNUM 1)
>>>+ (SP_REGNUM 2)
>>> (GP_REGNUM 3)
>>> (TP_REGNUM 4)
>>> (T0_REGNUM 5)
>>> (T1_REGNUM 6)
>>> (S0_REGNUM 8)
>>> (S1_REGNUM 9)
>>>+ (A0_REGNUM 10)
>>> (S2_REGNUM 18)
>>> (S3_REGNUM 19)
>>> (S4_REGNUM 20)
>>>@@ -3147,3 +3149,4 @@
>>> (include "sifive-7.md")
>>> (include "thead.md")
>>> (include "vector.md")
>>>+(include "zc.md")
>>>diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
>>>index 6e326fc7e02..9ef522306a5 100644
>>>--- a/gcc/config/riscv/t-riscv
>>>+++ b/gcc/config/riscv/t-riscv
>>>@@ -90,6 +90,10 @@ riscv-v.o: $(srcdir)/config/riscv/riscv-v.cc \
>>> $(COMPILE) $<
>>> $(POSTCOMPILE)
>>>
>>>+riscv-zcmp-popret.o: $(srcdir)/config/riscv/riscv-zcmp-popret.cc
>>>+ $(COMPILE) $<
>>>+ $(POSTCOMPILE)
>>>+
>>> thead.o: $(srcdir)/config/riscv/thead.cc \
>>> $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TARGET_H) backend.h $(RTL_H) \
>>> memmodel.h $(EMIT_RTL_H) poly-int.h output.h
>>>diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md
>>>new file mode 100644
>>>index 00000000000..3ad34dacd49
>>>--- /dev/null
>>>+++ b/gcc/config/riscv/zc.md
>>>@@ -0,0 +1,47 @@
>>>+;; Machine description for ZCE extension.
>>>+;; Copyright (C) 2021 Free Software Foundation, Inc.
>>>+
>>>+;; This file is part of GCC.
>>>+
>>>+;; GCC is free software; you can redistribute it and/or modify
>>>+;; it under the terms of the GNU General Public License as published by
>>>+;; the Free Software Foundation; either version 3, or (at your option)
>>>+;; any later version.
>>>+
>>>+;; GCC is distributed in the hope that it will be useful,
>>>+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>>+;; GNU General Public License for more details.
>>>+
>>>+;; You should have received a copy of the GNU General Public License
>>>+;; along with GCC; see the file COPYING3. If not see
>>>+;; <http://www.gnu.org/licenses/>.
>>>+
>>>+(define_insn "*stack_push<mode>"
>>>+ [(match_parallel 0 "riscv_stack_push_operation"
>>>+ [(set (reg:X SP_REGNUM) (plus:X (reg:X SP_REGNUM)
>>>+ (match_operand:X 1 "const_int_operand" "")))])]
>>>+ "TARGET_ZCMP"
>>>+ "cm.push\t{%L0},%1")
>>>+
>>>+(define_insn "*stack_pop<mode>"
>>>+ [(match_parallel 0 "riscv_stack_pop_operation"
>>>+ [(set (match_operand:X 1 "register_operand" "")
>>>+ (mem:X (plus:X (reg:X SP_REGNUM)
>>>+ (match_operand:X 2 "const_int_operand" ""))))])]
>>>+ "TARGET_ZCMP"
>>>+ {
>>>+ return riscv_output_popret_p (operands[0]) ?
>>>+ "cm.popret\t{%L0},%s0" :
>>>+ "cm.pop\t{%L0},%s0";
>>>+ })
>>>+
>>>+(define_insn "*stack_pop_with_return_value<mode>"
>>>+ [(match_parallel 0 "riscv_stack_pop_operation"
>>>+ [(set (reg:ANYI A0_REGNUM)
>>>+ (match_operand:ANYI 1 "pop_return_value_constant" ""))])]
>>>+ "TARGET_ZCMP"
>>>+ {
>>>+ gcc_assert (riscv_output_popret_p (operands[0]));
>>>+ return "cm.popretz\t{%L0},%s0";
>>>+ })
>>>--
>>>2.25.1

      reply	other threads:[~2023-05-12  9:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-25 10:11 Fei Gao
2023-05-05 15:57 ` Sinan
2023-05-06  8:53   ` Fei Gao
2023-05-12  8:12     ` Sinan
2023-05-12  9:10       ` Fei Gao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230512171020765870170@eswincomputing.com \
    --to=gaofei@eswincomputing.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jiawei@iscas.ac.cn \
    --cc=sinan.lin@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).