From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x935.google.com (mail-ua1-x935.google.com [IPv6:2607:f8b0:4864:20::935]) by sourceware.org (Postfix) with ESMTPS id E1BC33858D28 for ; Thu, 4 May 2023 09:04:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E1BC33858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ua1-x935.google.com with SMTP id a1e0cc1a2514c-77d46c7dd10so7815773241.0 for ; Thu, 04 May 2023 02:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683191042; x=1685783042; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Xf57rSdzHVO2AgY2fLEpEuY+n1VrY61ey0sY8ZVCXr0=; b=cmag+fJKVIzJwD/7Iq7Hgybcq0JrZcDvxBaYcoAs+OeeR1Pxm20HGMdlMijJRO66zh Tt2Hiam2U531jjKu1pF2hF5iXkV8cVF2AUgCoihYf2vu9f+A5vt0htotrGUo5YhwLBm8 n5LJX5vYX0IxKZjqVWyUWU0m9AdWYukcDxpev1r4h62k/cXdRA9nbdXHXAJ135uo9fTl KupbJbq0HJv/2qBoPuxaaZq4uVlOt4DtRx5xdNQMMXm8KKb/Bsjd1ZkgqZQ98yIC0B+t NNx6NcqoPhTnTSZtoh6L/cDW0S2VNvUMsbjJDSnrVoJlI9rG0BymrPNzznvn4rjonGbw z2tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683191042; x=1685783042; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Xf57rSdzHVO2AgY2fLEpEuY+n1VrY61ey0sY8ZVCXr0=; b=PovUzXd9ppBxFb0vR1h3nM4OqShylfkVPEn/Zde+5/3PLepAam1CC/TOlhX9FkDNJV L812jcAEvE3oMNWW86lkOS+b8Rgb9yqmNaZIKsnat7rp+bbG7rckEn8AgX+T8sPocYJ2 1Zkr6P2nBP2AigWNZShQW5GazLfIrdbw3BWJlX/NJX1UxylbhLTNepIXAWMvaSuxEVAg Y6p/DP29AUPdDtsRIoZBuM0+1dcLyjn98cn9Cc6kzNlyGF0ByR1QZ24R9BBRlX6akIwU UQIBvdCQ7hjOqZjhAy87eTQ28upUUlJdjYtb5Rv3d4aRtsu5s8Jr58+rNuO0tTKu2EAh xpDQ== X-Gm-Message-State: AC+VfDwfhboDEUZVGKC0cntsc3F2VDH22bKLL8zTM4VZBBQj+EoumPy4 QGhn8UXOS44Kfuz8FkdGIjx8vr6u8jJb2667lO4= X-Google-Smtp-Source: ACHHUZ5C1qK0ZofpKivr3EIFEv5a3e34x99AcQfRY235XQ+XZX6pyM+Yxv1P2+qjKyu98WozjQmqRbVG1DRKseMcu+M= X-Received: by 2002:a05:6102:161f:b0:42f:68d8:f83e with SMTP id cu31-20020a056102161f00b0042f68d8f83emr2027942vsb.12.1683191041683; Thu, 04 May 2023 02:04:01 -0700 (PDT) MIME-Version: 1.0 References: <20230406062118.47431-1-jiawei@iscas.ac.cn> <20230406062118.47431-5-jiawei@iscas.ac.cn> In-Reply-To: <20230406062118.47431-5-jiawei@iscas.ac.cn> From: Kito Cheng Date: Thu, 4 May 2023 17:03:50 +0800 Message-ID: Subject: Re: [PATCH 4/5] RISC-V: Add Zcmp extension supports. To: Jiawei Cc: gcc-patches@gcc.gnu.org, kito.cheng@sifive.com, palmer@dabbelt.com, christoph.muellner@vrull.eu, jeremy.bennett@embecosm.com, mary.bennett@embecosm.com, nandni.jamnadas@embecosm.com, charlie.keaney@embecosm.com, simon.cook@embecosm.com, tariq.kurd@codasip.com, ibrahim.abu.kharmeh1@huawei.com, sinan.lin@linux.alibaba.com, wuwei2016@iscas.ac.cn, shihua@iscas.ac.cn, shiyulong@iscas.ac.cn, chenyixuan@iscas.ac.cn Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Could you rebase this patch, we have some changes on > All "zcmpe" means Zcmp with RVE extension. Use zcmp_rve instead, zcmpe seems like a new ext. name > diff --git a/gcc/config/riscv/riscv-zcmp-popret.cc b/gcc/config/riscv/riscv-zcmp-popret.cc > new file mode 100644 > index 00000000000..d7b40f6a3e2 > --- /dev/null > +++ b/gcc/config/riscv/riscv-zcmp-popret.cc > @@ -0,0 +1,260 @@ Need a header here like "^#$% for RISC-V Copyright (C) 2023 Free Software Foundation, Inc." here > +#include "config.h" ... > +#include "cfgrtl.h" > + > +#define IN_TARGET_CODE 1 This should appear before include anything. > + > +namespace { > + > +/* > + 1. preprocessing: > + 1.1. if there is no push rtx, then just return. e.g. > + (note 5 1 22 2 [bb 2] NOTE_INSN_BASIC_BLOCK) > + (insn/f 22 5 23 2 (set (reg/f:SI 2 sp) > + (plus:SI (reg/f:SI 2 sp) > + (const_int -32 [0xffffffffffffffe0]))) > + (nil)) > + (note 23 22 2 2 NOTE_INSN_PROLOGUE_END) > + 1.2. if push rtx exists, then we compute the number of > + pushed s-registers, n_sreg. > + > + push rtx should be find before NOTE_INSN_PROLOGUE_END tag > + > + [2 and 3 happend simultaneously] > + 2. find valid move pattern, mv sN, aN, where N < n_sreg, > + and aN is not used the move pattern, and sN is not > + defined before the move pattern (from prologue to the > + position of move pattern). > + 3. analysis use and reach of every instruction from prologue > + to the position of move pattern. > + if any sN is used, then we mark the corresponding argument list > + candidate as invalid. > + e.g. > + push {ra,s0-s3}, {}, -32 > + sw s0,44(sp) # s0 is used, then argument list is invalid > + mv a0,a5 # a0 is defined, then argument list is invalid > + ... > + mv s0,a0 > + mv s1,a1 > + mv s2,a2 > + > + 4. if there is a valid argument list, then replace the pop > + push parallel insn, and delete mv pattern. > + if not, skip. > +*/ I am not sure I understand this optimization pass correctly, could you give more example or indicate which testcase can demonstrate this pass? And I would prefer this pass split from this patch, let it become a separated patch including testcase. > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 5f8cbfc15ed..17df2f3f8cf 100644 > +/* Order for the CLOBBERs/USEs of push/pop. */ > +static const unsigned push_save_reg_order[] = { push_save_reg_order -> zcmp_push_save_reg_order > + INVALID_REGNUM, RETURN_ADDR_REGNUM, S0_REGNUM, > + S1_REGNUM, S2_REGNUM, S3_REGNUM, S4_REGNUM, > + S5_REGNUM, S6_REGNUM, S7_REGNUM, S8_REGNUM, > + S9_REGNUM, S10_REGNUM, S11_REGNUM > +}; > + > +/* Order for the CLOBBERs/USEs of push/pop in rve. */ > +static const unsigned push_save_reg_order_zcmpe[] = { push_save_reg_order_zcmpe -> zcmp_rve_push_save_reg_order > @@ -4777,6 +4881,66 @@ riscv_use_save_libcall (const struct riscv_frame_info *frame) > return frame->save_libcall_adjustment != 0; > } > > +/* Determine how many instructions related to push/pop instructions. */ > + > +static unsigned > +riscv_save_push_pop_count (unsigned mask) > +{ > + if (!BITSET_P (mask, GP_REG_FIRST + RETURN_ADDR_REGNUM)) > + return 0; > + for (unsigned n = GP_REG_LAST; n > GP_REG_FIRST; n--) > + if (BITSET_P (mask, n) > + && !call_used_regs [n]) > + /* add ra saving and sp adjust. */ > + return CALLEE_SAVED_REG_NUMBER (n) + 1 + 2; What the magic number of `+ 1 + 2`? > + abort (); > +} > + > +/* Calculate the maximum sp adjustment of push/pop instruction. */ > + > +static unsigned > +riscv_push_pop_base_sp_adjust (unsigned mask) > +{ > + unsigned n_regs = riscv_save_push_pop_count (mask) - 1; > + return (n_regs * UNITS_PER_WORD + 15) & (~0xf); Use ROUND_UP > @@ -5171,6 +5337,86 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, riscv_save_restore_fn fn, > } > } > > +static void > +riscv_emit_pop_insn (struct riscv_frame_info *frame, HOST_WIDE_INT offset, HOST_WIDE_INT size) > +{ > + unsigned int veclen = riscv_save_push_pop_count (frame->mask); > + unsigned int n_reg = veclen - 1; > + rtvec vec = rtvec_alloc (veclen); > + HOST_WIDE_INT sp_adjust; > + rtx dwarf = NULL_RTX; > + > + const unsigned *reg_order = (TARGET_ZCMP && TARGET_RVE) > + ? push_save_reg_order_zcmpe > + : push_save_reg_order; > + > + gcc_assert (n_reg >= 1 > + && TARGET_ZCMP > + && ((TARGET_RVE && (n_reg <= ARRAY_SIZE (push_save_reg_order_zcmpe))) > + || (TARGET_ZCMP && (n_reg <= ARRAY_SIZE (push_save_reg_order))))); > + > + /* sp adjust pattern */ > + int max_allow_sp_adjust = riscv_push_pop_base_sp_adjust (frame->mask) + 48; > + int aligned_size = size; > + > + /* if sp adjustment is too large, we should split it first. */ > + if (aligned_size > max_allow_sp_adjust) > + { > + rtx dwarf_pre_sp_adjust = NULL_RTX; > + rtx pre_adjust_rtx = gen_add3_insn (stack_pointer_rtx, > + stack_pointer_rtx, > + GEN_INT (aligned_size - max_allow_sp_adjust)); > + rtx insn = emit_insn (pre_adjust_rtx); > + > + rtx cfa_pre_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx, > + GEN_INT (aligned_size - max_allow_sp_adjust)); > + dwarf_pre_sp_adjust = alloc_reg_note (REG_CFA_DEF_CFA, > + cfa_pre_adjust_rtx, > + dwarf_pre_sp_adjust); > + > + RTX_FRAME_RELATED_P (insn) = 1; > + REG_NOTES (insn) = dwarf_pre_sp_adjust; > + > + sp_adjust = max_allow_sp_adjust; > + } > + else > + sp_adjust = (aligned_size + 15) & (~0xf); Use ROUND_UP > @@ -5270,6 +5516,146 @@ riscv_emit_stack_tie (void) > emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx)); > } > > +bool > +riscv_check_regno(rtx pat, unsigned regno) > +{ > + return REG_P (pat) > + && REGNO (pat) == regno; > +} > + > +/* Function to check whether the OP is a valid stack push/pop operation. > + This part is borrowed from nds32 nds32_valid_stack_push_pop_p */ > + > +bool > +riscv_valid_stack_push_pop_p (rtx op, bool push_p) > +{ > + int index; > + int total_count; > + int sp_adjust_rtx_index; > + rtx elt; > + rtx elt_reg; > + rtx elt_plus; > + > + if (!TARGET_ZCMP) > + return false; > + > + total_count = XVECLEN (op, 0); > + sp_adjust_rtx_index = push_p ? 0 : total_count - 1; > + > + /* At least sp + one callee save/restore register rtx */ > + if (total_count < 2) > + return false; > + > + /* Perform some quick check for that every element should be 'set', > + for pop, it might contain `ret` and `ret value` pattern. */ > + for (index = 0; index < total_count; index++) > + { > + elt = XVECEXP (op, 0, index); > + > + /* skip pop return value rtx */ > + if (!push_p && GET_CODE (elt) == SET > + && riscv_check_regno (SET_DEST (elt), RETURN_VALUE_REGNUM) > + && total_count >= 4 > + && index + 1 < total_count > + && GET_CODE (XVECEXP (op, 0, index + 1)) == USE) > + { > + rtx use_reg = XEXP (XVECEXP (op, 0, index + 1), 0); > + > + if (!riscv_check_regno (use_reg, RETURN_VALUE_REGNUM)) > + return false; > + > + index += 1; > + continue; > + } > + > + /* skip ret rtx */ > + if (!push_p && GET_CODE (elt) == SIMPLE_RETURN > + && total_count >= 4 > + && index + 1 < total_count > + && GET_CODE (XVECEXP (op, 0, index + 1)) == USE) > + { > + rtx use_reg = XEXP (XVECEXP (op, 0, index + 1), 0); > + > + if (!riscv_check_regno (use_reg, RETURN_ADDR_REGNUM)) > + return false; > + > + index += 1; > + sp_adjust_rtx_index -= 2; > + continue; > + } > + > + if (GET_CODE (elt) != SET) > + return false; > + } > + > + elt = XVECEXP (op, 0, sp_adjust_rtx_index); > + elt_reg = SET_DEST (elt); > + elt_plus = SET_SRC (elt); > + > + /* Check this is (set (stack_reg) (plus stack_reg const)) pattern. */ > + if (GET_CODE (elt_plus) != PLUS > + || !riscv_check_regno (elt_reg, STACK_POINTER_REGNUM)) > + return false; > + > + /* Pass all test, this is a valid rtx. */ > + return true; > +} > + > +/* Generate push/pop rtx */ > + > +static void > +riscv_emit_push_insn (struct riscv_frame_info *frame, HOST_WIDE_INT size) > +{ > + unsigned int veclen = riscv_save_push_pop_count (frame->mask); > + unsigned int n_reg = veclen - 1; Need comment to explain why `- 1` here. > + rtvec vec = rtvec_alloc (veclen); > + > + const unsigned *reg_order = (TARGET_ZCMP && TARGET_RVE) > + ? push_save_reg_order_zcmpe > + : push_save_reg_order; > + > + int aligned_size = (size + 15) & (~0xf); Use ROUND_UP > + > + gcc_assert (n_reg >= 1 > + && TARGET_ZCMP > + && ((TARGET_RVE && (n_reg <= ARRAY_SIZE (push_save_reg_order_zcmpe))) > + || (TARGET_ZCMP && (n_reg <= ARRAY_SIZE (push_save_reg_order))))); > + > + /* sp adjust pattern */ > + int max_allow_sp_adjust = riscv_push_pop_base_sp_adjust (frame->mask) + 48; What's the magic number of 48? > + int sp_adjust = aligned_size > max_allow_sp_adjust ? > + max_allow_sp_adjust > + : aligned_size; > + > + /*TODO: move this part to frame computation function. */ Is it possible to resolve this TODO? > + frame->gp_sp_offset = (veclen - 1) * UNITS_PER_WORD; > + frame->push_pop_sp_adjust = sp_adjust; > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h > index d05b1d59853..6e6e3ee2c25 100644 > --- a/gcc/config/riscv/riscv.h > +++ b/gcc/config/riscv/riscv.h > @@ -383,6 +383,7 @@ ASM_MISA_SPEC > #define HARD_FRAME_POINTER_REGNUM 8 > #define STACK_POINTER_REGNUM 2 > #define THREAD_POINTER_REGNUM 4 > +#define RETURN_VALUE_REGNUM 10 > > /* These two registers don't really exist: they get eliminated to either > the stack or hard frame pointer. */ > @@ -1097,4 +1098,7 @@ extern void riscv_remove_unneeded_save_restore_calls (void); > #define DWARF_REG_TO_UNWIND_COLUMN(REGNO) \ > ((REGNO == RISCV_DWARF_VLENB) ? (FIRST_PSEUDO_REGISTER + 1) : REGNO) > > +#define RISCV_ZCE_PUSH_POP_MASK 0x0ffc0302u RISCV_ZCE_PUSH_POP_MASK -> RISCV_ZCMP_PUSH_POP_MASK > +#define RISCV_ZCMPE_PUSH_POP_MASK 0x302u RISCV_ZCMPE_PUSH_POP_MASK -> RISCV_ZCMP_RVE_PUSH_POP_MASK > diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv > index 6e326fc7e02..9ef522306a5 100644 > --- a/gcc/config/riscv/t-riscv > +++ b/gcc/config/riscv/t-riscv > @@ -90,6 +90,10 @@ riscv-v.o: $(srcdir)/config/riscv/riscv-v.cc \ > $(COMPILE) $< > $(POSTCOMPILE) > > +riscv-zcmp-popret.o: $(srcdir)/config/riscv/riscv-zcmp-popret.cc Plz add right dependency here. > diff --git a/gcc/config/riscv/zc.md b/gcc/config/riscv/zc.md > new file mode 100644 > index 00000000000..3ad34dacd49 > --- /dev/null > +++ b/gcc/config/riscv/zc.md > @@ -0,0 +1,47 @@ > +;; Machine description for ZCE extension. ZCE extension. -> Zc* extension