From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id BCF393858D28 for ; Mon, 19 Sep 2022 22:16:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BCF393858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x634.google.com with SMTP id 13so1961651ejn.3 for ; Mon, 19 Sep 2022 15:16:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=qVrfR0ZLRnj6InE9yyrrnWvyzSnBreVXT8O7ueYtvK4=; b=DfXLDeCXIa1bAlU49ZviTXY+gjvY8ux5RdOK+HZnekEmyYEnuVJmHSn269LXzYaVuy Zt29eUlORNrmlKPII7EtnlsISXPGScmmYrXPqHkhE6UmR9waoAhMBS/X5Lpk2vz0JEMP k3yIzgLAxAbuEl2LNp+SM73aA+jvest6Cpy00o5h7+QsN2ZeK8BmdWin7oxaaETJzhmB 82eCG0MZmxw+L5r2or8lwl9nF86EHwkzI8rHHYHZvCmUk5Ue0h8o9uK9OONP2w43jCRG kWZ0ALILwWR+IPE9Ul8DteeUijVIEQdp8fzrWzDDyb1YzoJxheHktSnIVVsRWX14CGWc HpLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=qVrfR0ZLRnj6InE9yyrrnWvyzSnBreVXT8O7ueYtvK4=; b=1O2ZbaFqYNUhbYFdgZJi/0DhB/TrmLNfHTq9TMRaj5AGHYum9ZC6H8+/a/dyeWZtVm 85qZZnYNnhxQZ1OEDAA6hzspvT4cACdnmbEZzaGwpBV4gn3E+iH4x38sBsLS0Is1CAy8 lqgO4HSOPPzo/fd+FGI1D3r/6IDwoKeTGTQ2ExaFiqrawgVbfqKvvVHoKV7oe9tnv8oB a+pUL7sQzHUXpvgzD4O864ztanB2a8sKRPxRomDEwenJpgSlggkBii5dcSiRsxpH6ggt Cs8740tYtlUwSPzjRzKINyxKnZlPgmFaSBJeLp1XcbMoM3Xg2nuEjEuFLo4w6zICLk7j t6lw== X-Gm-Message-State: ACrzQf1F8gfu3T6W9XWBqQYrblrtsspsYKHxvDIXGWecCKtXsEt34q8R xI5lkfKrqA6ikDTe9WcQZA17QQIWnTw1v6S7MtA= X-Google-Smtp-Source: AMsMyM6dKKzcqA09tTXoVyB10+oRkMhdT0TkisYBfS6Gb9aHjDWw5xGF7lLbPNNAEzmEdCEzPWkpUnAMQHL57oQVvWs= X-Received: by 2002:a17:907:1c15:b0:780:55b0:9c34 with SMTP id nc21-20020a1709071c1500b0078055b09c34mr14629057ejc.309.1663625813244; Mon, 19 Sep 2022 15:16:53 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Kito Cheng Date: Tue, 20 Sep 2022 00:16:41 +0200 Message-ID: Subject: Re: [PATCH] RISC-V modified add3 for large stack frame optimization [PR105733] To: Kevin Lee Cc: gcc-patches@gcc.gnu.org, gnu-toolchain@rivosinc.com Content-Type: multipart/alternative; boundary="000000000000a33e3005e90f11a8" X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000a33e3005e90f11a8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Could you provide some data including code size and performance? add is frequently used patten, so we should more careful when changing that. Kevin Lee =E6=96=BC 2022=E5=B9=B49=E6=9C=8819=E6=97=A5= =E9=80=B1=E4=B8=80=EF=BC=8C18:07=E5=AF=AB=E9=81=93=EF=BC=9A > Hello GCC, > Started from Jim Wilson's patch in > > https://github.com/riscv-admin/riscv-code-speed-optimization/blob/main/pr= ojects/gcc-optimizations.adoc > for the large stack frame optimization problem, this augmented patch > generates less instructions for cases such as > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105733. > Original: > foo: > li t0,-4096 > addi t0,t0,2016 > li a4,4096 > add sp,sp,t0 > li a5,-4096 > addi a4,a4,-2032 > add a4,a4,a5 > addi a5,sp,16 > add a5,a4,a5 > add a0,a5,a0 > li t0,4096 > sd a5,8(sp) > sb zero,2032(a0) > addi t0,t0,-2016 > add sp,sp,t0 > jr ra > After Patch: > foo: > li t0,-4096 > addi t0,t0,2032 > add sp,sp,t0 > addi a5,sp,-2032 > add a0,a5,a0 > li t0,4096 > sb zero,2032(a0) > addi t0,t0,-2032 > add sp,sp,t0 > jr ra > > =3D=3D=3D=3D=3D=3D=3D=3D=3D Summary of gcc testsuite =3D= =3D=3D=3D=3D=3D=3D=3D=3D > | # of unexpected case / # of unique unexpect= ed > case > | gcc | g++ | gfortran | > rv64gc/ lp64d/ medlow | 4 / 4 | 13 / 4 | 0 / 0 | > No additional failures were created from the testsuite. > > gcc/ChangeLog: > Jim Wilson > Michael Collison > Kevin Lee > > * config/riscv/predicates.md (const_lui_operand): New predicate. > (add_operand): Ditto. > (reg_or_const_int_operand): Ditto. > * config/riscv/riscv-protos.h (riscv_eliminable_reg): New > function. > * config/riscv/riscv.cc (riscv_eliminable_reg): New Function. > (riscv_adjust_libcall_cfi_prologue): Use gen_rtx_SET and > gen_rtx_fmt_ee instead of gen_add3_insn. > (riscv_adjust_libcall_cfi_epilogue): ditto. > * config/riscv/riscv.md (addsi3): Remove. > (adddi3): ditto. > (add3): New instruction for large stack frame optimization. > (add3_internal): ditto > (add3_internal2): New instruction for insns generated in > the prologue and epilogue pass. > --- > gcc/config/riscv/predicates.md | 13 +++++ > gcc/config/riscv/riscv-protos.h | 1 + > gcc/config/riscv/riscv.cc | 20 ++++++-- > gcc/config/riscv/riscv.md | 84 ++++++++++++++++++++++++++++----- > 4 files changed, 101 insertions(+), 17 deletions(-) > > diff --git a/gcc/config/riscv/predicates.md > b/gcc/config/riscv/predicates.md > index 862e72b0983..b98bb5a9768 100644 > --- a/gcc/config/riscv/predicates.md > +++ b/gcc/config/riscv/predicates.md > @@ -35,6 +35,14 @@ (define_predicate "sfb_alu_operand" > (ior (match_operand 0 "arith_operand") > (match_operand 0 "lui_operand"))) > > +(define_predicate "const_lui_operand" > + (and (match_code "const_int") > + (match_test "(INTVAL (op) & 0xFFF) =3D=3D 0 && INTVAL (op) !=3D 0= "))) > + > +(define_predicate "add_operand" > + (ior (match_operand 0 "arith_operand") > + (match_operand 0 "const_lui_operand"))) > + > (define_predicate "const_csr_operand" > (and (match_code "const_int") > (match_test "IN_RANGE (INTVAL (op), 0, 31)"))) > @@ -59,6 +67,11 @@ (define_predicate "reg_or_0_operand" > (ior (match_operand 0 "const_0_operand") > (match_operand 0 "register_operand"))) > > +;; For use in adds, when adding to an eliminable register. > +(define_predicate "reg_or_const_int_operand" > + (ior (match_code "const_int") > + (match_operand 0 "register_operand"))) > + > ;; Only use branch-on-bit sequences when the mask is not an ANDI > immediate. > (define_predicate "branch_on_bit_operand" > (and (match_code "const_int") > diff --git a/gcc/config/riscv/riscv-protos.h > b/gcc/config/riscv/riscv-protos.h > index 649c5c977e1..8f0aa8114be 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -63,6 +63,7 @@ extern void riscv_expand_conditional_move (rtx, rtx, rt= x, > rtx_code, rtx, rtx); > extern rtx riscv_legitimize_call_address (rtx); > extern void riscv_set_return_address (rtx, rtx); > extern bool riscv_expand_block_move (rtx, rtx, rtx); > +extern bool riscv_eliminable_reg (rtx); > extern rtx riscv_return_addr (int, rtx); > extern poly_int64 riscv_initial_elimination_offset (int, int); > extern void riscv_expand_prologue (void); > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 675d92c0961..b5577a4f366 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -4320,6 +4320,16 @@ riscv_initial_elimination_offset (int from, int to) > return src - dest; > } > > +/* Return true if X is a register that will be eliminated later on. */ > +bool > +riscv_eliminable_reg (rtx x) > +{ > + return REG_P (x) && (REGNO (x) =3D=3D FRAME_POINTER_REGNUM > + || REGNO (x) =3D=3D ARG_POINTER_REGNUM > + || (REGNO (x) >=3D FIRST_VIRTUAL_REGISTER > + && REGNO (x) <=3D LAST_VIRTUAL_REGISTER)); > +} > + > /* Implement RETURN_ADDR_RTX. We do not support moving back to a > previous frame. */ > > @@ -4521,8 +4531,9 @@ riscv_adjust_libcall_cfi_prologue () > } > > /* Debug info for adjust sp. */ > - adjust_sp_rtx =3D gen_add3_insn (stack_pointer_rtx, > - stack_pointer_rtx, GEN_INT (-saved_size)); > + adjust_sp_rtx =3D gen_rtx_SET (stack_pointer_rtx, > + gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx), > + stack_pointer_rtx, GEN_INT (saved_size))); > dwarf =3D alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, > dwarf); > return dwarf; > @@ -4624,8 +4635,9 @@ riscv_adjust_libcall_cfi_epilogue () > int saved_size =3D cfun->machine->frame.save_libcall_adjustment; > > /* Debug info for adjust sp. */ > - adjust_sp_rtx =3D gen_add3_insn (stack_pointer_rtx, > - stack_pointer_rtx, GEN_INT (saved_size)); > + adjust_sp_rtx =3D gen_rtx_SET (stack_pointer_rtx, > + gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx), > + stack_pointer_rtx, GEN_INT (saved_size))); > dwarf =3D alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, > dwarf); > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index 014206fb8bd..0285ac67b2a 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -438,23 +438,80 @@ (define_insn "add3" > [(set_attr "type" "fadd") > (set_attr "mode" "")]) > > -(define_insn "addsi3" > - [(set (match_operand:SI 0 "register_operand" "=3Dr,r") > - (plus:SI (match_operand:SI 1 "register_operand" " r,r") > - (match_operand:SI 2 "arith_operand" " r,I")))] > +(define_expand "add3" > + [(parallel > + [(set (match_operand:GPR 0 "register_operand" "") > + (plus:GPR (match_operand:GPR 1 "register_operand" "") > + (match_operand:GPR 2 "add_operand" ""))) > + (clobber (match_scratch:GPR 3 ""))])] > "" > - "add%i2%~\t%0,%1,%2" > +{ > + if (riscv_eliminable_reg (operands[1])) > + { > + if (splittable_const_int_operand (operands[2], mode)) > + { > + /* The idea here is that we emit > + add op0, op1, %hi(op2) > + addi op0, op0, %lo(op2) > + Then when op1, the eliminable reg, gets replaced with sp+offset, > + we can simplify the constants. */ > + HOST_WIDE_INT high_part =3D CONST_HIGH_PART (INTVAL (operands[2])); > + emit_insn (gen_add3_internal (operands[0], operands[1], > + GEN_INT (high_part))); > + operands[1] =3D operands[0]; > + operands[2] =3D GEN_INT (INTVAL (operands[2]) - high_part); > + } > + else if (! const_arith_operand (operands[2], mode)) > + operands[2] =3D force_reg (mode, operands[2]); > + } > +}) > + > +(define_insn_and_split "add3_internal" > + [(set (match_operand:GPR 0 "register_operand" "=3Dr,r,&r,!&r") > + (plus:GPR (match_operand:GPR 1 "register_operand" " %r,r,r,0") > + (match_operand:GPR 2 "add_operand" " r,I,L,L"))) > + (clobber (match_scratch:GPR 3 "=3DX,X,X,&r"))] > + "" > +{ > + if ((which_alternative =3D=3D 2) || (which_alternative =3D=3D 3)) > + return "#"; > + > + if (TARGET_64BIT && mode =3D=3D SImode) > + return "add%i2w\t%0,%1,%2"; > + else > + return "add%i2\t%0,%1,%2"; > +} > + "&& reload_completed && const_lui_operand (operands[2], mode)" > + [(const_int 0)] > +{ > + if (REGNO (operands[0]) !=3D REGNO (operands[1])) > + { > + emit_insn (gen_mov (operands[0], operands[2])); > + emit_insn (gen_add3_internal (operands[0], operands[0], > operands[1])); > + } > + else > + { > + emit_insn (gen_mov (operands[3], operands[2])); > + emit_insn (gen_add3_internal (operands[0], operands[0], > operands[3])); > + } > + DONE; > +} > [(set_attr "type" "arith") > - (set_attr "mode" "SI")]) > + (set_attr "mode" "")]) > > -(define_insn "adddi3" > - [(set (match_operand:DI 0 "register_operand" "=3Dr,r") > - (plus:DI (match_operand:DI 1 "register_operand" " r,r") > - (match_operand:DI 2 "arith_operand" " r,I")))] > - "TARGET_64BIT" > - "add%i2\t%0,%1,%2" > +(define_insn "add3_internal2" > + [(set (match_operand:GPR 0 "register_operand" "=3Dr,r") > + (plus:GPR (match_operand:GPR 1 "register_operand" " %r,r") > + (match_operand:GPR 2 "arith_operand" " r,I")))] > + "" > + { > + if (TARGET_64BIT && mode =3D=3D SImode) > + return "add%i2w\t%0,%1,%2"; > + else > + return "add%i2\t%0,%1,%2"; > + } > [(set_attr "type" "arith") > - (set_attr "mode" "DI")]) > + (set_attr "mode" "")]) > > (define_expand "addv4" > [(set (match_operand:GPR 0 "register_operand" "=3Dr,r") > @@ -500,6 +557,7 @@ (define_expand "addv4" > DONE; > }) > > + > (define_expand "uaddv4" > [(set (match_operand:GPR 0 "register_operand" "=3Dr,r") > (plus:GPR (match_operand:GPR 1 "register_operand" " r,r") > -- > 2.25.1 > --000000000000a33e3005e90f11a8--