From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by sourceware.org (Postfix) with ESMTPS id B388A3858410 for ; Tue, 1 Nov 2022 17:25:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B388A3858410 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-lj1-x233.google.com with SMTP id x21so20517231ljg.10 for ; Tue, 01 Nov 2022 10:25:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=P84sePayfewPJEtWdFw+RJkORfoBKwvTW/kZjxMGqEA=; b=A4bttdRvDKFWEUOyU0hkFmslSYPHvtrKECVkps3YUhCMi6oy0Ak9nTI1aCLF5o9Cjf vhaA13kfhl7/bb2yJ2+3qCgfi2RLySJvIRfWhZqmsrN9lnOhBmQAs8CFG4SoKco27tRx UulShRIAZSC8ecX6dEKdOQxqCJ7AWIUjK9W6nDzmnVWHFQxIezfC1qCYWJ7GyQzXW44L YS3KOt5pjv0tL3d5Y2aTKFDqTerZ0DT+uMybBWviw1misA8uBcg9ei6uuhXkGTbQbBom W3jwzKq0k/DUXy919Q04jGqsM7qKac1oBRNl4PGF15VRGbdrHHoS+9kvsf1XwTsHGotq I9Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=P84sePayfewPJEtWdFw+RJkORfoBKwvTW/kZjxMGqEA=; b=2/PEx7XaL/MtXYHkgJWvKHdVrQwIttxKlO0/dFmvwz4wtA57Ze0hPZXddfVTCGOe5a kA1DiXe8jFc6CN6wxn5e7A7TEkiCmt0DOw9JKabh7gY+6ANlqtApFe8CbRsqO/fMndFm OZJJZbZnV7/+LykJj2Ea/9OB/dSJuOQ2fdgD/rrCusU6+gf+t7Cg3LoAiTLAHQ0pX5Qe k8h0d+USqqmUvATlw8ApzqillPt98VBzWpnqXHxs/bn1HbyFD8XtQlXs7jBD9lieYPRw ddOHkjYsUsTmO/PWg+DbPHo6KiC4PI8/9+2PZqfVuI1pIvrbz8ENWs2F7SGAoUnI0Atq oNJw== X-Gm-Message-State: ACrzQf06aY/A/0QAOj76qgGIxWHsDQ0EQf6TgGpoUSZU9s6vvuwNgNGB OllOb2wf4IqWVnzCYh6EYciUO2HrrRBDimoQEZdwWWXkmW92HA== X-Google-Smtp-Source: AMsMyM71FufP/ik/Kbu5kz8BzjZNhHJTjqKIy6XguGMFEZyXYjIQGP2MKSnSzTnBGny3uDwW+hV8ZiDCVZAKXKdcpTk= X-Received: by 2002:a05:651c:1546:b0:277:8bc:5cc8 with SMTP id y6-20020a05651c154600b0027708bc5cc8mr7641212ljp.491.1667323524541; Tue, 01 Nov 2022 10:25:24 -0700 (PDT) MIME-Version: 1.0 From: Kevin Lee Date: Tue, 1 Nov 2022 10:25:13 -0700 Message-ID: Subject: [PATCH v2] RISC-V modified add3 for large stack frame optimization [PR105733] To: gcc-patches@gcc.gnu.org Cc: gnu-toolchain@rivosinc.com, Michael Collison Content-Type: multipart/alternative; boundary="0000000000006801f705ec6c02ca" X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --0000000000006801f705ec6c02ca Content-Type: text/plain; charset="UTF-8" This is the updated patch of https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601824.html. Since the riscv-selftest.cc has been added, this version of the patch adds the logic in riscv-selftest.cc to also consider parallel insns. The patch has been tested with rv64imafdc / rv64imac / rv32imafdc / rv32imac and no additional failures were detected in the testsuite. gcc/ChangeLog: Jim Wilson Michael Collison Kevin Lee * config/riscv/predicates.md (const_lui_operand): New Predicate. (add_operand): Ditto. (reg_or_const_int_operand): Ditto. * config/riscv/riscv-protos.h (riscv_eliminable_reg): New function. * config/riscv/riscv-selftests.cc (calculate_x_in_sequence): Consider Parallel insns. * config/riscv/riscv.cc (riscv_eliminable_reg): New function. (riscv_adjust_libcall_cfi_prologue): Use gen_rtx_SET and gen_rtx_fmt_ee instead of gen_add3_insn. (riscv_adjust_libcall_cfi_epilogue): Ditto. * config/riscv/riscv.md (addsi3): Remove. (add3): New instruction for large stack frame optimization. (add3_internal): Ditto. (adddi3): Remove. (add3_internal2): New instruction for insns generated in the prologue and epilogue pass. --- gcc/config/riscv/predicates.md | 13 +++++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-selftests.cc | 3 ++ gcc/config/riscv/riscv.cc | 20 +++++-- gcc/config/riscv/riscv.md | 84 ++++++++++++++++++++++++----- 5 files changed, 104 insertions(+), 17 deletions(-) diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index c2ff41bb0fd..3149f7227ac 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -35,6 +35,14 @@ (ior (match_operand 0 "arith_operand") (match_operand 0 "lui_operand"))) +(define_predicate "const_lui_operand" + (and (match_code "const_int") + (match_test "(INTVAL (op) & 0xFFF) == 0 && INTVAL (op) != 0"))) + +(define_predicate "add_operand" + (ior (match_operand 0 "arith_operand") + (match_operand 0 "const_lui_operand"))) + (define_predicate "const_csr_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 31)"))) @@ -59,6 +67,11 @@ (ior (match_operand 0 "const_0_operand") (match_operand 0 "register_operand"))) +;; For use in adds, when adding to an eliminable register. +(define_predicate "reg_or_const_int_operand" + (ior (match_code "const_int") + (match_operand 0 "register_operand"))) + ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate. (define_predicate "branch_on_bit_operand" (and (match_code "const_int") diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a718bb62b4..9348ac71956 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -63,6 +63,7 @@ extern void riscv_expand_conditional_move (rtx, rtx, rtx, rtx_code, rtx, rtx); extern rtx riscv_legitimize_call_address (rtx); extern void riscv_set_return_address (rtx, rtx); extern bool riscv_expand_block_move (rtx, rtx, rtx); +extern bool riscv_eliminable_reg (rtx); extern rtx riscv_return_addr (int, rtx); extern poly_int64 riscv_initial_elimination_offset (int, int); extern void riscv_expand_prologue (void); diff --git a/gcc/config/riscv/riscv-selftests.cc b/gcc/config/riscv/riscv-selftests.cc index 636874ebc0f..50457db708e 100644 --- a/gcc/config/riscv/riscv-selftests.cc +++ b/gcc/config/riscv/riscv-selftests.cc @@ -116,6 +116,9 @@ calculate_x_in_sequence (rtx reg) rtx pat = PATTERN (insn); rtx dest = SET_DEST (pat); + if (GET_CODE (pat) == PARALLEL) + dest = SET_DEST (XVECEXP (pat, 0, 0)); + if (GET_CODE (pat) == CLOBBER) continue; diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 32f9ef9ade9..de9344b37a3 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4686,6 +4686,16 @@ riscv_initial_elimination_offset (int from, int to) return src - dest; } +/* Return true if X is a register that will be eliminated later on. */ +bool +riscv_eliminable_reg (rtx x) +{ + return REG_P (x) && (REGNO (x) == FRAME_POINTER_REGNUM + || REGNO (x) == ARG_POINTER_REGNUM + || (REGNO (x) >= FIRST_VIRTUAL_REGISTER + && REGNO (x) <= LAST_VIRTUAL_REGISTER)); +} + /* Implement RETURN_ADDR_RTX. We do not support moving back to a previous frame. */ @@ -4887,8 +4897,9 @@ riscv_adjust_libcall_cfi_prologue () } /* Debug info for adjust sp. */ - adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, GEN_INT (-saved_size)); + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx, + gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx), + stack_pointer_rtx, GEN_INT (saved_size))); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, dwarf); return dwarf; @@ -4990,8 +5001,9 @@ riscv_adjust_libcall_cfi_epilogue () int saved_size = cfun->machine->frame.save_libcall_adjustment; /* Debug info for adjust sp. */ - adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, GEN_INT (saved_size)); + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx, + gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx), + stack_pointer_rtx, GEN_INT (saved_size))); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, dwarf); diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 798f7370a08..985dbdd50c4 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -446,23 +446,80 @@ [(set_attr "type" "fadd") (set_attr "mode" "")]) -(define_insn "addsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r") - (plus:SI (match_operand:SI 1 "register_operand" " r,r") - (match_operand:SI 2 "arith_operand" " r,I")))] +(define_expand "add3" + [(parallel + [(set (match_operand:GPR 0 "register_operand" "") + (plus:GPR (match_operand:GPR 1 "register_operand" "") + (match_operand:GPR 2 "add_operand" ""))) + (clobber (match_scratch:GPR 3 ""))])] "" - "add%i2%~\t%0,%1,%2" +{ + if (riscv_eliminable_reg (operands[1])) + { + if (splittable_const_int_operand (operands[2], mode)) + { + /* The idea here is that we emit + add op0, op1, %hi(op2) + addi op0, op0, %lo(op2) + Then when op1, the eliminable reg, gets replaced with sp+offset, + we can simplify the constants. */ + HOST_WIDE_INT high_part = CONST_HIGH_PART (INTVAL (operands[2])); + emit_insn (gen_add3_internal (operands[0], operands[1], + GEN_INT (high_part))); + operands[1] = operands[0]; + operands[2] = GEN_INT (INTVAL (operands[2]) - high_part); + } + else if (! const_arith_operand (operands[2], mode)) + operands[2] = force_reg (mode, operands[2]); + } +}) + +(define_insn_and_split "add3_internal" + [(set (match_operand:GPR 0 "register_operand" "=r,r,&r,!&r") + (plus:GPR (match_operand:GPR 1 "register_operand" " %r,r,r,0") + (match_operand:GPR 2 "add_operand" " r,I,L,L"))) + (clobber (match_scratch:GPR 3 "=X,X,X,&r"))] + "" +{ + if ((which_alternative == 2) || (which_alternative == 3)) + return "#"; + + if (TARGET_64BIT && mode == SImode) + return "add%i2w\t%0,%1,%2"; + else + return "add%i2\t%0,%1,%2"; +} + "&& reload_completed && const_lui_operand (operands[2], mode)" + [(const_int 0)] +{ + if (REGNO (operands[0]) != REGNO (operands[1])) + { + emit_insn (gen_mov (operands[0], operands[2])); + emit_insn (gen_add3_internal (operands[0], operands[0], operands[1])); + } + else + { + emit_insn (gen_mov (operands[3], operands[2])); + emit_insn (gen_add3_internal (operands[0], operands[0], operands[3])); + } + DONE; +} [(set_attr "type" "arith") - (set_attr "mode" "SI")]) + (set_attr "mode" "")]) -(define_insn "adddi3" - [(set (match_operand:DI 0 "register_operand" "=r,r") - (plus:DI (match_operand:DI 1 "register_operand" " r,r") - (match_operand:DI 2 "arith_operand" " r,I")))] - "TARGET_64BIT" - "add%i2\t%0,%1,%2" +(define_insn "add3_internal2" + [(set (match_operand:GPR 0 "register_operand" "=r,r") + (plus:GPR (match_operand:GPR 1 "register_operand" " %r,r") + (match_operand:GPR 2 "arith_operand" " r,I")))] + "" + { + if (TARGET_64BIT && mode == SImode) + return "add%i2w\t%0,%1,%2"; + else + return "add%i2\t%0,%1,%2"; + } [(set_attr "type" "arith") - (set_attr "mode" "DI")]) + (set_attr "mode" "")]) (define_expand "addv4" [(set (match_operand:GPR 0 "register_operand" "=r,r") @@ -508,6 +565,7 @@ DONE; }) + (define_expand "uaddv4" [(set (match_operand:GPR 0 "register_operand" "=r,r") (plus:GPR (match_operand:GPR 1 "register_operand" " r,r") -- 2.25.1 --0000000000006801f705ec6c02ca--