From: Patrick O'Neill <patrick@rivosinc.com>
To: gcc-patches@gcc.gnu.org
Cc: Patrick O'Neill <patrick@rivosinc.com>
Subject: [RFC] RISCV: Combine Pass Clobber Ops
Date: Thu, 10 Mar 2022 09:35:13 -0800 [thread overview]
Message-ID: <20220310173513.3176646-1-patrick@rivosinc.com> (raw)
RISC-V's C-extension describes 2-byte instructions with special
constraints. One of those constraints is that one of the sources/dest
registers are equal (op will clobber one of it's operands). This patch
adds support for combining simple sequences:
r1 = r2 + r3 (4 bytes)
r2 DEAD
r4 = r1 + r5 (4 bytes)
r1 DEAD
Combine pass now generates:
r2 = r2 + r3 (2 bytes)
r4 = r2 + r5 (4 bytes)
r2 DEAD
This change results in a ~150 Byte decrease in the linux kernel's
compiled size (text: 5327254 Bytes -> 5327102 Bytes).
I added this enforcement during the combine pass since it looks at the
cost of certian expressions and can rely on the target to tell the
pass that clobber-ops are cheaper than regular ops.
The main thing holding this RFC back is the combine pass's behavior for
sequences like this:
b = a << 5;
c = b + 2;
Normally the combine pass modifies the RTL to be:
c = (a << 5) + 2
before expanding it back to the original statement.
With my changes, the RTL is prevented from being combined like that and
instead results in RTL like this:
c = 2
which is clearly wrong.
I think that the next step would be to figure out where this
re-expansion takes place and implement the same-register constraint
there. However, I'm opening the RFC for any input:
1. Are there better ways to enforce same-register constraints during the
combine pass other than declaring the source/dest register to be the
same in RTL? Specifically, I'm concerned that this addition may
restrict subsequent RTL pass optimizations.
2. Are there other concerns with implementing source-dest constraints
within the combine pass?
3. Any other thoughts/input you have is welcome!
2022-03-10 Patrick O'Neill <patrick@rivosinc.com>
* combine.cc: Add register equality replacement.
* riscv.cc (riscv_insn_cost): Add in order to tell combine pass
that clobber-ops are cheaper.
* riscv.h: Add c extension argument macros.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
gcc/combine.cc | 71 +++++++++++++++++++++++++++++++++++++++
gcc/config/riscv/riscv.cc | 42 +++++++++++++++++++++++
gcc/config/riscv/riscv.h | 7 ++++
3 files changed, 120 insertions(+)
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 8f06ee0e54f..be910f8c734 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -3280,6 +3280,77 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
i2_is_used = n_occurrences;
}
+/* Attempt to replace ops with clobber-ops.
+ If the target implements clobber ops (set r1 (plus (r1)(r2))) as cheaper,
+ this pattern allows the combine pass to optimize with that in mind.
+ NOTE: This conditional is not triggered in most cases. Ideally we would be
+ able to move it above the if (i2_is_used == 0), but that breaks the
+ testsuite.
+ See RFC blurb for more info. */
+ if (!i0 && !i1 && i2 && i3 && GET_CODE(PATTERN(i2)) == SET && GET_CODE(SET_DEST(PATTERN(i2))) == REG
+ && GET_CODE(PATTERN(i3)) == SET
+ && (GET_RTX_CLASS(GET_CODE(SET_SRC(PATTERN(i3)))) == RTX_BIN_ARITH || GET_RTX_CLASS(GET_CODE(SET_SRC(PATTERN(i3)))) == RTX_COMM_ARITH || GET_RTX_CLASS(GET_CODE(SET_SRC(PATTERN(i3)))) == RTX_UNARY)
+ && GET_CODE(SET_DEST(PATTERN(i3))) == REG && REGNO(XEXP(SET_SRC(PATTERN(i3)), 0)) != REGNO(SET_DEST(PATTERN(i3)))) {
+
+ rtx_code insn_class = GET_CODE(SET_SRC (PATTERN(i2)));
+
+ if (GET_RTX_CLASS(insn_class) == RTX_BIN_ARITH || GET_RTX_CLASS(insn_class) == RTX_COMM_ARITH || GET_RTX_CLASS(insn_class) == RTX_UNARY) {
+
+ rtx operand1 = XEXP (SET_SRC (PATTERN (i2)), 0);
+ rtx prior_reg = SET_DEST (PATTERN (i2));
+
+ if (GET_CODE(operand1) == REG
+ && GET_MODE(operand1) == GET_MODE(prior_reg)
+ && find_reg_note (i2, REG_DEAD, operand1)
+ && regno_use_in (REGNO(prior_reg), PATTERN(i3))
+ && find_reg_note (i3, REG_DEAD, SET_DEST (PATTERN(i2)))) {
+
+ // Now we have a dead operand register, and we know where the dest dies.
+
+ // Remove the note declaring the register as dead
+ rtx note = find_reg_note (i2, REG_DEAD, operand1);
+ remove_note (i2, note);
+
+ // Overwrite i2 dest with operand1
+ rtx i2_dest = copy_rtx(operand1);
+ SUBST (SET_DEST (PATTERN (i2)), i2_dest);
+
+ // Replace the previous i2 dest register with operand1 in i3
+ rtx op1_copy = copy_rtx(operand1);
+ rtx new_src = simplify_replace_rtx(SET_SRC (PATTERN (i3)), prior_reg, op1_copy);
+ SUBST (SET_SRC (PATTERN (i3)), new_src);
+
+ // Move the dest dead note to the new register
+ note = find_reg_note (i3, REG_DEAD, prior_reg);
+ if (note) {
+ remove_note (i3, note);
+ //add_reg_note (i3, REG_DEAD, op1_copy);
+ }
+
+ newi2pat = PATTERN (i2);
+ newpat = PATTERN (i3);
+
+ subst_insn = i3;
+ subst_low_luid = DF_INSN_LUID (i2);
+ added_sets_2 = added_sets_1 = added_sets_0 = 0;
+ i2dest = i2_dest;
+ i2dest_killed = dead_or_set_p (i2, i2dest);
+
+ changed_i3_dest = false;
+
+ i2_code_number = recog_for_combine (&newi2pat, i2, &new_i2_notes);
+
+ if (i2_code_number >= 0)
+ insn_code_number = recog_for_combine (&newpat, i3, &new_i3_notes);
+
+ if (insn_code_number >= 0)
+ swap_i2i3 = true;
+
+ goto validate_replacement;
+ }
+ }
+ }
+
/* If we already got a failure, don't try to do more. Otherwise, try to
substitute I1 if we have it. */
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7da9d377120..9f0bd59ac41 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2160,6 +2160,46 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN
}
}
+/* Helper for INSN_COST.
+
+ Copied from arc.cc
+
+ Per Segher Boessenkool: rtx_costs computes the cost for any rtx (an
+ insn, a set, a set source, any random piece of one). set_src_cost,
+ set_rtx_cost, etc. are helper functions that use that.
+
+ Those functions do not work for parallels. Also, costs are not
+ additive like this simplified model assumes. Also, more complex
+ backends tend to miss many cases in their rtx_costs function.
+
+ Many passes that want costs want to know the cost of a full insn. Like
+ combine. That's why I created insn_cost: it solves all of the above
+ problems. */
+
+static int
+riscv_insn_cost (rtx_insn *insn, bool speed)
+{
+ rtx pat = PATTERN (insn);
+
+ if (TARGET_RVC && !speed) {
+ if (GET_CODE(pat) == SET && GET_CODE(SET_DEST(pat)) == REG) {
+ rtx src = SET_SRC(pat);
+ rtx dest = SET_DEST(pat);
+ if (GET_CODE(src) == PLUS && GET_CODE(XEXP(src, 0)) == REG && REGNO(XEXP(src, 0)) == REGNO(dest)) {
+ if (GET_CODE(XEXP(src, 1)) == REG)
+ return 2;
+ else if (GET_CODE(XEXP(src, 1)) == CONST_INT && CMPRESD_OPERAND(INTVAL(XEXP(src, 1))))
+ return 2;
+ } else if (GET_CODE(src) == ASHIFT && GET_CODE(XEXP(src, 0)) == REG && REGNO(dest) == REGNO(XEXP(src, 0))) {
+ if (GET_CODE(XEXP(src, 1)) == CONST_INT && UNSIGNED_CMPRESD_OPERAND(INTVAL(XEXP(src, 1))))
+ return 2;
+ }
+ }
+ }
+
+ return pattern_cost (pat, speed);
+}
+
/* Implement TARGET_ADDRESS_COST. */
static int
@@ -5618,6 +5658,8 @@ riscv_asan_shadow_offset (void)
#define TARGET_RTX_COSTS riscv_rtx_costs
#undef TARGET_ADDRESS_COST
#define TARGET_ADDRESS_COST riscv_address_cost
+#undef TARGET_INSN_COST
+#define TARGET_INSN_COST riscv_insn_cost
#undef TARGET_ASM_FILE_START
#define TARGET_ASM_FILE_START riscv_file_start
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 8a4d2cf7f85..be5e77b1394 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -522,6 +522,13 @@ enum reg_class
#define SMALL_OPERAND(VALUE) \
((unsigned HOST_WIDE_INT) (VALUE) + IMM_REACH/2 < IMM_REACH)
+#define CMPRESD_OPERAND(VALUE) \
+ (VALUE < 32 && VALUE >= -32)
+
+/* True if VALUE is an unsigned 5-bit number. */
+#define UNSIGNED_CMPRESD_OPERAND(VALUE) \
+ (VALUE < 64 && VALUE >= 0)
+
/* True if VALUE can be loaded into a register using LUI. */
#define LUI_OPERAND(VALUE) \
--
2.25.1
next reply other threads:[~2022-03-10 17:35 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-10 17:35 Patrick O'Neill [this message]
2022-03-10 17:42 ` H.J. Lu
2022-03-10 17:54 ` [RFC v2] " Patrick O'Neill
2022-03-11 8:07 ` Kito Cheng
2022-03-11 14:18 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220310173513.3176646-1-patrick@rivosinc.com \
--to=patrick@rivosinc.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).