From: Jeff Law <jeffreyalaw@gmail.com>
To: Feng Wang <wangfeng@eswincomputing.com>, gcc-patches@gcc.gnu.org
Cc: kito.cheng@gmail.com, palmer@dabbelt.com
Subject: Re: [PATCH v2 1/1] RISC-V: Optimze the reverse conditions of rotate shift
Date: Mon, 17 Apr 2023 11:53:50 -0600 [thread overview]
Message-ID: <219ae2ab-78bd-c84f-d162-2e5f3fc71aa6@gmail.com> (raw)
In-Reply-To: <20221206091153.27281-2-wangfeng@eswincomputing.com>
[-- Attachment #1: Type: text/plain, Size: 2597 bytes --]
On 12/6/22 02:11, Feng Wang wrote:
> From: wangfeng <wangfeng@eswincomputing.com>
>
> There is no Immediate operand of ins "rol" according to the B-ext,
> so the immediate operand should be loaded into register at first.
> But we can convert it to the ins "rori" or "roriw", and then one
> immediate load ins can be reduced.
> So I added some conditions when reverse the rotate shift during RTL
> expansion and RTL optimization.Reverse if the below two conditions
> are met at the same time,
> 1. The current insn_code doesn't exist or it's operand doesn't match,
> or the shift amount is beyond the half size of the machine mode;
> 2. The reversed insn_code exists and it's operand matches.
>
> Please refer to the following use cases:
> unsigned long foo2(unsigned long rs1)
> {
> return (rs1 << 10) | (rs1 >> 54);
> }
>
> The compiler result is:
> li a1,10
> rol a0,a0,a1
>
> This patch will generate one ins
> rori a0,a0,54
>
> At the same time I add the missing "roriw" ins RTL pattern
>
> Pass the linux-rv32imafdc-ilp32d-medany,linux-rv64imafdc-lp64d-medany,
> newlib-rv32imafc-ilp32f-medany and newlib-rv64imafdc-lp64d-medany regression.
>
> gcc/ChangeLog:
>
> * config/riscv/bitmanip.md: Add "roriw" insn output
> * expmed.cc (expand_shift_1):Call reverse_rotate_by_imm_p to judge
> whether reverse the rotate direction when GIMPLE to RTL.
> * rtl.h (reverse_rotate_by_imm_p): Add function declartion
> * simplify-rtx.cc (reverse_rotate_by_imm_p): Add a function to judge
> whether reverse rotate shift direction when simplify rtx.
> Reverse if the below two conditions are met at the same time,
> 1. The current insn_code doesn't exist or it's operand doesn't match,
> or the shift amount is beyond the half size of the machine mode;
> 2. The reversed insn_code exists and it's operand matches.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zbb-rol-ror-04.c: New test.
> * gcc.target/riscv/zbb-rol-ror-05.c: New test.
> * gcc.target/riscv/zbb-rol-ror-06.c: New test.
> * gcc.target/riscv/zbb-rol-ror-07.c: New test.
So I was waiting on test results to say this is ready for gcc-14 with
some minor edits. By the time those test results landed, gcc-14 has
partially opened up. Soo...
I updated various comments to hopefully make things clearer and adjusted
the logic slightly in reverse_rotate_by_imm_p.
Bootstrapped and regression tested on riscv64.
Pushed to the trunk.
Jeff
[-- Attachment #2: P --]
[-- Type: text/plain, Size: 9996 bytes --]
commit 0ccf520d349a82dafca0deb3d307a1080e8589a0
Author: Feng Wang <wangfeng@eswincomputing.com>
Date: Sat Apr 15 10:11:15 2023 -0600
RISC-V: Optimze the reverse conditions of rotate shift
gcc/ChangeLog:
* config/riscv/bitmanip.md (rotrsi3_sext): Support generating
roriw for constant counts.
* rtl.h (reverse_rotate_by_imm_p): Add function declartion
* simplify-rtx.cc (reverse_rotate_by_imm_p): New function.
(simplify_context::simplify_binary_operation_1): Use it.
* expmed.cc (expand_shift_1): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-rol-ror-04.c: New test.
* gcc.target/riscv/zbb-rol-ror-05.c: New test.
* gcc.target/riscv/zbb-rol-ror-06.c: New test.
* gcc.target/riscv/zbb-rol-ror-07.c: New test.
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 7aa591689ba..062968d479f 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -322,9 +322,9 @@
(define_insn "*rotrsi3_sext"
[(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI (rotatert:SI (match_operand:SI 1 "register_operand" "r")
- (match_operand:QI 2 "register_operand" "r"))))]
+ (match_operand:QI 2 "arith_operand" "rI"))))]
"TARGET_64BIT && (TARGET_ZBB || TARGET_ZBKB)"
- "rorw\t%0,%1,%2"
+ "ror%i2%~\t%0,%1,%2"
[(set_attr "type" "bitmanip")])
(define_insn "rotlsi3"
diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 1553ea8e31e..fbd4ce2d42f 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -2535,14 +2535,10 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted,
op1 = SUBREG_REG (op1);
}
- /* Canonicalize rotates by constant amount. If op1 is bitsize / 2,
- prefer left rotation, if op1 is from bitsize / 2 + 1 to
- bitsize - 1, use other direction of rotate with 1 .. bitsize / 2 - 1
- amount instead. */
- if (rotate
- && CONST_INT_P (op1)
- && IN_RANGE (INTVAL (op1), GET_MODE_BITSIZE (scalar_mode) / 2 + left,
- GET_MODE_BITSIZE (scalar_mode) - 1))
+ /* Canonicalize rotates by constant amount. We may canonicalize
+ to reduce the immediate or if the ISA can rotate by constants
+ in only on direction. */
+ if (rotate && reverse_rotate_by_imm_p (scalar_mode, left, op1))
{
op1 = gen_int_shift_amount (mode, (GET_MODE_BITSIZE (scalar_mode)
- INTVAL (op1)));
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 52f0419af29..60852aeecd8 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3566,6 +3566,7 @@ extern bool val_signbit_known_set_p (machine_mode,
unsigned HOST_WIDE_INT);
extern bool val_signbit_known_clear_p (machine_mode,
unsigned HOST_WIDE_INT);
+extern bool reverse_rotate_by_imm_p (machine_mode, unsigned int, rtx);
/* In reginfo.cc */
extern machine_mode choose_hard_reg_mode (unsigned int, unsigned int,
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index ee75079917f..c57ff3320ee 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -2741,6 +2741,44 @@ simplify_context::simplify_distributive_operation (rtx_code code,
return NULL_RTX;
}
+/* Return TRUE if a rotate in mode MODE with a constant count in OP1
+ should be reversed.
+
+ If the rotate should not be reversed, return FALSE.
+
+ LEFT indicates if this is a rotate left or a rotate right. */
+
+bool
+reverse_rotate_by_imm_p (machine_mode mode, unsigned int left, rtx op1)
+{
+ if (!CONST_INT_P (op1))
+ return false;
+
+ /* Some targets may only be able to rotate by a constant
+ in one direction. So we need to query the optab interface
+ to see what is possible. */
+ optab binoptab = left ? rotl_optab : rotr_optab;
+ optab re_binoptab = left ? rotr_optab : rotl_optab;
+ enum insn_code icode = optab_handler (binoptab, mode);
+ enum insn_code re_icode = optab_handler (re_binoptab, mode);
+
+ /* If the target can not support the reversed optab, then there
+ is nothing to do. */
+ if (re_icode == CODE_FOR_nothing)
+ return false;
+
+ /* If the target does not support the requested rotate-by-immediate,
+ then we want to try reversing the rotate. We also want to try
+ reversing to minimize the count. */
+ if ((icode == CODE_FOR_nothing)
+ || (!insn_operand_matches (icode, 2, op1))
+ || (IN_RANGE (INTVAL (op1),
+ GET_MODE_UNIT_PRECISION (mode) / 2 + left,
+ GET_MODE_UNIT_PRECISION (mode) - 1)))
+ return (insn_operand_matches (re_icode, 2, op1));
+ return false;
+}
+
/* Subroutine of simplify_binary_operation. Simplify a binary operation
CODE with result mode MODE, operating on OP0 and OP1. If OP0 and/or
OP1 are constant pool references, TRUEOP0 and TRUEOP1 represent the
@@ -4098,15 +4136,10 @@ simplify_context::simplify_binary_operation_1 (rtx_code code,
case ROTATE:
if (trueop1 == CONST0_RTX (mode))
return op0;
- /* Canonicalize rotates by constant amount. If op1 is bitsize / 2,
- prefer left rotation, if op1 is from bitsize / 2 + 1 to
- bitsize - 1, use other direction of rotate with 1 .. bitsize / 2 - 1
- amount instead. */
+ /* Canonicalize rotates by constant amount. If the condition of
+ reversing direction is met, then reverse the direction. */
#if defined(HAVE_rotate) && defined(HAVE_rotatert)
- if (CONST_INT_P (trueop1)
- && IN_RANGE (INTVAL (trueop1),
- GET_MODE_UNIT_PRECISION (mode) / 2 + (code == ROTATE),
- GET_MODE_UNIT_PRECISION (mode) - 1))
+ if (reverse_rotate_by_imm_p (mode, (code == ROTATE), trueop1))
{
int new_amount = GET_MODE_UNIT_PRECISION (mode) - INTVAL (trueop1);
rtx new_amount_rtx = gen_int_shift_amount (mode, new_amount);
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
new file mode 100644
index 00000000000..08053484cb2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
@@ -0,0 +1,52 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-lto -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-g" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+**foo1:
+** rori a0,a0,34
+** ret
+*/
+unsigned long foo1 (unsigned long rs1)
+{ return (rs1 >> (34)) | (rs1 << 30); }
+
+/*
+**foo2:
+** rori a0,a0,54
+** ret
+*/
+unsigned long foo2(unsigned long rs1)
+{
+ return (rs1 << 10) | (rs1 >> 54);
+}
+
+/*
+**foo3:
+** roriw a0,a0,20
+** ret
+*/
+unsigned int foo3(unsigned int rs1)
+{
+ return (rs1 >> 20) | (rs1 << 12);
+}
+
+/*
+**foo4:
+** roriw a0,a0,22
+** ret
+*/
+unsigned int foo4(unsigned int rs1)
+{
+ return (rs1 << 10) | (rs1 >> 22);
+}
+
+/*
+**foo5:
+** rorw a0,a0,a1
+** ret
+*/
+unsigned int foo5(unsigned int rs1, unsigned int rs2)
+{
+ return (rs1 >> rs2) | (rs1 << (32 - rs2));
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
new file mode 100644
index 00000000000..85090b1b0fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zbb -mabi=ilp32 -fno-lto -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-g" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+**foo1:
+** rori a0,a0,20
+** ret
+*/
+unsigned int foo1(unsigned int rs1)
+{
+ return (rs1 >> 20) | (rs1 << 12);
+}
+
+/*
+**foo2:
+** rori a0,a0,22
+** ret
+*/
+unsigned int foo2(unsigned int rs1)
+{
+ return (rs1 << 10) | (rs1 >> 22);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c
new file mode 100644
index 00000000000..70b79abb6ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-lto -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-g" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+**foo1:
+** roriw a0,a0,14
+** ret
+*/
+unsigned int foo1 (unsigned int rs1)
+{ return ((rs1 >> 14) | (rs1 << 18)); }
+
+/*
+**foo2:
+** roriw a0,a0,18
+** ret
+*/
+unsigned int foo2 (unsigned int rs1)
+{ return ((rs1 >> 18) | (rs1 << 14)); }
+
+/*
+**foo3:
+** roriw a0,a0,18
+** ret
+*/
+unsigned int foo3 (unsigned int rs1)
+{ return ((rs1 << 14) | (rs1 >> 18)); }
+
+/*
+**foo4:
+** roriw a0,a0,14
+** ret
+*/
+unsigned int foo4 (unsigned int rs1)
+{ return ((rs1 << 18) | (rs1 >> 14)); }
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c
new file mode 100644
index 00000000000..3b6ab385a85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-lto -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-g" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+**foo1:
+** rori a0,a0,34
+** ret
+*/
+unsigned long foo1 (unsigned long rs1)
+{
+ unsigned long tempt;
+ tempt = rs1 >> 30;
+ tempt = tempt << 2;
+ tempt = tempt >> 6;
+ rs1 = tempt | (rs1 << 30);
+ return rs1 ;
+}
+
+/*
+**foo2:
+** rori a0,a0,24
+** ret
+*/
+unsigned long foo2 (unsigned long rs1)
+{
+ unsigned long tempt;
+ tempt = rs1 >> 20;
+ tempt = tempt << 2;
+ tempt = tempt >> 6;
+ rs1 = tempt | (rs1 << 40);
+ return rs1 ;
+}
+
+/*
+**foo3:
+** rori a0,a0,40
+** ret
+*/
+unsigned long foo3 (unsigned long rs1)
+{
+ unsigned long tempt;
+ tempt = rs1 << 20;
+ tempt = tempt >> 2;
+ tempt = tempt << 6;
+ rs1 = tempt | (rs1 >> 40);
+ return rs1 ;
+}
+
+/*
+**foo4:
+** rori a0,a0,20
+** ret
+*/
+unsigned long foo4 (unsigned long rs1)
+{
+ unsigned long tempt;
+ tempt = rs1 << 40;
+ tempt = tempt >> 2;
+ tempt = tempt << 6;
+ rs1 = tempt | (rs1 >> 20);
+ return rs1 ;
+}
\ No newline at end of file
prev parent reply other threads:[~2023-04-17 17:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-06 9:11 [PATCH v2 0/1] " Feng Wang
2022-12-06 9:11 ` [PATCH v2 1/1] " Feng Wang
2023-04-17 17:53 ` Jeff Law [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=219ae2ab-78bd-c84f-d162-2e5f3fc71aa6@gmail.com \
--to=jeffreyalaw@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=kito.cheng@gmail.com \
--cc=palmer@dabbelt.com \
--cc=wangfeng@eswincomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).