* [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter
@ 2024-07-26 14:17 Jeff Law
2024-07-26 19:18 ` Philipp Tomsich
0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2024-07-26 14:17 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2277 bytes --]
pr116085 is a long standing (since late 2022) regression on the riscv
port.
A patch introduced a pattern to avoid unnecessary extensions when doing
a min/max operation where one of the values is a 32 bit positive constant.
> (define_insn_and_split "*minmax"
> [(set (match_operand:DI 0 "register_operand" "=r")
> (sign_extend:DI
> (subreg:SI
> (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
> (match_operand:DI 2 "immediate_operand" "i"))
> 0)))
> (clobber (match_scratch:DI 3 "=&r"))
> (clobber (match_scratch:DI 4 "=&r"))]
> "TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0"
> "#"
> "&& reload_completed"
> [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
> (set (match_dup 4) (match_dup 2))
> (set (match_dup 0) (<minmax_optab>:DI (match_dup 3) (match_dup 4)))]
Lots going on in here. The key is the nonconstant value is zero
extended from SI to DI in the original RTL and we know the constant
value is unchanged if we were to sign extend it from 32 to 64 bits.
We change the extension of the nonconstant operand from zero to sign
extension. I'm pretty confident the goal there is take advantage of the
fact that SI values are kept sign extended and will often be optimized away.
The problem occurs when the nonconstant operand has the SI sign bit set.
As an example:
smax (0x8000000, 0x7) resulting in 0x80000000
The split RTL will generate
smax (sign_extend (0x80000000), 0x7))
smax (0xffffffff80000000, 0x7) resulting in 0x7
Opps.
We really needed to change the opcode to umax for this transformation to
work. That's easy enough. But there's further improvements we can make.
First the pattern is a define_and_split with a post-reload split
condition. It would be better implemented as a 4->3 define_split so
that the costing model just works. Second, if operands[1] is a suitably
promoted subreg, then we can elide the sign extension when we generate
the split code, so often it'll be a 4->2 split, again with the cost
model working with no adjustments needed.
Tested on rv32 and rv64 in my tester. I'll wait for the pre-commit
tester to spin it as well.
Jeff
[-- Attachment #2: P --]
[-- Type: text/plain, Size: 4256 bytes --]
PR target/116085
gcc/
* config/riscv/bitmanip.md (minmax extension avoidance splitter):
Rewrite as a simpler define_split. Adjust the opcode appropriately.
Avoid emitting sign extension if it's clearly not needed.
* config/riscv/iterators.md (minmax_optab): Rename to uminmax_optab
and map everything to unsigned variants.
gcc/testsuite/
* gcc.target/riscv/pr116085.c: New test.
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 9fc5215d6e3..e8876bf3c94 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -549,23 +549,34 @@ (define_insn "*<bitmanip_optab><mode>3"
;; Optimize the common case of a SImode min/max against a constant
;; that is safe both for sign- and zero-extension.
-(define_insn_and_split "*minmax"
- [(set (match_operand:DI 0 "register_operand" "=r")
+(define_split
+ [(set (match_operand:DI 0 "register_operand")
(sign_extend:DI
(subreg:SI
- (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
- (match_operand:DI 2 "immediate_operand" "i"))
+ (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 "register_operand"))
+ (match_operand:DI 2 "immediate_operand"))
0)))
- (clobber (match_scratch:DI 3 "=&r"))
- (clobber (match_scratch:DI 4 "=&r"))]
+ (clobber (match_operand:DI 3 "register_operand"))
+ (clobber (match_operand:DI 4 "register_operand"))]
"TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0"
- "#"
- "&& reload_completed"
- [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
- (set (match_dup 4) (match_dup 2))
- (set (match_dup 0) (<minmax_optab>:DI (match_dup 3) (match_dup 4)))]
- ""
- [(set_attr "type" "bitmanip")])
+ [(set (match_dup 0) (<uminmax_optab>:DI (match_dup 4) (match_dup 3)))]
+ "
+{
+ /* Load the constant into a regsiter. */
+ emit_move_insn (operands[3], operands[2]);
+
+ /* Sign extend operands[1] if it isn't extended already. */
+ /* If operands[1] is a sign extended SUBREG, then we can use it
+ directly. Otherwise extend it into another temporary. */
+ if (SUBREG_P (operands[1])
+ && SUBREG_PROMOTED_VAR_P (operands[1])
+ && SUBREG_PROMOTED_SIGNED_P (operands[1]))
+ operands[4] = SUBREG_REG (operands[1]);
+ else
+ emit_move_insn (operands[4], gen_rtx_SIGN_EXTEND (DImode, operands[1]));
+
+ /* The minmax is actually emitted from the split pattern. */
+}")
;; ZBS extension.
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 734da041f0c..0a669f560e3 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -327,10 +327,11 @@ (define_code_attr atomic_optab
[(plus "add") (ior "or") (xor "xor") (and "and")])
; bitmanip code attributes
-(define_code_attr minmax_optab [(smin "smin")
- (smax "smax")
- (umin "umin")
- (umax "umax")])
+;; Unsigned variant of a min/max optab.
+(define_code_attr uminmax_optab [(smin "umin")
+ (smax "umax")
+ (umin "umin")
+ (umax "umax")])
(define_code_attr bitmanip_optab [(smin "smin")
(smax "smax")
(umin "umin")
diff --git a/gcc/testsuite/gcc.target/riscv/pr116085.c b/gcc/testsuite/gcc.target/riscv/pr116085.c
new file mode 100644
index 00000000000..998d82bd235
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr116085.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-ext-dce" } */
+
+extern void abort (void);
+
+int a = 2;
+unsigned b = 0x80000000;
+int arr_5[2][23];
+void test(int, unsigned, int);
+int main() {
+ test(a, b, 1);
+ if (arr_5[1][0] != -2147483648)
+ abort ();
+ return 0;
+}
+
+#define c(a, b) \
+ ({ \
+ long d = a; \
+ long e = b; \
+ d > e ? d : e; \
+ })
+__attribute__((noipa))
+void test(int f, unsigned g, int h) {
+ for (int i = 0; i < h; i = f)
+ arr_5[1][i] = h ? c(g, 7) : 0;
+}
+
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter
2024-07-26 14:17 [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter Jeff Law
@ 2024-07-26 19:18 ` Philipp Tomsich
2024-07-26 22:07 ` Jeff Law
0 siblings, 1 reply; 3+ messages in thread
From: Philipp Tomsich @ 2024-07-26 19:18 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
Nitpick: a typo slipped into the comment — "regsiter" -> "register".
On Fri, 26 Jul 2024 at 16:18, Jeff Law <jeffreyalaw@gmail.com> wrote:
>
> pr116085 is a long standing (since late 2022) regression on the riscv
> port.
>
> A patch introduced a pattern to avoid unnecessary extensions when doing
> a min/max operation where one of the values is a 32 bit positive constant.
>
> > (define_insn_and_split "*minmax"
> > [(set (match_operand:DI 0 "register_operand" "=r")
> > (sign_extend:DI
> > (subreg:SI
> > (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
> > (match_operand:DI 2 "immediate_operand" "i"))
> > 0)))
> > (clobber (match_scratch:DI 3 "=&r"))
> > (clobber (match_scratch:DI 4 "=&r"))]
> > "TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0"
> > "#"
> > "&& reload_completed"
> > [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
> > (set (match_dup 4) (match_dup 2))
> > (set (match_dup 0) (<minmax_optab>:DI (match_dup 3) (match_dup 4)))]
>
>
> Lots going on in here. The key is the nonconstant value is zero
> extended from SI to DI in the original RTL and we know the constant
> value is unchanged if we were to sign extend it from 32 to 64 bits.
>
> We change the extension of the nonconstant operand from zero to sign
> extension. I'm pretty confident the goal there is take advantage of the
> fact that SI values are kept sign extended and will often be optimized away.
>
> The problem occurs when the nonconstant operand has the SI sign bit set.
> As an example:
>
> smax (0x8000000, 0x7) resulting in 0x80000000
>
> The split RTL will generate
> smax (sign_extend (0x80000000), 0x7))
>
> smax (0xffffffff80000000, 0x7) resulting in 0x7
>
> Opps.
>
> We really needed to change the opcode to umax for this transformation to
> work. That's easy enough. But there's further improvements we can make.
>
> First the pattern is a define_and_split with a post-reload split
> condition. It would be better implemented as a 4->3 define_split so
> that the costing model just works. Second, if operands[1] is a suitably
> promoted subreg, then we can elide the sign extension when we generate
> the split code, so often it'll be a 4->2 split, again with the cost
> model working with no adjustments needed.
>
> Tested on rv32 and rv64 in my tester. I'll wait for the pre-commit
> tester to spin it as well.
>
>
>
> Jeff
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter
2024-07-26 19:18 ` Philipp Tomsich
@ 2024-07-26 22:07 ` Jeff Law
0 siblings, 0 replies; 3+ messages in thread
From: Jeff Law @ 2024-07-26 22:07 UTC (permalink / raw)
To: Philipp Tomsich; +Cc: gcc-patches
On 7/26/24 1:18 PM, Philipp Tomsich wrote:
> Nitpick: a typo slipped into the comment — "regsiter" -> "register".
Thanks. The pre-commit tester also pointed out a couple formatting
nits. I'll fix both.
Jeff
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-07-26 22:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-26 14:17 [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter Jeff Law
2024-07-26 19:18 ` Philipp Tomsich
2024-07-26 22:07 ` Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).