public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] internal-fn: Do not force vcond operand to reg.
@ 2024-05-10 13:17 Robin Dapp
  2024-05-10 13:21 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Dapp @ 2024-05-10 13:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: rdapp.gcc

Hi,

this only forces the first comparison operator into a register if it is
not already suitable.

Bootstrap and regtest is running on x86 and aarch64, successful on p10.
Regtested on riscv.

gcc/ChangeLog:

	PR middle-end/113474

	* internal-fn.cc (expand_vec_cond_mask_optab_fn):  Only force
	op1 to reg if necessary.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr113474.c: New test.

Regards
 Robin

---
 gcc/internal-fn.cc                                  |  3 ++-
 .../gcc.target/riscv/rvv/autovec/pr113474.c         | 13 +++++++++++++
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 2c764441cde..72cc6b7a1f7 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3164,7 +3164,8 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   rtx_op2 = expand_normal (op2);
 
   mask = force_reg (mask_mode, mask);
-  rtx_op1 = force_reg (mode, rtx_op1);
+  if (!insn_operand_matches (icode, 1, rtx_op1))
+    rtx_op1 = force_reg (mode, rtx_op1);
 
   rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   create_output_operand (&ops[0], target, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
new file mode 100644
index 00000000000..0364bf9f5e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target riscv_v } }  */
+/* { dg-additional-options "-std=c99" }  */
+
+void
+foo (int n, int **a)
+{
+  int b;
+  for (b = 0; b < n; b++)
+    for (long e = 8; e > 0; e--)
+      a[b][e] = a[b][e] == 15;
+}
+
+/* { dg-final { scan-assembler "vmerge.vim" } }  */
-- 
2.45.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] internal-fn: Do not force vcond operand to reg.
  2024-05-10 13:17 [PATCH] internal-fn: Do not force vcond operand to reg Robin Dapp
@ 2024-05-10 13:21 ` Richard Biener
  2024-05-13  6:18   ` Robin Dapp
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2024-05-10 13:21 UTC (permalink / raw)
  To: Robin Dapp; +Cc: gcc-patches

On Fri, May 10, 2024 at 3:18 PM Robin Dapp <rdapp.gcc@gmail.com> wrote:
>
> Hi,
>
> this only forces the first comparison operator into a register if it is
> not already suitable.
>
> Bootstrap and regtest is running on x86 and aarch64, successful on p10.
> Regtested on riscv.

How does this make a difference in the end?  I'd expect say forwprop to
fix things?

> gcc/ChangeLog:
>
>         PR middle-end/113474
>
>         * internal-fn.cc (expand_vec_cond_mask_optab_fn):  Only force
>         op1 to reg if necessary.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/autovec/pr113474.c: New test.
>
> Regards
>  Robin
>
> ---
>  gcc/internal-fn.cc                                  |  3 ++-
>  .../gcc.target/riscv/rvv/autovec/pr113474.c         | 13 +++++++++++++
>  2 files changed, 15 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 2c764441cde..72cc6b7a1f7 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -3164,7 +3164,8 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>    rtx_op2 = expand_normal (op2);
>
>    mask = force_reg (mask_mode, mask);
> -  rtx_op1 = force_reg (mode, rtx_op1);
> +  if (!insn_operand_matches (icode, 1, rtx_op1))
> +    rtx_op1 = force_reg (mode, rtx_op1);
>
>    rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>    create_output_operand (&ops[0], target, mode);
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
> new file mode 100644
> index 00000000000..0364bf9f5e3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target riscv_v } }  */
> +/* { dg-additional-options "-std=c99" }  */
> +
> +void
> +foo (int n, int **a)
> +{
> +  int b;
> +  for (b = 0; b < n; b++)
> +    for (long e = 8; e > 0; e--)
> +      a[b][e] = a[b][e] == 15;
> +}
> +
> +/* { dg-final { scan-assembler "vmerge.vim" } }  */
> --
> 2.45.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] internal-fn: Do not force vcond operand to reg.
  2024-05-10 13:21 ` Richard Biener
@ 2024-05-13  6:18   ` Robin Dapp
  2024-05-13  7:53     ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Dapp @ 2024-05-13  6:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: rdapp.gcc, gcc-patches

> How does this make a difference in the end?  I'd expect say forwprop to
> fix things?

In general we try to only add the masking "boilerplate" of our
instructions at split time so fwprop, combine et al. can do their
work uninhibited of it (and we don't need numerous
(if_then_else ... (if_then_else) ...) combinations in our patterns).
A vec constant we expand directly to a masked representation, though
which makes further simplification difficult.  I can experiment with
changing that if preferred.

My thinking was, however, that for other operations like binops we
directly emit the right variant via expand_operands without
forcing to a reg and don't even need to fwprop so I wanted to
imitate that.

Regards
 Robin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] internal-fn: Do not force vcond operand to reg.
  2024-05-13  6:18   ` Robin Dapp
@ 2024-05-13  7:53     ` Richard Biener
  2024-05-13 14:14       ` Robin Dapp
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2024-05-13  7:53 UTC (permalink / raw)
  To: Robin Dapp; +Cc: gcc-patches

On Mon, May 13, 2024 at 8:18 AM Robin Dapp <rdapp.gcc@gmail.com> wrote:
>
> > How does this make a difference in the end?  I'd expect say forwprop to
> > fix things?
>
> In general we try to only add the masking "boilerplate" of our
> instructions at split time so fwprop, combine et al. can do their
> work uninhibited of it (and we don't need numerous
> (if_then_else ... (if_then_else) ...) combinations in our patterns).
> A vec constant we expand directly to a masked representation, though
> which makes further simplification difficult.  I can experiment with
> changing that if preferred.
>
> My thinking was, however, that for other operations like binops we
> directly emit the right variant via expand_operands without
> forcing to a reg and don't even need to fwprop so I wanted to
> imitate that.

Ah, so yeah, it probably makes sense for constants.  Btw,
there's prepare_operand which I think might be better for
its CONST_INT handling?  I can also see we usually do not
bother with force_reg, the force_reg was added with the
initial r6-4696-ga414c77f2a30bb already.

What happens if we simply remove all of the force_reg here?

Thanks,
Richard.

> Regards
>  Robin
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] internal-fn: Do not force vcond operand to reg.
  2024-05-13  7:53     ` Richard Biener
@ 2024-05-13 14:14       ` Robin Dapp
  2024-05-13 14:19         ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Dapp @ 2024-05-13 14:14 UTC (permalink / raw)
  To: Richard Biener; +Cc: rdapp.gcc, gcc-patches

> What happens if we simply remove all of the force_reg here?

On x86 I bootstrapped and tested the attached without fallout
(gcc188, so it's no avx512-native machine and therefore limited
coverage).  riscv regtest is unchanged.
For aarch64 I would to rely on the pre-commit CI to pick it
up (does that work on sub-threads?).

Regards
 Robin


gcc/ChangeLog:

	PR middle-end/113474

	* internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
	force_regs.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr113474.c: New test.
---
 gcc/internal-fn.cc                                  |  3 ---
 .../gcc.target/riscv/rvv/autovec/pr113474.c         | 13 +++++++++++++
 2 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 2c764441cde..4d226c478b4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   rtx_op1 = expand_normal (op1);
   rtx_op2 = expand_normal (op2);
 
-  mask = force_reg (mask_mode, mask);
-  rtx_op1 = force_reg (mode, rtx_op1);
-
   rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   create_output_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], rtx_op1, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
new file mode 100644
index 00000000000..0364bf9f5e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target riscv_v } }  */
+/* { dg-additional-options "-std=c99" }  */
+
+void
+foo (int n, int **a)
+{
+  int b;
+  for (b = 0; b < n; b++)
+    for (long e = 8; e > 0; e--)
+      a[b][e] = a[b][e] == 15;
+}
+
+/* { dg-final { scan-assembler "vmerge.vim" } }  */
-- 
2.45.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] internal-fn: Do not force vcond operand to reg.
  2024-05-13 14:14       ` Robin Dapp
@ 2024-05-13 14:19         ` Richard Biener
  2024-05-17 10:35           ` Robin Dapp
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2024-05-13 14:19 UTC (permalink / raw)
  To: Robin Dapp; +Cc: gcc-patches

On Mon, May 13, 2024 at 4:14 PM Robin Dapp <rdapp.gcc@gmail.com> wrote:
>
> > What happens if we simply remove all of the force_reg here?
>
> On x86 I bootstrapped and tested the attached without fallout
> (gcc188, so it's no avx512-native machine and therefore limited
> coverage).  riscv regtest is unchanged.
> For aarch64 I would to rely on the pre-commit CI to pick it
> up (does that work on sub-threads?).

OK if that pre-commit CI works out.

Richard.

> Regards
>  Robin
>
>
> gcc/ChangeLog:
>
>         PR middle-end/113474
>
>         * internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
>         force_regs.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/autovec/pr113474.c: New test.
> ---
>  gcc/internal-fn.cc                                  |  3 ---
>  .../gcc.target/riscv/rvv/autovec/pr113474.c         | 13 +++++++++++++
>  2 files changed, 13 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 2c764441cde..4d226c478b4 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>    rtx_op1 = expand_normal (op1);
>    rtx_op2 = expand_normal (op2);
>
> -  mask = force_reg (mask_mode, mask);
> -  rtx_op1 = force_reg (mode, rtx_op1);
> -
>    rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>    create_output_operand (&ops[0], target, mode);
>    create_input_operand (&ops[1], rtx_op1, mode);
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
> new file mode 100644
> index 00000000000..0364bf9f5e3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target riscv_v } }  */
> +/* { dg-additional-options "-std=c99" }  */
> +
> +void
> +foo (int n, int **a)
> +{
> +  int b;
> +  for (b = 0; b < n; b++)
> +    for (long e = 8; e > 0; e--)
> +      a[b][e] = a[b][e] == 15;
> +}
> +
> +/* { dg-final { scan-assembler "vmerge.vim" } }  */
> --
> 2.45.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] internal-fn: Do not force vcond operand to reg.
  2024-05-13 14:19         ` Richard Biener
@ 2024-05-17 10:35           ` Robin Dapp
  0 siblings, 0 replies; 7+ messages in thread
From: Robin Dapp @ 2024-05-17 10:35 UTC (permalink / raw)
  To: Richard Biener; +Cc: rdapp.gcc, gcc-patches

> OK if that pre-commit CI works out.

The CI didn't pick it up, guess it needs to be a bit more explicit.
In the meanwhile, however, I managed to catch a short window with
> 10G free on gcc185 =>  Bootstrap and regtest successful on aarch64.
Going to push the patch later today.

Regards
 Robin

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-17 10:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-10 13:17 [PATCH] internal-fn: Do not force vcond operand to reg Robin Dapp
2024-05-10 13:21 ` Richard Biener
2024-05-13  6:18   ` Robin Dapp
2024-05-13  7:53     ` Richard Biener
2024-05-13 14:14       ` Robin Dapp
2024-05-13 14:19         ` Richard Biener
2024-05-17 10:35           ` Robin Dapp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).