public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] middle-end/113622 - allow .VEC_SET and .VEC_EXTRACT for global hard regs
@ 2024-01-29 10:24 Richard Biener
  0 siblings, 0 replies; 3+ messages in thread
From: Richard Biener @ 2024-01-29 10:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

The following expands .VEC_SET and .VEC_EXTRACT instruction selection
to global hard registers, not only automatic variables (possibly)
promoted to registers.  This can avoid some ICEs later and create
better code.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

	PR middle-end/113622
	* gimple-isel.cc (gimple_expand_vec_set_extract_expr):
	Also allow DECL_HARD_REGISTER variables.

	* gcc.target/i386/pr113622-1.c: New testcase.
---
 gcc/gimple-isel.cc                         |  3 ++-
 gcc/testsuite/gcc.target/i386/pr113622-1.c | 12 ++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr113622-1.c

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 7e2392ecd38..e94f292dd38 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function *fun,
       machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
       machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
 
-      if (auto_var_in_fn_p (view_op0, fun->decl)
+      if ((auto_var_in_fn_p (view_op0, fun->decl)
+	   || DECL_HARD_REGISTER (view_op0))
 	  && !TREE_ADDRESSABLE (view_op0)
 	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
 	      || (is_extract
diff --git a/gcc/testsuite/gcc.target/i386/pr113622-1.c b/gcc/testsuite/gcc.target/i386/pr113622-1.c
new file mode 100644
index 00000000000..2d6cb3c89a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr113622-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -w" } */
+
+typedef float __attribute__ ((vector_size (64))) vec;
+register vec a asm("zmm2"), b asm("zmm0"), c asm("zmm1");
+
+void
+test (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = a[i] < b[i] ? 0.1 : 0.2;
+}
-- 
2.35.3

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] middle-end/113622 - allow .VEC_SET and .VEC_EXTRACT for global hard regs
  2024-01-29 10:38 ` Jakub Jelinek
@ 2024-01-29 11:58   ` Richard Biener
  0 siblings, 0 replies; 3+ messages in thread
From: Richard Biener @ 2024-01-29 11:58 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Mon, 29 Jan 2024, Jakub Jelinek wrote:

> On Mon, Jan 29, 2024 at 11:24:58AM +0100, Richard Biener wrote:
> > The following expands .VEC_SET and .VEC_EXTRACT instruction selection
> > to global hard registers, not only automatic variables (possibly)
> > promoted to registers.  This can avoid some ICEs later and create
> > better code.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > OK?
> > 
> > Thanks,
> > Richard.
> > 
> > 	PR middle-end/113622
> > 	* gimple-isel.cc (gimple_expand_vec_set_extract_expr):
> > 	Also allow DECL_HARD_REGISTER variables.
> > 
> > 	* gcc.target/i386/pr113622-1.c: New testcase.
> > ---
> >  gcc/gimple-isel.cc                         |  3 ++-
> >  gcc/testsuite/gcc.target/i386/pr113622-1.c | 12 ++++++++++++
> >  2 files changed, 14 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113622-1.c
> > 
> > diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> > index 7e2392ecd38..e94f292dd38 100644
> > --- a/gcc/gimple-isel.cc
> > +++ b/gcc/gimple-isel.cc
> > @@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function *fun,
> >        machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> >        machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
> >  
> > -      if (auto_var_in_fn_p (view_op0, fun->decl)
> > +      if ((auto_var_in_fn_p (view_op0, fun->decl)
> > +	   || DECL_HARD_REGISTER (view_op0))
> >  	  && !TREE_ADDRESSABLE (view_op0)
> >  	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
> >  	      || (is_extract
> 
> All we know here from the earlier checks is DECL_P (view_op0), but
> DECL_HARD_REGISTER uses VAR_DECL_CHECK, shouldn't this be
> 	   || (VAR_P (view_op0) && DECL_HARD_REGISTER (view_op0)))
> instead?

Ah, yeah - will fix.

> > diff --git a/gcc/testsuite/gcc.target/i386/pr113622-1.c b/gcc/testsuite/gcc.target/i386/pr113622-1.c
> > new file mode 100644
> > index 00000000000..2d6cb3c89a8
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr113622-1.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mavx512f -w" } */
> > +
> > +typedef float __attribute__ ((vector_size (64))) vec;
> > +register vec a asm("zmm2"), b asm("zmm0"), c asm("zmm1");
> 
> I'd feel better if this used say zmm5, zmm6, zmm7 or something similar
> so that it doesn't clash with some of the implicitly used SSE
> registers, but on the other side still fit into 8 SSE registers
> which ia32 has access to.

OK, will adjust.

Thanks,
Richard.

> > +
> > +void
> > +test (void)
> > +{
> > +  for (int i = 0; i < 8; i++)
> > +    c[i] = a[i] < b[i] ? 0.1 : 0.2;
> > +}
> 
> Otherwise LGTM.
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] middle-end/113622 - allow .VEC_SET and .VEC_EXTRACT for global hard regs
       [not found] <20240129103038.CAAD13858439@sourceware.org>
@ 2024-01-29 10:38 ` Jakub Jelinek
  2024-01-29 11:58   ` Richard Biener
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Jelinek @ 2024-01-29 10:38 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Mon, Jan 29, 2024 at 11:24:58AM +0100, Richard Biener wrote:
> The following expands .VEC_SET and .VEC_EXTRACT instruction selection
> to global hard registers, not only automatic variables (possibly)
> promoted to registers.  This can avoid some ICEs later and create
> better code.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> OK?
> 
> Thanks,
> Richard.
> 
> 	PR middle-end/113622
> 	* gimple-isel.cc (gimple_expand_vec_set_extract_expr):
> 	Also allow DECL_HARD_REGISTER variables.
> 
> 	* gcc.target/i386/pr113622-1.c: New testcase.
> ---
>  gcc/gimple-isel.cc                         |  3 ++-
>  gcc/testsuite/gcc.target/i386/pr113622-1.c | 12 ++++++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113622-1.c
> 
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index 7e2392ecd38..e94f292dd38 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function *fun,
>        machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
>        machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
>  
> -      if (auto_var_in_fn_p (view_op0, fun->decl)
> +      if ((auto_var_in_fn_p (view_op0, fun->decl)
> +	   || DECL_HARD_REGISTER (view_op0))
>  	  && !TREE_ADDRESSABLE (view_op0)
>  	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
>  	      || (is_extract

All we know here from the earlier checks is DECL_P (view_op0), but
DECL_HARD_REGISTER uses VAR_DECL_CHECK, shouldn't this be
	   || (VAR_P (view_op0) && DECL_HARD_REGISTER (view_op0)))
instead?

> diff --git a/gcc/testsuite/gcc.target/i386/pr113622-1.c b/gcc/testsuite/gcc.target/i386/pr113622-1.c
> new file mode 100644
> index 00000000000..2d6cb3c89a8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr113622-1.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512f -w" } */
> +
> +typedef float __attribute__ ((vector_size (64))) vec;
> +register vec a asm("zmm2"), b asm("zmm0"), c asm("zmm1");

I'd feel better if this used say zmm5, zmm6, zmm7 or something similar
so that it doesn't clash with some of the implicitly used SSE
registers, but on the other side still fit into 8 SSE registers
which ia32 has access to.

> +
> +void
> +test (void)
> +{
> +  for (int i = 0; i < 8; i++)
> +    c[i] = a[i] < b[i] ? 0.1 : 0.2;
> +}

Otherwise LGTM.

	Jakub


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-01-29 12:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-29 10:24 [PATCH] middle-end/113622 - allow .VEC_SET and .VEC_EXTRACT for global hard regs Richard Biener
     [not found] <20240129103038.CAAD13858439@sourceware.org>
2024-01-29 10:38 ` Jakub Jelinek
2024-01-29 11:58   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).