public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH RFA] expand: empty class return optimization [PR88529]
@ 2021-06-15  3:24 Jason Merrill
  2021-06-21  7:40 ` Richard Biener
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Jason Merrill @ 2021-06-15  3:24 UTC (permalink / raw)
  To: gcc-patches

The x86_64 psABI says that an empty class isn't passed or returned in memory or
registers, so we shouldn't set %eax in this function.  Is this a reasonable
place to implement that?  Another possibility would be to remove the hack to
prevent i386.c:function_value_64 from returning NULL in this case and fix the
callers to deal, but that seems like more work.

The df-scan hunk catches the case where we look at a 0-length reg and build
a range the length of unsigned int, which happened before I changed
assign_parms to match expand_function_end.

Tested x86_64-pc-linux-gnu.  OK for trunk?

	PR target/88529

gcc/ChangeLog:

	* df-scan.c (df_ref_record): Check that regno < endregno.
	* function.c (assign_parms, expand_function_end): Do nothing with a
	TYPE_EMPTY_P result.

gcc/testsuite/ChangeLog:

	* g++.target/i386/empty-class1.C: New test.
---
 gcc/df-scan.c                                |  2 ++
 gcc/function.c                               | 16 ++++++++++------
 gcc/testsuite/g++.target/i386/empty-class1.C |  9 +++++++++
 3 files changed, 21 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/empty-class1.C

diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 1268536b3f0..e9da64ff3df 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -2595,6 +2595,8 @@ df_ref_record (enum df_ref_class cl,
 	    ref_flags |= DF_REF_PARTIAL;
 	  ref_flags |= DF_REF_MW_HARDREG;
 
+	  gcc_assert (regno < endregno);
+
 	  hardreg = problem_data->mw_reg_pool->allocate ();
 	  hardreg->type = ref_type;
 	  hardreg->flags = ref_flags;
diff --git a/gcc/function.c b/gcc/function.c
index 67576950983..6abaf3d116f 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3821,9 +3821,11 @@ assign_parms (tree fndecl)
       tree decl_result = DECL_RESULT (fndecl);
       rtx decl_rtl = DECL_RTL (decl_result);
 
-      if (REG_P (decl_rtl)
-	  ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
-	  : DECL_REGISTER (decl_result))
+      if ((REG_P (decl_rtl)
+	   ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
+	   : DECL_REGISTER (decl_result))
+	  /* Unless the psABI says not to.  */
+	  && !TYPE_EMPTY_P (TREE_TYPE (decl_result)))
 	{
 	  rtx real_decl_rtl;
 
@@ -5410,9 +5412,11 @@ expand_function_end (void)
       tree decl_result = DECL_RESULT (current_function_decl);
       rtx decl_rtl = DECL_RTL (decl_result);
 
-      if (REG_P (decl_rtl)
-	  ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
-	  : DECL_REGISTER (decl_result))
+      if ((REG_P (decl_rtl)
+	   ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
+	   : DECL_REGISTER (decl_result))
+	  /* Unless the psABI says not to.  */
+	  && !TYPE_EMPTY_P (TREE_TYPE (decl_result)))
 	{
 	  rtx real_decl_rtl = crtl->return_rtx;
 	  complex_mode cmode;
diff --git a/gcc/testsuite/g++.target/i386/empty-class1.C b/gcc/testsuite/g++.target/i386/empty-class1.C
new file mode 100644
index 00000000000..c1992772d26
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/empty-class1.C
@@ -0,0 +1,9 @@
+// PR target/88529
+// { dg-do compile { target { c++11 && x86_64-*-* } } }
+// { dg-additional-options -fdump-rtl-expand }
+// { dg-final { scan-rtl-dump-not "set" "expand" } }
+// The x86_64 psABI says that f() doesn't put the return value anywhere.
+
+class A{};
+
+A f() { return {}; }

base-commit: 08ce1f4c5091b80b680d15c53a17237544a3cca8
prerequisite-patch-id: 320d08bee3615919d0cf6a9d33bc2247aece51c7
-- 
2.27.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFA] expand: empty class return optimization [PR88529]
  2021-06-15  3:24 [PATCH RFA] expand: empty class return optimization [PR88529] Jason Merrill
@ 2021-06-21  7:40 ` Richard Biener
  2021-06-22 13:00 ` [PATCH] expand: Fix up empty class return optimization [PR101160] Jakub Jelinek
  2021-06-22 21:18 ` [PATCH RFA] expand: empty class return optimization [PR88529] Joseph Myers
  2 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2021-06-21  7:40 UTC (permalink / raw)
  To: Jason Merrill; +Cc: GCC Patches

On Tue, Jun 15, 2021 at 5:25 AM Jason Merrill via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> The x86_64 psABI says that an empty class isn't passed or returned in memory or
> registers, so we shouldn't set %eax in this function.  Is this a reasonable
> place to implement that?  Another possibility would be to remove the hack to
> prevent i386.c:function_value_64 from returning NULL in this case and fix the
> callers to deal, but that seems like more work.
>
> The df-scan hunk catches the case where we look at a 0-length reg and build
> a range the length of unsigned int, which happened before I changed
> assign_parms to match expand_function_end.
>
> Tested x86_64-pc-linux-gnu.  OK for trunk?

OK.

Thanks,
Richard.

>         PR target/88529
>
> gcc/ChangeLog:
>
>         * df-scan.c (df_ref_record): Check that regno < endregno.
>         * function.c (assign_parms, expand_function_end): Do nothing with a
>         TYPE_EMPTY_P result.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.target/i386/empty-class1.C: New test.
> ---
>  gcc/df-scan.c                                |  2 ++
>  gcc/function.c                               | 16 ++++++++++------
>  gcc/testsuite/g++.target/i386/empty-class1.C |  9 +++++++++
>  3 files changed, 21 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/i386/empty-class1.C
>
> diff --git a/gcc/df-scan.c b/gcc/df-scan.c
> index 1268536b3f0..e9da64ff3df 100644
> --- a/gcc/df-scan.c
> +++ b/gcc/df-scan.c
> @@ -2595,6 +2595,8 @@ df_ref_record (enum df_ref_class cl,
>             ref_flags |= DF_REF_PARTIAL;
>           ref_flags |= DF_REF_MW_HARDREG;
>
> +         gcc_assert (regno < endregno);
> +
>           hardreg = problem_data->mw_reg_pool->allocate ();
>           hardreg->type = ref_type;
>           hardreg->flags = ref_flags;
> diff --git a/gcc/function.c b/gcc/function.c
> index 67576950983..6abaf3d116f 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -3821,9 +3821,11 @@ assign_parms (tree fndecl)
>        tree decl_result = DECL_RESULT (fndecl);
>        rtx decl_rtl = DECL_RTL (decl_result);
>
> -      if (REG_P (decl_rtl)
> -         ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
> -         : DECL_REGISTER (decl_result))
> +      if ((REG_P (decl_rtl)
> +          ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
> +          : DECL_REGISTER (decl_result))
> +         /* Unless the psABI says not to.  */
> +         && !TYPE_EMPTY_P (TREE_TYPE (decl_result)))
>         {
>           rtx real_decl_rtl;
>
> @@ -5410,9 +5412,11 @@ expand_function_end (void)
>        tree decl_result = DECL_RESULT (current_function_decl);
>        rtx decl_rtl = DECL_RTL (decl_result);
>
> -      if (REG_P (decl_rtl)
> -         ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
> -         : DECL_REGISTER (decl_result))
> +      if ((REG_P (decl_rtl)
> +          ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
> +          : DECL_REGISTER (decl_result))
> +         /* Unless the psABI says not to.  */
> +         && !TYPE_EMPTY_P (TREE_TYPE (decl_result)))
>         {
>           rtx real_decl_rtl = crtl->return_rtx;
>           complex_mode cmode;
> diff --git a/gcc/testsuite/g++.target/i386/empty-class1.C b/gcc/testsuite/g++.target/i386/empty-class1.C
> new file mode 100644
> index 00000000000..c1992772d26
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/i386/empty-class1.C
> @@ -0,0 +1,9 @@
> +// PR target/88529
> +// { dg-do compile { target { c++11 && x86_64-*-* } } }
> +// { dg-additional-options -fdump-rtl-expand }
> +// { dg-final { scan-rtl-dump-not "set" "expand" } }
> +// The x86_64 psABI says that f() doesn't put the return value anywhere.
> +
> +class A{};
> +
> +A f() { return {}; }
>
> base-commit: 08ce1f4c5091b80b680d15c53a17237544a3cca8
> prerequisite-patch-id: 320d08bee3615919d0cf6a9d33bc2247aece51c7
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] expand: Fix up empty class return optimization [PR101160]
  2021-06-15  3:24 [PATCH RFA] expand: empty class return optimization [PR88529] Jason Merrill
  2021-06-21  7:40 ` Richard Biener
@ 2021-06-22 13:00 ` Jakub Jelinek
  2021-06-22 13:15   ` Richard Biener
  2021-06-22 21:18 ` [PATCH RFA] expand: empty class return optimization [PR88529] Joseph Myers
  2 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2021-06-22 13:00 UTC (permalink / raw)
  To: Richard Biener, Eric Botcazou, Jason Merrill; +Cc: gcc-patches

On Mon, Jun 14, 2021 at 11:24:22PM -0400, Jason Merrill via Gcc-patches wrote:
> The x86_64 psABI says that an empty class isn't passed or returned in memory or
> registers, so we shouldn't set %eax in this function.  Is this a reasonable
> place to implement that?  Another possibility would be to remove the hack to
> prevent i386.c:function_value_64 from returning NULL in this case and fix the
> callers to deal, but that seems like more work.
> 
> The df-scan hunk catches the case where we look at a 0-length reg and build
> a range the length of unsigned int, which happened before I changed
> assign_parms to match expand_function_end.

The assign_params change unfortunately breaks e.g. the following testcase.
The problem is that some passes (e.g. subreg lowering but assign_parms
comments also talk about delayed slot scheduling) rely on crtl->return_rtx
not to contain pseudo registers, and the assign_parms change results
in the pseudo in there not being replaced with a hard register.

The following patch instead clears the crtl->return_rtx if a function
returns TYPE_EMPTY_P structure, that way (use (pseudo)) is not emitted
into the IL and it is treated like more like functions returning void.

I've also changed the effective target on the empty-class1.C testcase, so
that it doesn't fail on x86_64-linux with -m32 testing.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-06-22  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/101160
	* function.c (assign_parms): For decl_result with TYPE_EMPTY_P type
	clear crtl->return_rtx instead of keeping it referencing a pseudo.

	* g++.target/i386/empty-class1.C: Require lp64 effective target
	instead of x86_64-*-*.
	* g++.target/i386/empty-class2.C: New test.

--- gcc/function.c.jj	2021-06-22 10:04:46.000000000 +0200
+++ gcc/function.c	2021-06-22 11:30:36.615264498 +0200
@@ -3821,17 +3821,22 @@ assign_parms (tree fndecl)
       tree decl_result = DECL_RESULT (fndecl);
       rtx decl_rtl = DECL_RTL (decl_result);
 
-      if ((REG_P (decl_rtl)
-	   ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
-	   : DECL_REGISTER (decl_result))
-	  /* Unless the psABI says not to.  */
-	  && !TYPE_EMPTY_P (TREE_TYPE (decl_result)))
+      if (REG_P (decl_rtl)
+	  ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
+	  : DECL_REGISTER (decl_result))
 	{
 	  rtx real_decl_rtl;
 
-	  real_decl_rtl = targetm.calls.function_value (TREE_TYPE (decl_result),
-							fndecl, true);
-	  REG_FUNCTION_VALUE_P (real_decl_rtl) = 1;
+	  /* Unless the psABI says not to.  */
+	  if (TYPE_EMPTY_P (TREE_TYPE (decl_result)))
+	    real_decl_rtl = NULL_RTX;
+	  else
+	    {
+	      real_decl_rtl
+		= targetm.calls.function_value (TREE_TYPE (decl_result),
+						fndecl, true);
+	      REG_FUNCTION_VALUE_P (real_decl_rtl) = 1;
+	    }
 	  /* The delay slot scheduler assumes that crtl->return_rtx
 	     holds the hard register containing the return value, not a
 	     temporary pseudo.  */
--- gcc/testsuite/g++.target/i386/empty-class1.C.jj	2021-06-22 10:04:46.377208914 +0200
+++ gcc/testsuite/g++.target/i386/empty-class1.C	2021-06-22 11:37:53.463375502 +0200
@@ -1,5 +1,5 @@
 // PR target/88529
-// { dg-do compile { target { c++11 && x86_64-*-* } } }
+// { dg-do compile { target { c++11 && lp64 } } }
 // { dg-additional-options -fdump-rtl-expand }
 // { dg-final { scan-rtl-dump-not "set" "expand" } }
 // The x86_64 psABI says that f() doesn't put the return value anywhere.
--- gcc/testsuite/g++.target/i386/empty-class2.C.jj	2021-06-22 11:34:53.422805115 +0200
+++ gcc/testsuite/g++.target/i386/empty-class2.C	2021-06-22 11:35:34.048257864 +0200
@@ -0,0 +1,20 @@
+// PR middle-end/101160
+// Test passing aligned empty aggregate
+// { dg-do compile }
+// { dg-options "-O2" }
+// { dg-additional-options "-Wno-psabi" { target { { i?86-*-* x86_64-*-* } && ilp32 } } }
+
+struct S { union {} a; } __attribute__((aligned));
+
+S
+foo (S arg)
+{
+  return arg;
+}
+
+void
+bar (void)
+{
+  S arg;
+  foo (arg);
+}


	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] expand: Fix up empty class return optimization [PR101160]
  2021-06-22 13:00 ` [PATCH] expand: Fix up empty class return optimization [PR101160] Jakub Jelinek
@ 2021-06-22 13:15   ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2021-06-22 13:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, Jason Merrill, gcc-patches

On Tue, 22 Jun 2021, Jakub Jelinek wrote:

> On Mon, Jun 14, 2021 at 11:24:22PM -0400, Jason Merrill via Gcc-patches wrote:
> > The x86_64 psABI says that an empty class isn't passed or returned in memory or
> > registers, so we shouldn't set %eax in this function.  Is this a reasonable
> > place to implement that?  Another possibility would be to remove the hack to
> > prevent i386.c:function_value_64 from returning NULL in this case and fix the
> > callers to deal, but that seems like more work.
> > 
> > The df-scan hunk catches the case where we look at a 0-length reg and build
> > a range the length of unsigned int, which happened before I changed
> > assign_parms to match expand_function_end.
> 
> The assign_params change unfortunately breaks e.g. the following testcase.
> The problem is that some passes (e.g. subreg lowering but assign_parms
> comments also talk about delayed slot scheduling) rely on crtl->return_rtx
> not to contain pseudo registers, and the assign_parms change results
> in the pseudo in there not being replaced with a hard register.
> 
> The following patch instead clears the crtl->return_rtx if a function
> returns TYPE_EMPTY_P structure, that way (use (pseudo)) is not emitted
> into the IL and it is treated like more like functions returning void.
> 
> I've also changed the effective target on the empty-class1.C testcase, so
> that it doesn't fail on x86_64-linux with -m32 testing.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.
 
> 2021-06-22  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/101160
> 	* function.c (assign_parms): For decl_result with TYPE_EMPTY_P type
> 	clear crtl->return_rtx instead of keeping it referencing a pseudo.
> 
> 	* g++.target/i386/empty-class1.C: Require lp64 effective target
> 	instead of x86_64-*-*.
> 	* g++.target/i386/empty-class2.C: New test.
> 
> --- gcc/function.c.jj	2021-06-22 10:04:46.000000000 +0200
> +++ gcc/function.c	2021-06-22 11:30:36.615264498 +0200
> @@ -3821,17 +3821,22 @@ assign_parms (tree fndecl)
>        tree decl_result = DECL_RESULT (fndecl);
>        rtx decl_rtl = DECL_RTL (decl_result);
>  
> -      if ((REG_P (decl_rtl)
> -	   ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
> -	   : DECL_REGISTER (decl_result))
> -	  /* Unless the psABI says not to.  */
> -	  && !TYPE_EMPTY_P (TREE_TYPE (decl_result)))
> +      if (REG_P (decl_rtl)
> +	  ? REGNO (decl_rtl) >= FIRST_PSEUDO_REGISTER
> +	  : DECL_REGISTER (decl_result))
>  	{
>  	  rtx real_decl_rtl;
>  
> -	  real_decl_rtl = targetm.calls.function_value (TREE_TYPE (decl_result),
> -							fndecl, true);
> -	  REG_FUNCTION_VALUE_P (real_decl_rtl) = 1;
> +	  /* Unless the psABI says not to.  */
> +	  if (TYPE_EMPTY_P (TREE_TYPE (decl_result)))
> +	    real_decl_rtl = NULL_RTX;
> +	  else
> +	    {
> +	      real_decl_rtl
> +		= targetm.calls.function_value (TREE_TYPE (decl_result),
> +						fndecl, true);
> +	      REG_FUNCTION_VALUE_P (real_decl_rtl) = 1;
> +	    }
>  	  /* The delay slot scheduler assumes that crtl->return_rtx
>  	     holds the hard register containing the return value, not a
>  	     temporary pseudo.  */
> --- gcc/testsuite/g++.target/i386/empty-class1.C.jj	2021-06-22 10:04:46.377208914 +0200
> +++ gcc/testsuite/g++.target/i386/empty-class1.C	2021-06-22 11:37:53.463375502 +0200
> @@ -1,5 +1,5 @@
>  // PR target/88529
> -// { dg-do compile { target { c++11 && x86_64-*-* } } }
> +// { dg-do compile { target { c++11 && lp64 } } }
>  // { dg-additional-options -fdump-rtl-expand }
>  // { dg-final { scan-rtl-dump-not "set" "expand" } }
>  // The x86_64 psABI says that f() doesn't put the return value anywhere.
> --- gcc/testsuite/g++.target/i386/empty-class2.C.jj	2021-06-22 11:34:53.422805115 +0200
> +++ gcc/testsuite/g++.target/i386/empty-class2.C	2021-06-22 11:35:34.048257864 +0200
> @@ -0,0 +1,20 @@
> +// PR middle-end/101160
> +// Test passing aligned empty aggregate
> +// { dg-do compile }
> +// { dg-options "-O2" }
> +// { dg-additional-options "-Wno-psabi" { target { { i?86-*-* x86_64-*-* } && ilp32 } } }
> +
> +struct S { union {} a; } __attribute__((aligned));
> +
> +S
> +foo (S arg)
> +{
> +  return arg;
> +}
> +
> +void
> +bar (void)
> +{
> +  S arg;
> +  foo (arg);
> +}
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFA] expand: empty class return optimization [PR88529]
  2021-06-15  3:24 [PATCH RFA] expand: empty class return optimization [PR88529] Jason Merrill
  2021-06-21  7:40 ` Richard Biener
  2021-06-22 13:00 ` [PATCH] expand: Fix up empty class return optimization [PR101160] Jakub Jelinek
@ 2021-06-22 21:18 ` Joseph Myers
  2 siblings, 0 replies; 5+ messages in thread
From: Joseph Myers @ 2021-06-22 21:18 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches

This introduces an ICE building libgomp for ColdFire, as detected by my 
glibc bot.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101170

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-22 21:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-15  3:24 [PATCH RFA] expand: empty class return optimization [PR88529] Jason Merrill
2021-06-21  7:40 ` Richard Biener
2021-06-22 13:00 ` [PATCH] expand: Fix up empty class return optimization [PR101160] Jakub Jelinek
2021-06-22 13:15   ` Richard Biener
2021-06-22 21:18 ` [PATCH RFA] expand: empty class return optimization [PR88529] Joseph Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).