public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
@ 2017-05-28 13:31 Segher Boessenkool
  2017-05-31 13:31 ` Segher Boessenkool
  2017-06-23  4:59 ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: Segher Boessenkool @ 2017-05-28 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool

__atomic_add_fetch adds a value to some memory, and returns the result.
If there is no direct support for this, expand_builtin_atomic_fetch_op
is asked to implement this as __atomic_fetch_add (which returns the
original value of the mem), followed by the addition.  Now, the
__atomic_add_fetch could have been a tail call, but we shouldn't
perform the __atomic_fetch_add as a tail call: following code would
not be executed, and in fact thrown away because there is a barrier
after tail calls.

This fixes it.

Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?


Segher


2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>

	PR middle-end/80902
	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
	a call, force the call to not be a tail call.

---
 gcc/builtins.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 4f6c9c4..3a70693 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target,
   gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
   TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
 
+  /* If we will emit code after the call, the call can not be a tail call.
+     If it is emitted as a tail call, a barrier is emitted after it, and
+     then all trailing code is removed.  */
+  if (!ignore)
+    CALL_EXPR_TAILCALL (exp) = 0;
+
   /* Expand the call here so we can emit trailing code.  */
   ret = expand_call (exp, target, ignore);
 
-- 
1.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-05-28 13:31 [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902) Segher Boessenkool
@ 2017-05-31 13:31 ` Segher Boessenkool
  2017-06-11  0:15   ` [ping x2] " Segher Boessenkool
  2017-06-23  4:59 ` Jeff Law
  1 sibling, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2017-05-31 13:31 UTC (permalink / raw)
  To: gcc-patches

Ping.

(Sorry for the very aggressive ping; this fixes 764 testsuite failures
on powerpc-linux).


Segher


On Sun, May 28, 2017 at 12:31:12PM +0000, Segher Boessenkool wrote:
> __atomic_add_fetch adds a value to some memory, and returns the result.
> If there is no direct support for this, expand_builtin_atomic_fetch_op
> is asked to implement this as __atomic_fetch_add (which returns the
> original value of the mem), followed by the addition.  Now, the
> __atomic_add_fetch could have been a tail call, but we shouldn't
> perform the __atomic_fetch_add as a tail call: following code would
> not be executed, and in fact thrown away because there is a barrier
> after tail calls.
> 
> This fixes it.
> 
> Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>
> 
> 	PR middle-end/80902
> 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> 	a call, force the call to not be a tail call.
> 
> ---
>  gcc/builtins.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 4f6c9c4..3a70693 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target,
>    gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
>    TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
>  
> +  /* If we will emit code after the call, the call can not be a tail call.
> +     If it is emitted as a tail call, a barrier is emitted after it, and
> +     then all trailing code is removed.  */
> +  if (!ignore)
> +    CALL_EXPR_TAILCALL (exp) = 0;
> +
>    /* Expand the call here so we can emit trailing code.  */
>    ret = expand_call (exp, target, ignore);
>  
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ping x2] Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-05-31 13:31 ` Segher Boessenkool
@ 2017-06-11  0:15   ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2017-06-11  0:15 UTC (permalink / raw)
  To: gcc-patches

Ping.

On Wed, May 31, 2017 at 08:19:56AM -0500, Segher Boessenkool wrote:
> Ping.
> 
> (Sorry for the very aggressive ping; this fixes 764 testsuite failures
> on powerpc-linux).
> 
> 
> Segher
> 
> 
> On Sun, May 28, 2017 at 12:31:12PM +0000, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.
> > 
> > This fixes it.
> > 
> > Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> > 
> > 
> > Segher
> > 
> > 
> > 2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>
> > 
> > 	PR middle-end/80902
> > 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> > 	a call, force the call to not be a tail call.
> > 
> > ---
> >  gcc/builtins.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > index 4f6c9c4..3a70693 100644
> > --- a/gcc/builtins.c
> > +++ b/gcc/builtins.c
> > @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target,
> >    gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
> >    TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
> >  
> > +  /* If we will emit code after the call, the call can not be a tail call.
> > +     If it is emitted as a tail call, a barrier is emitted after it, and
> > +     then all trailing code is removed.  */
> > +  if (!ignore)
> > +    CALL_EXPR_TAILCALL (exp) = 0;
> > +
> >    /* Expand the call here so we can emit trailing code.  */
> >    ret = expand_call (exp, target, ignore);
> >  
> > -- 
> > 1.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-05-28 13:31 [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902) Segher Boessenkool
  2017-05-31 13:31 ` Segher Boessenkool
@ 2017-06-23  4:59 ` Jeff Law
  2017-06-23 17:44   ` Segher Boessenkool
  2017-07-07 11:56   ` Segher Boessenkool
  1 sibling, 2 replies; 8+ messages in thread
From: Jeff Law @ 2017-06-23  4:59 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches

On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> __atomic_add_fetch adds a value to some memory, and returns the result.
> If there is no direct support for this, expand_builtin_atomic_fetch_op
> is asked to implement this as __atomic_fetch_add (which returns the
> original value of the mem), followed by the addition.  Now, the
> __atomic_add_fetch could have been a tail call, but we shouldn't
> perform the __atomic_fetch_add as a tail call: following code would
> not be executed, and in fact thrown away because there is a barrier
> after tail calls.
> 
> This fixes it.
> 
> Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>
> 
> 	PR middle-end/80902
> 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> 	a call, force the call to not be a tail call.
Hmmm.  I wonder if we have similar problems elsewhere.  For example
expand_builtin_int_roundingfn_2, stack_protect_epilogue,
expand_builtin_trap (though this one probably isn't broken in practice),
expand_ifn_atomic_compare_exchange_into_call.

OK, but please check the other instances where we call expand_call, then
continue generating code afterwards.  Fixing those can be a follow-up patch.

Sorry for the delay.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-06-23  4:59 ` Jeff Law
@ 2017-06-23 17:44   ` Segher Boessenkool
  2017-06-23 17:48     ` Jeff Law
  2017-07-07 11:56   ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2017-06-23 17:44 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.
> > 
> > This fixes it.

> > 	PR middle-end/80902
> > 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> > 	a call, force the call to not be a tail call.
> Hmmm.  I wonder if we have similar problems elsewhere.  For example
> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
> expand_builtin_trap (though this one probably isn't broken in practice),
> expand_ifn_atomic_compare_exchange_into_call.
> 
> OK, but please check the other instances where we call expand_call, then
> continue generating code afterwards.  Fixing those can be a follow-up patch.

I guess we want an expand_call_notail helper?  Or, hrm, why are function
calls expanded as tail calls at all, should that not be decided later?


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-06-23 17:44   ` Segher Boessenkool
@ 2017-06-23 17:48     ` Jeff Law
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Law @ 2017-06-23 17:48 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

On 06/23/2017 11:44 AM, Segher Boessenkool wrote:
> On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
>> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
>>> __atomic_add_fetch adds a value to some memory, and returns the result.
>>> If there is no direct support for this, expand_builtin_atomic_fetch_op
>>> is asked to implement this as __atomic_fetch_add (which returns the
>>> original value of the mem), followed by the addition.  Now, the
>>> __atomic_add_fetch could have been a tail call, but we shouldn't
>>> perform the __atomic_fetch_add as a tail call: following code would
>>> not be executed, and in fact thrown away because there is a barrier
>>> after tail calls.
>>>
>>> This fixes it.
> 
>>> 	PR middle-end/80902
>>> 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
>>> 	a call, force the call to not be a tail call.
>> Hmmm.  I wonder if we have similar problems elsewhere.  For example
>> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
>> expand_builtin_trap (though this one probably isn't broken in practice),
>> expand_ifn_atomic_compare_exchange_into_call.
>>
>> OK, but please check the other instances where we call expand_call, then
>> continue generating code afterwards.  Fixing those can be a follow-up patch.
> 
> I guess we want an expand_call_notail helper? 
Probably.

 Or, hrm, why are function
> calls expanded as tail calls at all, should that not be decided later?
That's how I thought it worked.  We create two streams of insns, then
decide later which of the two streams to use.

But I think part of the criteria for creating streams was that call was
in the tail position to start with.  And that's not the case with the
code you pointed out and the others I found.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-06-23  4:59 ` Jeff Law
  2017-06-23 17:44   ` Segher Boessenkool
@ 2017-07-07 11:56   ` Segher Boessenkool
  2017-07-24 17:01     ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2017-07-07 11:56 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.

> Hmmm.  I wonder if we have similar problems elsewhere.  For example
> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
> expand_builtin_trap (though this one probably isn't broken in practice),
> expand_ifn_atomic_compare_exchange_into_call.
> 
> OK, but please check the other instances where we call expand_call, then
> continue generating code afterwards.  Fixing those can be a follow-up patch.

We certainly have similar problems elsewhere.

I'm doing tests detecting whenever we create dead code (right after a
barrier); it finds a few things, mostly harmless, but there are quite
a few places where we create dead code during expand.  This will take
a while, but it will need to happen during stage 1...  I'm trying to
fit it in :-/


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)
  2017-07-07 11:56   ` Segher Boessenkool
@ 2017-07-24 17:01     ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2017-07-24 17:01 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Fri, Jul 07, 2017 at 06:56:09AM -0500, Segher Boessenkool wrote:
> On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
> > On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> > > __atomic_add_fetch adds a value to some memory, and returns the result.
> > > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > > is asked to implement this as __atomic_fetch_add (which returns the
> > > original value of the mem), followed by the addition.  Now, the
> > > __atomic_add_fetch could have been a tail call, but we shouldn't
> > > perform the __atomic_fetch_add as a tail call: following code would
> > > not be executed, and in fact thrown away because there is a barrier
> > > after tail calls.
> 
> > Hmmm.  I wonder if we have similar problems elsewhere.  For example
> > expand_builtin_int_roundingfn_2, stack_protect_epilogue,
> > expand_builtin_trap (though this one probably isn't broken in practice),
> > expand_ifn_atomic_compare_exchange_into_call.
> > 
> > OK, but please check the other instances where we call expand_call, then
> > continue generating code afterwards.  Fixing those can be a follow-up patch.
> 
> We certainly have similar problems elsewhere.
> 
> I'm doing tests detecting whenever we create dead code (right after a
> barrier); it finds a few things, mostly harmless, but there are quite
> a few places where we create dead code during expand.  This will take
> a while, but it will need to happen during stage 1...  I'm trying to
> fit it in :-/

The following patch bootstraps on powerpc64-linux, and builds cross to
*-linux, and builds Linux with it.  I'm not proposing it for inclusion
right now though; it is too disgusting, and at least it should only do
this with checking enabled.  It might be useful to someone though.  It
does complain about the PR80902 problem, at least.


Segher


diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index c9d8118..d424d91 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3843,17 +3843,79 @@ expand_gimple_tailcall (basic_block bb, gcall *stmt, bool *can_fallthru)
   last = NEXT_INSN (last);
   gcc_assert (BARRIER_P (last));
 
+  bool fatal_error = false;
   *can_fallthru = false;
-  while (NEXT_INSN (last))
+  for (rtx_insn *insn = NEXT_INSN (last); insn; insn = NEXT_INSN (insn))
     {
       /* For instance an sqrt builtin expander expands if with
 	 sibcall in the then and label for `else`.  */
-      if (LABEL_P (NEXT_INSN (last)))
+      if (LABEL_P (insn))
+	break;
+
+      /* Complain if we would delete an insn that is not a move for the
+	 function return value.  */
+      if (NONDEBUG_INSN_P (insn))
+	{
+	  bool okay = false;
+	  rtx set = single_set (insn);
+	  if (set)
+	    {
+	      rtx dest = SET_DEST (set);
+	      rtx src = SET_SRC (set);
+	      if (GET_CODE (src) == SIGN_EXTEND
+		  || GET_CODE (src) == ZERO_EXTEND)
+		src = XEXP (src, 0);
+	      if ((OBJECT_P (dest) || SUBREG_P (dest))
+		  && (OBJECT_P (src) || SUBREG_P (src)))
+		okay = true;
+	      /* Not too bad so far...  but some weirder things are emitted
+		 as well, unfortunately.  */
+	      if (REG_P (dest) && REGNO (dest) == STACK_POINTER_REGNUM)
+		okay = true;
+	      /* mips does the following:  */
+	      if (REG_P (dest)
+		  && GET_CODE (src) == UNSPEC
+		  && XVECLEN (src, 0) == 1
+		  && REG_P (XVECEXP (src, 0, 0)))
+		okay = true;
+	      /* And at least tile* does this, followed by other code to
+		 calculate more stack addresses.  So let's ignore the rest
+		 of the code.  */
+	      if (GET_CODE (src) == PLUS
+		  && REG_P (dest)
+		  && REG_P (XEXP (src, 0))
+		  && IN_RANGE (REGNO (XEXP (src, 0)),
+			       FIRST_VIRTUAL_REGISTER,
+			       LAST_VIRTUAL_REGISTER))
+		break;
+	    }
+	  else if (GET_CODE (PATTERN (insn)) == CLOBBER)
+	    okay = true;
+	  if (!okay)
+	    fatal_error = true;
+	}
+    }
+
+  if (fatal_error)
+    {
+      fprintf (stderr, "DELETING:\n");
+      for (rtx_insn *insn = NEXT_INSN (last); insn; insn = NEXT_INSN (insn))
+	debug_rtx (insn);
+
+      gcc_assert (0);
+    }
+
+  for (rtx_insn *insn = NEXT_INSN (last); insn; insn = NEXT_INSN (last))
+    {
+      /* For instance an sqrt builtin expander expands if with
+	 sibcall in the then and label for `else`.  */
+      if (LABEL_P (insn))
 	{
 	  *can_fallthru = true;
 	  break;
 	}
-      delete_insn (NEXT_INSN (last));
+
+      delete_insn (insn);
     }
 
   e = make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_ABNORMAL

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-07-24 17:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-28 13:31 [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902) Segher Boessenkool
2017-05-31 13:31 ` Segher Boessenkool
2017-06-11  0:15   ` [ping x2] " Segher Boessenkool
2017-06-23  4:59 ` Jeff Law
2017-06-23 17:44   ` Segher Boessenkool
2017-06-23 17:48     ` Jeff Law
2017-07-07 11:56   ` Segher Boessenkool
2017-07-24 17:01     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).