public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v2] combine: perform jump threading at the end
@ 2018-09-05 12:02 Ilya Leoshkevich
  2018-09-05 12:11 ` Richard Biener
  2018-09-14 22:11 ` Segher Boessenkool
  0 siblings, 2 replies; 8+ messages in thread
From: Ilya Leoshkevich @ 2018-09-05 12:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: krebbel, rdapp, richard.guenther, segher, Ilya Leoshkevich

gcc/ChangeLog:

2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>

	PR target/80080
	* combine.c (rest_of_handle_combine): Perform jump threading.

gcc/testsuite/ChangeLog:

2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>

	PR target/80080
	* gcc.target/s390/pr80080-4.c: New test.
---
 gcc/combine.c                             | 10 ++++++++--
 gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 ++++++++++++++++
 2 files changed, 24 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c

diff --git a/gcc/combine.c b/gcc/combine.c
index a2649b6d5a1..818b4c5b77d 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -14960,10 +14960,16 @@ rest_of_handle_combine (void)
 	free_dominance_info (CDI_DOMINATORS);
       timevar_push (TV_JUMP);
       rebuild_jump_labels (get_insns ());
-      cleanup_cfg (0);
-      timevar_pop (TV_JUMP);
     }
 
+  /* Combining insns can change basic blocks in a way that they end up
+     containing a single jump_insn. This creates an opportunity to improve code
+     with jump threading.  */
+  cleanup_cfg (CLEANUP_THREADING);
+
+  if (rebuild_jump_labels_after_combine)
+    timevar_pop (TV_JUMP);
+
   regstat_free_n_sets_and_refs ();
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/s390/pr80080-4.c b/gcc/testsuite/gcc.target/s390/pr80080-4.c
new file mode 100644
index 00000000000..91d31ec7845
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr80080-4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-march=z196 -O2" } */
+
+extern void bar(int *mem);
+
+void foo4(int *mem)
+{
+  int oldval = 0;
+  if (!__atomic_compare_exchange_n (mem, (void *) &oldval, 1,
+				    1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
+    {
+      bar (mem);
+    }
+}
+
+/* { dg-final { scan-assembler "\n\tlt\t.*\n\tjne\t(\\.L\\d+)\n(.*\n)*\tcs\t.*\n\tber\t%r14\n\\1:\n\tjg\tbar\n" } } */
-- 
2.18.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-05 12:02 [PATCH v2] combine: perform jump threading at the end Ilya Leoshkevich
@ 2018-09-05 12:11 ` Richard Biener
  2018-09-06 18:11   ` Jeff Law
  2018-09-14 22:11 ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Biener @ 2018-09-05 12:11 UTC (permalink / raw)
  To: iii; +Cc: GCC Patches, krebbel, rdapp, Segher Boessenkool

On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> gcc/ChangeLog:
>
> 2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>
>
>         PR target/80080
>         * combine.c (rest_of_handle_combine): Perform jump threading.
>
> gcc/testsuite/ChangeLog:
>
> 2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>
>
>         PR target/80080
>         * gcc.target/s390/pr80080-4.c: New test.
> ---
>  gcc/combine.c                             | 10 ++++++++--
>  gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index a2649b6d5a1..818b4c5b77d 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -14960,10 +14960,16 @@ rest_of_handle_combine (void)
>         free_dominance_info (CDI_DOMINATORS);
>        timevar_push (TV_JUMP);
>        rebuild_jump_labels (get_insns ());
> -      cleanup_cfg (0);
> -      timevar_pop (TV_JUMP);
>      }
>
> +  /* Combining insns can change basic blocks in a way that they end up
> +     containing a single jump_insn. This creates an opportunity to improve code
> +     with jump threading.  */
> +  cleanup_cfg (CLEANUP_THREADING);
> +
> +  if (rebuild_jump_labels_after_combine)
> +    timevar_pop (TV_JUMP);

cleanup_cfg pushes its own timevar so it doesn't make sense to try covering it
with TV_JUMP.  And rebuild_jump_labels immediately pushes TV_REBUILD_JUMP.

So I suggest to remove the timevar_push/pop of TV_JUMP here.

No comment in general about the change, maybe we can detect transforms that
make jump-threading viable and conditionalize that properly?  Note the only
setter of CLEANUP_THREADING guards it with flag_thread_jumps so maybe better
do it above as well (avoids cost at -O0 for example).

Richard.

> +
>    regstat_free_n_sets_and_refs ();
>    return 0;
>  }
> diff --git a/gcc/testsuite/gcc.target/s390/pr80080-4.c b/gcc/testsuite/gcc.target/s390/pr80080-4.c
> new file mode 100644
> index 00000000000..91d31ec7845
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/pr80080-4.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=z196 -O2" } */
> +
> +extern void bar(int *mem);
> +
> +void foo4(int *mem)
> +{
> +  int oldval = 0;
> +  if (!__atomic_compare_exchange_n (mem, (void *) &oldval, 1,
> +                                   1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
> +    {
> +      bar (mem);
> +    }
> +}
> +
> +/* { dg-final { scan-assembler "\n\tlt\t.*\n\tjne\t(\\.L\\d+)\n(.*\n)*\tcs\t.*\n\tber\t%r14\n\\1:\n\tjg\tbar\n" } } */
> --
> 2.18.0
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-05 12:11 ` Richard Biener
@ 2018-09-06 18:11   ` Jeff Law
  2018-09-06 22:39     ` Segher Boessenkool
  2018-09-10 10:58     ` Ilya Leoshkevich
  0 siblings, 2 replies; 8+ messages in thread
From: Jeff Law @ 2018-09-06 18:11 UTC (permalink / raw)
  To: Richard Biener, iii; +Cc: GCC Patches, krebbel, rdapp, Segher Boessenkool

On 09/05/2018 06:11 AM, Richard Biener wrote:
> On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>
>> gcc/ChangeLog:
>>
>> 2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>
>>
>>         PR target/80080
>>         * combine.c (rest_of_handle_combine): Perform jump threading.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>
>>
>>         PR target/80080
>>         * gcc.target/s390/pr80080-4.c: New test.
>> ---
>>  gcc/combine.c                             | 10 ++++++++--
>>  gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 ++++++++++++++++
>>  2 files changed, 24 insertions(+), 2 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c
>>
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index a2649b6d5a1..818b4c5b77d 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -14960,10 +14960,16 @@ rest_of_handle_combine (void)
>>         free_dominance_info (CDI_DOMINATORS);
>>        timevar_push (TV_JUMP);
>>        rebuild_jump_labels (get_insns ());
>> -      cleanup_cfg (0);
>> -      timevar_pop (TV_JUMP);
>>      }
>>
>> +  /* Combining insns can change basic blocks in a way that they end up
>> +     containing a single jump_insn. This creates an opportunity to improve code
>> +     with jump threading.  */
>> +  cleanup_cfg (CLEANUP_THREADING);
>> +
>> +  if (rebuild_jump_labels_after_combine)
>> +    timevar_pop (TV_JUMP);
> 
> cleanup_cfg pushes its own timevar so it doesn't make sense to try covering it
> with TV_JUMP.  And rebuild_jump_labels immediately pushes TV_REBUILD_JUMP.
> 
> So I suggest to remove the timevar_push/pop of TV_JUMP here.
> 
> No comment in general about the change, maybe we can detect transforms that
> make jump-threading viable and conditionalize that properly?  Note the only
> setter of CLEANUP_THREADING guards it with flag_thread_jumps so maybe better
> do it above as well (avoids cost at -O0 for example).
The sad thing is I thought we'd killed the RTL jump threading code eons ago.

THe RTL jump threading code tries to prove that the target block has no
side effects and that we can statically determine the true/false
condition for the conditional branch at the end of the block.

This is (of course) much easier to do when the target block has no insns
other than the conditional branch.  So perhaps only do this when the
target block has just the conditional?

Hard to know if that'd work here since RTL wasn't posted.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-06 18:11   ` Jeff Law
@ 2018-09-06 22:39     ` Segher Boessenkool
  2018-09-10 12:25       ` Ilya Leoshkevich
  2018-09-10 10:58     ` Ilya Leoshkevich
  1 sibling, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2018-09-06 22:39 UTC (permalink / raw)
  To: iii; +Cc: Richard Biener, Jeff Law, GCC Patches, krebbel, rdapp

On Thu, Sep 06, 2018 at 12:11:09PM -0600, Jeff Law wrote:
> On 09/05/2018 06:11 AM, Richard Biener wrote:
> > On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >> +  /* Combining insns can change basic blocks in a way that they end up
> >> +     containing a single jump_insn. This creates an opportunity to improve code
> >> +     with jump threading.  */
> >> +  cleanup_cfg (CLEANUP_THREADING);

Please show an example of when this happens.  For almost all code it does
not happen, so please don't do it always.

Does it improve code at all?  There is a reason we do not run the expensive
cfg cleanups after every pass: they are expensive.  They are only done in
some strategically chosen places.


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-06 18:11   ` Jeff Law
  2018-09-06 22:39     ` Segher Boessenkool
@ 2018-09-10 10:58     ` Ilya Leoshkevich
  2018-09-14 21:41       ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: Ilya Leoshkevich @ 2018-09-10 10:58 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Biener, GCC Patches, krebbel, rdapp, Segher Boessenkool



> Am 06.09.2018 um 20:11 schrieb Jeff Law <law@redhat.com>:
> 
> On 09/05/2018 06:11 AM, Richard Biener wrote:
>> On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>> 
>>> gcc/ChangeLog:
>>> 
>>> 2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>
>>> 
>>>        PR target/80080
>>>        * combine.c (rest_of_handle_combine): Perform jump threading.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> 2018-09-05  Ilya Leoshkevich  <iii@linux.ibm.com>
>>> 
>>>        PR target/80080
>>>        * gcc.target/s390/pr80080-4.c: New test.
>>> ---
>>> gcc/combine.c                             | 10 ++++++++--
>>> gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 ++++++++++++++++
>>> 2 files changed, 24 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c
>>> 
>>> diff --git a/gcc/combine.c b/gcc/combine.c
>>> index a2649b6d5a1..818b4c5b77d 100644
>>> --- a/gcc/combine.c
>>> +++ b/gcc/combine.c
>>> @@ -14960,10 +14960,16 @@ rest_of_handle_combine (void)
>>>        free_dominance_info (CDI_DOMINATORS);
>>>       timevar_push (TV_JUMP);
>>>       rebuild_jump_labels (get_insns ());
>>> -      cleanup_cfg (0);
>>> -      timevar_pop (TV_JUMP);
>>>     }
>>> 
>>> +  /* Combining insns can change basic blocks in a way that they end up
>>> +     containing a single jump_insn. This creates an opportunity to improve code
>>> +     with jump threading.  */
>>> +  cleanup_cfg (CLEANUP_THREADING);
>>> +
>>> +  if (rebuild_jump_labels_after_combine)
>>> +    timevar_pop (TV_JUMP);
>> 
>> cleanup_cfg pushes its own timevar so it doesn't make sense to try covering it
>> with TV_JUMP.  And rebuild_jump_labels immediately pushes TV_REBUILD_JUMP.
>> 
>> So I suggest to remove the timevar_push/pop of TV_JUMP here.
>> 
>> No comment in general about the change, maybe we can detect transforms that
>> make jump-threading viable and conditionalize that properly?  Note the only
>> setter of CLEANUP_THREADING guards it with flag_thread_jumps so maybe better
>> do it above as well (avoids cost at -O0 for example).
> The sad thing is I thought we'd killed the RTL jump threading code eons ago.
Do you mean RTL jump threading is deprecated and/or we better rely on
something else to achieve the same results?
> 
> THe RTL jump threading code tries to prove that the target block has no
> side effects and that we can statically determine the true/false
> condition for the conditional branch at the end of the block.
> 
> This is (of course) much easier to do when the target block has no insns
> other than the conditional branch.  So perhaps only do this when the
> target block has just the conditional?
Sounds reasonable, I implemented this in the new patch.  The only thing
is that combine leaves (note NOTE_INSN_DELETED) around, so I needed to
also account for those.  I used side_effects_p for that.
> 
> Hard to know if that'd work here since RTL wasn't posted.
I will post the RTL with the updated patch shortly.
> 
> Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-06 22:39     ` Segher Boessenkool
@ 2018-09-10 12:25       ` Ilya Leoshkevich
  0 siblings, 0 replies; 8+ messages in thread
From: Ilya Leoshkevich @ 2018-09-10 12:25 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Richard Biener, Jeff Law, GCC Patches, krebbel, rdapp



> Am 07.09.2018 um 00:39 schrieb Segher Boessenkool <segher@kernel.crashing.org>:
> 
> On Thu, Sep 06, 2018 at 12:11:09PM -0600, Jeff Law wrote:
>> On 09/05/2018 06:11 AM, Richard Biener wrote:
>>> On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>> +  /* Combining insns can change basic blocks in a way that they end up
>>>> +     containing a single jump_insn. This creates an opportunity to improve code
>>>> +     with jump threading.  */
>>>> +  cleanup_cfg (CLEANUP_THREADING);
> 
> Please show an example of when this happens.  For almost all code it does
> not happen, so please don't do it always.
This improves the code for the following example from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080:

extern void bar(int *mem);

void foo5(int *mem)
{
  int oldval = 0;
  __atomic_compare_exchange_n (mem, (void *) &oldval, 1,
                               1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
  if (oldval != 0)
    bar (mem);
}

I posted the corresponding RTL here:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00495.html
> 
> Does it improve code at all?  There is a reason we do not run the expensive
> cfg cleanups after every pass: they are expensive.  They are only done in
> some strategically chosen places.
> 
Performance-wise, the net win is insignificant (checked with SPEC
CPU2006), but nevertheless the generated code contains less redundant
jumps.  I intended this to be a small unintrusive improvement, so it
definitely is not good if it would increase the compile times.  I will
check how this affects build times of gcc master and SPEC CPU2006
benchmarks.
> 
> Segher
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-10 10:58     ` Ilya Leoshkevich
@ 2018-09-14 21:41       ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2018-09-14 21:41 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: Jeff Law, Richard Biener, GCC Patches, krebbel, rdapp

On Mon, Sep 10, 2018 at 12:58:19PM +0200, Ilya Leoshkevich wrote:
> > Am 06.09.2018 um 20:11 schrieb Jeff Law <law@redhat.com>:
> > This is (of course) much easier to do when the target block has no insns
> > other than the conditional branch.  So perhaps only do this when the
> > target block has just the conditional?
> Sounds reasonable, I implemented this in the new patch.  The only thing
> is that combine leaves (note NOTE_INSN_DELETED) around, so I needed to
> also account for those.  I used side_effects_p for that.

Combine cannot delete insns because various things in combine refer to
insns by uid.  It could of course delete insns after it is done, hrm,
I'll have a look at that.  Either way, you'll also have lots of other
notes still hanging around (esp. with debug info).


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] combine: perform jump threading at the end
  2018-09-05 12:02 [PATCH v2] combine: perform jump threading at the end Ilya Leoshkevich
  2018-09-05 12:11 ` Richard Biener
@ 2018-09-14 22:11 ` Segher Boessenkool
  1 sibling, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2018-09-14 22:11 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: gcc-patches, krebbel, rdapp, richard.guenther

On Wed, Sep 05, 2018 at 02:00:59PM +0200, Ilya Leoshkevich wrote:
> +/* { dg-final { scan-assembler "\n\tlt\t.*\n\tjne\t(\\.L\\d+)\n(.*\n)*\tcs\t.*\n\tber\t%r14\n\\1:\n\tjg\tbar\n" } } */

About this RE.  "." matches anything, also newlines, so (.*\n)* is the
same as just .* (ignoring captures).  But you probably do not want this,
so you should put (?n) in your RE (traditionally, at the start of it).

You also might want to quote the string with {} instead of "".  Read
https://www.tcl.tk/man/tcl8.4/TclCmd/Tcl.htm for why (it is short; when
you have read it, you will know *all* Tcl syntax.  Well maybe read re_syntax
as well).


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-14 21:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05 12:02 [PATCH v2] combine: perform jump threading at the end Ilya Leoshkevich
2018-09-05 12:11 ` Richard Biener
2018-09-06 18:11   ` Jeff Law
2018-09-06 22:39     ` Segher Boessenkool
2018-09-10 12:25       ` Ilya Leoshkevich
2018-09-10 10:58     ` Ilya Leoshkevich
2018-09-14 21:41       ` Segher Boessenkool
2018-09-14 22:11 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).