public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
@ 2017-04-25 20:01 Jakub Jelinek
  2017-04-26 11:57 ` Jakub Jelinek
  2017-04-28 18:46 ` Jeff Law
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Jelinek @ 2017-04-25 20:01 UTC (permalink / raw)
  To: Richard Biener, Jeff Law, Eric Botcazou; +Cc: gcc-patches

Hi!

The following patch is a partial fix for PR80491, an improvement for
if-conversion if there is empty else_bb.

What we do right now is in that case we only look at the immediately
preceeding (non-debug/non-note) instruction before cond_earliest
and if it is not the set of x, we just turn it into attempt to
use previous value of x rather than whatever it has been initialized to.

On the testcases from the PR, we have:
(insn 7 4 8 2 (set (reg:DI 97 [ _15+8 ])
        (const_int 0 [0])) 81 {*movdi_internal}
     (nil))
(insn 8 7 9 2 (set (reg:DI 104 [ a_11(D)->low ])
        (mem:DI (reg/v/f:DI 101 [ a ]) [2 a_11(D)->low+0 S8 A64])) 81 {*movdi_internal}
     (nil))
(insn 9 8 10 2 (set (reg:DI 105 [ b_12(D)->low ])
        (mem:DI (reg/v/f:DI 102 [ b ]) [2 b_12(D)->low+0 S8 A64])) 81 {*movdi_internal}
     (nil))
(insn 10 9 11 2 (parallel [
            (set (reg:CCC 17 flags)
                (compare:CCC (plus:DI (reg:DI 104 [ a_11(D)->low ])
                        (reg:DI 105 [ b_12(D)->low ]))
                    (reg:DI 104 [ a_11(D)->low ])))
            (set (reg:DI 103)
                (plus:DI (reg:DI 104 [ a_11(D)->low ])
                    (reg:DI 105 [ b_12(D)->low ])))
        ]) 316 {*adddi3_cc_overflow_1}
     (expr_list:REG_DEAD (reg:DI 105 [ b_12(D)->low ])
        (expr_list:REG_DEAD (reg:DI 104 [ a_11(D)->low ])
            (nil))))
(jump_insn 11 10 14 2 (set (pc)
        (if_then_else (ltu (reg:CCC 17 flags)
                (const_int 0 [0]))
            (label_ref 14)
            (pc))) 617 {*jcc_1}
     (expr_list:REG_DEAD (reg:CCC 17 flags)
        (int_list:REG_BR_PROB 4 (nil)))
 -> 14)
(code_label 14 11 35 3 3 (nil) [1 uses])
(note 35 14 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 35 16 3 (set (reg:DI 97 [ _15+8 ])
        (const_int 1 [0x1])) 81 {*movdi_internal}
     (nil))
(code_label 16 15 36 4 2 (nil) [0 uses])
If insn 7 would come after insn 9 (which doesn't change the behavior, as
insn 8 and insn 9 don't clobber pseudo 97 and const_int 0 is constant),
we'd turn that into a setcc pattern, but otherwise we fail.

This patch let us search for x's setter earlier in the bb.
During testing I found that modified_in_p/modified_in_between_p don't
actually take into account calls that could change MEMs, so the patch
handles that too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

During bootstrap/regtest it seems to trigger over 10000 times, but
my statistics collection only noted cases where it found an earlier setter
and the noce_process_if_block has been successful, has not attempted to
check if insn_b/set_b was NULL whether it would otherwise fail (but even
in cases where it doesn't fail perhaps with the patch it can generate
better code).

2017-04-25  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/80491
	* ifcvt.c (noce_process_if_block): When looking for x setter
	with missing else_bb, don't check only the insn right before
	cond_earliest, but look for the last insn that x is modified in
	within the same bb.

--- gcc/ifcvt.c.jj	2017-04-24 19:28:02.151975658 +0200
+++ gcc/ifcvt.c	2017-04-25 16:22:16.760928057 +0200
@@ -3440,7 +3440,36 @@ noce_process_if_block (struct noce_if_in
     }
   else
     {
-      insn_b = prev_nonnote_nondebug_insn (if_info->cond_earliest);
+      bool call_crossed = false;
+      insn_b = if_info->cond_earliest;
+      do
+	{
+	  insn_b = prev_nonnote_nondebug_insn (insn_b);
+	  if (!insn_b
+	      || (BLOCK_FOR_INSN (insn_b)
+		  != BLOCK_FOR_INSN (if_info->cond_earliest))
+	      || modified_in_p (x, insn_b))
+	    break;
+	  /* modified_in_p does not consider calls changing MEMs,
+	     so punt on calls if x or SET_SRC (set_b) are MEMs that
+	     could be changed by calls.  */
+	  if (CALL_P (insn_b))
+	    {
+	      call_crossed = true;
+	      if (MEM_P (x) && !MEM_READONLY_P (x))
+		break;
+	    }
+	}
+      while (1);
+      if (!call_crossed)
+	for (rtx_insn *insn_c = if_info->cond_earliest; insn_c != jump;
+	     insn_c = NEXT_INSN (insn_c))
+	  if (CALL_P (insn_c))
+	    {
+	      call_crossed = true;
+	      break;
+	    }
+
       /* We're going to be moving the evaluation of B down from above
 	 COND_EARLIEST to JUMP.  Make sure the relevant data is still
 	 intact.  */
@@ -3468,6 +3497,29 @@ noce_process_if_block (struct noce_if_in
 	  insn_b = NULL;
 	  set_b = NULL_RTX;
 	}
+      else if (call_crossed)
+	{
+	  subrtx_iterator::array_type array;
+	  /* Punt if there are any non-readonly MEMs and there are any
+	     calls in between insn_b and jump.  Even for
+	     RTL_CONST_OR_PURE_CALL_P calls the call argument stack slots
+	     might be modified, so unless we are prepared to carefully
+	     analyze what might be modified and what can't, just punt.  */
+	  FOR_EACH_SUBRTX (iter, array, SET_SRC (set_b), ALL)
+	    if (MEM_P (*iter) && !MEM_READONLY_P (*iter))
+	      {
+		insn_b = NULL;
+		set_b = NULL_RTX;
+		break;
+	      }
+	  FOR_EACH_SUBRTX (iter, array, x, ALL)
+	    if (MEM_P (*iter) && !MEM_READONLY_P (*iter))
+	      {
+		insn_b = NULL;
+		set_b = NULL_RTX;
+		break;
+	      }
+	}
     }
 
   /* If x has side effects then only the if-then-else form is safe to

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
  2017-04-25 20:01 [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491) Jakub Jelinek
@ 2017-04-26 11:57 ` Jakub Jelinek
  2017-04-28 18:51   ` Jeff Law
  2017-04-29 22:23   ` Steven Bosscher
  2017-04-28 18:46 ` Jeff Law
  1 sibling, 2 replies; 7+ messages in thread
From: Jakub Jelinek @ 2017-04-26 11:57 UTC (permalink / raw)
  To: Richard Biener, Jeff Law, Eric Botcazou; +Cc: gcc-patches

On Tue, Apr 25, 2017 at 09:31:00PM +0200, Jakub Jelinek wrote:
> This patch let us search for x's setter earlier in the bb.
> During testing I found that modified_in_p/modified_in_between_p don't
> actually take into account calls that could change MEMs, so the patch
> handles that too.

Or shall we just:
--- gcc/alias.c	2017-04-25 15:51:31.072923325 +0200
+++ gcc/alias.c	2017-04-26 13:23:55.595048464 +0200
@@ -3221,6 +3221,8 @@ memory_modified_in_insn_p (const_rtx mem
 {
   if (!INSN_P (insn))
     return false;
+  if (CALL_P (insn))
+    return true;
   memory_modified = false;
   note_stores (PATTERN (insn), memory_modified_1, CONST_CAST_RTX(mem));
   return memory_modified;

instead of the call_crossed hacks?  Then modified_between_p and
modified_in_p would return true for !MEM_READONLY_P MEMs crossing a call.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
  2017-04-25 20:01 [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491) Jakub Jelinek
  2017-04-26 11:57 ` Jakub Jelinek
@ 2017-04-28 18:46 ` Jeff Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2017-04-28 18:46 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, Eric Botcazou; +Cc: gcc-patches

On 04/25/2017 01:31 PM, Jakub Jelinek wrote:
> Hi!
> 
> The following patch is a partial fix for PR80491, an improvement for
> if-conversion if there is empty else_bb.
> 
> What we do right now is in that case we only look at the immediately
> preceeding (non-debug/non-note) instruction before cond_earliest
> and if it is not the set of x, we just turn it into attempt to
> use previous value of x rather than whatever it has been initialized to.
> 
> On the testcases from the PR, we have:
> (insn 7 4 8 2 (set (reg:DI 97 [ _15+8 ])
>          (const_int 0 [0])) 81 {*movdi_internal}
>       (nil))
> (insn 8 7 9 2 (set (reg:DI 104 [ a_11(D)->low ])
>          (mem:DI (reg/v/f:DI 101 [ a ]) [2 a_11(D)->low+0 S8 A64])) 81 {*movdi_internal}
>       (nil))
> (insn 9 8 10 2 (set (reg:DI 105 [ b_12(D)->low ])
>          (mem:DI (reg/v/f:DI 102 [ b ]) [2 b_12(D)->low+0 S8 A64])) 81 {*movdi_internal}
>       (nil))
> (insn 10 9 11 2 (parallel [
>              (set (reg:CCC 17 flags)
>                  (compare:CCC (plus:DI (reg:DI 104 [ a_11(D)->low ])
>                          (reg:DI 105 [ b_12(D)->low ]))
>                      (reg:DI 104 [ a_11(D)->low ])))
>              (set (reg:DI 103)
>                  (plus:DI (reg:DI 104 [ a_11(D)->low ])
>                      (reg:DI 105 [ b_12(D)->low ])))
>          ]) 316 {*adddi3_cc_overflow_1}
>       (expr_list:REG_DEAD (reg:DI 105 [ b_12(D)->low ])
>          (expr_list:REG_DEAD (reg:DI 104 [ a_11(D)->low ])
>              (nil))))
> (jump_insn 11 10 14 2 (set (pc)
>          (if_then_else (ltu (reg:CCC 17 flags)
>                  (const_int 0 [0]))
>              (label_ref 14)
>              (pc))) 617 {*jcc_1}
>       (expr_list:REG_DEAD (reg:CCC 17 flags)
>          (int_list:REG_BR_PROB 4 (nil)))
>   -> 14)
> (code_label 14 11 35 3 3 (nil) [1 uses])
> (note 35 14 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 15 35 16 3 (set (reg:DI 97 [ _15+8 ])
>          (const_int 1 [0x1])) 81 {*movdi_internal}
>       (nil))
> (code_label 16 15 36 4 2 (nil) [0 uses])
> If insn 7 would come after insn 9 (which doesn't change the behavior, as
> insn 8 and insn 9 don't clobber pseudo 97 and const_int 0 is constant),
> we'd turn that into a setcc pattern, but otherwise we fail.
> 
> This patch let us search for x's setter earlier in the bb.
> During testing I found that modified_in_p/modified_in_between_p don't
> actually take into account calls that could change MEMs, so the patch
> handles that too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> During bootstrap/regtest it seems to trigger over 10000 times, but
> my statistics collection only noted cases where it found an earlier setter
> and the noce_process_if_block has been successful, has not attempted to
> check if insn_b/set_b was NULL whether it would otherwise fail (but even
> in cases where it doesn't fail perhaps with the patch it can generate
> better code).
> 
> 2017-04-25  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/80491
> 	* ifcvt.c (noce_process_if_block): When looking for x setter
> 	with missing else_bb, don't check only the insn right before
> 	cond_earliest, but look for the last insn that x is modified in
> 	within the same bb.
OK.

jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
  2017-04-26 11:57 ` Jakub Jelinek
@ 2017-04-28 18:51   ` Jeff Law
  2017-04-29 17:22     ` Jakub Jelinek
  2017-04-29 22:23   ` Steven Bosscher
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff Law @ 2017-04-28 18:51 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, Eric Botcazou; +Cc: gcc-patches

On 04/26/2017 05:25 AM, Jakub Jelinek wrote:
> On Tue, Apr 25, 2017 at 09:31:00PM +0200, Jakub Jelinek wrote:
>> This patch let us search for x's setter earlier in the bb.
>> During testing I found that modified_in_p/modified_in_between_p don't
>> actually take into account calls that could change MEMs, so the patch
>> handles that too.
> 
> Or shall we just:
> --- gcc/alias.c	2017-04-25 15:51:31.072923325 +0200
> +++ gcc/alias.c	2017-04-26 13:23:55.595048464 +0200
> @@ -3221,6 +3221,8 @@ memory_modified_in_insn_p (const_rtx mem
>   {
>     if (!INSN_P (insn))
>       return false;
> +  if (CALL_P (insn))
> +    return true;
>     memory_modified = false;
>     note_stores (PATTERN (insn), memory_modified_1, CONST_CAST_RTX(mem));
>     return memory_modified;
> 
> instead of the call_crossed hacks?  Then modified_between_p and
> modified_in_p would return true for !MEM_READONLY_P MEMs crossing a call.
This seems like a good idea as well.  I'm a bit surprised we're not 
handling it this way already.  If you want to go w ith this and simplify 
the patch for 80491, that's fine too.

jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
  2017-04-28 18:51   ` Jeff Law
@ 2017-04-29 17:22     ` Jakub Jelinek
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Jelinek @ 2017-04-29 17:22 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Biener, Eric Botcazou, gcc-patches

On Fri, Apr 28, 2017 at 12:32:47PM -0600, Jeff Law wrote:
> On 04/26/2017 05:25 AM, Jakub Jelinek wrote:
> > On Tue, Apr 25, 2017 at 09:31:00PM +0200, Jakub Jelinek wrote:
> > > This patch let us search for x's setter earlier in the bb.
> > > During testing I found that modified_in_p/modified_in_between_p don't
> > > actually take into account calls that could change MEMs, so the patch
> > > handles that too.
> > 
> > Or shall we just:
> > --- gcc/alias.c	2017-04-25 15:51:31.072923325 +0200
> > +++ gcc/alias.c	2017-04-26 13:23:55.595048464 +0200
> > @@ -3221,6 +3221,8 @@ memory_modified_in_insn_p (const_rtx mem
> >   {
> >     if (!INSN_P (insn))
> >       return false;
> > +  if (CALL_P (insn))
> > +    return true;
> >     memory_modified = false;
> >     note_stores (PATTERN (insn), memory_modified_1, CONST_CAST_RTX(mem));
> >     return memory_modified;
> > 
> > instead of the call_crossed hacks?  Then modified_between_p and
> > modified_in_p would return true for !MEM_READONLY_P MEMs crossing a call.
> This seems like a good idea as well.  I'm a bit surprised we're not handling
> it this way already.  If you want to go w ith this and simplify the patch
> for 80491, that's fine too.

Ok, I went with the above patch plus the following simplified version of the
ifcvt.c change.  I've additionally bootstrapped just the alias.c change
compared to vanilla trunk and the only code generation differences (without
the ifcvt.c patch) were in alias.o from cc1* objects (because it was
patched), and all stage3 shared libraries were also identical, which means
it is unlikely it hurts a lot.

2017-04-29  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/80491
	* ifcvt.c (noce_process_if_block): When looking for x setter
	with missing else_bb, don't check only the insn right before
	cond_earliest, but look for the last insn that x is modified in
	within the same bb.

--- gcc/ifcvt.c.jj	2017-04-04 19:51:33.000000000 +0200
+++ gcc/ifcvt.c	2017-04-24 16:01:09.784787699 +0200
@@ -3440,7 +3440,14 @@ noce_process_if_block (struct noce_if_in
     }
   else
     {
-      insn_b = prev_nonnote_nondebug_insn (if_info->cond_earliest);
+      insn_b = if_info->cond_earliest;
+      do
+	insn_b = prev_nonnote_nondebug_insn (insn_b);
+      while (insn_b
+	     && (BLOCK_FOR_INSN (insn_b)
+		 == BLOCK_FOR_INSN (if_info->cond_earliest))
+	     && !modified_in_p (x, insn_b));
+
       /* We're going to be moving the evaluation of B down from above
 	 COND_EARLIEST to JUMP.  Make sure the relevant data is still
 	 intact.  */


	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
  2017-04-26 11:57 ` Jakub Jelinek
  2017-04-28 18:51   ` Jeff Law
@ 2017-04-29 22:23   ` Steven Bosscher
  2017-04-30  9:13     ` Jakub Jelinek
  1 sibling, 1 reply; 7+ messages in thread
From: Steven Bosscher @ 2017-04-29 22:23 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Jeff Law, Eric Botcazou, GCC Patches

On Wed, Apr 26, 2017 at 1:25 PM, Jakub Jelinek wrote:
> Or shall we just:
> --- gcc/alias.c 2017-04-25 15:51:31.072923325 +0200
> +++ gcc/alias.c 2017-04-26 13:23:55.595048464 +0200
> @@ -3221,6 +3221,8 @@ memory_modified_in_insn_p (const_rtx mem
>  {
>    if (!INSN_P (insn))
>      return false;
> +  if (CALL_P (insn))
> +    return true;

+"&& ! RTL_CONST_OR_PURE_CALL_P (insn)" ?

Ciao!
Steven




>    memory_modified = false;
>    note_stores (PATTERN (insn), memory_modified_1, CONST_CAST_RTX(mem));
>    return memory_modified;
>
> instead of the call_crossed hacks?  Then modified_between_p and
> modified_in_p would return true for !MEM_READONLY_P MEMs crossing a call.
>
>         Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
  2017-04-29 22:23   ` Steven Bosscher
@ 2017-04-30  9:13     ` Jakub Jelinek
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Jelinek @ 2017-04-30  9:13 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Richard Biener, Jeff Law, Eric Botcazou, GCC Patches

On Sat, Apr 29, 2017 at 10:43:52PM +0200, Steven Bosscher wrote:
> On Wed, Apr 26, 2017 at 1:25 PM, Jakub Jelinek wrote:
> > Or shall we just:
> > --- gcc/alias.c 2017-04-25 15:51:31.072923325 +0200
> > +++ gcc/alias.c 2017-04-26 13:23:55.595048464 +0200
> > @@ -3221,6 +3221,8 @@ memory_modified_in_insn_p (const_rtx mem
> >  {
> >    if (!INSN_P (insn))
> >      return false;
> > +  if (CALL_P (insn))
> > +    return true;
> 
> +"&& ! RTL_CONST_OR_PURE_CALL_P (insn)" ?

Without detailed analysis of the MEM I'm not convinced it is safe.
I believe even const or pure calls invalidate e.g. the argument stack slots,
at least in the ABIs where stack slots are owned by callee, not caller.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-04-29 20:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-25 20:01 [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491) Jakub Jelinek
2017-04-26 11:57 ` Jakub Jelinek
2017-04-28 18:51   ` Jeff Law
2017-04-29 17:22     ` Jakub Jelinek
2017-04-29 22:23   ` Steven Bosscher
2017-04-30  9:13     ` Jakub Jelinek
2017-04-28 18:46 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).