public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Tree tail merging breaks __builtin_unreachable optimization
@ 2012-07-04 17:02 Ulrich Weigand
  2012-07-04 18:09 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Ulrich Weigand @ 2012-07-04 17:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: tom

Hello,

starting with 4.7, if multiple __builtin_unreachable statements occur in
a single function, they are no longer optimized as they used to be.

For example,

int foo(int a)
{
    if (a <= 0)
        __builtin_unreachable();
    if (a > 2)
        __builtin_unreachable();

    return a > 0;
}

results in the following (ARM) code:

foo:
        cmp r0, #0
        ble .L3
        cmp r0, #2
        bgt .L3
        mov r0, #1
        bx lr
.L3:

with the label .L3 hanging off after the end of the function.

With 4.6, we instead get the expected:

foo:
        mov     r0, #1
        bx      lr


The problem seems to be an unfortunate interaction between tree and
RTL optimization passes. In 4.6, we had something like:

<bb 2>:
  if (a_1(D) <= 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  __builtin_unreachable ();

<bb 4>:
  if (a_1(D) > 2)
    goto <bb 5>;
  else
    goto <bb 6>;

<bb 5>:
  __builtin_unreachable ();

<bb 6>:
  return 1;

on the tree level; during RTL expansion __builtin_unreachable expands to just a
barrier, and subsequent CFG optimization detects basic blocks containing just a
barrier and optimizes the predecessor blocks.

With 4.7, we get instead:

<bb 2>:
  if (a_1(D) <= 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  __builtin_unreachable ();

<bb 4>:
  if (a_1(D) > 2)
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 5>:
  return 1;

where there is just a single basic block containing __builtin_unreachable,
and multiple predecessors branching to it. Now unfortunately the RTL
optimizers detecting unreachable blocks appear to have difficulties if
such a block has multiple predecessors, and fail to optimize them.

The tree pass that merged the two blocks is a new pass called "tail merging",
which was added in the 4.7 cycle. In fact, using -fno-tree-tail-merge gets
the expected result back.

Any suggestions how to fix this?  Should tail merging detect
__builtin_unreachable and not merge such block?  Or else, should
the CFG optimizer be extended (how?) to handle unreachable blocks
with multiple predecessors better?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-04 17:02 Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
@ 2012-07-04 18:09 ` Andrew Pinski
  2012-07-04 18:17 ` Steven Bosscher
  2012-07-05 12:49 ` Tom de Vries
  2 siblings, 0 replies; 16+ messages in thread
From: Andrew Pinski @ 2012-07-04 18:09 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, tom

On Wed, Jul 4, 2012 at 10:02 AM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Any suggestions how to fix this?  Should tail merging detect
> __builtin_unreachable and not merge such block?  Or else, should
> the CFG optimizer be extended (how?) to handle unreachable blocks
> with multiple predecessors better?

This bug has nothing to do with tail merging except tail merging
exposing it.  I have seen this behavior on MIPS even without tail
merging.  The problem is how __builtin_unreachable is implemented.  It
is implemented fine on the tree level but once it is expanded to RTL,
to optimize it away it depends on the unreachable basic block being
right after the conditional.

See PR 50385 #c1 which shows exactly the problem I saw while
implementing a basic block reorder on the tree level.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-04 17:02 Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
  2012-07-04 18:09 ` Andrew Pinski
@ 2012-07-04 18:17 ` Steven Bosscher
  2012-07-05 12:44   ` Michael Matz
  2012-07-05 12:49 ` Tom de Vries
  2 siblings, 1 reply; 16+ messages in thread
From: Steven Bosscher @ 2012-07-04 18:17 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, tom

On Wed, Jul 4, 2012 at 7:02 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Any suggestions how to fix this?  Should tail merging detect
> __builtin_unreachable and not merge such block?

That seems to be the most straight-forward thing to do. I don't think
there are any other passes that do this kind of code merging.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-04 18:17 ` Steven Bosscher
@ 2012-07-05 12:44   ` Michael Matz
  2012-07-05 12:46     ` Richard Guenther
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Matz @ 2012-07-05 12:44 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Ulrich Weigand, gcc-patches, tom

Hi,

On Wed, 4 Jul 2012, Steven Bosscher wrote:

> On Wed, Jul 4, 2012 at 7:02 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> > Any suggestions how to fix this?  Should tail merging detect
> > __builtin_unreachable and not merge such block?
> 
> That seems to be the most straight-forward thing to do. I don't think
> there are any other passes that do this kind of code merging.

What do we gain by delaying to remove these blocks until RTL?  AFAICS not 
much if anything.  So removing those on the tree level would make more 
sense.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-05 12:44   ` Michael Matz
@ 2012-07-05 12:46     ` Richard Guenther
  2012-07-05 13:17       ` Michael Matz
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Guenther @ 2012-07-05 12:46 UTC (permalink / raw)
  To: Michael Matz; +Cc: Steven Bosscher, Ulrich Weigand, gcc-patches, tom

On Thu, Jul 5, 2012 at 2:44 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 4 Jul 2012, Steven Bosscher wrote:
>
>> On Wed, Jul 4, 2012 at 7:02 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> > Any suggestions how to fix this?  Should tail merging detect
>> > __builtin_unreachable and not merge such block?
>>
>> That seems to be the most straight-forward thing to do. I don't think
>> there are any other passes that do this kind of code merging.
>
> What do we gain by delaying to remove these blocks until RTL?  AFAICS not
> much if anything.  So removing those on the tree level would make more
> sense.

The gain is to derive assertions from the conditional guarding these blocks
and optimize using that knowledge.

Richard.

>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-04 17:02 Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
  2012-07-04 18:09 ` Andrew Pinski
  2012-07-04 18:17 ` Steven Bosscher
@ 2012-07-05 12:49 ` Tom de Vries
  2012-07-05 13:18   ` Richard Guenther
  2012-07-05 13:30   ` Michael Matz
  2 siblings, 2 replies; 16+ messages in thread
From: Tom de Vries @ 2012-07-05 12:49 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: gcc-patches, tom, Andrew Pinski, Steven Bosscher,
	Richard Guenther, Michael Matz

[-- Attachment #1: Type: text/plain, Size: 3244 bytes --]

On 04/07/12 19:02, Ulrich Weigand wrote:
> Any suggestions how to fix this?  Should tail merging detect
> __builtin_unreachable and not merge such block?  Or else, should
> the CFG optimizer be extended (how?) to handle unreachable blocks
> with multiple predecessors better?

Ulrich,

I extended the example as such:
...
int
foo(int a)
{
    if (a <= 0)
      {
#ifdef MERGED
      L1:
#endif
	__builtin_unreachable();
      }
    if (a > 2)
#ifdef MERGED
      goto L1;
#else
      __builtin_unreachable();
#endif
    return a > 0;
}
...

Indeed I can reproduce the problem:
...
$ gcc unreachable.c -O2 -S -o- -ftree-tail-merge
foo:
	.cfi_startproc
	testl	%edi, %edi
	jle	.L3
	cmpl	$2, %edi
	jg	.L3
	movl	$1, %eax
	ret
.L3:
	.cfi_endproc
...

And I can make the problem go away with -fno-tree-tail-merge:
...
$ gcc unreachable.c -O2 -S -o- -fno-tree-tail-merge
foo:
	.cfi_startproc
	movl	$1, %eax
	ret
	.cfi_endproc
...

But I can also reproduce the problem with -fno-tree-tail-merge by using -DMERGED:
,,,
$ gcc unreachable.c -O2 -S -o- -fno-tree-tail-merge -DMERGED
foo:
	.cfi_startproc
	testl	%edi, %edi
	jle	.L3
	cmpl	$2, %edi
	jg	.L3
	movl	$1, %eax
	ret
.L3:
	.cfi_endproc
,,,

So it seems that this is a problem of a missed optimization, triggered by, but
not exclusively triggered by -ftree-tail-merge.

How to fix the missed optimization, I'm not sure where it should be done.

I think the optimization could be done in tree-vrp. Currently the assert
expressions look like this:
...
.foo (intD.6 aD.1711)
{
  _BoolD.1693 D.1720;
  intD.6 D.1719;

  # BLOCK 2 freq:10000
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  if (aD.1711_1(D) <= 0)
    goto <bb 3>;
  else
    goto <bb 4>;
  # SUCC: 3 [0.0%]  (true,exec) 4 [100.0%]  (false,exec)

  # BLOCK 3 freq:4
  # PRED: 2 [0.0%]  (true,exec)
  # VUSE <.MEMD.1722_4(D)>
  # USE = nonlocal
  # CLB = nonlocal
  __builtin_unreachableD.997 ();
  # SUCC:

  # BLOCK 4 freq:9996
  # PRED: 2 [100.0%]  (false,exec)
  aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>;
  if (aD.1711_6 > 2)
    goto <bb 5>;
  else
    goto <bb 6>;
  # SUCC: 5 [0.0%]  (true,exec) 6 [100.0%]  (false,exec)

  # BLOCK 5 freq:4
  # PRED: 4 [0.0%]  (true,exec)
  # VUSE <.MEMD.1722_4(D)>
  # USE = nonlocal
  # CLB = nonlocal
  __builtin_unreachableD.997 ();
  # SUCC:

  # BLOCK 6 freq:9992
  # PRED: 4 [100.0%]  (false,exec)
  aD.1711_5 = ASSERT_EXPR <aD.1711_6, aD.1711_6 <= 2>;
  D.1720_2 = aD.1711_5 > 0;
  D.1719_3 = (intD.6) D.1720_2;
  # VUSE <.MEMD.1722_4(D)>
  return D.1719_3;
  # SUCC: EXIT [100.0%]

}
..

The asserts allow the return result to be optimized, but not the cfg conditions.

AFAIU, we can insert the asserts earlier. F.i., we can insert
  aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
before the GIMPLE_COND in bb2.

Attached proof-of-concept patch implements this, and works for this example both
with and without -DMERGED, independent of -ftree-tail-merge.

The only problem I can see with this approach is that doing this early (vrp1)
basically removes range annotations which might have had an effect later on. It
could be and idea to only do this in vrp2, which is not much earlier than
expand, were the rtl cfg optimization kicks in for the first time.

Thanks,
- Tom

[-- Attachment #2: tree-vrp-unreachable.patch --]
[-- Type: text/x-patch, Size: 1931 bytes --]

Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c (revision 189007)
+++ gcc/tree-vrp.c (working copy)
@@ -4222,9 +4222,11 @@ register_new_assert_for (tree name, tree
 
   gcc_checking_assert (bb == NULL || e == NULL);
 
+#if 0
   if (e == NULL)
     gcc_checking_assert (gimple_code (gsi_stmt (si)) != GIMPLE_COND
 			 && gimple_code (gsi_stmt (si)) != GIMPLE_SWITCH);
+#endif
 
   /* Never build an assert comparing against an integer constant with
      TREE_OVERFLOW set.  This confuses our undefined overflow warning
@@ -4449,7 +4451,28 @@ register_edge_assert_for_2 (tree name, e
   if (live_on_edge (e, name)
       && !has_single_use (name))
     {
-      register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
+      bool only_reachable_edge = true;
+      edge_iterator ei2;
+      edge e2;
+      FOR_EACH_EDGE (e2, ei2, e->src->succs)
+	{
+	  if (e2 == e)
+	    continue;
+	  if (EDGE_COUNT (e2->dest->succs) == 0)
+	    continue;
+	  only_reachable_edge = false;
+	  break;
+	}
+
+      if (only_reachable_edge)
+	{
+	  gimple_stmt_iterator bsi2 = bsi;
+	  gsi_prev (&bsi2);
+	  register_new_assert_for (name, name, comp_code, val, e->src, NULL, bsi2);
+	  return true;
+	}
+      else
+	register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
       retval = true;
     }
 
@@ -5569,7 +5592,13 @@ process_assert_insertions_for (tree name
   /* Otherwise, we can insert right after LOC->SI iff the
      statement must not be the last statement in the block.  */
   stmt = gsi_stmt (loc->si);
-  if (!stmt_ends_bb_p (stmt))
+  if (stmt == NULL)
+    {
+      gimple_stmt_iterator si2 = gsi_last_bb (gsi_bb (loc->si));
+      gsi_insert_before (&si2, assert_stmt, GSI_SAME_STMT);
+      return false;
+    }
+  else if (!stmt_ends_bb_p (stmt))
     {
       gsi_insert_after (&loc->si, assert_stmt, GSI_SAME_STMT);
       return false;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-05 12:46     ` Richard Guenther
@ 2012-07-05 13:17       ` Michael Matz
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Matz @ 2012-07-05 13:17 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Steven Bosscher, Ulrich Weigand, gcc-patches, tom

Hi,

On Thu, 5 Jul 2012, Richard Guenther wrote:

> >> On Wed, Jul 4, 2012 at 7:02 PM, Ulrich Weigand <uweigand@de.ibm.com> 
> >> wrote:
> >> > Any suggestions how to fix this?  Should tail merging detect 
> >> > __builtin_unreachable and not merge such block?
> >>
> >> That seems to be the most straight-forward thing to do. I don't think 
> >> there are any other passes that do this kind of code merging.
> >
> > What do we gain by delaying to remove these blocks until RTL?  AFAICS 
> > not much if anything.  So removing those on the tree level would make 
> > more sense.
> 
> The gain is to derive assertions from the conditional guarding these 
> blocks and optimize using that knowledge.

I know that we derive assertions, but that's no reason why we couldn't 
remove the BBs (or move it's sideeffects) in e.g. pass_fold_builtins.  It 
runs late enough that we don't make use of assertions afterwards.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-05 12:49 ` Tom de Vries
@ 2012-07-05 13:18   ` Richard Guenther
  2012-07-05 13:30   ` Michael Matz
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Guenther @ 2012-07-05 13:18 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Ulrich Weigand, gcc-patches, tom, Andrew Pinski, Steven Bosscher,
	Michael Matz

On Thu, Jul 5, 2012 at 2:49 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 04/07/12 19:02, Ulrich Weigand wrote:
>> Any suggestions how to fix this?  Should tail merging detect
>> __builtin_unreachable and not merge such block?  Or else, should
>> the CFG optimizer be extended (how?) to handle unreachable blocks
>> with multiple predecessors better?
>
> Ulrich,
>
> I extended the example as such:
> ...
> int
> foo(int a)
> {
>     if (a <= 0)
>       {
> #ifdef MERGED
>       L1:
> #endif
>         __builtin_unreachable();
>       }
>     if (a > 2)
> #ifdef MERGED
>       goto L1;
> #else
>       __builtin_unreachable();
> #endif
>     return a > 0;
> }
> ...
>
> Indeed I can reproduce the problem:
> ...
> $ gcc unreachable.c -O2 -S -o- -ftree-tail-merge
> foo:
>         .cfi_startproc
>         testl   %edi, %edi
>         jle     .L3
>         cmpl    $2, %edi
>         jg      .L3
>         movl    $1, %eax
>         ret
> .L3:
>         .cfi_endproc
> ...
>
> And I can make the problem go away with -fno-tree-tail-merge:
> ...
> $ gcc unreachable.c -O2 -S -o- -fno-tree-tail-merge
> foo:
>         .cfi_startproc
>         movl    $1, %eax
>         ret
>         .cfi_endproc
> ...
>
> But I can also reproduce the problem with -fno-tree-tail-merge by using -DMERGED:
> ,,,
> $ gcc unreachable.c -O2 -S -o- -fno-tree-tail-merge -DMERGED
> foo:
>         .cfi_startproc
>         testl   %edi, %edi
>         jle     .L3
>         cmpl    $2, %edi
>         jg      .L3
>         movl    $1, %eax
>         ret
> .L3:
>         .cfi_endproc
> ,,,
>
> So it seems that this is a problem of a missed optimization, triggered by, but
> not exclusively triggered by -ftree-tail-merge.
>
> How to fix the missed optimization, I'm not sure where it should be done.
>
> I think the optimization could be done in tree-vrp. Currently the assert
> expressions look like this:
> ...
> .foo (intD.6 aD.1711)
> {
>   _BoolD.1693 D.1720;
>   intD.6 D.1719;
>
>   # BLOCK 2 freq:10000
>   # PRED: ENTRY [100.0%]  (fallthru,exec)
>   if (aD.1711_1(D) <= 0)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>   # SUCC: 3 [0.0%]  (true,exec) 4 [100.0%]  (false,exec)
>
>   # BLOCK 3 freq:4
>   # PRED: 2 [0.0%]  (true,exec)
>   # VUSE <.MEMD.1722_4(D)>
>   # USE = nonlocal
>   # CLB = nonlocal
>   __builtin_unreachableD.997 ();
>   # SUCC:
>
>   # BLOCK 4 freq:9996
>   # PRED: 2 [100.0%]  (false,exec)
>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>;
>   if (aD.1711_6 > 2)
>     goto <bb 5>;
>   else
>     goto <bb 6>;
>   # SUCC: 5 [0.0%]  (true,exec) 6 [100.0%]  (false,exec)
>
>   # BLOCK 5 freq:4
>   # PRED: 4 [0.0%]  (true,exec)
>   # VUSE <.MEMD.1722_4(D)>
>   # USE = nonlocal
>   # CLB = nonlocal
>   __builtin_unreachableD.997 ();
>   # SUCC:
>
>   # BLOCK 6 freq:9992
>   # PRED: 4 [100.0%]  (false,exec)
>   aD.1711_5 = ASSERT_EXPR <aD.1711_6, aD.1711_6 <= 2>;
>   D.1720_2 = aD.1711_5 > 0;
>   D.1719_3 = (intD.6) D.1720_2;
>   # VUSE <.MEMD.1722_4(D)>
>   return D.1719_3;
>   # SUCC: EXIT [100.0%]
>
> }
> ..
>
> The asserts allow the return result to be optimized, but not the cfg conditions.
>
> AFAIU, we can insert the asserts earlier. F.i., we can insert
>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
> before the GIMPLE_COND in bb2.
>
> Attached proof-of-concept patch implements this, and works for this example both
> with and without -DMERGED, independent of -ftree-tail-merge.

You don't need to do it this way - you can simply remove the CFG path
feeding __builtin_unreachable.  Putting the assert before the if looks
a bit backward, so I'd prefer to not do that.

> The only problem I can see with this approach is that doing this early (vrp1)
> basically removes range annotations which might have had an effect later on. It
> could be and idea to only do this in vrp2, which is not much earlier than
> expand, were the rtl cfg optimization kicks in for the first time.

Indeed.

Richard.

> Thanks,
> - Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-05 12:49 ` Tom de Vries
  2012-07-05 13:18   ` Richard Guenther
@ 2012-07-05 13:30   ` Michael Matz
  2012-07-05 18:46     ` Tom de Vries
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Matz @ 2012-07-05 13:30 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Ulrich Weigand, gcc-patches, tom, Andrew Pinski, Steven Bosscher,
	Richard Guenther

Hi,

On Thu, 5 Jul 2012, Tom de Vries wrote:

> The asserts allow the return result to be optimized, but not the cfg 
> conditions.
> 
> AFAIU, we can insert the asserts earlier. F.i., we can insert
>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
> before the GIMPLE_COND in bb2.

Nope.  That would require some more checks, in particular that the BB 
containing builtin_unreachable doesn't contain any other side-effects.  
Given this:

if (i < 0)
  { do_something_interesting();
    __builtin_unreachable();
  }

moving the assert before the if would remove the if condition, hence 
the call to do_something_interesting.  You need to retain side-effects if 
there are any.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-05 13:30   ` Michael Matz
@ 2012-07-05 18:46     ` Tom de Vries
  2012-07-06 11:02       ` Richard Guenther
  0 siblings, 1 reply; 16+ messages in thread
From: Tom de Vries @ 2012-07-05 18:46 UTC (permalink / raw)
  To: Michael Matz
  Cc: Ulrich Weigand, gcc-patches, tom, Andrew Pinski, Steven Bosscher,
	Richard Guenther

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]

On 05/07/12 15:30, Michael Matz wrote:
> Hi,
> 
> On Thu, 5 Jul 2012, Tom de Vries wrote:
> 
>> The asserts allow the return result to be optimized, but not the cfg 
>> conditions.
>>
>> AFAIU, we can insert the asserts earlier. F.i., we can insert
>>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
>> before the GIMPLE_COND in bb2.
> 
> Nope.  That would require some more checks, in particular that the BB 
> containing builtin_unreachable doesn't contain any other side-effects.  
> Given this:
> 
> if (i < 0)
>   { do_something_interesting();
>     __builtin_unreachable();
>   }
> 
> moving the assert before the if would remove the if condition, hence 
> the call to do_something_interesting.  You need to retain side-effects if 
> there are any.
> 

Michael,

Thanks for pointing that out.

I tried a first stab at your suggestion of implementing the optimization in
pass_fold_builtins, it works for the test-case.

Thanks,
- Tom

> 
> Ciao,
> Michael.
> 



[-- Attachment #2: ccp-unreachable.patch --]
[-- Type: text/x-patch, Size: 1775 bytes --]

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c (revision 189007)
+++ gcc/tree-ssa-ccp.c (working copy)
@@ -2318,6 +2318,44 @@ optimize_stdarg_builtin (gimple call)
     }
 }
 
+/* Return false if there are staments with side-effects before I.  Otherwise,
+   return true and make the block of I unreachable.  */
+
+static bool
+optimize_unreachable (gimple_stmt_iterator i)
+{
+  gimple stmt;
+  basic_block bb;
+  edge_iterator ei;
+  edge e;
+
+  for (gsi_prev (&i); !gsi_end_p (i); gsi_prev (&i))
+    {
+      stmt = gsi_stmt (i);
+      if (gimple_has_side_effects (stmt))
+	return false;
+    }
+
+  bb = gsi_bb (i);
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      basic_block src = e->src;
+      gimple stmt;
+      i = gsi_last_bb (src);
+      stmt = gsi_stmt (i);
+      gcc_assert (gimple_code (stmt) == GIMPLE_COND);
+      if (e->flags & EDGE_TRUE_VALUE)
+	gimple_cond_make_false (stmt);
+      else if (e->flags & EDGE_FALSE_VALUE)
+	gimple_cond_make_true (stmt);
+      else
+	gcc_unreachable ();
+    }
+
+  return true;
+}
+
 /* A simple pass that attempts to fold all builtin functions.  This pass
    is run after we've propagated as many constants as we can.  */
 
@@ -2379,6 +2417,11 @@ execute_fold_all_builtins (void)
 		gsi_next (&i);
 		continue;
 
+	      case BUILT_IN_UNREACHABLE:
+		if (optimize_unreachable (i))
+		  cfg_changed = true;
+		break;
+
 	      case BUILT_IN_VA_START:
 	      case BUILT_IN_VA_END:
 	      case BUILT_IN_VA_COPY:
@@ -2393,6 +2436,9 @@ execute_fold_all_builtins (void)
 		continue;
 	      }
 
+	  if (result == NULL_TREE)
+	    break;
+
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    {
 	      fprintf (dump_file, "Simplified\n  ");

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-05 18:46     ` Tom de Vries
@ 2012-07-06 11:02       ` Richard Guenther
  2012-07-06 16:37         ` Tom de Vries
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Guenther @ 2012-07-06 11:02 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Michael Matz, Ulrich Weigand, gcc-patches, tom, Andrew Pinski,
	Steven Bosscher

On Thu, Jul 5, 2012 at 8:45 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 05/07/12 15:30, Michael Matz wrote:
>> Hi,
>>
>> On Thu, 5 Jul 2012, Tom de Vries wrote:
>>
>>> The asserts allow the return result to be optimized, but not the cfg
>>> conditions.
>>>
>>> AFAIU, we can insert the asserts earlier. F.i., we can insert
>>>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
>>> before the GIMPLE_COND in bb2.
>>
>> Nope.  That would require some more checks, in particular that the BB
>> containing builtin_unreachable doesn't contain any other side-effects.
>> Given this:
>>
>> if (i < 0)
>>   { do_something_interesting();
>>     __builtin_unreachable();
>>   }
>>
>> moving the assert before the if would remove the if condition, hence
>> the call to do_something_interesting.  You need to retain side-effects if
>> there are any.
>>
>
> Michael,
>
> Thanks for pointing that out.
>
> I tried a first stab at your suggestion of implementing the optimization in
> pass_fold_builtins, it works for the test-case.

+static bool
+optimize_unreachable (gimple_stmt_iterator i)
+{
+  gimple stmt;
+  basic_block bb;
+  edge_iterator ei;
+  edge e;
+
+  for (gsi_prev (&i); !gsi_end_p (i); gsi_prev (&i))
+    {
+      stmt = gsi_stmt (i);
+      if (gimple_has_side_effects (stmt))
+       return false;
+    }

I think we should rely on DCE to remove stmts without side-effects before
__builtin_unreachable.  Thus I'd make this

  basic_block bb = gsi_bb (i);
  for (gsi = gsi_start (bb); !gsi_end_p (i); gsi_next (&gsi))
    {
      if (is_gimple_debug ())
       continue;
      if (gimple_code () == GIMPLE_LABEL)
        /* Verify we do not need to preserve the label.  */;
      if (gsi_stmt () != gsi_stmt (i))
        return false;
    }

  ...

thus simply require the builtin be the first statement in the block.

As for the label I'm concerned about

void foo (int b, int c)
{
  void *x = &&lab;
  if (b)
    {
lab:
      __builtin_unreachable ();
    }
lab2:
  if (c)
    x = &&lab2;
  goto *x;
}

non-sensical, of course, but "valid".

Richard.

> Thanks,
> - Tom
>
>>
>> Ciao,
>> Michael.
>>
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-06 11:02       ` Richard Guenther
@ 2012-07-06 16:37         ` Tom de Vries
  2012-07-09  8:10           ` Richard Guenther
  2012-07-09 20:36           ` Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
  0 siblings, 2 replies; 16+ messages in thread
From: Tom de Vries @ 2012-07-06 16:37 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Michael Matz, Ulrich Weigand, gcc-patches, tom, Andrew Pinski,
	Steven Bosscher

[-- Attachment #1: Type: text/plain, Size: 2754 bytes --]

On 06/07/12 13:01, Richard Guenther wrote:
> On Thu, Jul 5, 2012 at 8:45 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>> On 05/07/12 15:30, Michael Matz wrote:
>>> Hi,
>>>
>>> On Thu, 5 Jul 2012, Tom de Vries wrote:
>>>
>>>> The asserts allow the return result to be optimized, but not the cfg
>>>> conditions.
>>>>
>>>> AFAIU, we can insert the asserts earlier. F.i., we can insert
>>>>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
>>>> before the GIMPLE_COND in bb2.
>>>
>>> Nope.  That would require some more checks, in particular that the BB
>>> containing builtin_unreachable doesn't contain any other side-effects.
>>> Given this:
>>>
>>> if (i < 0)
>>>   { do_something_interesting();
>>>     __builtin_unreachable();
>>>   }
>>>
>>> moving the assert before the if would remove the if condition, hence
>>> the call to do_something_interesting.  You need to retain side-effects if
>>> there are any.
>>>
>>
>> Michael,
>>
>> Thanks for pointing that out.
>>
>> I tried a first stab at your suggestion of implementing the optimization in
>> pass_fold_builtins, it works for the test-case.
> 
> +static bool
> +optimize_unreachable (gimple_stmt_iterator i)
> +{
> +  gimple stmt;
> +  basic_block bb;
> +  edge_iterator ei;
> +  edge e;
> +
> +  for (gsi_prev (&i); !gsi_end_p (i); gsi_prev (&i))
> +    {
> +      stmt = gsi_stmt (i);
> +      if (gimple_has_side_effects (stmt))
> +       return false;
> +    }
> 
> I think we should rely on DCE to remove stmts without side-effects before
> __builtin_unreachable.  Thus I'd make this
> 
>   basic_block bb = gsi_bb (i);
>   for (gsi = gsi_start (bb); !gsi_end_p (i); gsi_next (&gsi))
>     {
>       if (is_gimple_debug ())
>        continue;
>       if (gimple_code () == GIMPLE_LABEL)
>         /* Verify we do not need to preserve the label.  */;
>       if (gsi_stmt () != gsi_stmt (i))
>         return false;
>     }
> 
>   ...
> 
> thus simply require the builtin be the first statement in the block.

Done.

> 
> As for the label I'm concerned about
> 
> void foo (int b, int c)
> {
>   void *x = &&lab;
>   if (b)
>     {
> lab:
>       __builtin_unreachable ();
>     }
> lab2:
>   if (c)
>     x = &&lab2;
>   goto *x;
> }
> 
> non-sensical, of course, but "valid".
> 

Added this example as test-case.

Bootstrapped and reg-tested (ada inclusive) on x86_64.

OK for trunk?

Thanks,
- Tom


2012-07-06  Tom de Vries  <tom@codesourcery.com>
	    Richard Guenther  <rguenther@suse.de>

	* tree-ssa-ccp.c (optimize_unreachable): New function.
	(execute_fold_all_builtins): Use optimize_unreachable to optimize
	BUILT_IN_UNREACHABLE.  Don't optimize after BUILT_IN_UNREACHABLE.

	* gcc.dg/builtin-unreachable-6.c: New test.
	* gcc.dg/builtin-unreachable-5.c: New test.

[-- Attachment #2: ccp-unreachable.2.patch --]
[-- Type: text/x-patch, Size: 3780 bytes --]

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c (revision 189007)
+++ gcc/tree-ssa-ccp.c (working copy)
@@ -2318,6 +2318,69 @@ optimize_stdarg_builtin (gimple call)
     }
 }
 
+/* Attemp to make the block of __builtin_unreachable I unreachable by changing
+   the incoming jumps.  Return true if at least one jump was changed.  */
+
+static bool
+optimize_unreachable (gimple_stmt_iterator i)
+{
+  basic_block bb = gsi_bb (i);
+  gimple_stmt_iterator gsi;
+  gimple stmt;
+  edge_iterator ei;
+  edge e;
+  bool ret;
+
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      stmt = gsi_stmt (gsi);
+
+      if (is_gimple_debug (stmt))
+       continue;
+
+      if (gimple_code (stmt) == GIMPLE_LABEL)
+	{
+	  /* Verify we do not need to preserve the label.  */
+	  if (FORCED_LABEL (gimple_label_label (stmt)))
+	    return false;
+
+	  continue;
+	}
+
+      /* Only handle the case that __builtin_unreachable is the first statement
+	 in the block.  We rely on DCE to remove stmts without side-effects
+	 before __builtin_unreachable.  */
+      if (gsi_stmt (gsi) != gsi_stmt (i))
+        return false;
+    }
+
+  ret = false;
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      gsi = gsi_last_bb (e->src);
+      stmt = gsi_stmt (gsi);
+
+      if (stmt && gimple_code (stmt) == GIMPLE_COND)
+	{
+	  if (e->flags & EDGE_TRUE_VALUE)
+	    gimple_cond_make_false (stmt);
+	  else if (e->flags & EDGE_FALSE_VALUE)
+	    gimple_cond_make_true (stmt);
+	  else
+	    gcc_unreachable ();
+	}
+      else
+	{
+	  /* Todo: handle other cases, f.i. switch statement.  */
+	  continue;
+	}
+
+      ret = true;
+    }
+
+  return ret;
+}
+
 /* A simple pass that attempts to fold all builtin functions.  This pass
    is run after we've propagated as many constants as we can.  */
 
@@ -2379,6 +2442,11 @@ execute_fold_all_builtins (void)
 		gsi_next (&i);
 		continue;
 
+	      case BUILT_IN_UNREACHABLE:
+		if (optimize_unreachable (i))
+		  cfg_changed = true;
+		break;
+
 	      case BUILT_IN_VA_START:
 	      case BUILT_IN_VA_END:
 	      case BUILT_IN_VA_COPY:
@@ -2393,6 +2461,9 @@ execute_fold_all_builtins (void)
 		continue;
 	      }
 
+	  if (result == NULL_TREE)
+	    break;
+
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    {
 	      fprintf (dump_file, "Simplified\n  ");
Index: gcc/testsuite/gcc.dg/builtin-unreachable-6.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/builtin-unreachable-6.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fab" } */
+
+void
+foo (int b, int c)
+{
+  void *x = &&lab;
+  if (b)
+    {
+lab:
+      __builtin_unreachable ();
+    }
+lab2:
+  if (c)
+    x = &&lab2;
+  goto *x;
+}
+
+/* { dg-final { scan-tree-dump-times "lab:" 1 "fab" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_unreachable" 1 "fab" } } */
+/* { dg-final { cleanup-tree-dump "fab" } } */
Index: gcc/testsuite/gcc.dg/builtin-unreachable-5.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/builtin-unreachable-5.c (revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fab" } */
+
+int
+foo (int a)
+{
+  if (a <= 0)
+    {
+    L1:
+      __builtin_unreachable ();
+    }
+
+  if (a > 2)
+    goto L1;
+
+  return a > 0;
+}
+
+/* { dg-final { scan-tree-dump-times "if \\(" 0 "fab" } } */
+/* { dg-final { scan-tree-dump-times "goto" 0 "fab" } } */
+/* { dg-final { scan-tree-dump-times "L1:" 0 "fab" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_unreachable" 0 "fab" } } */
+/* { dg-final { cleanup-tree-dump "fab" } } */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-06 16:37         ` Tom de Vries
@ 2012-07-09  8:10           ` Richard Guenther
  2012-07-16 13:56             ` [RFC] 4.7 backport crashes (was: Re: Tree tail merging breaks __builtin_unreachable optimization) Ulrich Weigand
  2012-07-09 20:36           ` Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Guenther @ 2012-07-09  8:10 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Michael Matz, Ulrich Weigand, gcc-patches, tom, Andrew Pinski,
	Steven Bosscher

On Fri, Jul 6, 2012 at 6:36 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 06/07/12 13:01, Richard Guenther wrote:
>> On Thu, Jul 5, 2012 at 8:45 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>> On 05/07/12 15:30, Michael Matz wrote:
>>>> Hi,
>>>>
>>>> On Thu, 5 Jul 2012, Tom de Vries wrote:
>>>>
>>>>> The asserts allow the return result to be optimized, but not the cfg
>>>>> conditions.
>>>>>
>>>>> AFAIU, we can insert the asserts earlier. F.i., we can insert
>>>>>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
>>>>> before the GIMPLE_COND in bb2.
>>>>
>>>> Nope.  That would require some more checks, in particular that the BB
>>>> containing builtin_unreachable doesn't contain any other side-effects.
>>>> Given this:
>>>>
>>>> if (i < 0)
>>>>   { do_something_interesting();
>>>>     __builtin_unreachable();
>>>>   }
>>>>
>>>> moving the assert before the if would remove the if condition, hence
>>>> the call to do_something_interesting.  You need to retain side-effects if
>>>> there are any.
>>>>
>>>
>>> Michael,
>>>
>>> Thanks for pointing that out.
>>>
>>> I tried a first stab at your suggestion of implementing the optimization in
>>> pass_fold_builtins, it works for the test-case.
>>
>> +static bool
>> +optimize_unreachable (gimple_stmt_iterator i)
>> +{
>> +  gimple stmt;
>> +  basic_block bb;
>> +  edge_iterator ei;
>> +  edge e;
>> +
>> +  for (gsi_prev (&i); !gsi_end_p (i); gsi_prev (&i))
>> +    {
>> +      stmt = gsi_stmt (i);
>> +      if (gimple_has_side_effects (stmt))
>> +       return false;
>> +    }
>>
>> I think we should rely on DCE to remove stmts without side-effects before
>> __builtin_unreachable.  Thus I'd make this
>>
>>   basic_block bb = gsi_bb (i);
>>   for (gsi = gsi_start (bb); !gsi_end_p (i); gsi_next (&gsi))
>>     {
>>       if (is_gimple_debug ())
>>        continue;
>>       if (gimple_code () == GIMPLE_LABEL)
>>         /* Verify we do not need to preserve the label.  */;
>>       if (gsi_stmt () != gsi_stmt (i))
>>         return false;
>>     }
>>
>>   ...
>>
>> thus simply require the builtin be the first statement in the block.
>
> Done.
>
>>
>> As for the label I'm concerned about
>>
>> void foo (int b, int c)
>> {
>>   void *x = &&lab;
>>   if (b)
>>     {
>> lab:
>>       __builtin_unreachable ();
>>     }
>> lab2:
>>   if (c)
>>     x = &&lab2;
>>   goto *x;
>> }
>>
>> non-sensical, of course, but "valid".
>>
>
> Added this example as test-case.
>
> Bootstrapped and reg-tested (ada inclusive) on x86_64.
>
> OK for trunk?

Ok.
Thanks,
Richard.

> Thanks,
> - Tom
>
>
> 2012-07-06  Tom de Vries  <tom@codesourcery.com>
>             Richard Guenther  <rguenther@suse.de>
>
>         * tree-ssa-ccp.c (optimize_unreachable): New function.
>         (execute_fold_all_builtins): Use optimize_unreachable to optimize
>         BUILT_IN_UNREACHABLE.  Don't optimize after BUILT_IN_UNREACHABLE.
>
>         * gcc.dg/builtin-unreachable-6.c: New test.
>         * gcc.dg/builtin-unreachable-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tree tail merging breaks __builtin_unreachable optimization
  2012-07-06 16:37         ` Tom de Vries
  2012-07-09  8:10           ` Richard Guenther
@ 2012-07-09 20:36           ` Ulrich Weigand
  1 sibling, 0 replies; 16+ messages in thread
From: Ulrich Weigand @ 2012-07-09 20:36 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Richard Guenther, Michael Matz, gcc-patches, tom, Andrew Pinski,
	Steven Bosscher

Tom de Vries wrote:

> 2012-07-06  Tom de Vries  <tom@codesourcery.com>
> 	    Richard Guenther  <rguenther@suse.de>
> 
> 	* tree-ssa-ccp.c (optimize_unreachable): New function.
> 	(execute_fold_all_builtins): Use optimize_unreachable to optimize
> 	BUILT_IN_UNREACHABLE.  Don't optimize after BUILT_IN_UNREACHABLE.
> 
> 	* gcc.dg/builtin-unreachable-6.c: New test.
> 	* gcc.dg/builtin-unreachable-5.c: New test.

Many thanks for taking care of this!

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC] 4.7 backport crashes (was: Re: Tree tail merging breaks __builtin_unreachable optimization)
  2012-07-09  8:10           ` Richard Guenther
@ 2012-07-16 13:56             ` Ulrich Weigand
  2012-07-16 14:11               ` Richard Guenther
  0 siblings, 1 reply; 16+ messages in thread
From: Ulrich Weigand @ 2012-07-16 13:56 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Tom de Vries, Michael Matz, gcc-patches, tom, Andrew Pinski,
	Steven Bosscher

Richard Guenther wrote:
> On Fri, Jul 6, 2012 at 6:36 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> > Bootstrapped and reg-tested (ada inclusive) on x86_64.
> >
> > OK for trunk?
> 
> Ok.
> Thanks,
> Richard.
> 
> > 2012-07-06  Tom de Vries  <tom@codesourcery.com>
> >             Richard Guenther  <rguenther@suse.de>
> >
> >         * tree-ssa-ccp.c (optimize_unreachable): New function.
> >         (execute_fold_all_builtins): Use optimize_unreachable to optimize
> >         BUILT_IN_UNREACHABLE.  Don't optimize after BUILT_IN_UNREACHABLE.
> >
> >         * gcc.dg/builtin-unreachable-6.c: New test.
> >         * gcc.dg/builtin-unreachable-5.c: New test.


When attempting to backport this patch to our 4.7 branch, I ran into
segmentation faults.  It turns out that at least in 4.7, gsi_stmt
crashes when passed an empty gsi (for which gsi_end_p is true);
on mainline, gsi_stmt simply returns NULL instead.

Now I understand that even on mainline, we're still supposed to check
gsi_end_p before calling gsi_stmt.  The patch below updates
tree-ssa-ccp.c:optimize_unreachable to do that.  In the backport
this fixes the crashes.

This doesn't really have any effect on behaviour on mainline.  Should
it be installed anyway?

(Tested on mainline on i386-linux with no regressions.)

In addition, I was wondering whether we should backport Tom's patch
(including the fix below) to the FSF 4.7 branch: it does fix a
(performance) regression, in the sense that the original testcase
calling __builtin_unreachable multiple times was optimized well
until 4.6, and is now again optimized well on mainline, but it
generates quite bad code on 4.7 at the moment ...

Bye,
Ulrich


ChangeLog:

	* tree-ssa-ccp.c (optimize_unreachable): Check gsi_end_p
	before calling gsi_stmt.

Index: gcc/tree-ssa-ccp.c
===================================================================
*** gcc/tree-ssa-ccp.c	(revision 189459)
--- gcc/tree-ssa-ccp.c	(working copy)
*************** optimize_unreachable (gimple_stmt_iterat
*** 2358,2366 ****
    FOR_EACH_EDGE (e, ei, bb->preds)
      {
        gsi = gsi_last_bb (e->src);
!       stmt = gsi_stmt (gsi);
  
!       if (stmt && gimple_code (stmt) == GIMPLE_COND)
  	{
  	  if (e->flags & EDGE_TRUE_VALUE)
  	    gimple_cond_make_false (stmt);
--- 2358,2368 ----
    FOR_EACH_EDGE (e, ei, bb->preds)
      {
        gsi = gsi_last_bb (e->src);
!       if (gsi_end_p (gsi))
! 	continue;
  
!       stmt = gsi_stmt (gsi);
!       if (gimple_code (stmt) == GIMPLE_COND)
  	{
  	  if (e->flags & EDGE_TRUE_VALUE)
  	    gimple_cond_make_false (stmt);


-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] 4.7 backport crashes (was: Re: Tree tail merging breaks __builtin_unreachable optimization)
  2012-07-16 13:56             ` [RFC] 4.7 backport crashes (was: Re: Tree tail merging breaks __builtin_unreachable optimization) Ulrich Weigand
@ 2012-07-16 14:11               ` Richard Guenther
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Guenther @ 2012-07-16 14:11 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Tom de Vries, Michael Matz, gcc-patches, tom, Andrew Pinski,
	Steven Bosscher

On Mon, Jul 16, 2012 at 3:55 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Richard Guenther wrote:
>> On Fri, Jul 6, 2012 at 6:36 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>> > Bootstrapped and reg-tested (ada inclusive) on x86_64.
>> >
>> > OK for trunk?
>>
>> Ok.
>> Thanks,
>> Richard.
>>
>> > 2012-07-06  Tom de Vries  <tom@codesourcery.com>
>> >             Richard Guenther  <rguenther@suse.de>
>> >
>> >         * tree-ssa-ccp.c (optimize_unreachable): New function.
>> >         (execute_fold_all_builtins): Use optimize_unreachable to optimize
>> >         BUILT_IN_UNREACHABLE.  Don't optimize after BUILT_IN_UNREACHABLE.
>> >
>> >         * gcc.dg/builtin-unreachable-6.c: New test.
>> >         * gcc.dg/builtin-unreachable-5.c: New test.
>
>
> When attempting to backport this patch to our 4.7 branch, I ran into
> segmentation faults.  It turns out that at least in 4.7, gsi_stmt
> crashes when passed an empty gsi (for which gsi_end_p is true);
> on mainline, gsi_stmt simply returns NULL instead.
>
> Now I understand that even on mainline, we're still supposed to check
> gsi_end_p before calling gsi_stmt.  The patch below updates
> tree-ssa-ccp.c:optimize_unreachable to do that.  In the backport
> this fixes the crashes.
>
> This doesn't really have any effect on behaviour on mainline.  Should
> it be installed anyway?

Yes please.

> (Tested on mainline on i386-linux with no regressions.)
>
> In addition, I was wondering whether we should backport Tom's patch
> (including the fix below) to the FSF 4.7 branch: it does fix a
> (performance) regression, in the sense that the original testcase
> calling __builtin_unreachable multiple times was optimized well
> until 4.6, and is now again optimized well on mainline, but it
> generates quite bad code on 4.7 at the moment ...

I'm not sure.

Thanks,
Richard.

> Bye,
> Ulrich
>
>
> ChangeLog:
>
>         * tree-ssa-ccp.c (optimize_unreachable): Check gsi_end_p
>         before calling gsi_stmt.
>
> Index: gcc/tree-ssa-ccp.c
> ===================================================================
> *** gcc/tree-ssa-ccp.c  (revision 189459)
> --- gcc/tree-ssa-ccp.c  (working copy)
> *************** optimize_unreachable (gimple_stmt_iterat
> *** 2358,2366 ****
>     FOR_EACH_EDGE (e, ei, bb->preds)
>       {
>         gsi = gsi_last_bb (e->src);
> !       stmt = gsi_stmt (gsi);
>
> !       if (stmt && gimple_code (stmt) == GIMPLE_COND)
>         {
>           if (e->flags & EDGE_TRUE_VALUE)
>             gimple_cond_make_false (stmt);
> --- 2358,2368 ----
>     FOR_EACH_EDGE (e, ei, bb->preds)
>       {
>         gsi = gsi_last_bb (e->src);
> !       if (gsi_end_p (gsi))
> !       continue;
>
> !       stmt = gsi_stmt (gsi);
> !       if (gimple_code (stmt) == GIMPLE_COND)
>         {
>           if (e->flags & EDGE_TRUE_VALUE)
>             gimple_cond_make_false (stmt);
>
>
> --
>   Dr. Ulrich Weigand
>   GNU Toolchain for Linux on System z and Cell BE
>   Ulrich.Weigand@de.ibm.com
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-07-16 14:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-04 17:02 Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
2012-07-04 18:09 ` Andrew Pinski
2012-07-04 18:17 ` Steven Bosscher
2012-07-05 12:44   ` Michael Matz
2012-07-05 12:46     ` Richard Guenther
2012-07-05 13:17       ` Michael Matz
2012-07-05 12:49 ` Tom de Vries
2012-07-05 13:18   ` Richard Guenther
2012-07-05 13:30   ` Michael Matz
2012-07-05 18:46     ` Tom de Vries
2012-07-06 11:02       ` Richard Guenther
2012-07-06 16:37         ` Tom de Vries
2012-07-09  8:10           ` Richard Guenther
2012-07-16 13:56             ` [RFC] 4.7 backport crashes (was: Re: Tree tail merging breaks __builtin_unreachable optimization) Ulrich Weigand
2012-07-16 14:11               ` Richard Guenther
2012-07-09 20:36           ` Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).