public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Tree tail merging breaks __builtin_unreachable optimization
@ 2012-07-04 17:02 Ulrich Weigand
  2012-07-04 18:09 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Ulrich Weigand @ 2012-07-04 17:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: tom

Hello,

starting with 4.7, if multiple __builtin_unreachable statements occur in
a single function, they are no longer optimized as they used to be.

For example,

int foo(int a)
{
    if (a <= 0)
        __builtin_unreachable();
    if (a > 2)
        __builtin_unreachable();

    return a > 0;
}

results in the following (ARM) code:

foo:
        cmp r0, #0
        ble .L3
        cmp r0, #2
        bgt .L3
        mov r0, #1
        bx lr
.L3:

with the label .L3 hanging off after the end of the function.

With 4.6, we instead get the expected:

foo:
        mov     r0, #1
        bx      lr


The problem seems to be an unfortunate interaction between tree and
RTL optimization passes. In 4.6, we had something like:

<bb 2>:
  if (a_1(D) <= 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  __builtin_unreachable ();

<bb 4>:
  if (a_1(D) > 2)
    goto <bb 5>;
  else
    goto <bb 6>;

<bb 5>:
  __builtin_unreachable ();

<bb 6>:
  return 1;

on the tree level; during RTL expansion __builtin_unreachable expands to just a
barrier, and subsequent CFG optimization detects basic blocks containing just a
barrier and optimizes the predecessor blocks.

With 4.7, we get instead:

<bb 2>:
  if (a_1(D) <= 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  __builtin_unreachable ();

<bb 4>:
  if (a_1(D) > 2)
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 5>:
  return 1;

where there is just a single basic block containing __builtin_unreachable,
and multiple predecessors branching to it. Now unfortunately the RTL
optimizers detecting unreachable blocks appear to have difficulties if
such a block has multiple predecessors, and fail to optimize them.

The tree pass that merged the two blocks is a new pass called "tail merging",
which was added in the 4.7 cycle. In fact, using -fno-tree-tail-merge gets
the expected result back.

Any suggestions how to fix this?  Should tail merging detect
__builtin_unreachable and not merge such block?  Or else, should
the CFG optimizer be extended (how?) to handle unreachable blocks
with multiple predecessors better?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-07-16 14:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-04 17:02 Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand
2012-07-04 18:09 ` Andrew Pinski
2012-07-04 18:17 ` Steven Bosscher
2012-07-05 12:44   ` Michael Matz
2012-07-05 12:46     ` Richard Guenther
2012-07-05 13:17       ` Michael Matz
2012-07-05 12:49 ` Tom de Vries
2012-07-05 13:18   ` Richard Guenther
2012-07-05 13:30   ` Michael Matz
2012-07-05 18:46     ` Tom de Vries
2012-07-06 11:02       ` Richard Guenther
2012-07-06 16:37         ` Tom de Vries
2012-07-09  8:10           ` Richard Guenther
2012-07-16 13:56             ` [RFC] 4.7 backport crashes (was: Re: Tree tail merging breaks __builtin_unreachable optimization) Ulrich Weigand
2012-07-16 14:11               ` Richard Guenther
2012-07-09 20:36           ` Tree tail merging breaks __builtin_unreachable optimization Ulrich Weigand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).