From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-294720-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 17087 invoked by alias); 18 Jun 2011 23:49:03 -0000
Received: (qmail 17078 invoked by uid 22791); 18 Jun 2011 23:49:02 -0000
X-SWARE-Spam-Status: No, hits=-2.3 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,TW_CF
X-Spam-Check-By: sourceware.org
Received: from mail-qw0-f47.google.com (HELO mail-qw0-f47.google.com) (209.85.216.47)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 18 Jun 2011 23:48:39 +0000
Received: by qwh5 with SMTP id 5so770346qwh.20        for <gcc-patches@gcc.gnu.org>; Sat, 18 Jun 2011 16:48:38 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.229.212.8 with SMTP id gq8mr1588296qcb.73.1308440916572; Sat, 18 Jun 2011 16:48:36 -0700 (PDT)
Received: by 10.229.47.78 with HTTP; Sat, 18 Jun 2011 16:48:36 -0700 (PDT)
In-Reply-To: <4DF985ED.30406@redhat.com>
References: <4DF985ED.30406@redhat.com>
Date: Sun, 19 Jun 2011 07:30:00 -0000
Message-ID: <BANLkTi=UxK9ZbH702MCaGvZH5YhkkXu+Bw@mail.gmail.com>
Subject: Re: Improve jump threading #5 of N
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Jeff Law <law@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2011-06/txt/msg01405.txt.bz2

On Wed, Jun 15, 2011 at 9:26 PM, Jeff Law <law@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
>
>
> So as I've mentioned previously, I've been working on a relatively small
> change to the jump threading code which would allow it to duplicate a
> join block when doing so allows us to thread through a successor of the
> join block. =A0This is expected to be the last extension to the existing
> jump threading code.
>
> This was mainly done to improve our ability to eliminate unexecutable
> paths through the CFG which helps avoid false positives with certain
> warnings. =A0It also has the nice property that it eliminates conditionals
> and often results in further optimization of nearby code.
>
> To help evaluate the code generation improvements of this change I built
> gcc-4.6 (checking enabled) using a compiler with and without this
> improvement. =A0I then used the 4.6 cc1s to compile a bunch of .i files
> under the watchful eye of valgrind.
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0without patch =A0 =A0with =
patch
> Total cbranches =A0 =A0 =A0 =A0 =A0231072754220 =A0 =A0 229626578262
> Total ibranches: =A0 =A0 =A0 =A0 =A0 7687404775 =A0 =A0 =A0 7686994201
>
>
> cbranches shows the number of dynamically executed conditional branches.
> =A0As you can see, with the patch we eliminated about .625% of the runtime
> conditional branches. =A0Not bad at all. =A0We eliminated a trivial number
> of indirect branches. =A0In all we eliminated 1446595532 runtime branches.
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0without patch =A0 =A0with =
patch
> Total instructions: =A0 =A0 1254106133886 =A0 =A01247718004946
>
>
> I was expecting a reduction in the total number of instructions
> executed, but was quite surprised at the actual data. =A0We end up
> eliminating 6388128940 dynamic instructions --- which means that for
> every dynamic branch eliminated, on average we were able to eliminate an
> additional 3.4 dynamic instructions -- that's a huge secondary effect.
> Clearly improving jump threading in this way is allowing the rest of the
> optimizers to do a better job.
>
> Anyway, attached is the patch. =A0Again, the concept is pretty simple,
> when we have a join block which can not be threaded, we peek at the
> successors of the join block and see if one or more of them can be thread=
ed.
>
> If so, we make a duplicate of the join block, wire the incoming edge we
> were originally trying to thread to reach the duplicate rather than the
> original join block. =A0We then wire the outgoing edge from the duplicate
> to the final jump thread target.
>
> So if given a CFG like this (from =A0a routine in cfgexpand):
>
> =A0 =A0 =A0 =A0 =A0 A
> =A0 =A0 =A0 =A0 =A0/ \
> =A0 =A0 =A0 =A0 B =A0 C
> =A0 =A0 =A0 =A0 | =A0/ \
> =A0 =A0 =A0 =A0 | D =A0 E
> =A0 =A0 =A0 =A0 | | =A0/ \
> =A0 =A0 =A0 =A0 | | F =A0 G
> =A0 =A0 =A0 =A0 =A0\| |
> =A0 =A0 =A0 =A0 =A0 =A0\|
> =A0 =A0 =A0 =A0 =A0 =A0 H
> =A0 =A0 =A0 =A0 =A0 =A0/ \
> =A0 =A0 =A0 =A0 =A0 I =A0 J
> =A0 =A0 =A0 =A0 =A0/ \
> =A0 =A0 =A0 =A0 L =A0 M
> =A0 =A0 =A0 =A0 | =A0/ \
> =A0 =A0 =A0 =A0 | N =A0 O
> =A0 =A0 =A0 =A0 | | =A0/ \
> =A0 =A0 =A0 =A0 | | P =A0 Q
> =A0 =A0 =A0 =A0 =A0\| |
> =A0 =A0 =A0 =A0 =A0 =A0\|
> =A0 =A0 =A0 =A0 =A0 =A0 R
>
>
> As it turns out some blocks have the same condition (A,I), (C,M), (E,O).
> But because of the merge block H, no threading is possible. =A0What we
> want to do is make 3 copies of H, each reachable from one predecessor of
> the original H. =A0That exposes the jump threading opportunities B->L,
> D->N and F->P. =A0The final CFG looks something like this:
>
> =A0 =A0 =A0 =A0 =A0 A
> =A0 =A0 =A0 =A0 =A0/ \
> =A0 =A0 =A0 =A0BH'L C
> =A0 =A0 =A0 =A0 | =A0/ \
> =A0 =A0 =A0 =A0 |DH'N E
> =A0 =A0 =A0 =A0 | | =A0/ \
> =A0 =A0 =A0 =A0 | |FH'P G
> =A0 =A0 =A0 =A0 =A0\| |
> =A0 =A0 =A0 =A0 =A0 =A0\|
> =A0 =A0 =A0 =A0 =A0 =A0 R
>
>
>
> Where each H' also has an edge to J from the original CFG, but which is
> hard to show here... Note that I, M, O & Q all disappear and each
> dynamic path through the cfg is shortened, even though we had to
> duplicate H multiple times.
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>
> OK for mainline?
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D49465

--=20
H.J.