From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-195698-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 5282 invoked by alias); 25 May 2007 12:58:54 -0000
Received: (qmail 5270 invoked by uid 22791); 25 May 2007 12:58:52 -0000
X-Spam-Check-By: sourceware.org
Received: from mail1.panix.com (HELO mail1.panix.com) (166.84.1.72)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 25 May 2007 12:58:50 +0000
Received: from mailspool3.panix.com (mailspool3.panix.com [166.84.1.78]) 	by mail1.panix.com (Postfix) with ESMTP id 42C8A595EA; 	Fri, 25 May 2007 08:58:47 -0400 (EDT)
Received: from [192.168.1.60] (pool-70-104-128-175.nycmny.fios.verizon.net [70.104.128.175]) 	by mailspool3.panix.com (Postfix) with ESMTP id 33E3816976; 	Fri, 25 May 2007 08:58:47 -0400 (EDT)
Message-ID: <4656DD86.8030103@naturalbridge.com>
Date: Fri, 25 May 2007 13:00:00 -0000
From: Kenneth Zadeck <zadeck@naturalbridge.com>
User-Agent: Thunderbird 1.5.0.10 (X11/20060911)
MIME-Version: 1.0
To: Bernd Schmidt <bernds_cb1@t-online.de>
CC: gcc-patches <gcc-patches@gcc.gnu.org>,   Steven Bosscher <stevenb.gcc@gmail.com>,  "Park, Seongbae" <seongbae.park@gmail.com>,   "Bonzini, Paolo" <bonzini@gnu.org>,  Serge Belyshev <belyshev@depni.sinp.msu.ru>,  richard.earnshaw@arm.com,   echristo@apple.com,   "Pinski, Andrew" <andrew_pinski@playstation.sony.com>,  "Weigand, Ulrich" <Ulrich.Weigand@de.ibm.com>,   Ian Lance Taylor <iant@google.com>,  "Edelsohn, David" <dje@watson.ibm.com>,   "Berlin, Daniel" <dberlin@dberlin.org>
Subject: Re: dataflow branch merging plans.
References: <46543F49.8060104@naturalbridge.com> <46557380.9060105@t-online.de> <4655EF12.6060605@naturalbridge.com> <4656AFFD.1040605@t-online.de>
In-Reply-To: <4656AFFD.1040605@t-online.de>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2007-05/txt/msg01714.txt.bz2

Bernd Schmidt wrote:
> Kenneth Zadeck wrote:
>   
>> Bernd Schmidt wrote:
>>     
>>> Kenneth Zadeck wrote:
>>>   
>>>       
>>>> I believe that the dataflow branch is now ready to merge into the
>>>> mainline.  We have fixed almost all of the performance problems
>>>> associated with it.  While there are still some left, we feel
>>>> confident that these can be addressed during the rest of stage I and
>>>> during stage II.
>>>>     
>>>>         
>>> Vlad's last benchmark run still showed up to 11% compile time
>>> regression, didn't it?
>>>
>>>
>>> Bernd
>>>   
>>>       
>> The ia-64 is an outlier at 11% as is the ppc at .9%.  I think that it is
>> quite likely that there is some ia-64 unique pass, like one of the
>> schedulers, that needs a look at.  This will be addressed after the merge. 
>>     
>
> Still, 6% compile time regression on several targets and typically a
> (very small) regression in SPEC scores - am I the only one who's not
> impressed?  We don't normally accept patches with these kinds of
> results, and I don't see why we should make an exception here.
>
>   
Bernd,

I certainly had little expectation that just replacing the dataflow
would, in itself, have a dramatic effect on the compiled code.  While
the analysis is more precise that what flow produces, that in itself was
not something that was going to change the world.
Nor was the replacement of the three optimizations, dse, dce and auto
inc detection,  that were embedded in flow.  While these passes are
finding many more opportunities for change than their predecessors in
flow, these passes themselves were not the underlying problems with the
back end.

The gains to be made are going to come from changing the way that
instruction selection, register allocation and scheduling are
implemented and these require a good dataflow foundation to get any
interesting improvements.

The problem is that much of that back end only makes cursory use of
dataflow information.  It either ignores the flow's information all
together, uses the current df,  rolls it's own (constant propagation) or
it is written using a much weaker technique that simply avoids global
information at all.

In many cases, those passes that did use flow had to be scaled back
(either deliberately or by being patched to death) because the flow was
not accurate.
Aside from changing the code to get the information from global_live_at
to DF_LIVE, we have made few other changes to these passes.  It is the
upgrading or flat out replacement of these passes which is the real
target of the df branch technology.

There were several paths that could have been taken for integration:
1) Just continue to replace passes one at a time with the df that is in
the trunk. 
This would have been much more expensive, in compile time and space than
the current approach.  As it turns out, the only way to make df
efficient is to have certain parts of it (the scanning an several other
structures) persistent.  Flow does the same thing with many of its
structures including the global_live_at vectors. 

Maintaining two sets of persistent information over the entire back end
would have been very expensive and it was clear that this was not going
to be a reasonable option.

2) Replace all of flow with df, then after the merge, then
replace/upgrade any other passes later.  The was the path that we have
set out on.  We have replaced flow, but as of yet we have not touched
any any of the passes that just roll their own. 

We have made large gains in the amount of time and space cost of this
conversion.  The time cost is really the only thing that remains, and we
will address that during the rest of stage I and stage II.

The reason for merging df now, rather than later is that we are still in
stage I, and the people who are working on df are now spending almost
all of their time fighting the differences between mainline and the
branch.  There are enough differences in the way that the mainline uses
flow, and those same passes use df to consume most of my time and great
deal of the time that seonbae, steven and paolo have to spend on this
branch.  We are no longer making any progress because of this.

I believe, and in private conversations with Vlad he believes, that we
have reached the point where there are no longer any performance
regressions introduced by the branch.  As Vlad correctly pointed out,
these would have been much more difficult to deal with once the merge
happened, and so resolving these has been our number one priority. 
> Why the rush to merge now?  How exactly will it make 4.3 better?
>   
We are in stage I.  As I have pointed out,  df provides the framework
for replacing /upgrading many of the other parts of the back end, but it
can only do that if it is actually in the compiler.   It does no good to
have it be a side branch that just diverges further and further away
from the trunk.

The truth is that there is never a great time to do a house cleaning. 
The is just the least worst time. 

> Finally, since a plan to merge next week was indicated - which
> maintainer has approved it?
>
>   
The steering committee set out criteria for merging the branch along the
lines that tree ssa was merged.  All of the work has been done in
public, and we have, as I pointed out been careful to vet the complex
interactions with the proper maintainers as we have done them. 

There is not a requirement that the branch be vetted, as in a normal
patch by a maintainer.  We certainly do not take that as a pass to just
go off and do what we want in a vacuum.  I and the rest of the people
who have worked on this branch have been quick to resolve any issues
brought up by anyone who has been watching the patches go in. 

My announcement was consistent with this.  If you or anyone else have
problems with any part of the branch, we will certainly try to address
those issues both before and after the merge.

Kenny
> Bernd
>