From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15584 invoked by alias); 30 Jul 2005 19:20:45 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 15419 invoked by uid 22791); 30 Jul 2005 19:20:36 -0000 Received: from atrey.karlin.mff.cuni.cz (HELO atrey.karlin.mff.cuni.cz) (195.113.31.123) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Sat, 30 Jul 2005 19:20:36 +0000 Received: by atrey.karlin.mff.cuni.cz (Postfix, from userid 4018) id 22DEA4B408C; Sat, 30 Jul 2005 21:20:28 +0200 (CEST) Date: Sat, 30 Jul 2005 19:20:00 -0000 From: Jan Hubicka To: girish vaitheeswaran Cc: Jan Hubicka , Janis Johnson , gcc@gcc.gnu.org Subject: Re: -fprofile-generate and -fprofile-use Message-ID: <20050730192028.GC11080@atrey.karlin.mff.cuni.cz> References: <20050726065955.GD7310@atrey.karlin.mff.cuni.cz> <20050726181049.46326.qmail@web80010.mail.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050726181049.46326.qmail@web80010.mail.yahoo.com> User-Agent: Mutt/1.5.6+20040907i X-SW-Source: 2005-07/txt/msg01260.txt.bz2 > Jan, Hi, > That's going to be rather difficult given that the app > has over 1000 files. Is there a way I can turn off the > "default" options one at a time ? This is unforutnately not possible :( The optimizations used either profile feedback or profile guessed by GCC itself. It looks like for your case the profile guessed by GCC (even if departed from reality) causes GCC to produce better code than the real profile (or that the real profile got missread, but there are some sanity checks for this so this is quite unlikely). It seems to me that only way to proceed from here is some profiling. The way I usually look into such problems is to produce oprofile of both versions of code and then compare the times spent in individual functions then it is sometimes possible to identify the offending code more easilly.... Honza > Thx > -girish > > --- Jan Hubicka wrote: > > > > I have done quite a few experiments with this to > > > narrow down the problem. The performance numbers > > are > > > slower compared to *No Feedback optimization with > > just > > > -O3* Here are some of them. All the experiments > > were > > > done on a new build-area in order to eliminate > > effects > > > of old feedback files. > > > > > > 1. I built the app using -O3 and > > -fprofile-generate to > > > generate the feedback data. I then ran the > > workload > > > and then recompiled the app using -O3 and > > > -fprofile-use [app was 20% slower] > > > > > > 2. I built the app using -O3 and > > -fprofile-generate to > > > generate the feedback data. I then ran the > > workload > > > and then recompiled the app using -O3 and > > > -fprofile-use -fno-vpt -fno-unroll-loops > > > -fno-peel-loops -fno-tracer (Which is turn off all > > the > > > flags used by -fprofile-use) [App was still 20% > > > slower] > > > > > > 3. I have tried selectively turning of some of the > > > other flags in the above list as well, but the > > > performance regression persists. > > > > > > 4. I tried with the older flags namely > > -fprofile-arcs > > > and -fbranch-probabilities still no help. > > > > So it looks like the slowdown is caused by one of > > the profile based > > optimizations that are enabled by default (basic > > block reordering or > > register allocation). If you are getting such a > > noticable slodown, it > > probably means that your app has pretty small inner > > loop. Can you just > > look into assembly generated for it with and without > > profiling and try > > to spot what is gong wrong? > > > > Thanks, > > Honza > > > > > > Can someone help me out on how to proceed with > > this. > > > > > > Thanks > > > -girish > > > > > > > > > --- Jan Hubicka wrote: > > > > > > > > I started with a clean slate in my build > > > > environment > > > > > and did not have any residual files hanging > > > > around. > > > > > Are the steps I have indicated in my earlier > > email > > > > > correct. Is there a way I can break down the > > > > problem > > > > > into a smaller sub-set of flags and eliminate > > the > > > > flag > > > > > causing the performance problem. What I mean > > is > > > > since > > > > > -fprofile-generate and -fprofile-use enable a > > > > bunch of > > > > > flags, would it make sense to avoid profiling > > and > > > > try > > > > > out some of the individual flags on a trial > > and > > > > error > > > > > basis. If so what would be the flags to start > > the > > > > It would be probably better to just turn off the > > > > individual > > > > optimizations with -fprofile-use (for > > optimizations > > > > that are implied by > > > > this flag there should be no need to re-profile > > each > > > > time). > > > > If you can find particular optimization that > > gets > > > > out of control, it > > > > would be lot easier to fix it... > > > > > > > > Honza > > > > > trials with. > > > > > > > > > > -girish > > > > > > > > > > --- Jan Hubicka wrote: > > > > > > > > > > > > On Wed, Jul 20, 2005 at 10:45:01AM -0700, > > > > girish > > > > > > vaitheeswaran wrote: > > > > > > > > > --- Steven Bosscher > > > > wrote: > > > > > > > > > > > > > > > > > > > On Wednesday 20 July 2005 18:53, > > girish > > > > > > vaitheeswaran wrote: > > > > > > > > > > > I am seeing a 20% slowdown with > > > > feedback > > > > > > optimization. > > > > > > > > > > > Does anyone have any thoughts on > > this. > > > > > > > > > > > > > > > > > > > > My first thought is that you should > > > > probably > > > > > > first > > > > > > > > > > tell what compiler > > > > > > > > > > you are using. > > > > > > > > > > > > > > > > I am using gcc 3.4.3 > > > > > > > > -girish > > > > > > > > > > > > > > Which platform? I've seen slower code for > > > > > > profile-directed optimizations > > > > > > > on powerpc64-linux with GCC 4.0 and > > mainline. > > > > > > It's a bug, but I haven't > > > > > > > looked into it enough to provide a small > > test > > > > case > > > > > > for a problem report. > > > > > > > > > > > > Actually I would be very interested in > > seeing > > > > > > testcases such as those. > > > > > > (and the Girish' slowdown too if possible). > > In > > > > > > general some slowdowns > > > > > > in side corners are probably unavoidable but > > > > both > > > > > > 3.4.3 and 4.0 seems to > > > > > > have pretty consistent improvements with > > > > profiling > > > > > > at least for SPEC and > > > > > > i386 I am testing pretty regularly. > > > > > > Such slodowns usually indicate problems like > > > > > > incorrectly updated profile > > > > > > or incorrectly readed in profile because of > > > > > > missmatch in CFGs in between > > > > > > profile and feedback run that are rather > > > > dificult to > > > > > > notice and hunt > > > > > > down... > > > > > > > > > > > > Honza > > > > > > > > > > > > > > Janis > > > > > > > > > > > >