From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29621 invoked by alias); 21 Apr 2002 11:32:40 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 29606 invoked from network); 21 Apr 2002 11:32:38 -0000 Received: from unknown (HELO atrey.karlin.mff.cuni.cz) (195.113.31.123) by sources.redhat.com with SMTP; 21 Apr 2002 11:32:38 -0000 Received: by atrey.karlin.mff.cuni.cz (Postfix, from userid 4018) id 62CBE4F969; Sun, 21 Apr 2002 13:32:38 +0200 (CEST) Date: Sun, 21 Apr 2002 05:46:00 -0000 From: Jan Hubicka To: Michel LESPINASSE Cc: gcc list Subject: Re: GCC performance regression - up to 20% ? Message-ID: <20020421113238.GC16602@atrey.karlin.mff.cuni.cz> References: <20020421005718.GA16378@zoy.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020421005718.GA16378@zoy.org> User-Agent: Mutt/1.3.27i X-SW-Source: 2002-04/txt/msg01039.txt.bz2 > > libmpeg2, on athlon tbird 950, with mmx optimizations: > gcc-3.0 is about 2% slower than 2.95.4 > gcc-3.1 snapshot is about 10% slower than 2.95.4 > > libmpeg2, on athlon tbird 950, using pure C code: > gcc-3.0 is about 4.5% slower than 2.95.4 > gcc-3.1 snapshot is about 5.5% slower than 2.95.4 > > libmpeg2, on celeron 366, with mmx optimizations: > gcc-3.0 is about 4% slower than 2.95.4 > gcc-3.1 snapshot is about 20.5% slower than 2.95.4 (!!!!) > > These results are all very repeatable. the celeron 366 results are the > most worrying, as this processor already has borderline performance > for decoding mpeg2 streams. Are you able to figure out what exactly makes the code slow? Having self contained testcase will definitly help a lot. WHat flags do you use? I would be quite curious whether using profile feedback helps. (see documentation of -fprofile-arcs and -fbranch-probabilities) you can just have some badly predicted branch in the innermost loop. Problem of such code usualy is fact that it is tuned to avoid problems on one particular version of gcc, so even when new version os faster overall, it is slower in such places. We've hit similar case with Athlon matrix multiplication code and such problems are usually easy to fix on gcc side. > > Is there a known performance regression in current GCCs (say, do they > get lower SPECint scores ?) or is it only with my code ? No, the SPECint numbers are quite consistenly higher than in any previous release. See http://www.suse.de/~aj/SPEC In fact no previous release had such a huge gap in perofrmance. > > Also, is there anything I could do in my code to enhance performance > with newer gcc versions ? One thing I noticed is that 3.1 snapshot > produces less inlining than 3.0 or 2.95. This probably accounts for > some of the slowdown I see when using mmx optimizations, as my mmx > routines are written using a few routines that I really expect to get > inlined. Is there any way I can get back control about that, so that > gcc honours the inline keyword ? I have not managed to do this either. THere is parameter to increase inline threshold as well as allwaysinline function attribute. See the documentation. Honza > > BTW, these two apps I mentionned can be found at > http://libmpeg2.sourceforge.net/ > http://liba52.sourceforge.net/ > > Puzzled, > > -- > Michel "Walken" LESPINASSE > Is this the best that god can do ? Then I'm not impressed.