From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16873 invoked by alias); 28 May 2007 09:54:44 -0000 Received: (qmail 16865 invoked by uid 22791); 28 May 2007 09:54:43 -0000 X-Spam-Check-By: sourceware.org Received: from pfepb.post.tele.dk (HELO pfepb.post.tele.dk) (195.41.46.236) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 28 May 2007 09:54:40 +0000 Received: from x1-6-00-0f-9f-c6-3e-90 (x1-6-00-0f-9f-c6-3e-90.k75.webspeed.dk [80.197.1.215]) by pfepb.post.tele.dk (Postfix) with ESMTP id AD30BA50028; Mon, 28 May 2007 11:54:32 +0200 (CEST) Received: from x1-6-00-0f-9f-c6-3e-90 (localhost.localdomain [127.0.0.1]) by x1-6-00-0f-9f-c6-3e-90 (8.14.0/8.14.0) with ESMTP id l4S9sWRO012749; Mon, 28 May 2007 11:54:32 +0200 Received: (from rask@localhost) by x1-6-00-0f-9f-c6-3e-90 (8.14.0/8.14.0/Submit) id l4S9sWtg012748; Mon, 28 May 2007 11:54:32 +0200 Date: Mon, 28 May 2007 11:26:00 -0000 From: Rask Ingemann Lambertsen To: "Barak A. Pearlmutter" Cc: gcc-help@gcc.gnu.org Subject: Re: Bizarrely Poor Code from Bizarre Machine-Generated C Sources Message-ID: <20070528095432.GA5690@sygehus.dk> References: <20070527180509.GX5690@sygehus.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.14 (2007-02-12) Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2007-05/txt/msg00284.txt.bz2 On Sun, May 27, 2007 at 10:28:09PM +0100, Barak A. Pearlmutter wrote: > Hope you don't mind if I ask some follow-up questions. Not at all. > Yup: we had also noticed the zillions of calls to memcpy with static > arguments. This is part of what I meant by "unnecessary data > shuffling". Is there some way to tell GCC that it isn't worth calling > memcpy to copy such short structures? GCC optimizes memcpy according to the size of the memory block and the CPU it is optimizing for. I'm not sure the most recent work on optimizing memcpy() for x86 processors went into GCC 4.2, though. > We could re-jigger our back end to generate FORTRAN instead of C and > use GCC's FORTRAN stuff, maybe that would help? I don't know FORTRAN. I have no idea. > > You will definitely want a lot of inlining for this sort of code, so > > at least use -O3, but perhaps play with the inlining parameters too. > > Right; -O3 didn't make any qualitative difference. (I certainly tried > that before posting.) I do see a whole bunch of inline-related > parameters in the GCC documentation, but it is not clear which I > should tweaked. I tried -O3 -flinline-limit=60000 (default 600) but > even that doesn't make any qualitative difference. You *really* need to crank up those limits. I don't have GCC 4.2, but I tried GCC 4.3 --param inline-call-cost=10000 --param max-inline-insns-auto=20000 --param large-function-growth=1000 --param inline-unit-growth=1000 which wasn't enough. I ran out of memory (256 MB RAM + 757 MB swap) with -finline-limit=60000 --param inline-call-cost=10000 --param max-inline-insns-auto=200000 --param large-function-growth=10000 --param inline-unit-growth=10000. Some versions of GCC need much more memory than others. YMMV. I only noticed right now that you have many functions marked inline. Then you also want to increase the parameter max-inline-insns-single. -- Rask Ingemann Lambertsen