From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27788 invoked by alias); 27 May 2007 18:05:20 -0000 Received: (qmail 27780 invoked by uid 22791); 27 May 2007 18:05:19 -0000 X-Spam-Check-By: sourceware.org Received: from pfepb.post.tele.dk (HELO pfepb.post.tele.dk) (195.41.46.236) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sun, 27 May 2007 18:05:17 +0000 Received: from x1-6-00-0f-9f-c6-3e-90 (x1-6-00-0f-9f-c6-3e-90.k75.webspeed.dk [80.197.1.215]) by pfepb.post.tele.dk (Postfix) with ESMTP id C8B12A5001C; Sun, 27 May 2007 20:05:12 +0200 (CEST) Received: from x1-6-00-0f-9f-c6-3e-90 (localhost.localdomain [127.0.0.1]) by x1-6-00-0f-9f-c6-3e-90 (8.14.0/8.14.0) with ESMTP id l4RI5Cl8009822; Sun, 27 May 2007 20:05:12 +0200 Received: (from rask@localhost) by x1-6-00-0f-9f-c6-3e-90 (8.14.0/8.14.0/Submit) id l4RI5Bms009821; Sun, 27 May 2007 20:05:11 +0200 Date: Sun, 27 May 2007 18:22:00 -0000 From: Rask Ingemann Lambertsen To: "Barak A. Pearlmutter" Cc: gcc-help@gcc.gnu.org Subject: Re: Bizarrely Poor Code from Bizarre Machine-Generated C Sources Message-ID: <20070527180509.GX5690@sygehus.dk> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.14 (2007-02-12) Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2007-05/txt/msg00278.txt.bz2 On Sun, May 27, 2007 at 03:05:38PM +0100, Barak A. Pearlmutter wrote: > In particular, it defines gobs of new > structure types and gobs of very very short functions, and there are > no pointers used. It should be possible, using the optimization > techniques already present in GCC, for very tense machine code to be > generated from this admittedly strange FORTRAN-style C source code. > But instead, the assembly code GCC generates is full of unnecessary > data shuffling. The way you are using structures forces GCC to copy data around. Unless you somehow manage to inline the whole program into main(), I don't see how it can be any different. > - Some small change we could make to the generated C sources that > would cause it to be optimized well. (Add some magic __attribute__ > somewhere.) Change the structures into scalar variables for a start. GCC has more freedom to place scalar variables than structures. Also, try to arrange function parameters such that sibling call optimization has a chance of working. BAD: int g (int c, int b, int a) { ... } int f (int a, int b, int c) { return g (c, b, a); } GOOD: int g (int a, int b, int c) { ... } int f (int a, int b, int c) { return g (a, b, c); } > Below are notes that include detailed version information on the > compilers used. In the notes below we used > -O2 -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 > but the results don't seem to improve by changing them. You will definitely want a lot of inlining for this sort of code, so at least use -O3, but perhaps play with the inlining parameters too. On a side note, consider using using -march to tell GCC which model of CPU you intend to run the code on. > $ wc --lines *.s > 163922 particle1-gcc295.s > 343012 particle1-gcc33.s > 353057 particle1-gcc34.s > 100697 particle1-gcc41.s > 47030 particle1-gcc42.s I imagine you'll be enlightened by running $ for i in *.s; do echo -n "${i}: "; grep -F -e memcpy ${i} | wc --lines; done -- Rask Ingemann Lambertsen