From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4599 invoked by alias); 27 May 2007 14:07:41 -0000 Received: (qmail 4588 invoked by uid 22791); 27 May 2007 14:07:39 -0000 X-Spam-Check-By: sourceware.org Received: from mail.nuim.ie (HELO LARCH.MAY.IE) (149.157.1.19) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sun, 27 May 2007 14:07:36 +0000 Received: from localhost ([149.157.192.252]) by NUIM.IE (PMDF V6.2-X17 #30789) with ESMTP id <01MH3299NCOI017INU@NUIM.IE> for gcc-help@gcc.gnu.org; Sun, 27 May 2007 15:06:19 +0000 (GMT) Received: from barak by localhost with local (Exim 4.63) (envelope-from ) id 1HsJNK-0005B3-2P; Sun, 27 May 2007 15:05:38 +0100 Date: Sun, 27 May 2007 15:11:00 -0000 From: "Barak A. Pearlmutter" Subject: Bizarrely Poor Code from Bizarre Machine-Generated C Sources To: gcc-help@gcc.gnu.org Reply-to: barak@cs.nuim.ie Message-id: Content-transfer-encoding: 7BIT Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2007-05/txt/msg00276.txt.bz2 A colleague and I have developed a fancy compiler for a new sort of advanced numeric programming language. The output of this compiler is C source code. Although optimized in some respects, this C is somewhat bizarre in others. In particular, it defines gobs of new structure types and gobs of very very short functions, and there are no pointers used. It should be possible, using the optimization techniques already present in GCC, for very tense machine code to be generated from this admittedly strange FORTRAN-style C source code. But instead, the assembly code GCC generates is full of unnecessary data shuffling. So much data shuffling that this dominates the actual useful arithmetic instructions, by a factor of 100s, causing a slowdown in the generated executable of a similar magnitude. The poor optimization is present no matter what we try: all versions of GCC and all optimization flags. Although it does seem to be a little better in GCC 4.2. What I'm hoping for is one of the following: - Some new GCC option magic that would get this all optimized. - Some small change we could make to the generated C sources that would cause it to be optimized well. (Add some magic __attribute__ somewhere.) - Some other magic (rebuild GCC with build option XXX, or patch the GCC sources *here* and *here*) that would make it optimize well. - Some combination of the above. - A pointer to some other compiler (horrors!) that would optimize this well. The C sources, and generated assembly, are too long to attach below. Instead, I am making them available at http://www.bcl.hamilton.ie/~barak/stalingrad-vs-gcc/ Below are notes that include detailed version information on the compilers used. In the notes below we used -O2 -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 but the results don't seem to improve by changing them. Our thanks, to anyone who takes up the challenge, for looking at and thinking about this issue. -- Barak A. Pearlmutter Hamilton Institute & Dept Comp Sci, NUI Maynooth, Co. Kildare, Ireland http://www.bcl.hamilton.ie/~barak/ ---------------------------------------------------------------- --- NOTES --- ---------------------------------------------------------------- $ gcc-4.1 -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --with-tune=i686 --enable-checking=release i486-linux-gnu Thread model: posix gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) $ gcc-4.1 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c particle1.c:10763: warning: 'f95' defined but not used particle1.c:10775: warning: 'f110' defined but not used particle1.c:10788: warning: 'f126' defined but not used particle1.c:10887: warning: 'f273' defined but not used particle1.c:10888: warning: 'f274' defined but not used particle1.c:10889: warning: 'f275' defined but not used particle1.c:10890: warning: 'f277' defined but not used particle1.c:12456: warning: 'f2456' defined but not used particle1.c:12478: warning: 'f2482' defined but not used particle1.c:12583: warning: 'f2623' defined but not used particle1.c:12631: warning: 'f2690' defined but not used particle1.c:12678: warning: 'f2752' defined but not used particle1.c:12720: warning: 'f2828' defined but not used $ mv particle1.s particle1-gcc41.s $ gcc-4.2 -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-targets=all --disable-werror --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.1 20070525 (prerelease) (Debian 4.2-20070525-1) $ gcc-4.2 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c particle1.c:10763: warning: 'f95' defined but not used particle1.c:10775: warning: 'f110' defined but not used particle1.c:10788: warning: 'f126' defined but not used particle1.c:10887: warning: 'f273' defined but not used particle1.c:10888: warning: 'f274' defined but not used particle1.c:10889: warning: 'f275' defined but not used particle1.c:10890: warning: 'f277' defined but not used particle1.c:12456: warning: 'f2456' defined but not used particle1.c:12478: warning: 'f2482' defined but not used particle1.c:12583: warning: 'f2623' defined but not used particle1.c:12631: warning: 'f2690' defined but not used particle1.c:12678: warning: 'f2752' defined but not used particle1.c:12720: warning: 'f2828' defined but not used $ mv particle1.s particle1-gcc42.s $ gcc-2.95 -v Reading specs from /usr/lib/gcc-lib/i486-linux-gnu/2.95.4/specs gcc version 2.95.4 20011002 (Debian prerelease) $ gcc-2.95 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c cc1: Invalid option `fpmath=sse' cc1: Invalid option `sse3' particle1.c: In function `write_real': particle1.c:7: warning: use of `l' length character with `g' type character particle1.c: At top level: particle1.c:10763: warning: `f95' defined but not used particle1.c:10775: warning: `f110' defined but not used particle1.c:10788: warning: `f126' defined but not used particle1.c:10887: warning: `f273' defined but not used particle1.c:10888: warning: `f274' defined but not used particle1.c:10889: warning: `f275' defined but not used particle1.c:10890: warning: `f277' defined but not used particle1.c:12456: warning: `f2456' defined but not used particle1.c:12478: warning: `f2482' defined but not used particle1.c:12583: warning: `f2623' defined but not used particle1.c:12631: warning: `f2690' defined but not used particle1.c:12678: warning: `f2752' defined but not used particle1.c:12720: warning: `f2828' defined but not used $ mv particle1.s particle1-gcc295.s $ gcc-3.3 -v Reading specs from /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/specs Configured with: ../src/configure -v --enable-languages=c,c++ --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug i486-linux-gnu Thread model: posix gcc version 3.3.6 (Debian 1:3.3.6-15) $ gcc-3.3 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c particle1.c:10763: warning: `f95' defined but not used particle1.c:10775: warning: `f110' defined but not used particle1.c:10788: warning: `f126' defined but not used particle1.c:10887: warning: `f273' defined but not used particle1.c:10888: warning: `f274' defined but not used particle1.c:10889: warning: `f275' defined but not used particle1.c:10890: warning: `f277' defined but not used particle1.c:12456: warning: `f2456' defined but not used particle1.c:12478: warning: `f2482' defined but not used particle1.c:12583: warning: `f2623' defined but not used particle1.c:12631: warning: `f2690' defined but not used particle1.c:12678: warning: `f2752' defined but not used particle1.c:12720: warning: `f2828' defined but not used $ mv particle1.s particle1-gcc33.s $ gcc-3.4 -v Reading specs from /usr/lib/gcc/i486-linux-gnu/3.4.6/specs Configured with: ../src/configure -v --enable-languages=c,c++,f77,pascal --prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --with-tune=i686 i486-linux-gnu Thread model: posix gcc version 3.4.6 (Debian 3.4.6-5) $ gcc-3.4 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c particle1.c:10763: warning: 'f95' defined but not used particle1.c:10775: warning: 'f110' defined but not used particle1.c:10788: warning: 'f126' defined but not used particle1.c:10887: warning: 'f273' defined but not used particle1.c:10888: warning: 'f274' defined but not used particle1.c:10889: warning: 'f275' defined but not used particle1.c:10890: warning: 'f277' defined but not used particle1.c:12456: warning: 'f2456' defined but not used particle1.c:12478: warning: 'f2482' defined but not used particle1.c:12583: warning: 'f2623' defined but not used particle1.c:12631: warning: 'f2690' defined but not used particle1.c:12678: warning: 'f2752' defined but not used particle1.c:12720: warning: 'f2828' defined but not used $ mv particle1.s particle1-gcc34.s $ gcc -o particle1 particle1.c -lm $ ./particle1 0.01999188620615792 $ ls -l -rw-rw-r-- 1 barak barak 6764 2007-05-27 14:38 NOTES -rwxrwxr-x 1 barak barak 736714 2007-05-27 13:08 particle1 -rw-r--r-- 1 barak barak 901853 2007-05-27 12:14 particle1.c -rw-r--r-- 1 barak barak 2383226 2007-05-27 12:41 particle1-gcc295.s -rw-r--r-- 1 barak barak 7291988 2007-05-27 12:46 particle1-gcc33.s -rw-r--r-- 1 barak barak 8005026 2007-05-27 12:55 particle1-gcc34.s -rw-rw-r-- 1 barak barak 1703481 2007-05-27 12:33 particle1-gcc41.s -rw-r--r-- 1 barak barak 1000722 2007-05-27 12:36 particle1-gcc42.s $ wc --lines particle1.c 12825 particle1.c $ wc --lines *.s 163922 particle1-gcc295.s 343012 particle1-gcc33.s 353057 particle1-gcc34.s 100697 particle1-gcc41.s 47030 particle1-gcc42.s