public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Bizarrely Poor Code from Bizarre Machine-Generated C Sources
@ 2007-05-29 16:10 Barak A. Pearlmutter
  2007-05-29 23:39 ` Andrew Haley
  0 siblings, 1 reply; 7+ messages in thread
From: Barak A. Pearlmutter @ 2007-05-29 16:10 UTC (permalink / raw)
  To: gcc-help

Success!

Some working magic seems to be this:

    gcc -s -o particle1 \
	-O3 \
	-march=k8 \
	-mfpmath=sse \
	-finline-limit=100000 \
	--param large-function-insns=1000000 \
	--param inline-unit-growth=1000000 \
	--param sra-field-structure-ratio=0 \
	particle1.c -lm

although it looks like -Os gives an additional improvement.

This (with GCC 4.1) reduces code volume to about 16k from a previous
near 1M, and reduces runtime by a factor of about 2700, as compared to
just -O3.

Further improvements welcome.

I'd also suggest adding a section to the GCC documentation on "how to
use GCC as a back-end to another compiler" which gives some typical
magic options like the above that would be useful in circumstances
like these.
--
Barak A. Pearlmutter <barak@cs.nuim.ie>
 Hamilton Institute & Dept Comp Sci, NUI Maynooth, Co. Kildare, Ireland
 http://www.bcl.hamilton.ie/~barak/

^ permalink raw reply	[flat|nested] 7+ messages in thread
* Bizarrely Poor Code from Bizarre Machine-Generated C Sources
@ 2007-05-27 15:11 Barak A. Pearlmutter
  2007-05-27 18:22 ` Rask Ingemann Lambertsen
  0 siblings, 1 reply; 7+ messages in thread
From: Barak A. Pearlmutter @ 2007-05-27 15:11 UTC (permalink / raw)
  To: gcc-help

A colleague and I have developed a fancy compiler for a new sort of
advanced numeric programming language.  The output of this compiler is
C source code.  Although optimized in some respects, this C is
somewhat bizarre in others.  In particular, it defines gobs of new
structure types and gobs of very very short functions, and there are
no pointers used.  It should be possible, using the optimization
techniques already present in GCC, for very tense machine code to be
generated from this admittedly strange FORTRAN-style C source code.
But instead, the assembly code GCC generates is full of unnecessary
data shuffling.  So much data shuffling that this dominates the actual
useful arithmetic instructions, by a factor of 100s, causing a
slowdown in the generated executable of a similar magnitude.  The poor
optimization is present no matter what we try: all versions of GCC and
all optimization flags.  Although it does seem to be a little better
in GCC 4.2.

What I'm hoping for is one of the following:

 - Some new GCC option magic that would get this all optimized.

 - Some small change we could make to the generated C sources that
   would cause it to be optimized well.  (Add some magic __attribute__
   somewhere.)

 - Some other magic (rebuild GCC with build option XXX, or patch the
   GCC sources *here* and *here*) that would make it optimize well.

 - Some combination of the above.

 - A pointer to some other compiler (horrors!) that would optimize
   this well.

The C sources, and generated assembly, are too long to attach below.
Instead, I am making them available at

 http://www.bcl.hamilton.ie/~barak/stalingrad-vs-gcc/

Below are notes that include detailed version information on the
compilers used.  In the notes below we used
 -O2 -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3
but the results don't seem to improve by changing them.

Our thanks, to anyone who takes up the challenge, for looking at and
thinking about this issue.
--
Barak A. Pearlmutter <barak@cs.nuim.ie>
 Hamilton Institute & Dept Comp Sci, NUI Maynooth, Co. Kildare, Ireland
 http://www.bcl.hamilton.ie/~barak/

----------------------------------------------------------------
--- NOTES ---
----------------------------------------------------------------

$ gcc-4.1 -v

Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --with-tune=i686 --enable-checking=release i486-linux-gnu
Thread model: posix
gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

$ gcc-4.1 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c

particle1.c:10763: warning: 'f95' defined but not used
particle1.c:10775: warning: 'f110' defined but not used
particle1.c:10788: warning: 'f126' defined but not used
particle1.c:10887: warning: 'f273' defined but not used
particle1.c:10888: warning: 'f274' defined but not used
particle1.c:10889: warning: 'f275' defined but not used
particle1.c:10890: warning: 'f277' defined but not used
particle1.c:12456: warning: 'f2456' defined but not used
particle1.c:12478: warning: 'f2482' defined but not used
particle1.c:12583: warning: 'f2623' defined but not used
particle1.c:12631: warning: 'f2690' defined but not used
particle1.c:12678: warning: 'f2752' defined but not used
particle1.c:12720: warning: 'f2828' defined but not used

$ mv particle1.s particle1-gcc41.s 



$ gcc-4.2 -v

Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-targets=all --disable-werror --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.2.1 20070525 (prerelease) (Debian 4.2-20070525-1)

$ gcc-4.2 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c

particle1.c:10763: warning: 'f95' defined but not used
particle1.c:10775: warning: 'f110' defined but not used
particle1.c:10788: warning: 'f126' defined but not used
particle1.c:10887: warning: 'f273' defined but not used
particle1.c:10888: warning: 'f274' defined but not used
particle1.c:10889: warning: 'f275' defined but not used
particle1.c:10890: warning: 'f277' defined but not used
particle1.c:12456: warning: 'f2456' defined but not used
particle1.c:12478: warning: 'f2482' defined but not used
particle1.c:12583: warning: 'f2623' defined but not used
particle1.c:12631: warning: 'f2690' defined but not used
particle1.c:12678: warning: 'f2752' defined but not used
particle1.c:12720: warning: 'f2828' defined but not used

$ mv particle1.s particle1-gcc42.s 



$ gcc-2.95 -v

Reading specs from /usr/lib/gcc-lib/i486-linux-gnu/2.95.4/specs
gcc version 2.95.4 20011002 (Debian prerelease)

$ gcc-2.95 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c

cc1: Invalid option `fpmath=sse'
cc1: Invalid option `sse3'
particle1.c: In function `write_real':
particle1.c:7: warning: use of `l' length character with `g' type character
particle1.c: At top level:
particle1.c:10763: warning: `f95' defined but not used
particle1.c:10775: warning: `f110' defined but not used
particle1.c:10788: warning: `f126' defined but not used
particle1.c:10887: warning: `f273' defined but not used
particle1.c:10888: warning: `f274' defined but not used
particle1.c:10889: warning: `f275' defined but not used
particle1.c:10890: warning: `f277' defined but not used
particle1.c:12456: warning: `f2456' defined but not used
particle1.c:12478: warning: `f2482' defined but not used
particle1.c:12583: warning: `f2623' defined but not used
particle1.c:12631: warning: `f2690' defined but not used
particle1.c:12678: warning: `f2752' defined but not used
particle1.c:12720: warning: `f2828' defined but not used

$ mv particle1.s particle1-gcc295.s 



$ gcc-3.3 -v

Reading specs from /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/specs
Configured with: ../src/configure -v --enable-languages=c,c++ --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug i486-linux-gnu
Thread model: posix
gcc version 3.3.6 (Debian 1:3.3.6-15)

$ gcc-3.3 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c

particle1.c:10763: warning: `f95' defined but not used
particle1.c:10775: warning: `f110' defined but not used
particle1.c:10788: warning: `f126' defined but not used
particle1.c:10887: warning: `f273' defined but not used
particle1.c:10888: warning: `f274' defined but not used
particle1.c:10889: warning: `f275' defined but not used
particle1.c:10890: warning: `f277' defined but not used
particle1.c:12456: warning: `f2456' defined but not used
particle1.c:12478: warning: `f2482' defined but not used
particle1.c:12583: warning: `f2623' defined but not used
particle1.c:12631: warning: `f2690' defined but not used
particle1.c:12678: warning: `f2752' defined but not used
particle1.c:12720: warning: `f2828' defined but not used

$ mv particle1.s particle1-gcc33.s 



$ gcc-3.4 -v

Reading specs from /usr/lib/gcc/i486-linux-gnu/3.4.6/specs
Configured with: ../src/configure -v --enable-languages=c,c++,f77,pascal --prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --with-tune=i686 i486-linux-gnu
Thread model: posix
gcc version 3.4.6 (Debian 3.4.6-5)

$ gcc-3.4 -S -O2 -Wall -freg-struct-return -fomit-frame-pointer -mfpmath=sse -msse3 particle1.c

particle1.c:10763: warning: 'f95' defined but not used
particle1.c:10775: warning: 'f110' defined but not used
particle1.c:10788: warning: 'f126' defined but not used
particle1.c:10887: warning: 'f273' defined but not used
particle1.c:10888: warning: 'f274' defined but not used
particle1.c:10889: warning: 'f275' defined but not used
particle1.c:10890: warning: 'f277' defined but not used
particle1.c:12456: warning: 'f2456' defined but not used
particle1.c:12478: warning: 'f2482' defined but not used
particle1.c:12583: warning: 'f2623' defined but not used
particle1.c:12631: warning: 'f2690' defined but not used
particle1.c:12678: warning: 'f2752' defined but not used
particle1.c:12720: warning: 'f2828' defined but not used

$ mv particle1.s particle1-gcc34.s 



$ gcc -o particle1 particle1.c -lm

$ ./particle1
0.01999188620615792


$ ls -l
-rw-rw-r-- 1 barak barak    6764 2007-05-27 14:38 NOTES
-rwxrwxr-x 1 barak barak  736714 2007-05-27 13:08 particle1
-rw-r--r-- 1 barak barak  901853 2007-05-27 12:14 particle1.c
-rw-r--r-- 1 barak barak 2383226 2007-05-27 12:41 particle1-gcc295.s
-rw-r--r-- 1 barak barak 7291988 2007-05-27 12:46 particle1-gcc33.s
-rw-r--r-- 1 barak barak 8005026 2007-05-27 12:55 particle1-gcc34.s
-rw-rw-r-- 1 barak barak 1703481 2007-05-27 12:33 particle1-gcc41.s
-rw-r--r-- 1 barak barak 1000722 2007-05-27 12:36 particle1-gcc42.s

$ wc --lines particle1.c
   12825 particle1.c

$ wc --lines *.s
  163922 particle1-gcc295.s
  343012 particle1-gcc33.s
  353057 particle1-gcc34.s
  100697 particle1-gcc41.s
   47030 particle1-gcc42.s

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-05-29 16:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-29 16:10 Bizarrely Poor Code from Bizarre Machine-Generated C Sources Barak A. Pearlmutter
2007-05-29 23:39 ` Andrew Haley
  -- strict thread matches above, loose matches on Subject: below --
2007-05-27 15:11 Barak A. Pearlmutter
2007-05-27 18:22 ` Rask Ingemann Lambertsen
2007-05-28  7:23   ` Barak A. Pearlmutter
2007-05-28  9:42     ` Mattias Engdegård
2007-05-28 11:26     ` Rask Ingemann Lambertsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).