* Time for a compile server status update?
@ 2003-12-13 17:58 Dan Kegel
2003-12-14 7:03 ` Per Bothner
0 siblings, 1 reply; 2+ messages in thread
From: Dan Kegel @ 2003-12-13 17:58 UTC (permalink / raw)
To: Mike Stump, per; +Cc: GCC Mailing List
Hi guys,
it's been a whole six weeks since we last heard news
about the compile server (http://gcc.gnu.org/ml/gcc/2003-10/msg01398.html et seq.)
Could you post a short note with where things are now?
I recently chatted with a guy who used to do compilers
at Borland; he said he'd done a speed benchmark comparing
gcc with msvc, and msvc compiled 11 times faster than gcc
when you turn on all the tricks (including incremental compiling).
He was of the opinion that gcc needed to be burnt to the
ground and rewritten to be purely x86 oriented, etc., etc.
I bit my tongue. I'm really looking forward to being able
to say "The 3.5 release will compile packages with large headers
2x faster than previous releases thanks to the compile server",
or something like that.
Also, it would be cool if you could say a few words about
what it would take to someday run the compile server in parallel
on multiple computers, distcc style. The obvious "just use
distcc" probably won't help as much as one would think
because distcc runs the preprocessor before shipping the
code out, whereas (judging by
http://per.bothner.com/papers/GccSummit03-slides/) your
compile server really wants the original source files.
Presumably distcc could send over source files if
the distcc server hadn't seen them before (based on a
hash of the file). In large installations, you'd want
to go just on a hash of the file's contents, not on its
absolute path, since multiple users might be compiling
the same file.
Anyway, we're all salivating over this, so speak on up.
Maybe it'll help draw more people to help work on it.
Thanks!
- Dan
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Time for a compile server status update?
2003-12-13 17:58 Time for a compile server status update? Dan Kegel
@ 2003-12-14 7:03 ` Per Bothner
0 siblings, 0 replies; 2+ messages in thread
From: Per Bothner @ 2003-12-14 7:03 UTC (permalink / raw)
To: Dan Kegel; +Cc: Mike Stump, GCC Mailing List
Dan Kegel wrote:
> it's been a whole six weeks since we last heard news
> about the compile server
> (http://gcc.gnu.org/ml/gcc/2003-10/msg01398.html et seq.)
> Could you post a short note with where things are now?
Yes, it is time for an update. Partly because Monday is my last day at
Apple (due to policies about maximum length of contracting), so I will
no longer (for now) be able to spend much time on the compile server.
However, Mike Stump will still be working on in for most of his time,
and others, including Steve Naroff will be helping out.
First, correctness/safety: The compile server handles real C programs
and real header files (which can be pretty ugly) quite will. We've
compile d125 .c files (a large subset of those that make up cc1) with
their associated #included files with a single invocation of cc1 in
server mode, on both GNU/Linux (Fedora 1.0) and Mac OSX Panther. The
compile suceeds and produces .o files that can linked together to
produce a working cc1. There are some regressions in the testsuite
(which is run in non-server mode), and there are known ways you can
trick the compile server to "do the wrong thing", so we're not quite
there yet, but this is very encouraging.
This is for C, which is what I have been working on. Mike Stump has
been working on C++, which is more complicated. He has also a
potentially faster design for "conditional symbol tables", which speeds
up header file re-use. Mike can update on these.
Next, speed: This dependx very much on the "work mix". A simple .c
file that does nothing except #include <Carbon/Carbon.h> (which includes
most of the headers for Apple's older Carbon GUI framework) achieves
close to 100% reuse (i.e. header file fragments whose declarations can
be reused without requiring reparsing). The exception is a few system
header files like stdarg.h and limits.h. The actual speed-up is about
10x for second and subsequent compiles using the server, compared to
using the same cc1 executable in non-server mode. Mike has some
preliminary numbers suggesting we can do even better.
But of course this is for the best case, where compilation times are
dominated by header files that are included many times in different
compilation units. Such applications are the ones where we expect the
most benefit from the compile server - just as for pre-compiled headers.
We also get a definite speedup compiling 14 .c files using the server.
These are the first 14 files mentioned in the TEST_TARGET rule in the
gcc/makefile, and are mostly part of cpplib, so these files are medium
size with modest header file reuse. The speedup is around 20% if you
look at "user" time as reported by 'time' (adding server and client time).
If I run 121 .c files (all of TEST_TARGET in gcc/Makefile) the result is
not so good. We don't expect much of a benefit, because many of the .c
files are huge, and there aren't that many header files to re-use. So
on my Powerbook it comes out even. However, on my GNU/Linux server
using the server appears to be about 40% slower in this case. (However,
in absolute numbers the latter is faster.) My best guess is that we're
getting hit by the poor locality (page and especially cache) of Gcc's
data structures. This is an issue when not using the compile server,
but it becomes worse with the compile server just become we now have so
much more stuff saved in memory. So any efforts to improve Gcc cache
locality will probably benefit the compile server even more.
The gcc/Makefile has a couple of rules for trying out the compile
server: Do 'make start-server' to start the server. (This works on
GNU/Linux, but may require some tweaking on other platforms since it
starts up cc1 directly without going via the gcc command.) To compile
the files listed in TEST_TARGET, do (in a different window) 'make
recompile-with-server'.
> Also, it would be cool if you could say a few words about
> what it would take to someday run the compile server in parallel
> on multiple computers, distcc style. The obvious "just use
> distcc" probably won't help as much as one would think
> because distcc runs the preprocessor before shipping the
> code out, whereas (judging by
> http://per.bothner.com/papers/GccSummit03-slides/) your
> compile server really wants the original source files.
> Presumably distcc could send over source files if
> the distcc server hadn't seen them before (based on a
> hash of the file). In large installations, you'd want
> to go just on a hash of the file's contents, not on its
> absolute path, since multiple users might be compiling
> the same file.
I think you've said it. A shared/distributed file system also makes
sense. Our initial "target" is a developer with a laptop or desktop
machine and a single or small number of CPUs.
--
--Per Bothner
per@bothner.com http://per.bothner.com/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2003-12-14 5:45 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-13 17:58 Time for a compile server status update? Dan Kegel
2003-12-14 7:03 ` Per Bothner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).