Time for a compile server status update?

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Time for a compile server status update?
@ 2003-12-13 17:58 Dan Kegel
  2003-12-14  7:03 ` Per Bothner
  0 siblings, 1 reply; 2+ messages in thread
From: Dan Kegel @ 2003-12-13 17:58 UTC (permalink / raw)
  To: Mike Stump, per; +Cc: GCC Mailing List

Hi guys,
it's been a whole six weeks since we last heard news
about the compile server (http://gcc.gnu.org/ml/gcc/2003-10/msg01398.html et seq.)
Could you post a short note with where things are now?

I recently chatted with a guy who used to do compilers
at Borland; he said he'd done a speed benchmark comparing
gcc with msvc, and msvc compiled 11 times faster than gcc
when you turn on all the tricks (including incremental compiling).
He was of the opinion that gcc needed to be burnt to the
ground and rewritten to be purely x86 oriented, etc., etc.
I bit my tongue.  I'm really looking forward to being able
to say "The 3.5 release will compile packages with large headers
2x faster than previous releases thanks to the compile server",
or something like that.

Also, it would be cool if you could say a few words about
what it would take to someday run the compile server in parallel
on multiple computers, distcc style.  The obvious "just use
distcc" probably won't help as much as one would think
because distcc runs the preprocessor before shipping the
code out, whereas (judging by
http://per.bothner.com/papers/GccSummit03-slides/) your
compile server really wants the original source files.
Presumably distcc could send over source files if
the distcc server hadn't seen them before (based on a
hash of the file).   In large installations, you'd want
to go just on a hash of the file's contents, not on its
absolute path, since multiple users might be compiling
the same file.

Anyway, we're all salivating over this, so speak on up.
Maybe it'll help draw more people to help work on it.

Thanks!
- Dan

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Time for a compile server status update?
  2003-12-13 17:58 Time for a compile server status update? Dan Kegel
@ 2003-12-14  7:03 ` Per Bothner
  0 siblings, 0 replies; 2+ messages in thread
From: Per Bothner @ 2003-12-14  7:03 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Mike Stump, GCC Mailing List

Dan Kegel wrote:

> it's been a whole six weeks since we last heard news
> about the compile server 
> (http://gcc.gnu.org/ml/gcc/2003-10/msg01398.html et seq.)
> Could you post a short note with where things are now?

Yes, it is time for an update.  Partly because Monday is my last day at 
Apple (due to policies about maximum length of contracting), so I will 
no longer (for now) be able to spend much time on the compile server. 
However, Mike Stump will still be working on in for most of his time, 
and others, including Steve Naroff will be helping out.

First, correctness/safety:  The compile server handles real C programs 
and real header files (which can be pretty ugly) quite will.  We've 
compile d125 .c files (a large subset of those that make up cc1) with 
their associated #included files with a single invocation of cc1 in 
server mode, on both GNU/Linux (Fedora 1.0) and Mac OSX Panther.  The 
compile suceeds and produces .o files that can linked together to 
produce a working cc1.  There are some regressions in the testsuite 
(which is run in non-server mode), and there are known ways you can 
trick the compile server to "do the wrong thing", so we're not quite 
there yet, but this is very encouraging.

This is for C, which is what I have been working on.  Mike Stump has 
been working on C++, which is more complicated.  He has also a 
potentially faster design for "conditional symbol tables", which speeds 
up header file re-use.  Mike can update on these.

Next, speed:  This dependx very much on the "work mix".  A simple .c 
file that does nothing except #include <Carbon/Carbon.h> (which includes 
most of the headers for Apple's older Carbon GUI framework) achieves 
close to 100% reuse (i.e. header file fragments whose declarations can 
be reused without requiring reparsing).  The exception is a few system 
header files like stdarg.h and limits.h.  The actual speed-up is about 
10x for second and subsequent compiles using the server, compared to 
using the same cc1 executable in non-server mode.  Mike has some 
preliminary numbers suggesting we can do even better.

But of course this is for the best case, where compilation times are 
dominated by header files that are included many times in different 
compilation units.  Such applications are the ones where we expect the 
most benefit from the compile server -  just as for pre-compiled headers.

We also get a definite speedup compiling 14 .c files using the server. 
These are the first 14 files mentioned in the TEST_TARGET rule in the 
gcc/makefile, and are mostly part of cpplib, so these files are medium 
size with modest header file reuse.  The speedup is around 20% if you 
look at "user" time as reported by 'time' (adding server and client time).

If I run 121 .c files (all of TEST_TARGET in gcc/Makefile) the result is 
not so good.  We don't expect much of a benefit, because many of the .c 
files are huge, and there aren't that many header files to re-use.  So 
on my Powerbook it comes out even.  However, on my GNU/Linux server 
using the server appears to be about 40% slower in this case.  (However, 
in absolute numbers the latter is faster.)  My best guess is that we're 
getting hit by the poor locality (page and especially cache) of Gcc's 
data structures.  This is an issue when not using the compile server, 
but it becomes worse with the compile server just become we now have so 
much more stuff saved in memory.  So any efforts to improve Gcc cache 
locality will probably benefit the compile server even more.

The gcc/Makefile has a couple of rules for trying out the compile 
server:  Do 'make start-server' to start the server.  (This works on 
GNU/Linux, but may require some tweaking on other platforms since it 
starts up cc1 directly without going via the gcc command.)  To compile 
the files listed in TEST_TARGET, do (in a different window) 'make 
recompile-with-server'.

> Also, it would be cool if you could say a few words about
> what it would take to someday run the compile server in parallel
> on multiple computers, distcc style.  The obvious "just use
> distcc" probably won't help as much as one would think
> because distcc runs the preprocessor before shipping the
> code out, whereas (judging by
> http://per.bothner.com/papers/GccSummit03-slides/) your
> compile server really wants the original source files.
> Presumably distcc could send over source files if
> the distcc server hadn't seen them before (based on a
> hash of the file).   In large installations, you'd want
> to go just on a hash of the file's contents, not on its
> absolute path, since multiple users might be compiling
> the same file.

I think you've said it.  A shared/distributed file system also makes 
sense.  Our initial "target" is a developer with a laptop or desktop 
machine and a single or small number of CPUs.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-12-14  5:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-13 17:58 Time for a compile server status update? Dan Kegel
2003-12-14  7:03 ` Per Bothner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).