Thoughts on LLVM and LTO

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Thoughts on LLVM and LTO
@ 2005-11-22 16:20 Diego Novillo
  2005-11-22 16:37 ` Daniel Jacobowitz
                   ` (4 more replies)
  0 siblings, 5 replies; 59+ messages in thread
From: Diego Novillo @ 2005-11-22 16:20 UTC (permalink / raw)
  To: gcc

First off, regardless of what direction we choose to go, I think we are in 
a great position.  Finally, GCC will have all the obvious and standard 
technology that one reads in textbooks.  Not long ago, GCC didn't even 
build a flowgraph, and now here we are deciding what IPA technology we 
want to implement.

In the end, I don't think it really matters which way we go.  We are not 
doing advanced rocket science here.  Sure, the engineering will be tricky 
and convoluted.  But this technology is relatively mature and there are 
not very many variations on the subject.  The final result will be roughly 
the same.  Different shades of gray, and all that.

Right now, I am more concerned about the approach we take to get there.  I 
am a big proponent of evolution vs revolution, so any approach that 
involves starting from scratch gives me the willies.

In principle, I can't tell which approach will take the most effort.  Both 
seem to be missing X features that the other one has.

If we go with LTO:

GVM, TU combination and all the associated slimming down of our IR data 
structures will be quite a bit of work.  This is also needed for other 
projects

We would keep a fully functional compiler throughout.  Rewiring internal 
data structures and code to make them smaller/nimbler can be easily tested 
by making sure we can still build the world.

LLVM already has some of the technology we need for link-time optimization.  
Perhaps we should look into it and swipe design ideas, if not code.

Initially, I wasn't too thrilled with the stack-based IR chosen for GVM.  
But I understand the rationale and don't have major objections against it.  

One thing that is not clear from the LTO document is whether GVM will be 
useful for dynamic optimization.  This is one area that we will eventually 
want to move into.

If we choose LLVM, I have more questions than ideas, take these thoughts as 
very preliminary based on incomplete information:

The initial impression I get is that LLVM involves starting from scratch.  
I don't quite agree that this is necessary.  One of the engineering 
challenges we need to tackle is the requirement of keeping a fully 
functional compiler *while* we improve its architecture.

With our limited resources, we cannot really afford to go off on a 
multi-year tangent nurturing and growing a new technology just to add a 
new feature.

From what I understand, LLVM has never been used outside of a research 
environment and it can only generate code for a very limited set of 
targets.  These two are very serious limitations.  We would be losing 
years of target tweaking and compromise our ability to be a system 
compiler.

LLVM is missing a few other features like debugging information and 
vectorization.  Yes, all of it is fixable, but again, we have limited 
resources.  Furthermore, it may be hard to convince our development 
community to add these missing features: "we already implemented that!".  
It is much easier to entice folks to do something new than to re-implement 
old stuff.

The lack of FSF copyright assignment for LLVM is a problem.  It may even be 
a bigger problem than what we think.  Then again, it may not.  I just 
don't know.  What I do know is that this must absolutely be resolved 
before we even think of adding LLVM to any branch.  Chris said he'd be 
adding LLVM to the apple branch soon.  I hope the FSF assignment is worked 
out by then.  I understand that even code in branches should be under FSF 
copyright assignment.

A minor hurdle is LLVM's implementation language.  Personally, I would be 
ecstatic if we started implementing in C++.  However, not everyone in the 
community thinks this is a good idea.

Another minor nit is performance.  Judging by SPEC, LLVM has some 
performance problems.  It's very good for floating point (a 9% advantage 
over GCC), but GCC has a 24% advantage over LLVM 1.2 in integer code.  I'm 
sure that is fixable and I only have data for an old release of LLVM.  But 
is still more work to be done.  Particularly for targets not yet supported 
by LLVM.

To summarize: I am very impressed with LLVM's technical merits.  It already 
has much of the technology that we want to add to GCC.  But I think moving 
all of GCC's infrastructure to it would present quite a few problems for 
us.

Yes, LLVM gives us a well-defined and solid IPA framework, but it is 
missing quite a few things that we already take for granted.  All of them 
are fixable and some are in the process of being fixed.  But what are the 
timelines?  What resources are needed?  The LLVM solution seems to require 
a whole lot more effort from the whole community.  Not only, the core 
developers will need to be involved in it.  Many of the target and 
sub-system maintainers will need to pitch in.  Being a volunteer project, 
it is not clear whether we will be able to pull it off in a reasonable 
period of time.

The LTO approach has the disadvantage that it needs to play catch-up to 
what LLVM already has.  But it does have that evolutionary flavour that I 
personally find easier to work with.

So, my question/proposal is this: Should we consider swiping chunks of LLVM 
and adapt them to our existing framework?  So, go from GIMPLE into LLVM, 
stream to disk, do all the link-time stuff using LLVM and then read it 
back in.  In time, we'd hammer the rest of the compiler in shape.

But I would want to avoid anything that forces us to throw away the bath 
water.  Getting the baby back could prove very expensive.

Finally, I would be very interested in timelines.  Neither proposal 
mentions them.  My impression is that they will both take roughly the same 
amount of time, though the LLVM approach (as described) may take longer 
because it seems to have more missing pieces.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:20 Thoughts on LLVM and LTO Diego Novillo
@ 2005-11-22 16:37 ` Daniel Jacobowitz
  2005-11-22 22:27   ` Chris Lattner
  2005-11-22 16:37 ` Rafael Espíndola
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 59+ messages in thread
From: Daniel Jacobowitz @ 2005-11-22 16:37 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

On Tue, Nov 22, 2005 at 11:20:02AM -0500, Diego Novillo wrote:
> If we choose LLVM, I have more questions than ideas, take these thoughts as 
> very preliminary based on incomplete information:
> 
> The initial impression I get is that LLVM involves starting from scratch.  
> I don't quite agree that this is necessary.  One of the engineering 
> challenges we need to tackle is the requirement of keeping a fully 
> functional compiler *while* we improve its architecture.

Most of your concerns seem to be based on this impression; I don't
think it's right.  I'll keep this brief since others can probably
answer the details more accurately than I can.

LLVM as a backend, i.e. replacing everything from GIMPLE -> assembly,
would involve a lot of starting from scratch.  e.g. your later example
of limited target support.  One of the options Chris proposed is
an optional GIMPLE -> LLVM -> GIMPLE process, in which:

(A) the LLVM step is only necessary for optimization - I like this for
lots of reasons, not least being that we could bootstrap without a C++
compiler.

(B) the LLVM register allocator, backend, et cetera would be optional
or unused, and the existing GCC backends would be used instead.  Which
are there today, need some modernizing, but work very well.

The LLVM -> GIMPLE translator does not exist yet; I believe Chris has a
prototype of the GIMPLE -> LLVM layer working, and it took him under a
month.  I've been convinced that the opposite direction would be as
straightforward.  That's something a sufficiently motivated developer
could hack out in the course of this discussion.

> From what I understand, LLVM has never been used outside of a research 
> environment and it can only generate code for a very limited set of 
> targets.  These two are very serious limitations.

LLVM is indeed very new.  At this point I believe it has been used
outside of a research environment, but I can't say how thoroughly.

> Finally, I would be very interested in timelines.  Neither proposal 
> mentions them.  My impression is that they will both take roughly the same 
> amount of time, though the LLVM approach (as described) may take longer 
> because it seems to have more missing pieces.

I'd have guessed the other way around; the GVM/LTO proposal is for a
completely new technology and the LLVM proposal is for merging an
existing (already GCC-based) technology to work more closely with GCC.

I'm not actually as biased in favor of LLVM as this message sounds; I
feel that I don't have a good enough understanding of either option.
But I wanted to clarify what I've learned from my earlier conversations
about this topic.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:20 Thoughts on LLVM and LTO Diego Novillo
  2005-11-22 16:37 ` Daniel Jacobowitz
@ 2005-11-22 16:37 ` Rafael Espíndola
  2005-11-25  2:17   ` Scott Robert Ladd
  2005-11-22 16:45 ` Daniel Berlin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 59+ messages in thread
From: Rafael Espíndola @ 2005-11-22 16:37 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

> The initial impression I get is that LLVM involves starting from scratch.
> I don't quite agree that this is necessary.  One of the engineering
> challenges we need to tackle is the requirement of keeping a fully
> functional compiler *while* we improve its architecture.
I don't think that it involves starting from scratch. If we write a
LLVM -> GIMPLE converter the compilation process can look like
GENERIC -> GIMPLE -> LLVM -> GIMPLE -> RTL

In a first stage nothing will be done with the LLVM representation
except convert it back to GIMPLE. This will make sure that all
necessary information (including debug) can pass through the LLVM. The
conversion will also receive very good testing with this.

Latter the optimizations can be moved one by one and in a last stage
the backend can also be replaced to work directly with LLVM. This has
the advantage that only the last stage of the port is architecture
dependent.

Rafael

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:20 Thoughts on LLVM and LTO Diego Novillo
  2005-11-22 16:37 ` Daniel Jacobowitz
  2005-11-22 16:37 ` Rafael Espíndola
@ 2005-11-22 16:45 ` Daniel Berlin
  2005-11-22 18:03   ` Scott Robert Ladd
                     ` (2 more replies)
  2005-11-22 16:57 ` Steven Bosscher
  2005-11-22 18:17 ` Benjamin Kosnik
  4 siblings, 3 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-11-22 16:45 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc


> 
> GVM, TU combination and all the associated slimming down of our IR
> data 
> structures will be quite a bit of work.  This is also needed for
> other 
> projects
> 

I believe it is more work than porting improvements to LLVM and making
LLVM usable.
Significantly more work.
> 
> We would keep a fully functional compiler throughout.  Rewiring
> internal 
> data structures and code to make them smaller/nimbler can be easily
> tested 
> by making sure we can still build the world.
> 

The only way to keep a fully functioning compiler throughout is to have
massive patches.  I highly doubt you can rewrite all the optimizers to
not use tree, be safe about types, etc, without breaking anything or
without massive patches.  It's just not going to work.

> 
> The initial impression I get is that LLVM involves starting from
> scratch.  
> I don't quite agree that this is necessary.  One of the engineering 
> challenges we need to tackle is the requirement of keeping a fully 
> functional compiler *while* we improve its architecture.
> 
> 

I take the absolute opposite view, in that I believe *not using LLVM* is
starting from scratch.

You kinda gloss over the real work that will be required to modify and
implement all our data structure changes.  For example: What makes you
think we will be very successful at reducing memory usage of our data
structures without major changes, when we *never have been able to do
this well before*?

Why should we take a gamble at implementing the things we suck at, when
LLVM does them well, and we'd only need to implement the things we've
done right before?

> 
> With our limited resources, we cannot really afford to go off on a 
> multi-year tangent nurturing and growing a new technology just to add
> a 
> new feature.
> 
What makes you think implementing LTO from scratch is different here?

> Another minor nit is performance.  Judging by SPEC, LLVM has some 
> performance problems.  It's very good for floating point (a 9% advantage 
> over GCC), but GCC has a 24% advantage over LLVM 1.2 in integer code.  I'm 
> sure that is fixable and I only have data for an old release of LLVM

Uh, you are comparing 4 releases ago of LLVM, against the current
release of gcc, and saying "It doesn't do as well".

GCC 3.2 wasn't that good at SPEC either :)

As for the rest, i'm sure Chris will respond, but

1. It has been used outside research environments, and in fact, Apple is
moving to use it as their middle end.

2. It natively supports Alpha, Sparc, IA64, X86, and PowerPC.  An
LLVM->RTL converter is not that hard, which simply removes the entire
argument anyway.

The bottom line I just don't see any sane argument for redo'ing what
others have done very well, unless using that will require more
resources than doing it from scratch.

I can't honestly believe that the work required to make LLVM usable for
us is anywhere near the work we are going to need to tree-ssa to do the
same things LLVM does.

--Dan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:20 Thoughts on LLVM and LTO Diego Novillo
                   ` (2 preceding siblings ...)
  2005-11-22 16:45 ` Daniel Berlin
@ 2005-11-22 16:57 ` Steven Bosscher
  2005-11-22 17:28   ` Daniel Berlin
                     ` (2 more replies)
  2005-11-22 18:17 ` Benjamin Kosnik
  4 siblings, 3 replies; 59+ messages in thread
From: Steven Bosscher @ 2005-11-22 16:57 UTC (permalink / raw)
  To: gcc; +Cc: Diego Novillo

On Tuesday 22 November 2005 17:20, Diego Novillo wrote:
> The initial impression I get is that LLVM involves starting from scratch.

I thought it would basically "only" replace the GIMPLE parts of the
compiler.  That is,

FE	-->	GENERIC	-->	LLVM	-->	RTL	--> asm
(trees)		(trees)

In the longer-term, you could maybe cut out the RTL part for targets
for which LLVM has its own backend.

This is not less evolutionary or revolutionary than tree-ssa was IMHO.

> With our limited resources, we cannot really afford to go off on a
> multi-year tangent nurturing and growing a new technology just to add a
> new feature.

It depends on who is going to invest these resources.  Would you want
to tell Apple they can't do this even though they can? ;-)

> LLVM is missing a few other features like debugging information and
> vectorization.  Yes, all of it is fixable, but again, we have limited
> resources.

A lot of work will have to go into debugging information even for GVM,
because debug info will have to go through the IPA machinery also, i.e.
be attached to the call graph, etc...

The vectorizer is a minor piece of code compared to IPA, or a complete
high-level optimizer.  And at least the ideas from the implementation
for GIMPLE that we have now may be re-usable.  (This is still the first
true "portable" vectorizer that I know of, maybe it's not only easy to
port to other targets, but also to other compilers! ;-)

> The lack of FSF copyright assignment for LLVM is a problem.

As is the lack of clear agreements on who would control this project,
e.g. going with LLVM also means we'll have to shuffle our reviewers'
privileges a bit.  After all, in the GCC community only Chris really
knows LLVM well enough -- so would that mean he'd get blanket approval
rights for the LLVM parts of the compiler?

This actually worries me more than the copyright question.

> A minor hurdle is LLVM's implementation language.  Personally, I would be
> ecstatic if we started implementing in C++.  However, not everyone in the
> community thinks this is a good idea.

And not everyone in the community thinks that sticking with C is a
good idea.  So far, the status quo was to stay with C because that's
what we have.  If someone puts up a large body of C++ code now, there
had better be good technical reasons against going with C++...

> But what are the 
> timelines?  What resources are needed?

Interesting questions.  Both projects obviously will take significant
effort.  But IIUC Chris has bits of the LLVM stuff already going, so
he has the head-start (like tree-SSA did when LLVM was introduced to
the GCC community, ironically? ;-) so maybe Chris can have a working
prototype implementation within, what, months?  The GVM plan could
take years to get to that point...

So my dummy prediction would be that the LLVM path would result in a
reasonable product more quickly than the GVM plan -- iff RTL stays.

Gr.
Steven

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:57 ` Steven Bosscher
@ 2005-11-22 17:28   ` Daniel Berlin
  2005-11-22 19:06   ` Richard Henderson
  2005-11-22 22:31   ` Chris Lattner
  2 siblings, 0 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-11-22 17:28 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Diego Novillo

On Tue, 2005-11-22 at 17:58 +0100, Steven Bosscher wrote:
> On Tuesday 22 November 2005 17:20, Diego Novillo wrote:
> > The initial impression I get is that LLVM involves starting from scratch.
> 
> I thought it would basically "only" replace the GIMPLE parts of the
> compiler.  That is,
> 
> FE	-->	GENERIC	-->	LLVM	-->	RTL	--> asm
> (trees)		(trees)
> 
> In the longer-term, you could maybe cut out the RTL part for targets
> for which LLVM has its own backend.
> 

This was my impression as well.  I certainly didn't think we'd throw out
RTL anytime soon for anything else.  Maybe nobody said this explicitly,
so Diego thought some of us supported replacing the backends.
That would slow you down years, and not even necessarily be better.

> This is not less evolutionary or revolutionary than tree-ssa was IMHO.
> 
> 
> > With our limited resources, we cannot really afford to go off on a
> > multi-year tangent nurturing and growing a new technology just to add a
> > new feature.
> 
> It depends on who is going to invest these resources.  Would you want
> to tell Apple they can't do this even though they can? ;-)


> As is the lack of clear agreements on who would control this project,
> e.g. going with LLVM also means we'll have to shuffle our reviewers'
> privileges a bit.  After all, in the GCC community only Chris really
> knows LLVM well enough

Well, i'm familiar with a bunch of the optimizers in LLVM (I've
contributed some patches), but not the code generation side.

>  -- so would that mean he'd get blanket approval
> rights for the LLVM parts of the compiler?

I assume if the technical issues were not issues, the SC would help to
resolve these issues.  You'd probably need some kind of transition for
tree-ssa/etc maintainers to help maintain the appropriate parts of LLVM
(assuming they wanted to), or else you'd end up very quickly with a
logjam of patches to those areas, etc.

Maintainership is based mainly on knowledge of design in the area (not
code monkey ability), so i don't think this would be a huge problem.

But this is all conjecture, political issues are best left to political
people, which i am not.



> > But what are the 
> > timelines?  What resources are needed?
> 
> Interesting questions.  Both projects obviously will take significant
> effort.  But IIUC Chris has bits of the LLVM stuff already going, so
> he has the head-start (like tree-SSA did when LLVM was introduced to
> the GCC community, ironically? ;-) so maybe Chris can have a working
> prototype implementation within, what, months? 

I think Apple hopes he will :)

>  The GVM plan could
> take years to get to that point...

> 
> So my dummy prediction would be that the LLVM path would result in a
> reasonable product more quickly than the GVM plan -- iff RTL stays.
> 
> Gr.
> Steven
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:45 ` Daniel Berlin
@ 2005-11-22 18:03   ` Scott Robert Ladd
  2005-11-23 12:11   ` Diego Novillo
  2005-11-27 19:58   ` Devang Patel
  2 siblings, 0 replies; 59+ messages in thread
From: Scott Robert Ladd @ 2005-11-22 18:03 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, gcc

Daniel Berlin wrote:
> 2. It natively supports Alpha, Sparc, IA64, X86, and PowerPC.  An
> LLVM->RTL converter is not that hard, which simply removes the entire
> argument anyway.

I see the phrase "doing X is not that hard" in response to many 
questions about this proposal. Now, I'm arguing the difficulty of the 
given tasks, but even a simple task requires someone to do it. And 
maintain it.

Which begs the question: Are these proposals practical within the 
existing GCC developer community, particularly over the long term 
(years, decades)?

How did moving to tree-ssa affect the developer community? Did more 
people come on board, are fewer people working on GCC now, or did it 
have no net influence on the developer base?

I honestly don't know, hence my queries.

> The bottom line I just don't see any sane argument for redo'ing what
> others have done very well, unless using that will require more
> resources than doing it from scratch.
> 
> I can't honestly believe that the work required to make LLVM usable for
> us is anywhere near the work we are going to need to tree-ssa to do the
> same things LLVM does.

Reverse the question: What does tree-ssa do that LLVM does not? I know 
that's been covered to some extent in these threads, but maybe someone 
knowledgable could lay out a very simple bullet point list comparing 
what needs to be done with both plans.

I'm not saying which way GCC should go -- I merely think that all the 
consequences need to be considered carefully.

-- 
Scott Robert Ladd <scott.ladd@coyotegulch.com>
Coyote Gulch Productions
http://www.coyotegulch.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:20 Thoughts on LLVM and LTO Diego Novillo
                   ` (3 preceding siblings ...)
  2005-11-22 16:57 ` Steven Bosscher
@ 2005-11-22 18:17 ` Benjamin Kosnik
  2005-11-22 18:27   ` Gabriel Dos Reis
                     ` (3 more replies)
  4 siblings, 4 replies; 59+ messages in thread
From: Benjamin Kosnik @ 2005-11-22 18:17 UTC (permalink / raw)
  To: gcc

> First off, regardless of what direction we choose to go, I think we
> are in a great position.  Finally, GCC will have all the obvious and
> standard technology that one reads in textbooks.  Not long ago, GCC
> didn't even build a flowgraph, and now here we are deciding what IPA
> technology we want to implement.

I agree. It's cool to see this evolution.

> LLVM is missing a few other features like debugging information and
> vectorization.  Yes, all of it is fixable, but again, we have limited
> resources.  Furthermore, it may be hard to convince our development
> community to add these missing features: "we already implemented
> that!".  It is much easier to entice folks to do something new than to
> re-implement old stuff.

Debugability is essential. I would like to seem some comment from LLVM
people about plans for this, and I fully realize that IMA and the
other proposal also have their own debugability baggage. 

IMHO, some realistic plan to deal with this essential feature is
required.

> The lack of FSF copyright assignment for LLVM is a problem.  It may
> even be a bigger problem than what we think.  Then again, it may not.
> I just don't know.  What I do know is that this must absolutely be
> resolved before we even think of adding LLVM to any branch.  Chris
> said he'd be adding LLVM to the apple branch soon.  I hope the FSF
> assignment is worked out by then.  I understand that even code in
> branches should be under FSF copyright assignment.

This is a solvable problem, and has been pointed out to Chris
repeatedly, by many people, at various venues, for over a year.

http://gcc.gnu.org/ml/gcc/2004-10/msg01146.html

He keeps hand waving, saying that it's possible.

Great. 

I say, enough grandstanding: it's not enough to be possible, it needs
to be actual. If it is indeed actually possible to release LLVM under
the GPL, then he needs to pick a version and GPL it. Then we can get
serious.

Make it so.

He seems to be operating under the mistaken idea that the GCC
community can go ahead and make plans around LLVM being free as
defined by the GNU project, without it actually being so.

> Another minor nit is performance.  Judging by SPEC, LLVM has some
> performance problems.  It's very good for floating point (a 9%
> advantage over GCC), but GCC has a 24% advantage over LLVM 1.2 in
> integer code.  I'm sure that is fixable and I only have data for an
> old release of LLVM.  But is still more work to be done.  Particularly
> for targets not yet supported by LLVM.

What about compile-time performance?

I'd actually like to make this a requirement, regardless of the option
chosen.

-benjamin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:17 ` Benjamin Kosnik
@ 2005-11-22 18:27   ` Gabriel Dos Reis
  2005-11-22 18:47     ` Daniel Berlin
  2005-11-22 22:06   ` Steven Bosscher
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 59+ messages in thread
From: Gabriel Dos Reis @ 2005-11-22 18:27 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: gcc

Benjamin Kosnik  <bkoz@redhat.com> writes:

[...]

| I'd actually like to make this a requirement, regardless of the option
| chosen.

Amen.

-- Gaby

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:27   ` Gabriel Dos Reis
@ 2005-11-22 18:47     ` Daniel Berlin
  2005-11-22 18:50       ` Richard Henderson
  2005-11-22 18:59       ` Gabriel Dos Reis
  0 siblings, 2 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-11-22 18:47 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Benjamin Kosnik, gcc

On Tue, 2005-11-22 at 19:25 +0100, Gabriel Dos Reis wrote:
> Benjamin Kosnik  <bkoz@redhat.com> writes:
> 
> [...]
> 
> | I'd actually like to make this a requirement, regardless of the option
> | chosen.
> 
> Amen.
> 

Uh, IPA of any sort is generally not about speed.
It's fine to say compile time performance of the middle end portions ew
may replace should be same or better, but algorithms that operate on
large portions of the program are over never fast, because they aren't
linear.
They usually take *at least* seconds per pass.

So you need to quantify "good".  

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:47     ` Daniel Berlin
@ 2005-11-22 18:50       ` Richard Henderson
  2005-11-22 18:53         ` Daniel Berlin
  2005-11-22 18:59       ` Gabriel Dos Reis
  1 sibling, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2005-11-22 18:50 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Gabriel Dos Reis, Benjamin Kosnik, gcc

On Tue, Nov 22, 2005 at 01:47:12PM -0500, Daniel Berlin wrote:
> Uh, IPA of any sort is generally not about speed.

Except that we're talking about replacing all the tree optimizations
all of the time with llvm, which affects -O1.  Or at least I thought
that was the suggestion...


r~

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:50       ` Richard Henderson
@ 2005-11-22 18:53         ` Daniel Berlin
  2005-11-22 19:07           ` Benjamin Kosnik
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Berlin @ 2005-11-22 18:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Gabriel Dos Reis, Benjamin Kosnik, gcc

On Tue, 2005-11-22 at 10:49 -0800, Richard Henderson wrote:
> On Tue, Nov 22, 2005 at 01:47:12PM -0500, Daniel Berlin wrote:
> > Uh, IPA of any sort is generally not about speed.
> 
> Except that we're talking about replacing all the tree optimizations
> all of the time with llvm, which affects -O1.  Or at least I thought
> that was the suggestion...

Which is why i said "It's fine to say compile time performance of the
middle end portions ew may replace should be same or better".

And if you were to look right now, it's actually significantly better in
some cases :(


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:47     ` Daniel Berlin
  2005-11-22 18:50       ` Richard Henderson
@ 2005-11-22 18:59       ` Gabriel Dos Reis
  2005-11-22 19:06         ` Daniel Berlin
  1 sibling, 1 reply; 59+ messages in thread
From: Gabriel Dos Reis @ 2005-11-22 18:59 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Benjamin Kosnik, gcc

Daniel Berlin <dberlin@dberlin.org> writes:

| On Tue, 2005-11-22 at 19:25 +0100, Gabriel Dos Reis wrote:
| > Benjamin Kosnik  <bkoz@redhat.com> writes:
| > 
| > [...]
| > 
| > | I'd actually like to make this a requirement, regardless of the option
| > | chosen.
| > 
| > Amen.
| > 
| 
| Uh, IPA of any sort is generally not about speed.
| It's fine to say compile time performance of the middle end portions ew
| may replace should be same or better, but algorithms that operate on
| large portions of the program are over never fast, because they aren't
| linear.
| They usually take *at least* seconds per pass.
| 
| So you need to quantify "good".  

As I undestand it, we are going to merge information from different
translation units for the purpose of link-time optimization.  I expect
some increase in compile-time there.  I don't care that the algorithms
are linear or not.  What I do care about is that for the end-result,
compile-time performance is kept in reasonable bounds -- no matter what
implementation technology is finally decided on. 

-- Gaby

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:59       ` Gabriel Dos Reis
@ 2005-11-22 19:06         ` Daniel Berlin
  2005-11-22 19:21           ` Benjamin Kosnik
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Berlin @ 2005-11-22 19:06 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Benjamin Kosnik, gcc

On Tue, 2005-11-22 at 19:57 +0100, Gabriel Dos Reis wrote:
> Daniel Berlin <dberlin@dberlin.org> writes:
> 
> | On Tue, 2005-11-22 at 19:25 +0100, Gabriel Dos Reis wrote:
> | > Benjamin Kosnik  <bkoz@redhat.com> writes:
> | > 
> | > [...]
> | > 
> | > | I'd actually like to make this a requirement, regardless of the option
> | > | chosen.
> | > 
> | > Amen.
> | > 
> | 
> | Uh, IPA of any sort is generally not about speed.
> | It's fine to say compile time performance of the middle end portions ew
> | may replace should be same or better, but algorithms that operate on
> | large portions of the program are over never fast, because they aren't
> | linear.
> | They usually take *at least* seconds per pass.
> | 
> | So you need to quantify "good".  
> 
> 
> As I undestand it, we are going to merge information from different
> translation units for the purpose of link-time optimization.  I expect
> some increase in compile-time there.  I don't care that the algorithms
> are linear or not.  What I do care about is that for the end-result,
> compile-time performance is kept in reasonable bounds -- no matter what
> implementation technology is finally decided on. 

Okay, but you need to understand that reasonable bounds for compiling
the entire program at once are usually 3x-7x more (and in the worst
case, even wore) than doing it seperately.

That is the case with completely state of the art algorithms,
implementation techniques, etc.

It's just the way the world goes.

It's in no way reasonable to expect to be able to perform IPA
optimizations on a 1 million line program in 30 seconds, even if we can
compile it normally in 10 seconds.

--Dan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:57 ` Steven Bosscher
  2005-11-22 17:28   ` Daniel Berlin
@ 2005-11-22 19:06   ` Richard Henderson
  2005-11-22 19:28     ` David Edelsohn
                       ` (2 more replies)
  2005-11-22 22:31   ` Chris Lattner
  2 siblings, 3 replies; 59+ messages in thread
From: Richard Henderson @ 2005-11-22 19:06 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Diego Novillo

On Tue, Nov 22, 2005 at 05:58:14PM +0100, Steven Bosscher wrote:
> I thought it would basically "only" replace the GIMPLE parts of the
> compiler.  That is,
> 
> FE	-->	GENERIC	-->	LLVM	-->	RTL	--> asm
> (trees)		(trees)

This is certainly the only way to avoid losing functionality.

I worry that this path will bitrot as 99% of folk use the llvm
path straight through to assembly on i386.  But perhaps a config
option to force the rtl path, plus some automated testing, can
prevent that from happening too fast.

> It depends on who is going to invest these resources.  Would you want
> to tell Apple they can't do this even though they can? ;-)

No, but we also might want to let Apple work on this for a year
and then come back with something more concrete than "it should
be easy".

> A lot of work will have to go into debugging information even for GVM,
> because debug info will have to go through the IPA machinery also, i.e.
> be attached to the call graph, etc...

I think this view is not really correct.  The hard part about debug
info is pushing it through the optimizers.  This is where LLVM has
a huge amount of work to do.

For GVM, all it has to do is hold onto the debug info that we had
whe writing out the IL and read it back in.  This is trivial in
the context of the rest of IPA.

The biggest technical problem I see with LLVM is actually the debug
info.  Frankly, I'm not sure I even want to consider LLVM until
that's done.  If it's as easy as Chris and Danny make it out to be,
then they'll have it knocked off in short order.  If not ...

> The GVM plan could take years to get to that point...

Could, but probably won't.  I'd have actually guessed they could
have something functional, if not 100% robust, in 6 months given
2 or 3 people on the project.

r~

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:53         ` Daniel Berlin
@ 2005-11-22 19:07           ` Benjamin Kosnik
  2005-11-22 20:04             ` Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO) Jan Hubicka
  2005-11-22 22:52             ` Thoughts on LLVM and LTO Chris Lattner
  0 siblings, 2 replies; 59+ messages in thread
From: Benjamin Kosnik @ 2005-11-22 19:07 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: rth, gdr, gcc

> Which is why i said "It's fine to say compile time performance of the
> middle end portions ew may replace should be same or better".
> 
> And if you were to look right now, it's actually significantly better in
> some cases :(

Can you prove this assertion?

Here is some data:
http://people.redhat.com/dnovillo/spec2000.i686/gcc/global-build-secs_elapsed.html

And some more
http://llvm.cs.uiuc.edu/testresults/X86/2005-11-01.html

I'm  not sure about accuracy, or versions of LLVM used, etc.

Although promising on some things (as Diego said), LLVM exectue and
compile performance is a mixed bag.

It would probably be interesting to run SPEC or something else with icc
IPO enabled, LLVM IPO enabled, and whatever gcc IMA support is
available, to do a true comparison of where things stand. More data
would be interesting.

-benjamin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:06         ` Daniel Berlin
@ 2005-11-22 19:21           ` Benjamin Kosnik
  2005-11-22 19:37             ` Gabriel Dos Reis
                               ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Benjamin Kosnik @ 2005-11-22 19:21 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdr, gcc

> Okay, but you need to understand that reasonable bounds for compiling
> the entire program at once are usually 3x-7x more (and in the worst
> case, even wore) than doing it seperately.
> 
> That is the case with completely state of the art algorithms,
> implementation techniques, etc.
> 
> It's just the way the world goes.
> 
> It's in no way reasonable to expect to be able to perform IPA
> optimizations on a 1 million line program in 30 seconds, even if we can
> compile it normally in 10 seconds.

Tree-SSA managed to add new technology to the compiler without major
slowdowns. I'm suggesting that whatever LTO technology is used do
the same for non-LTO programs. I consider this reasonable.

Now, I think you are setting the compile time performance bar for LTO
awfully low. I'm not asking for new funtionality to be as fast as the
current technology without the functionality (although that would
certainly be nice, wouldn't it?).

Certainly, icc with IPO is definitely not as slow as you claim. 

-benjamin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:06   ` Richard Henderson
@ 2005-11-22 19:28     ` David Edelsohn
  2005-11-22 22:19     ` Steven Bosscher
  2005-11-22 22:50     ` Chris Lattner
  2 siblings, 0 replies; 59+ messages in thread
From: David Edelsohn @ 2005-11-22 19:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

> I'd have actually guessed they could
> have something functional, if not 100% robust, in 6 months given
> 2 or 3 people on the project.

	The question is the width of the gap between functional and
usable.  A number of people on this thread have implied that GCC's data
structures will need to be trimmed substantially for LTO to meet the
expectations of end users.

	Among some of the difficult areas, GCC is ahead on debug
information and LLVM is ahead on data structures.

	I think the main thing we need is for the LLVM community to start
the necessary effort on the copyright assignment paperwork from the
various contributors for LLVM to be a practical option.  For LLVM to be
seriously considered, the license and assignment needs to be well on its
way to being resolved, not just hand-waving that it can be solved.
Otherwise, this discussion is a distraction that hurts GCC development
progress.

David

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:21           ` Benjamin Kosnik
@ 2005-11-22 19:37             ` Gabriel Dos Reis
  2005-11-22 20:09             ` Daniel Berlin
  2005-11-22 22:15             ` Steven Bosscher
  2 siblings, 0 replies; 59+ messages in thread
From: Gabriel Dos Reis @ 2005-11-22 19:37 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: Daniel Berlin, gcc

Benjamin Kosnik <bkoz@redhat.com> writes:

| > Okay, but you need to understand that reasonable bounds for compiling
| > the entire program at once are usually 3x-7x more (and in the worst
| > case, even wore) than doing it seperately.
| > 
| > That is the case with completely state of the art algorithms,
| > implementation techniques, etc.
| > 
| > It's just the way the world goes.
| > 
| > It's in no way reasonable to expect to be able to perform IPA
| > optimizations on a 1 million line program in 30 seconds, even if we can
| > compile it normally in 10 seconds.
| 
| Tree-SSA managed to add new technology to the compiler without major
| slowdowns. I'm suggesting that whatever LTO technology is used do
| the same for non-LTO programs. I consider this reasonable.

Indeed.  I don't quite understand why people suddently get on their horses.

-- Gaby

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO)
  2005-11-22 19:07           ` Benjamin Kosnik
@ 2005-11-22 20:04             ` Jan Hubicka
  2005-11-22 20:19               ` Scott Robert Ladd
  2005-11-22 22:52             ` Thoughts on LLVM and LTO Chris Lattner
  1 sibling, 1 reply; 59+ messages in thread
From: Jan Hubicka @ 2005-11-22 20:04 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: Daniel Berlin, rth, gdr, gcc

> 
> > Which is why i said "It's fine to say compile time performance of the
> > middle end portions ew may replace should be same or better".
> > 
> > And if you were to look right now, it's actually significantly better in
> > some cases :(
> 
> Can you prove this assertion?
> 
> Here is some data:
> http://people.redhat.com/dnovillo/spec2000.i686/gcc/global-build-secs_elapsed.html
> 
> And some more
> http://llvm.cs.uiuc.edu/testresults/X86/2005-11-01.html
> 
> I'm  not sure about accuracy, or versions of LLVM used, etc.
> 
> Although promising on some things (as Diego said), LLVM exectue and
> compile performance is a mixed bag.
> 
> It would probably be interesting to run SPEC or something else with icc
> IPO enabled, LLVM IPO enabled, and whatever gcc IMA support is
> available, to do a true comparison of where things stand. More data
> would be interesting.

I might try to produce bit more useful charts, but I've done some
testing of GCC 4.1 on SPEC and some of C++ testcases recently mostly
looking for regressions in GCC 4.1 release.  I didn't tested LLVM, but
did some ICC comparsion and testing both with and without our current
IMA so it gives rough idea.

I should note that comparison to ICC is not quite fair since it lacks
Opteron tunning I tested on, but I would say that we are in same
performance camp on SPECint with IMA (IMA contribute 3.3% to the result)
despite the fact that GCC IMA and IPA is very primitive.  This can be
just proof that SPECint is not best testcase for testing future IPA
implementations.  I also did some C++ results that are a lot more wild.
It would be really interesting to see how much benefits one can see on
compiling full blown application and how large stuff one can hope to
compile with LTO (ie GCC/kernel/mozilla/OOo/... ;).

I am not quite sure how much of SPECfp loss can be contributed to IMA,
since I would expect it to more come from Fotran tunning.  Only
regressing C benchmark is ART that ineed needs whole program
optimization to allow datastructure layout changes.  Obviously we did
some notable progress on fortran perofrmance in between 4.0 and 4.1 and
none of that is IPA related.

I am also adding some scores of C++ testcases - tramp3d that has single
file and Gerald's application I didn't actually managed to merge into
single file, but I combined the files that appear hot in coverage.

Concerning compile time at -O2 hammer branch needs 185s, 4.0 192s, 4.1
205s With IPA and no FDO 4.0 needs 193s when patches by Andrew's faster
typemerging patch, 4.1 needs 218s.  I didn't recorded ICC compilation
times, but it clearly show that we are making compile time problems
worse with 4.1 again overall.  It also shows that IPA is cheap right,
but just because it is so primitive.  It is also cheap only as long as
you fit in memory (You need over 512MB of memory to build SPEC with IMA
on GCC that is far from acceptable)

Also note that eon and fortran files are not compiled with IMA in GCC
tests.

-O2, no IMA on both compilers:
	GCC-3.3-hammer	GCC 4.0	GCC 4.1	ICC-9.0
gzip	1162		1181	1199	1151
vpr    	859		853	824	854
gcc    	1057		1035	1028	963
mcf    	540		540	541	543
crafty 	2100		2041	2025	2106
parser 	776		790	783	778
eon    	1793		1874	1952	(failed, substituted as 783 for geomavg)
perlbmk	1407		1453	1438	1503
gap    	1095		1152	1156	1071
vortex 	1689		1663	1666	1618
bzip2  	1009		1011	1000	997
twolf  	843		858	852	823
geomavg	1114.8		1124.95	1122.76	1102

	GCC-3.3-hammer	GCC 4.0	GCC 4.1	ICC-9.0
wupwise	1218		1079	1304	1278
swim	1038		1065	1070	1064
mgrid	784		728	906	909
applu	772		822	840	884
mesa	1536		1609	1536	1486
galgel	    		803	830	
art	730		739	735	747
equake	1102		1085	1069	1055
facerec	    		905	914	1393
ammp	967		993	1008	985
lucas	    		1106	1113	1264
fma3d	    		976	978	1154
sixtrac	582		591	618	647
apsi	810		922	1004	948
			933	971	1016

-O2 -static --combine -fwhole-program  -fipa-cp
versus ICC -xW -O3 -ipo -vec_report3
profile feedback is used on both compilers.
	GCC-3.3-hammer	GCC 4.0	GCC-4.1	ICC-9.0
gzip	1269		1299	1264	1337
vpr    	890		864	885	869
gcc    	1112		1095	1175	1023
mcf    	539		536	538	546
crafty 	2055		2034	2236	2301
parser 	960		975	993	851
eon    	2081		1928	2192	2150
perlbmk	1621		1574	1697	1652
gap    	1117		1181	1223	1224
vortex 	1683		2038	2173	2421
bzip2  	1058		1022	1085	1087
twolf  	842		877	877	849
	1183.41		1195.84	1251.55	1232.97

	GCC-3.3-hammer	GCC 4.0	GCC 4.1	ICC-9.0
wupwise			1305	1401	1678
swim			1065	1293	1360
mgrid			758	884	973
applu			857	918	1060
mesa	1756		1751	1756	1759
galgel			818	848	1790
art	724		734	735	1414
equake	1088		1101	1108	1308
facerec			974	1110	1467
ammp	1008		1034	1063	967
lucas			1111	1104	1261
fma3d			976	1215	1238
sixtrac			643	702	653
apsi			940	988	958
			973.82	1049.12	1234.02

Tramp3d, iterations per seccond with and without FDO.
GCC 3.3-hammer	0.36
GCC 4.0		0.45
GCC 4.1		0.56
GCC 4.1 flatten	0.62
GCC 4.1 profile	0.07
GCC 4.1 FDO    	0.81
GCC 4.1 profile	0.08
4.1 FDO flatten	0.89
ICC 9.0		0.14

DLV, speedup in percents relative to GCC 3.3 hammer-branch
		GCC 4.0	GCC 4.1	GCC-4.1 profile	ICC 9.0
STRATCOMP1-ALL	284	287.1	242.86		18.52
STRATCOMP-770.2-6.25	0	13.33		-10.53
2QBF1		-5.47	-5.87	6.83		-15.23
PRIMEIMPL2	3.09	5.26	12.36		-23.95
3COL-SIMPLEX1	-1.78	-7.78	2.47		9.21
3COL-RANDOM1	-3.88	-0.84	0.21		-20.84
HP-RANDOM1	-26.72	-13.83	-12.45		-9.94
HAMCYCLE-FREE	-1.89	-3.7	0		-17.46
DECOMP2		-6.84	-12.2	-12.35		-11.27
BW-P5-nopush	-6.29	-4.07	-2.75		-5.98
BW-P5-pushbin	-5.28	-1.95	-0.4		-13.75
BW-P5-nopushbin	-6.49	-2.7	0		-8.86
HANOI-Towers	-6.79	-2.58	0		-21.35
RAMSEY		5.41	-3.7	9.86		-5.65
CRISTAL		-17.21	-20.12	-13.53		-8.91
21-QUEENS	-1.71	-2.55	4.24		-34.48
MSTDir[V=13]	2.06	0.2	6		-31.72
MSTDir[V=15]	1.84	1.01	6.87		-32.15
MSTUndir[V=13]	-4.08	-4.08	2.92		-29.5
TIMETABLING	2.65	0.74	7.97		-31.91
AVG		2.71	2.6	7.74		-16.31

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:21           ` Benjamin Kosnik
  2005-11-22 19:37             ` Gabriel Dos Reis
@ 2005-11-22 20:09             ` Daniel Berlin
  2005-11-22 22:15             ` Steven Bosscher
  2 siblings, 0 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-11-22 20:09 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: gdr, gcc, Richard Henderson

On Tue, 2005-11-22 at 13:21 -0600, Benjamin Kosnik wrote:
> > Okay, but you need to understand that reasonable bounds for compiling
> > the entire program at once are usually 3x-7x more (and in the worst
> > case, even wore) than doing it seperately.
> > 
> > That is the case with completely state of the art algorithms,
> > implementation techniques, etc.
> > 
> > It's just the way the world goes.
> > 
> > It's in no way reasonable to expect to be able to perform IPA
> > optimizations on a 1 million line program in 30 seconds, even if we can
> > compile it normally in 10 seconds.
> 
> Tree-SSA managed to add new technology to the compiler without major
> slowdowns. 
I'm suggesting that whatever LTO technology is used do
> the same for non-LTO programs. I consider this reasonable.

This is fine.

> 
> Now, I think you are setting the compile time performance bar for LTO
> awfully low. I'm not asking for new funtionality to be as fast as the
> current technology without the functionality (although that would
> certainly be nice, wouldn't it?).
> 
> Certainly, icc with IPO is definitely not as slow as you claim. 

I'm done arguing any of these points (yours, Richard's about debug info,
or anyone else's).

For once, I'm simply going to sit on the sidelines for 2 years while
everyone else does (or does not) do something about the problems we face
moving to IPA.

So let me just say, as a final analysis:

1. Richard, LLVM carries debug info in the exact same way we would do so
in GVM (transmitting it in the IL), and keeping it up to date in the
optimizers would also have to be done the same way.  

When Chris says "LLVM doesn't support debug info" he means it does not
*produce* any debug info.  if you compile a file with llvm-gcc -g, you
will see familiar line tracking information, function start/end info,
and lexical region info, and compile units.   The thing missing is to
write out the dwarf2 info we've accumulated, into the approriate global
llvm compile unit descriptors, and then pass that along the same exact
way we do in tree-ssa land.

We do almost nothing to truly keep it up to date in the tree optimizers,
except for a very small number of passes, and this would also be true of
LLVM.

2. Ben, I'm not sure where you think icc with IPO is fast.
It takes 15 minutes to IPA compile xemacs on my machine with icc, and
230 meg of memory.

non-ipa it takes 3 minutes and 60 meg of memory.

You are *lucky* to have only a 5x slowdown when you have any large
amount of code.

I've done a lot of work in IPA, not just on GCC.  You are really in for
a surprise if you think that compiling an industrial application isn't
going to take literally days with IPA optimizations when it takes hours
without it.
Again, it's fine to say the non-LTO programs should compile fast.  But I
am setting the LTO compile time goal to what i believe it can meet.

I still think it would be a mistake to redo all of this from scratch
than start with something else, because i think we, as usual, vastly
underestimate the amount of work our data structures will need.

I eagerly await someone to make gcc --combine not take 2 gigabytes of
memory to compile gcc, but still optimize it very well and have
reasonable compile time bounds

By the by, to compile something like libsablevm (after removing the 6
lines of inline assembly), gcc takes >120 meg of memory (in fact, 4.0
takes 260 meg of memory), and 35 seconds (at O2 or O3, your choice).
This is just tree optimizer time i'm including.

llvm-gcc takes only 64 meg of memory, and 32 seconds, while performing a
whole bunch of of interprocedural optimizations to boot.

But again, i'm just going to sit on the sidelines from now on and let
someone else come up with all the technical solutions and work.  

I'm truly not a proponent of either proposal (though it may seem that
way), but nobody has given any more details about how they plan on
changing anything so it's memory and compile time efficient in the other
one, only how to write it out and read it back in.  We've been saying
we'd solve these issues since *I* started working on gcc, but we still
get our ass handed to us by everyone else on things like memory usage.

--Dan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO)
  2005-11-22 20:04             ` Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO) Jan Hubicka
@ 2005-11-22 20:19               ` Scott Robert Ladd
  2005-11-22 21:04                 ` Jan Hubicka
  2005-11-22 22:02                 ` Steven Bosscher
  0 siblings, 2 replies; 59+ messages in thread
From: Scott Robert Ladd @ 2005-11-22 20:19 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Benjamin Kosnik, Daniel Berlin, rth, gdr, gcc

Jan Hubicka wrote:
> I should note that comparison to ICC is not quite fair since it lacks
> Opteron tunning...

I think you may be comparing oranges to tangerines -- not as bad as 
apples and oranges, but still potentially an invalid comparison.

In my experience the extra registers of the Opteron provide a 
significant benefit; GCC has an unfair advantage if ICC only generated 
code for the small set of x86 registers.

..Scott

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO)
  2005-11-22 20:19               ` Scott Robert Ladd
@ 2005-11-22 21:04                 ` Jan Hubicka
  2005-11-22 22:02                 ` Steven Bosscher
  1 sibling, 0 replies; 59+ messages in thread
From: Jan Hubicka @ 2005-11-22 21:04 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: Jan Hubicka, Benjamin Kosnik, Daniel Berlin, rth, gdr, gcc

> Jan Hubicka wrote:
> >I should note that comparison to ICC is not quite fair since it lacks
> >Opteron tunning...
> 
> I think you may be comparing oranges to tangerines -- not as bad as 
> apples and oranges, but still potentially an invalid comparison.
> 
> In my experience the extra registers of the Opteron provide a 
> significant benefit; GCC has an unfair advantage if ICC only generated 
> code for the small set of x86 registers.

Forgot to mention, all the tests was x86-64, so ICC used extra registers
too.  Also to clarify, the C++ benchmarks was done with -O3 only.

Honza
> 
> ..Scott

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO)
  2005-11-22 20:19               ` Scott Robert Ladd
  2005-11-22 21:04                 ` Jan Hubicka
@ 2005-11-22 22:02                 ` Steven Bosscher
  2005-11-23  0:30                   ` Scott Robert Ladd
  1 sibling, 1 reply; 59+ messages in thread
From: Steven Bosscher @ 2005-11-22 22:02 UTC (permalink / raw)
  To: gcc
  Cc: Scott Robert Ladd, Jan Hubicka, Benjamin Kosnik, Daniel Berlin, rth, gdr

On Tuesday 22 November 2005 21:18, Scott Robert Ladd wrote:
> Jan Hubicka wrote:
> > I should note that comparison to ICC is not quite fair since it lacks
> > Opteron tunning...
>
> I think you may be comparing oranges to tangerines -- not as bad as
> apples and oranges, but still potentially an invalid comparison.
>
> In my experience the extra registers of the Opteron provide a
> significant benefit; GCC has an unfair advantage if ICC only generated
> code for the small set of x86 registers.

It obviously doesn't do that.  ICC uses that larger register file, too,
for x86-64.

AMD's Opteron (AMD64) and Intel's Nocona (EM64T-cheap-ass-AMD64-clone)
are both just implementations of the x86-64 architecture.  And ICC is
tuned for EM64T, I would guess.  GCC is tuned for AMD64.

But both compilers compile for x86-64, so both compilers use the larger
register file.

Gr.
Steven

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:17 ` Benjamin Kosnik
  2005-11-22 18:27   ` Gabriel Dos Reis
@ 2005-11-22 22:06   ` Steven Bosscher
  2005-11-22 22:44   ` Chris Lattner
  2005-11-23 12:32   ` Diego Novillo
  3 siblings, 0 replies; 59+ messages in thread
From: Steven Bosscher @ 2005-11-22 22:06 UTC (permalink / raw)
  To: gcc; +Cc: Benjamin Kosnik

On Tuesday 22 November 2005 19:17, Benjamin Kosnik wrote:
> What about compile-time performance?
>
> I'd actually like to make this a requirement, regardless of the option
> chosen.

Amen.

Maybe we should pick a baseline compiler, and require that all
compile time comparisons are made wrt. that baseline instead of
wrt. the head of the trunk, to avoid the gradual 0.05%-per-patch
slowdowns that keep accumulating...

Gr.
Steven

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:21           ` Benjamin Kosnik
  2005-11-22 19:37             ` Gabriel Dos Reis
  2005-11-22 20:09             ` Daniel Berlin
@ 2005-11-22 22:15             ` Steven Bosscher
  2005-11-22 22:28               ` Eric Botcazou
  2 siblings, 1 reply; 59+ messages in thread
From: Steven Bosscher @ 2005-11-22 22:15 UTC (permalink / raw)
  To: gcc; +Cc: Benjamin Kosnik, Daniel Berlin, gdr

On Tuesday 22 November 2005 20:21, Benjamin Kosnik wrote:
> Tree-SSA managed to add new technology to the compiler without major
> slowdowns.

You must be looking at different timings than I do.

GCC 4.1 is on average almost 40% slower than GCC 3.3.

Gr.
Steven

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:06   ` Richard Henderson
  2005-11-22 19:28     ` David Edelsohn
@ 2005-11-22 22:19     ` Steven Bosscher
  2005-11-22 22:50     ` Chris Lattner
  2 siblings, 0 replies; 59+ messages in thread
From: Steven Bosscher @ 2005-11-22 22:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc, Diego Novillo

On Tuesday 22 November 2005 20:06, Richard Henderson wrote:
> > The GVM plan could take years to get to that point...
>
> Could, but probably won't.  I'd have actually guessed they could
> have something functional, if not 100% robust, in 6 months given
> 2 or 3 people on the project.

Yes.  But would the memory footprint be as small as LLVM's?

Your worst fears are for debug information.  Mine are for the 'tree'
data structures, for which, frankly, I see absolutely no easy way to
make them smaller without major (or rather, Major) surgery that hurts
at least as much as just moving to some entirely different IL for the
optimizers.

Without less heavy data structures, we're going to have such a huge
memory footprint that IPA would be practically impossible for serious
applications.

Gr.
Steven

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:37 ` Daniel Jacobowitz
@ 2005-11-22 22:27   ` Chris Lattner
  0 siblings, 0 replies; 59+ messages in thread
From: Chris Lattner @ 2005-11-22 22:27 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Diego Novillo, gcc

On Tue, 22 Nov 2005, Daniel Jacobowitz wrote:
>> The initial impression I get is that LLVM involves starting from scratch.
>> I don't quite agree that this is necessary.  One of the engineering
>> challenges we need to tackle is the requirement of keeping a fully
>> functional compiler *while* we improve its architecture.
>
> Most of your concerns seem to be based on this impression; I don't
> think it's right.  I'll keep this brief since others can probably
> answer the details more accurately than I can.

FWIW, I completely agree with Daniel's message here.

> LLVM as a backend, i.e. replacing everything from GIMPLE -> assembly,
> would involve a lot of starting from scratch.  e.g. your later example
> of limited target support.  One of the options Chris proposed is
> an optional GIMPLE -> LLVM -> GIMPLE process, in which:

Correct.  I think that this is, by far, the most logical first step.  As 
the LLVM code generators mature, enabling them on a per-target basis can 
make sense.  If the GCC RTL backend improves (e.g. with the new RA 
design), perhaps the transition will never occur.

> (A) the LLVM step is only necessary for optimization - I like this for
> lots of reasons, not least being that we could bootstrap without a C++
> compiler.

Great point.

> (B) the LLVM register allocator, backend, et cetera would be optional
> or unused, and the existing GCC backends would be used instead.  Which
> are there today, need some modernizing, but work very well.

Exactly.

> The LLVM -> GIMPLE translator does not exist yet; I believe Chris has a
> prototype of the GIMPLE -> LLVM layer working, and it took him under a
> month.  I've been convinced that the opposite direction would be as
> straightforward.  That's something a sufficiently motivated developer
> could hack out in the course of this discussion.

It took under a month while multitasking and doing several other things 
:).  I will send out the patch for some concrete ideas of what it 
involves.

>> From what I understand, LLVM has never been used outside of a research
>> environment and it can only generate code for a very limited set of
>> targets.  These two are very serious limitations.
>
> LLVM is indeed very new.  At this point I believe it has been used
> outside of a research environment, but I can't say how thoroughly.

Yes, it has been used by several industrial groups, e.g. people targeting 
unconventional devices and as a other things (JIT compiler for shaders in 
a graphics program).  Apple is investing in it as well as mentioned 
before.

>> Finally, I would be very interested in timelines.  Neither proposal
>> mentions them.  My impression is that they will both take roughly the same
>> amount of time, though the LLVM approach (as described) may take longer
>> because it seems to have more missing pieces.

My personal goal is to have debug info, vectorization, and inline asm 
fully working by Summer 2006, assuming no help outside of what Apple is 
investing.

In terms of bigger goals, I intend to be compiling all of OS/X by next 
December (again assuming no major help).

These are somewhat safe/conservative goals, but they should give some 
indication for my plans.  With help, they would be significantly moved up. 
:)

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 22:15             ` Steven Bosscher
@ 2005-11-22 22:28               ` Eric Botcazou
  2005-11-22 22:51                 ` Steven Bosscher
  0 siblings, 1 reply; 59+ messages in thread
From: Eric Botcazou @ 2005-11-22 22:28 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Benjamin Kosnik, Daniel Berlin, gdr

> > Tree-SSA managed to add new technology to the compiler without major
> > slowdowns.
>
> You must be looking at different timings than I do.
>
> GCC 4.1 is on average almost 40% slower than GCC 3.3.

That's not true for GCC 4.0.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:57 ` Steven Bosscher
  2005-11-22 17:28   ` Daniel Berlin
  2005-11-22 19:06   ` Richard Henderson
@ 2005-11-22 22:31   ` Chris Lattner
  2 siblings, 0 replies; 59+ messages in thread
From: Chris Lattner @ 2005-11-22 22:31 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Diego Novillo

On Tue, 22 Nov 2005, Steven Bosscher wrote:
> On Tuesday 22 November 2005 17:20, Diego Novillo wrote:
>> The initial impression I get is that LLVM involves starting from scratch.
> I thought it would basically "only" replace the GIMPLE parts of the
> compiler.  That is,
>
> FE	-->	GENERIC	-->	LLVM	-->	RTL	--> asm
> (trees)		(trees)
>
> In the longer-term, you could maybe cut out the RTL part for targets for 
> which LLVM has its own backend. This is not less evolutionary or 
> revolutionary than tree-ssa was IMHO.

Yes, agreed.  For my work at Apple, we will probably end up using the LLVM 
backends.  For the bigger GCC picture, making use of the RTL backends is 
essential.

>> With our limited resources, we cannot really afford to go off on a
>> multi-year tangent nurturing and growing a new technology just to add a
>> new feature.
>
> It depends on who is going to invest these resources.  Would you want
> to tell Apple they can't do this even though they can? ;-)

:)

>> But what are the
>> timelines?  What resources are needed?
>
> Interesting questions.  Both projects obviously will take significant
> effort.  But IIUC Chris has bits of the LLVM stuff already going, so
> he has the head-start (like tree-SSA did when LLVM was introduced to
> the GCC community, ironically? ;-) so maybe Chris can have a working
> prototype implementation within, what, months?  The GVM plan could
> take years to get to that point...

That is the plan.

> So my dummy prediction would be that the LLVM path would result in a
> reasonable product more quickly than the GVM plan -- iff RTL stays.

Yes, totally agreed.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:17 ` Benjamin Kosnik
  2005-11-22 18:27   ` Gabriel Dos Reis
  2005-11-22 22:06   ` Steven Bosscher
@ 2005-11-22 22:44   ` Chris Lattner
  2005-11-23 12:32   ` Diego Novillo
  3 siblings, 0 replies; 59+ messages in thread
From: Chris Lattner @ 2005-11-22 22:44 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: gcc

On Tue, 22 Nov 2005, Benjamin Kosnik wrote:
>> Another minor nit is performance.  Judging by SPEC, LLVM has some
>> performance problems.  It's very good for floating point (a 9%
>> advantage over GCC), but GCC has a 24% advantage over LLVM 1.2 in
>> integer code.  I'm sure that is fixable and I only have data for an
>> old release of LLVM.  But is still more work to be done.  Particularly
>> for targets not yet supported by LLVM.

First off, this is for an extremely old version of LLVM (1.5yrs old, which 
represents 1/3 of LLVM's life :).

For a better picture, you can take a look at some of the PowerPC numbers 
here: http://persephone.cs.uiuc.edu/~oscar/nightlytest/

It's hard to decode for people who are not used to staring at the Tables, 
but overall, LLVM is about 10-20% win over GCC (rough numbers) on PPC. 
The X86 backend is getting more investment now, as it has been without a 
maintainer for quite a while.  I don't think its performance is as good as 
the PPC backend yet.   In any case, going to RTL solves that issue.

> What about compile-time performance?

LLVM has very good compile-time performance overall, and is one of the 
reasons that Apple is interested in it.  It was designed with modern 
principles, and has had the advantage of not having to worry about legacy 
code to maintain.  For -O0 compiles for example, I'm targeting a 20% 
speedup in the backend vs GCC 4 (using the native LLVM code generators).

The individual LLVM optimizers are almost all very fast (though a couple 
need to be tuned), and the link-time stages are all quite fast (though 
obviously not as fast as not doing link-time optzn).  For some examples, 
you can check out my thesis work, which is doing extremely aggressive 
context sensitive analysis and datastructure transformations in 3-4% of 
GCC compile times (i.e., single digit seconds for large programs like 
gcc).  When dealing with large programs, it's "just" a matter of using 
good algorithms and data structures: there is no other solution.

OTOH the specific questions about link-time compile-time performance, as 
others have pointed out, are not really that interesting.  They would only 
be enabled at -O4 (or something) and the other link-time proposal would 
also have a substantial impact on compile-times (I posit that it will be 
far worse than compile times using LLVM).  Besides using good algorithms 
and data structures, there is nothing you can do.  Doing optimization at 
link time *will* be slower than doing it at compile time: the question is 
just how much.

> I'd actually like to make this a requirement, regardless of the option
> chosen.

Agreed.  For my work at Apple at least, compile-times are a very important 
part of the work (and one of the direct motivators).

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:06   ` Richard Henderson
  2005-11-22 19:28     ` David Edelsohn
  2005-11-22 22:19     ` Steven Bosscher
@ 2005-11-22 22:50     ` Chris Lattner
  2005-11-22 23:23       ` Diego Novillo
  2 siblings, 1 reply; 59+ messages in thread
From: Chris Lattner @ 2005-11-22 22:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Steven Bosscher, gcc, Diego Novillo

On Tue, 22 Nov 2005, Richard Henderson wrote:
> On Tue, Nov 22, 2005 at 05:58:14PM +0100, Steven Bosscher wrote:
>> I thought it would basically "only" replace the GIMPLE parts of the
>> compiler.  That is,
>>
>> FE	-->	GENERIC	-->	LLVM	-->	RTL	--> asm
>> (trees)		(trees)
>
> This is certainly the only way to avoid losing functionality.
>
> I worry that this path will bitrot as 99% of folk use the llvm
> path straight through to assembly on i386.  But perhaps a config
> option to force the rtl path, plus some automated testing, can
> prevent that from happening too fast.

I'm not sure that's a real concern.  Considering that (if people wanted to 
use the LLVM code generator at all) we would only be enabled for some 
targets, the mechanism would already be in place to disable the LLVM 
backend.  If the mechanism is already in place, allowing people to disable 
it on targets where it is supported would be trivial.

>> It depends on who is going to invest these resources.  Would you want
>> to tell Apple they can't do this even though they can? ;-)
>
> No, but we also might want to let Apple work on this for a year
> and then come back with something more concrete than "it should
> be easy".

At this point, that is a reasonable decision to make.  The work will 
progress without outside involvement, it would just go *faster* with 
outside involvement :).

In practice, all this discussion boils down to is: when (and if) we merge 
the LLVM work into the main GCC tree.  It can be disabled by default while 
in progress if desired, but are we going to make everyone interested in it 
use the Apple branch?

> The biggest technical problem I see with LLVM is actually the debug
> info.  Frankly, I'm not sure I even want to consider LLVM until
> that's done.  If it's as easy as Chris and Danny make it out to be,
> then they'll have it knocked off in short order.  If not ...

As far as scheduling goes, we will probably start intense work on that in 
January (given that the holidays are coming up).  Waiting until that piece 
is in place would be reasonable.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 22:28               ` Eric Botcazou
@ 2005-11-22 22:51                 ` Steven Bosscher
  2005-11-22 23:05                   ` Eric Botcazou
  0 siblings, 1 reply; 59+ messages in thread
From: Steven Bosscher @ 2005-11-22 22:51 UTC (permalink / raw)
  To: gcc; +Cc: Eric Botcazou, Benjamin Kosnik, Daniel Berlin, gdr

On Tuesday 22 November 2005 23:32, Eric Botcazou wrote:
> > > Tree-SSA managed to add new technology to the compiler without major
> > > slowdowns.
> >
> > You must be looking at different timings than I do.
> >
> > GCC 4.1 is on average almost 40% slower than GCC 3.3.
>
> That's not true for GCC 4.0.

True, but GCC 4.0 produces code that is hardly better than what
GCC 3.3 makes of it, and 4.0 is still significantly slower.  Just
not as much as GCC 4.1 (something like 15%-20% wrt. GCC 3.3, iirc).

Gr.
Steven

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 19:07           ` Benjamin Kosnik
  2005-11-22 20:04             ` Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO) Jan Hubicka
@ 2005-11-22 22:52             ` Chris Lattner
  1 sibling, 0 replies; 59+ messages in thread
From: Chris Lattner @ 2005-11-22 22:52 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: Daniel Berlin, rth, gdr, gcc

On Tue, 22 Nov 2005, Benjamin Kosnik wrote:
>> Which is why i said "It's fine to say compile time performance of the
>> middle end portions ew may replace should be same or better".
>>
>> And if you were to look right now, it's actually significantly better in
>> some cases :(
> http://people.redhat.com/dnovillo/spec2000.i686/gcc/global-build-secs_elapsed.html
>
> And some more
> http://llvm.cs.uiuc.edu/testresults/X86/2005-11-01.html
>
> I'm  not sure about accuracy, or versions of LLVM used, etc.

No, this is not fair at all.  The version you're comparing against is a 
debug version of LLVM.  Debug versions of LLVM are literally 10x slower or 
more than release versions.  Further, those compile times are with the 
"old llvm gcc", which is not only doing link-time optimization (which your 
GCC numbers aren't) it's also writing out a massive text file and reading 
it back in at compile-time.

> Although promising on some things (as Diego said), LLVM exectue and
> compile performance is a mixed bag.

As I mentioned before, the X86 backend does not currently produce stellar 
code.  The PPC backend is better, and the whole thing is a moot point if 
we're going to RTL :)

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 22:51                 ` Steven Bosscher
@ 2005-11-22 23:05                   ` Eric Botcazou
  0 siblings, 0 replies; 59+ messages in thread
From: Eric Botcazou @ 2005-11-22 23:05 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Benjamin Kosnik, Daniel Berlin, gdr

> True, but GCC 4.0 produces code that is hardly better than what
> GCC 3.3 makes of it, and 4.0 is still significantly slower.

Maybe compared to your "hammer" branch.  On SPARC, FSF 3.4 is definitely 
better than FSF 3.3 and 4.0 not worse than 3.4.

> Just not as much as GCC 4.1 (something like 15%-20% wrt. GCC 3.3, iirc).

Yes, Tree-SSA per se is not responsible for the 40% slowdown you reported.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 22:50     ` Chris Lattner
@ 2005-11-22 23:23       ` Diego Novillo
  2005-11-22 23:42         ` David Edelsohn
  2005-11-23  2:28         ` Chris Lattner
  0 siblings, 2 replies; 59+ messages in thread
From: Diego Novillo @ 2005-11-22 23:23 UTC (permalink / raw)
  To: Chris Lattner; +Cc: gcc

Chris,

You will need to address two, potentially bigger, issues: license and 
implementation language.  You will need to get University of Illinois and 
past/present LLVM developers to assign the copyright over to the FSF.  
Yes, you've claimed it's easy, but it needs to be done.  Otherwise, we are 
in limbo.  We cannot do anything with LLVM until this is finalized.

Over the last couple of years, there have been some half hearted attempts 
at suggesting C++ as a new implementation language for GCC.  I would 
personally love to see us move to C++, but so far that has not happened.  
I am not quite sure how to address this.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:23       ` Diego Novillo
@ 2005-11-22 23:42         ` David Edelsohn
  2005-11-22 23:52           ` Daniel Jacobowitz
  2005-11-22 23:57           ` Diego Novillo
  2005-11-23  2:28         ` Chris Lattner
  1 sibling, 2 replies; 59+ messages in thread
From: David Edelsohn @ 2005-11-22 23:42 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Chris Lattner, gcc

>>>>> Diego Novillo writes:

Diego> Over the last couple of years, there have been some half hearted attempts 
Diego> at suggesting C++ as a new implementation language for GCC.  I would 
Diego> personally love to see us move to C++, but so far that has not happened.  
	C++ is not an issue that Chris can address or should be asked to
address.  I will work with the GCC SC and FSF on that issue once the
licensing issue is addressed and we know LLVM is a viable option.

David

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:42         ` David Edelsohn
@ 2005-11-22 23:52           ` Daniel Jacobowitz
  2005-11-23  0:09             ` Joe Buck
  2005-11-22 23:57           ` Diego Novillo
  1 sibling, 1 reply; 59+ messages in thread
From: Daniel Jacobowitz @ 2005-11-22 23:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Diego Novillo, Chris Lattner, gcc

On Tue, Nov 22, 2005 at 06:42:11PM -0500, David Edelsohn wrote:
> >>>>> Diego Novillo writes:
> 
> Diego> Over the last couple of years, there have been some half hearted attempts 
> Diego> at suggesting C++ as a new implementation language for GCC.  I would 
> Diego> personally love to see us move to C++, but so far that has not happened.  
> 	C++ is not an issue that Chris can address or should be asked to
> address.  I will work with the GCC SC and FSF on that issue once the
> licensing issue is addressed and we know LLVM is a viable option.

That covers the FSF issue, but the GCC developers have their own say in
the question, too.

Without going any further into this historically touchy subject, I'd
just like to reiterate one point I made earlier: I think that at this
time there would be concrete benefits to confining C++ to the
optimizers, i.e. preserving the ability to bootstrap without a C++
compiler.

That said, I wish it weren't necessary.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:42         ` David Edelsohn
  2005-11-22 23:52           ` Daniel Jacobowitz
@ 2005-11-22 23:57           ` Diego Novillo
  2005-11-23  0:05             ` Gabriel Dos Reis
  2005-11-23  0:08             ` Robert Dewar
  1 sibling, 2 replies; 59+ messages in thread
From: Diego Novillo @ 2005-11-22 23:57 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Chris Lattner, gcc

On Tuesday 22 November 2005 18:42, David Edelsohn wrote:

> I will work with the GCC SC and FSF on that issue once the licensing
> issue is addressed and we know LLVM is a viable option.
>
What purpose would that serve?  I'm not concerned about the SC, initially.  
It's the development community at large that needs convincing first.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:57           ` Diego Novillo
@ 2005-11-23  0:05             ` Gabriel Dos Reis
  2005-11-23  2:24               ` Chris Lattner
  2005-11-23  0:08             ` Robert Dewar
  1 sibling, 1 reply; 59+ messages in thread
From: Gabriel Dos Reis @ 2005-11-23  0:05 UTC (permalink / raw)
  To: Diego Novillo; +Cc: David Edelsohn, Chris Lattner, gcc

Diego Novillo <dnovillo@redhat.com> writes:

| On Tuesday 22 November 2005 18:42, David Edelsohn wrote:
| 
| > I will work with the GCC SC and FSF on that issue once the licensing
| > issue is addressed and we know LLVM is a viable option.
| >
| What purpose would that serve?  I'm not concerned about the SC, initially.  
| It's the development community at large that needs convincing first.

help me finish converting GCC to something compilable with a C++ compiler.
Without having something concrete to test with, we'll go again over
abstract arguments -- I did the conversion (many times) on my machine
as a proof-of-concept.

Lots of things kept (and are keeping) me busy. But, I'm back now and
working on it.  I'd not mind more hands/help. 

-- Gaby

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:57           ` Diego Novillo
  2005-11-23  0:05             ` Gabriel Dos Reis
@ 2005-11-23  0:08             ` Robert Dewar
  1 sibling, 0 replies; 59+ messages in thread
From: Robert Dewar @ 2005-11-23  0:08 UTC (permalink / raw)
  To: Diego Novillo; +Cc: David Edelsohn, Chris Lattner, gcc

Diego Novillo wrote:
> On Tuesday 22 November 2005 18:42, David Edelsohn wrote:

>>I will work with the GCC SC and FSF on that issue once the licensing
>>issue is addressed and we know LLVM is a viable option.
>> 
> What purpose would that serve?  I'm not concerned about the SC, initially.  
> It's the development community at large that needs convincing first.

My view here is that the primary reason for sticking to C is to
avoid aggravating the bootstrapping procedure. I see no objection
on this basis to including he use of C++ in optimizers. Of course
appropriate coding standards etc will have to be agreed on.

There may be people who just plain don't like C++, but that's a
completely different argument. I am not a great C++ fan, but I
see no reason not to have optimizers written in C++. At this stage
plenty of people are quite familiar with this language, so I do
not see that it would restrict development (proposing Ada for this
purpose would be a bit more controversial :-)

It seems reasonable to me to first resolve the licensing issues.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:52           ` Daniel Jacobowitz
@ 2005-11-23  0:09             ` Joe Buck
  0 siblings, 0 replies; 59+ messages in thread
From: Joe Buck @ 2005-11-23  0:09 UTC (permalink / raw)
  To: David Edelsohn, Diego Novillo, Chris Lattner, gcc



> > >>>>> Diego Novillo writes:

> Over the last couple of years, there have been some half hearted attempts 
> at suggesting C++ as a new implementation language for GCC.  I would 
> personally love to see us move to C++, but so far that has not happened.  

On Tue, Nov 22, 2005 at 06:42:11PM -0500, David Edelsohn wrote:
> > 	C++ is not an issue that Chris can address or should be asked to
> > address.  I will work with the GCC SC and FSF on that issue once the
> > licensing issue is addressed and we know LLVM is a viable option.

On Tue, Nov 22, 2005 at 06:52:33PM -0500, Daniel Jacobowitz wrote:
> That covers the FSF issue, but the GCC developers have their own say in
> the question, too.

RMS has strongly objected to C++ use in the past, but of course there's
no reason to bring up the subject with him unless and until there's developer
consensus.  So IMHO it's premature to have an SC or FSF discussion at this
point ... sometimes a "heads up" message is wise, but I don't think so in
this case.

> Without going any further into this historically touchy subject, I'd
> just like to reiterate one point I made earlier: I think that at this
> time there would be concrete benefits to confining C++ to the
> optimizers, i.e. preserving the ability to bootstrap without a C++
> compiler.

Yes, making bootstrapping more difficult is a real issue, but a mixed
approach could add difficulties of its own.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO)
  2005-11-22 22:02                 ` Steven Bosscher
@ 2005-11-23  0:30                   ` Scott Robert Ladd
  0 siblings, 0 replies; 59+ messages in thread
From: Scott Robert Ladd @ 2005-11-23  0:30 UTC (permalink / raw)
  To: GCC Mailing List

Steven Bosscher wrote:
> It obviously doesn't do that.  ICC uses that larger register file, too,
> for x86-64.

The Intel compiler can be set to compile for multiple processors,
keeping different versions of the same function in an executable and
picking which code to run based on the processor in use. Thus it is
quite possible to compile code (with Intel) on an Opteron, but have it
sereptitiously run using a lesser instruction and register set.

> AMD's Opteron (AMD64) and Intel's Nocona (EM64T-cheap-ass-AMD64-clone)
> are both just implementations of the x86-64 architecture.  And ICC is
> tuned for EM64T, I would guess.  GCC is tuned for AMD64.

Yes, I'm aware of that. Which makes me suspicious of any benchmark that
uses Intel's "EM64T tuned" compiler on the Opteron.

-- 
Scott Robert Ladd <scott.ladd@coyotegulch.com>
Coyote Gulch Productions
http://www.coyotegulch.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-23  0:05             ` Gabriel Dos Reis
@ 2005-11-23  2:24               ` Chris Lattner
  2005-11-23  2:43                 ` Gabriel Dos Reis
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Lattner @ 2005-11-23  2:24 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Diego Novillo, David Edelsohn, gcc

On Tue, 23 Nov 2005, Gabriel Dos Reis wrote:
> Diego Novillo <dnovillo@redhat.com> writes:
> | On Tuesday 22 November 2005 18:42, David Edelsohn wrote:
> | > I will work with the GCC SC and FSF on that issue once the licensing
> | > issue is addressed and we know LLVM is a viable option.
> | >
> | What purpose would that serve?  I'm not concerned about the SC, initially.
> | It's the development community at large that needs convincing first.
>
> help me finish converting GCC to something compilable with a C++ compiler.
> Without having something concrete to test with, we'll go again over
> abstract arguments -- I did the conversion (many times) on my machine
> as a proof-of-concept.
>
> Lots of things kept (and are keeping) me busy. But, I'm back now and
> working on it.  I'd not mind more hands/help.

Why is this important?  I'm finding that compiling GCC with a C compiler 
and using extern "C" around the headers works just fine.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 23:23       ` Diego Novillo
  2005-11-22 23:42         ` David Edelsohn
@ 2005-11-23  2:28         ` Chris Lattner
       [not found]           ` <m3fypnhjnm.fsf@gossamer.airs.com>
  1 sibling, 1 reply; 59+ messages in thread
From: Chris Lattner @ 2005-11-23  2:28 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

On Tue, 22 Nov 2005, Diego Novillo wrote:
> You will need to address two, potentially bigger, issues: license and
> implementation language.

> Over the last couple of years, there have been some half hearted attempts
> at suggesting C++ as a new implementation language for GCC.  I would
> personally love to see us move to C++, but so far that has not happened.
> I am not quite sure how to address this.

As mentioned, there is nothing I can do about the implementation language. 
If the GCC community doesn't like C++, they are welcome to ignore LLVM 
and/or take inspiration for it, but I'm personally not interested in being 
involved.

> You will need to get University of Illinois and
> past/present LLVM developers to assign the copyright over to the FSF.
> Yes, you've claimed it's easy, but it needs to be done.  Otherwise, we are
> in limbo.  We cannot do anything with LLVM until this is finalized.

I would definately like to get this process running, but unfortunately 
it will have to wait until January.  The main person I have to talk to has 
gone to India for Christmas, so I can't really start the process until 
January.  Yes, I'm incredibly frustrated with this as well. :(

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-23  2:24               ` Chris Lattner
@ 2005-11-23  2:43                 ` Gabriel Dos Reis
  0 siblings, 0 replies; 59+ messages in thread
From: Gabriel Dos Reis @ 2005-11-23  2:43 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Diego Novillo, David Edelsohn, gcc

Chris Lattner <sabre@nondot.org> writes:

| On Tue, 23 Nov 2005, Gabriel Dos Reis wrote:
| > Diego Novillo <dnovillo@redhat.com> writes:
| > | On Tuesday 22 November 2005 18:42, David Edelsohn wrote:
| > | > I will work with the GCC SC and FSF on that issue once the licensing
| > | > issue is addressed and we know LLVM is a viable option.
| > | >
| > | What purpose would that serve?  I'm not concerned about the SC, initially.
| > | It's the development community at large that needs convincing first.
| >
| > help me finish converting GCC to something compilable with a C++ compiler.
| > Without having something concrete to test with, we'll go again over
| > abstract arguments -- I did the conversion (many times) on my machine
| > as a proof-of-concept.
| >
| > Lots of things kept (and are keeping) me busy. But, I'm back now and
| > working on it.  I'd not mind more hands/help.
| 
| Why is this important?

For what? LLVM?  I never said it was.

|  I'm finding that compiling GCC with a C
| compiler and using extern "C" around the headers works just fine.

some of us needs more than just 'extern "C"' around the headers.  And
also notice that, not a long time ago you could not just pretend an
'extern "C"' around the headers and compiler it.  For more
information, search the archive. 

-- Gaby

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:45 ` Daniel Berlin
  2005-11-22 18:03   ` Scott Robert Ladd
@ 2005-11-23 12:11   ` Diego Novillo
  2005-11-27 19:58   ` Devang Patel
  2 siblings, 0 replies; 59+ messages in thread
From: Diego Novillo @ 2005-11-23 12:11 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc

On Tuesday 22 November 2005 11:45, Daniel Berlin wrote:

> > Another minor nit is performance.  Judging by SPEC, LLVM has some
> > performance problems.  It's very good for floating point (a 9%
> > advantage over GCC), but GCC has a 24% advantage over LLVM 1.2 in
> > integer code.  I'm sure that is fixable and I only have data for an
> > old release of LLVM
>
> Uh, you are comparing 4 releases ago of LLVM, against the current
> release of gcc, and saying "It doesn't do as well".
>
Yes, that's why I said I needed more work.  I ran with the latest release I 
could find (LLVM 1.6).  I'm not quite sure how to hook up the gfortran FE 
to LLVM, so for SPECfp I could only run the C tests.  Also, at -O3 LLVM 
1.6 fails eon and perlbmk.

On x86 LLVM 1.6 still lags behind GCC in SPECint (7%) but the gap is 
narrower, an excellent sign.  For SPECfp, the difference is similar to 
what it was with 1.2 (LLVM's score is 10% better).

Chris mentioned that the PPC backend is better.  We wouldn't be using 
LLVM's back end, so I guess this is not really a problem.  

	Processor:  Intel(R) Pentium(R) 4 CPU 2.26GHz (2260.065 Mhz)
	Memory:     1034832 kB
	Cache:      512 KB

Before Compiler
	Compiler:   gcc version 3.4-llvm 20051104 (LLVM 1.6)
	Peak flags: -O3 -Wl,-native-cbe

After Compiler
	Compiler:   gcc version 4.1.0 20051117 (experimental)
	Peak flags: -O3


SPECint results for peak

    Benchmark   Before   After  % diff
     164.gzip   550.30  659.84  + 19.90%
      175.vpr   412.63  424.97  +  2.99%
      176.gcc   726.04  759.82  +  4.65%
      181.mcf   432.00  425.03  -  1.61%
   186.crafty   507.09  680.06  + 34.11%
   197.parser   557.70  610.13  +  9.40%
      252.eon     0.00  575.67  INF
  253.perlbmk     0.00  767.01  INF
      254.gap   726.62  750.56  +  3.29%
   255.vortex   1142.30 833.70  - 27.02%
    256.bzip2   469.00  524.18  + 11.77%
    300.twolf   488.30  532.43  +  9.04%
         mean   573.20  614.47  +  7.20%


SPECfp result for peak

    Benchmark   Before   After  % diff
  168.wupwise     0.00  662.55  INF
     171.swim     0.00  496.77  INF
    172.mgrid     0.00  445.58  INF
    173.applu     0.00  598.96  INF
     177.mesa   521.26  427.31  - 18.02%
   178.galgel     0.00  351.39  INF
      179.art   366.68  189.75  - 48.25%
   183.equake   838.25  858.94  +  2.47%
  187.facerec     0.00  359.06  INF
     188.ammp   352.94  360.97  +  2.27%
    189.lucas     0.00  507.08  INF
    191.fma3d     0.00  408.12  INF
 200.sixtrack     0.00  404.31  INF
     301.apsi     0.00  439.52  INF
         mean   487.64  440.16  -  9.74%

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 18:17 ` Benjamin Kosnik
                     ` (2 preceding siblings ...)
  2005-11-22 22:44   ` Chris Lattner
@ 2005-11-23 12:32   ` Diego Novillo
  2005-11-23 18:13     ` Chris Lattner
  3 siblings, 1 reply; 59+ messages in thread
From: Diego Novillo @ 2005-11-23 12:32 UTC (permalink / raw)
  To: gcc; +Cc: Benjamin Kosnik

On Tuesday 22 November 2005 13:17, Benjamin Kosnik wrote:

> What about compile-time performance?
>
Well, it's hard to say, I have not really used LLVM extensively.  The only 
real data I have is compile times for SPECint:

	SPECint build times (secs)

				-O2	-O3

GCC 4.1.0 (20051117)		354	398
LLVM 1.6 (-Wl,-native-cbe)	802	805

So there appears to be a factor of 2 slowdown in LLVM.  However, I know 
LLVM has a separate GCC invokation.  It seems as if it emitted C code that 
is then compiled with GCC (I'm using -Wl,-native-cbe).

I don't think this would be standard procedure in an integrated LLVM.  
Chris, how would one compare compile times?  Not using -Wl,-native-cbe 
implies emitting bytecode, right?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-23 12:32   ` Diego Novillo
@ 2005-11-23 18:13     ` Chris Lattner
  2005-11-23 18:30       ` Diego Novillo
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Lattner @ 2005-11-23 18:13 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc, Benjamin Kosnik

On Wed, 23 Nov 2005, Diego Novillo wrote:

> On Tuesday 22 November 2005 13:17, Benjamin Kosnik wrote:
>
>> What about compile-time performance?
>>
> Well, it's hard to say, I have not really used LLVM extensively.  The only
> real data I have is compile times for SPECint:
>
> 	SPECint build times (secs)
>
> 				-O2	-O3
>
> GCC 4.1.0 (20051117)		354	398
> LLVM 1.6 (-Wl,-native-cbe)	802	805
>
> So there appears to be a factor of 2 slowdown in LLVM.  However, I know
> LLVM has a separate GCC invokation.  It seems as if it emitted C code that
> is then compiled with GCC (I'm using -Wl,-native-cbe).

Wow, only 2x slowdown?  That is pretty good, considering what this is 
doing.  I assume you're timing a release build here, not a debug build.

In any case, the LLVM time above includes the following:
1. An incredibly inefficient compile-time stage that is going away in the
    newly integrated compiler.  This involves producing a giant .ll file,
    writing it to disk (cache) then parsing the whole thing back in.  This
    was a pretty expensive process that existed only to avoid linking LLVM
    into GCC.
2. This time includes *full* linktime IPO (the old llvm-gcc wasn't
    integrated well enough to have -O options :( ).
3. This time includes the time to convert the LLVM code to C, write out a
    really large C file for the entire program, then fork/exec 'gcc -O2' on
    the .c file.

Considering that the slowdown is only a factor of two with all that going 
on, I think that's pretty impressive. :)

> I don't think this would be standard procedure in an integrated LLVM.
> Chris, how would one compare compile times?  Not using -Wl,-native-cbe
> implies emitting bytecode, right?

Correct, if you're timing build times, eliminating the -Wl,-native-cbe 
will give you a sense for how expensive #3 is (which will just leave you 
with LLVM .bc files).  I suspect that it is about 30% or more of the 
llvm-gcc time you report above.  For another data point, you can compile 
with '-Wl,-native' instead of -Wl,-native-cbe which will give you the LLVM 
native X86 backend.  It should be significantly faster and should provide 
another interesting performance datapoint (though admitedly probably not 
very good, due to the X86 backend needing work).

In any case, one of the major motivating factors is reduced compile times.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-23 18:13     ` Chris Lattner
@ 2005-11-23 18:30       ` Diego Novillo
  2005-11-27  7:59         ` Mike Stump
  0 siblings, 1 reply; 59+ messages in thread
From: Diego Novillo @ 2005-11-23 18:30 UTC (permalink / raw)
  To: Chris Lattner; +Cc: gcc, Benjamin Kosnik

On Wednesday 23 November 2005 13:13, Chris Lattner wrote:

> I assume you're timing a release build here, not a debug build. 
>
Yes, a release build.

> In any case, the LLVM time above includes the following:
> [ ... ]
>
Well, it seems that it's too early to test LLVM, then.  It's both slow and 
integer code performance isn't up to par yet.  I also couldn't test it 
with gfortran.  I'll keep an eye on the apple branch.  Will gfortran work 
on the branch?

Let me know when you folks add the patch so I can build it on my ppc SPEC 
box.  Thanks.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:37 ` Rafael Espíndola
@ 2005-11-25  2:17   ` Scott Robert Ladd
  2005-11-25 13:00     ` Rafael Espíndola
  2005-11-25 15:29     ` Diego Novillo
  0 siblings, 2 replies; 59+ messages in thread
From: Scott Robert Ladd @ 2005-11-25  2:17 UTC (permalink / raw)
  To: Rafael Espíndola, Diego Novillo; +Cc: gcc

I've been quietly watching the conversation, largely as an interested 
user as opposed to a GCC developer. One of my concerns lies with:

	GENERIC -> GIMPLE -> LLVM -> GIMPLE -> RTL

That design adds two phases (GIMPLE -> LLVM, LLVM -> GIMPLE) here -- 
perhaps simple one, perhaps not. The line is very straight, but adding 
two more segments make me wonder if we're complicating the plumbing.

How will this effect compiler speed?

How will debugging information flow accurately through the process?

And will we be making it even more difficult to isolate problems?

Already, we have people who understand frontends, and others who know 
GIMPLE initimately, and still overs who focus on RTL generation. Is 
adding two additional passes going to further fragment expertise?

I understand Rafael's comment, as quoted here:

 > In a first stage nothing will be done with the LLVM representation
 > except convert it back to GIMPLE. This will make sure that all
 > necessary information (including debug) can pass through the LLVM. The
 > conversion will also receive very good testing with this.

Does this mean that the "LLVM pass" will initially invoked only via an 
option, and that a normal compile will continue the current path until 
LLVM is fully tested and accepted?

Just questions; if they are stupid, please be gentle. ;)

-- 
Scott Robert Ladd <scott.ladd@coyotegulch.com>
Coyote Gulch Productions
http://www.coyotegulch.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-25  2:17   ` Scott Robert Ladd
@ 2005-11-25 13:00     ` Rafael Espíndola
  2005-11-25 15:29     ` Diego Novillo
  1 sibling, 0 replies; 59+ messages in thread
From: Rafael Espíndola @ 2005-11-25 13:00 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: gcc

On 11/22/05, Scott Robert Ladd <scott.ladd@coyotegulch.com> wrote:
> I've been quietly watching the conversation, largely as an interested
> user as opposed to a GCC developer. One of my concerns lies with:
I have worked on some toy front ends, so I think that I am a kind of a
user also :)

>         GENERIC -> GIMPLE -> LLVM -> GIMPLE -> RTL
>
> That design adds two phases (GIMPLE -> LLVM, LLVM -> GIMPLE) here --
> perhaps simple one, perhaps not. The line is very straight, but adding
> two more segments make me wonder if we're complicating the plumbing.
>
> How will this effect compiler speed?
It is hoped that optimizing in LLVM will be faster than optimizing in
GIMPLE. So optimized builds are likely to be faster.

> How will debugging information flow accurately through the process?
I think that this is an open issue. The major technical one.

> And will we be making it even more difficult to isolate problems?
Not in the LLVM part. If the conversion is turned on all the time it
will receive a lot of testing. And LLVM is simpler and has some nice
tools to help in bug hunting.

> Already, we have people who understand frontends, and others who know
> GIMPLE initimately, and still overs who focus on RTL generation. Is
> adding two additional passes going to further fragment expertise?
Tha algorithms are going to be more or less the same. The data
structures are going to be different. There will be a need the learn a
new API, but this is true for any proposal that involves changing the
internal representation.

> I understand Rafael's comment, as quoted here:
>
>  > In a first stage nothing will be done with the LLVM representation
>  > except convert it back to GIMPLE. This will make sure that all
>  > necessary information (including debug) can pass through the LLVM. The
>  > conversion will also receive very good testing with this.
>
> Does this mean that the "LLVM pass" will initially invoked only via an
> option, and that a normal compile will continue the current path until
> LLVM is fully tested and accepted?
I was hopping  that this would be done only to have a fast track for
adding LLVM. After that, the current path would be ported as fast as
possible. Others have expressed concerns about needing a c++ compiler
to bootstrap. It may be possible then to maintain an option of short
cutting the GIMPLE -> LLVM -> GIMPLE conversion so that stage1 can be
build with a C compiler.

> Just questions; if they are stupid, please be gentle. ;)
>
> --
> Scott Robert Ladd <scott.ladd@coyotegulch.com>
> Coyote Gulch Productions
> http://www.coyotegulch.com
>
Rafael

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-25  2:17   ` Scott Robert Ladd
  2005-11-25 13:00     ` Rafael Espíndola
@ 2005-11-25 15:29     ` Diego Novillo
  1 sibling, 0 replies; 59+ messages in thread
From: Diego Novillo @ 2005-11-25 15:29 UTC (permalink / raw)
  To: gcc; +Cc: Scott Robert Ladd, Rafael Espíndola

On Tuesday 22 November 2005 12:53, Scott Robert Ladd wrote:

> 	GENERIC -> GIMPLE -> LLVM -> GIMPLE -> RTL
>
> That design adds two phases (GIMPLE -> LLVM, LLVM -> GIMPLE) here --
> perhaps simple one, perhaps not. The line is very straight, but adding
> two more segments make me wonder if we're complicating the plumbing.
>
> How will this effect compiler speed?
>
It will likely slow down the compiler, initially.  Though, we may get the 
usual mixed bag, some things will be faster, others slower.  There will be 
some infrastructure duplication and integration is a tricky game to play.  

We are still fighting integration problems with tree-ssa, and LLVM is bound 
to be trickier to integrate, as it brings a completely new data structure 
and implementation language.

In the end, of course, the result is a better compiler (both in compile 
time and codegen quality).  The question is which road takes us faster to 
the goal.

> Already, we have people who understand frontends, and others who know
> GIMPLE initimately, and still overs who focus on RTL generation. Is
> adding two additional passes going to further fragment expertise?
>
No.  LLVM is conceptually no different from our current infrastructure and 
where we are planning to go framework-wise.  The reason it is attractive 
is that it already solves some data structure and infrastructure problems.  
Its data structures are leaner and it already provides a modern IPA 
architecture.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-23 18:30       ` Diego Novillo
@ 2005-11-27  7:59         ` Mike Stump
  0 siblings, 0 replies; 59+ messages in thread
From: Mike Stump @ 2005-11-27  7:59 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Chris Lattner, gcc, Benjamin Kosnik

On Nov 23, 2005, at 10:30 AM, Diego Novillo wrote:
> I'll keep an eye on the apple branch.  Will gfortran work on the  
> branch?

I generally like to keep Java and Fortran working on it.  For moments  
in time, it can have various breakages, though, they tend to be  
obvious/trivial to fix.  For some reason, I think it is currently  
broken, but, I could just be misremembering.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-22 16:45 ` Daniel Berlin
  2005-11-22 18:03   ` Scott Robert Ladd
  2005-11-23 12:11   ` Diego Novillo
@ 2005-11-27 19:58   ` Devang Patel
  2005-11-27 20:55     ` Daniel Berlin
  2 siblings, 1 reply; 59+ messages in thread
From: Devang Patel @ 2005-11-27 19:58 UTC (permalink / raw)
  To: Daniel Berlin, Diego Novillo; +Cc: gcc

> >
> > With our limited resources, we cannot really afford to go off on a
> > multi-year tangent nurturing and growing a new technology just to add
> > a
> > new feature.
> >
> What makes you think implementing LTO from scratch is different here?

I read entire thread (last msg, I read is from Mike Stump) but I did not
see any discussion about productivity of GCC developers.

If one approach provides tools that make developer very very productive
then it may blew initial work estimates out of water.

Here are the questions for LLVM as well as LTO folks. (To be fair,
Chris gave us some hints on few of this, but it is OK if people ask him
for clarifications :) And I have not read anything about this in LTO
proposal, so I take that this may need extra work not considered in
LTO time estimates).

1) Documentation

How well is the documentation so that _new_ compiler engineer can
become productive sooner ?

2) Testability of optimization passes

How much precision one can get while testing particular feature,
optimization pass?

3) Integrated tools to investigate/debug/fix optimizer bugs

4) Basic APIs needed to implement various optimization techniques

-
Devang

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-27 19:58   ` Devang Patel
@ 2005-11-27 20:55     ` Daniel Berlin
  2005-11-28  2:32       ` Chris Lattner
  2006-02-02  3:38       ` Ben Elliston
  0 siblings, 2 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-11-27 20:55 UTC (permalink / raw)
  To: Devang Patel; +Cc: Diego Novillo, gcc

On Sun, 2005-11-27 at 11:58 -0800, Devang Patel wrote:
> > >
> > > With our limited resources, we cannot really afford to go off on a
> > > multi-year tangent nurturing and growing a new technology just to add
> > > a
> > > new feature.
> > >
> > What makes you think implementing LTO from scratch is different here?
> 
> I read entire thread (last msg, I read is from Mike Stump) but I did not
> see any discussion about productivity of GCC developers.
> 
> If one approach provides tools that make developer very very productive
> then it may blew initial work estimates out of water.
> 
> Here are the questions for LLVM as well as LTO folks. (To be fair,
> Chris gave us some hints on few of this, but it is OK if people ask him
> for clarifications :) And I have not read anything about this in LTO
> proposal, so I take that this may need extra work not considered in
> LTO time estimates).
> 
> 1) Documentation
> 
> How well is the documentation so that _new_ compiler engineer can
> become productive sooner ?

There is no question that LLVM has much better documentation of IR and
semantics than we do, 

See, e.g., http://llvm.cs.uiuc.edu/docs/LangRef.html


It has tutorials on writing a pass, as well as example passes, 
http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html

> 
> 2) Testability of optimization passes
> 
> How much precision one can get while testing particular feature,
> optimization pass?

You can run one pass at a time, if you wanted to, using opt (or two, or
three).

> 
> 3) Integrated tools to investigate/debug/fix optimizer bugs

bugpoint beats pretty much anything we have, IMHO :).

> 
> 4) Basic APIs needed to implement various optimization techniques

All the basics are there for scalar opts.  There is no data dependence
yet, but they have a fine working SCEV, so it's only a few months to
implement, at most.






^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-27 20:55     ` Daniel Berlin
@ 2005-11-28  2:32       ` Chris Lattner
  2006-02-02  3:38       ` Ben Elliston
  1 sibling, 0 replies; 59+ messages in thread
From: Chris Lattner @ 2005-11-28  2:32 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Devang Patel, Diego Novillo, gcc

On Sun, 27 Nov 2005, Daniel Berlin wrote:
> On Sun, 2005-11-27 at 11:58 -0800, Devang Patel wrote:
>>> What makes you think implementing LTO from scratch is different here?
>>
>> Here are the questions for LLVM as well as LTO folks. (To be fair,
>>
>> 1) Documentation
>>
>> How well is the documentation so that _new_ compiler engineer can
>> become productive sooner ?
>
> There is no question that LLVM has much better documentation of IR and
> semantics than we do,
>
> See, e.g., http://llvm.org/docs/LangRef.html

> It has tutorials on writing a pass, as well as example passes,
> http://llvm.org/docs/WritingAnLLVMPass.html

Yup, in addition, LLVM has several pretty good docs for various 
subsystems, the full set is included here: http://llvm.org/docs/

Another good tutorial (aimed at people writing mid-level optimization 
passes) is here: 
http://llvm.org/pubs/2004-09-22-LCPCLLVMTutorial.html

Note that the organization of the 'llvm-gcc' compiler reflects the old 
compiler, not the new one.  Other than that it is up-to-date.

For a grab bag of various LLVM apis that you may run into, this document 
is useful: http://llvm.org/docs/ProgrammersManual.html

>> 2) Testability of optimization passes
>>
>> How much precision one can get while testing particular feature,
>> optimization pass?
>
> You can run one pass at a time, if you wanted to, using opt (or two, or
> three).

Yup, however there is one specific reason that is important/useful. 
With the ability to write out the IR and a truly modular pass manager, you 
can write really good regression tests.  This means you can write 
regression tests for optimizers/analyses that specify the exact input to a 
pass.

With traditional GCC regtests, you write your test in C (or some other 
language).  If you're testing the 7th pass from the parser, the regression 
test may fail to test what you want as time progresses and the 6 passes 
before you (or the parser) changes.  With LLVM, this isn't an issue.

Note that the link-time proposal could also implement this, but would 
require some hacking (e.g. implementing a text form for the IR) and time 
to get right.

>> 3) Integrated tools to investigate/debug/fix optimizer bugs
>
> bugpoint beats pretty much anything we have, IMHO :).

For those that are not familiar with it, here's some info:
http://llvm.org/docs/Bugpoint.html

If you are familiar with delta, it is basically a far more fast and 
powerful (but similar in spirit) automatic debugger.  It can reduce test 
cases, identify which pass is the problem, can debug ICE's and 
miscompilations, and can debug the optimizer, native backend, or JIT 
compiler.

>> 4) Basic APIs needed to implement various optimization techniques
>
> All the basics are there for scalar opts.  There is no data dependence
> yet, but they have a fine working SCEV, so it's only a few months to
> implement, at most.

Yup.  LLVM has the scalar optimizations basically covered, but is weak on 
loop optimizations.  This is something that we intend to cover in time (if 
noone else ends up helping) but will come after debug info and other 
things are complete.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
       [not found]           ` <m3fypnhjnm.fsf@gossamer.airs.com>
@ 2005-11-28  2:54             ` Chris Lattner
  0 siblings, 0 replies; 59+ messages in thread
From: Chris Lattner @ 2005-11-28  2:54 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

On Wed, 23 Nov 2005, Ian Lance Taylor wrote:

> Chris Lattner <sabre@nondot.org> writes:
>
>>> You will need to get University of Illinois and
>>> past/present LLVM developers to assign the copyright over to the FSF.
>>> Yes, you've claimed it's easy, but it needs to be done.  Otherwise, we are
>>> in limbo.  We cannot do anything with LLVM until this is finalized.
>>
>> I would definately like to get this process running, but unfortunately
>> it will have to wait until January.  The main person I have to talk to
>> has gone to India for Christmas, so I can't really start the process
>> until January.  Yes, I'm incredibly frustrated with this as well. :(
>
> You, or somebody, can start the process by writing to the FSF, at
> assign@fsf.org, to see what forms the FSF would like to see.  Ideally
> those forms will be acceptable to all concerned.  More likely there
> will have to be some negotiation between the FSF and the University.

For record, I sent an email to the FSF to get the ball rolling and find 
out what needs to be done.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Thoughts on LLVM and LTO
  2005-11-27 20:55     ` Daniel Berlin
  2005-11-28  2:32       ` Chris Lattner
@ 2006-02-02  3:38       ` Ben Elliston
  1 sibling, 0 replies; 59+ messages in thread
From: Ben Elliston @ 2006-02-02  3:38 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Devang Patel, Diego Novillo, gcc

Sorry for the long delay in this thread .. still catching up from the
break.

> > 2) Testability of optimization passes
> > 
> > How much precision one can get while testing particular feature,
> > optimization pass?
> 
> You can run one pass at a time, if you wanted to, using opt (or two,
> or three).

I have patches to GCC to do this if anyone thinks they'd be useful?

Ben

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2006-02-02  3:38 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-22 16:20 Thoughts on LLVM and LTO Diego Novillo
2005-11-22 16:37 ` Daniel Jacobowitz
2005-11-22 22:27   ` Chris Lattner
2005-11-22 16:37 ` Rafael Espíndola
2005-11-25  2:17   ` Scott Robert Ladd
2005-11-25 13:00     ` Rafael Espíndola
2005-11-25 15:29     ` Diego Novillo
2005-11-22 16:45 ` Daniel Berlin
2005-11-22 18:03   ` Scott Robert Ladd
2005-11-23 12:11   ` Diego Novillo
2005-11-27 19:58   ` Devang Patel
2005-11-27 20:55     ` Daniel Berlin
2005-11-28  2:32       ` Chris Lattner
2006-02-02  3:38       ` Ben Elliston
2005-11-22 16:57 ` Steven Bosscher
2005-11-22 17:28   ` Daniel Berlin
2005-11-22 19:06   ` Richard Henderson
2005-11-22 19:28     ` David Edelsohn
2005-11-22 22:19     ` Steven Bosscher
2005-11-22 22:50     ` Chris Lattner
2005-11-22 23:23       ` Diego Novillo
2005-11-22 23:42         ` David Edelsohn
2005-11-22 23:52           ` Daniel Jacobowitz
2005-11-23  0:09             ` Joe Buck
2005-11-22 23:57           ` Diego Novillo
2005-11-23  0:05             ` Gabriel Dos Reis
2005-11-23  2:24               ` Chris Lattner
2005-11-23  2:43                 ` Gabriel Dos Reis
2005-11-23  0:08             ` Robert Dewar
2005-11-23  2:28         ` Chris Lattner
     [not found]           ` <m3fypnhjnm.fsf@gossamer.airs.com>
2005-11-28  2:54             ` Chris Lattner
2005-11-22 22:31   ` Chris Lattner
2005-11-22 18:17 ` Benjamin Kosnik
2005-11-22 18:27   ` Gabriel Dos Reis
2005-11-22 18:47     ` Daniel Berlin
2005-11-22 18:50       ` Richard Henderson
2005-11-22 18:53         ` Daniel Berlin
2005-11-22 19:07           ` Benjamin Kosnik
2005-11-22 20:04             ` Some GCC 4.1 benchmarks (Re: Thoughts on LLVM and LTO) Jan Hubicka
2005-11-22 20:19               ` Scott Robert Ladd
2005-11-22 21:04                 ` Jan Hubicka
2005-11-22 22:02                 ` Steven Bosscher
2005-11-23  0:30                   ` Scott Robert Ladd
2005-11-22 22:52             ` Thoughts on LLVM and LTO Chris Lattner
2005-11-22 18:59       ` Gabriel Dos Reis
2005-11-22 19:06         ` Daniel Berlin
2005-11-22 19:21           ` Benjamin Kosnik
2005-11-22 19:37             ` Gabriel Dos Reis
2005-11-22 20:09             ` Daniel Berlin
2005-11-22 22:15             ` Steven Bosscher
2005-11-22 22:28               ` Eric Botcazou
2005-11-22 22:51                 ` Steven Bosscher
2005-11-22 23:05                   ` Eric Botcazou
2005-11-22 22:06   ` Steven Bosscher
2005-11-22 22:44   ` Chris Lattner
2005-11-23 12:32   ` Diego Novillo
2005-11-23 18:13     ` Chris Lattner
2005-11-23 18:30       ` Diego Novillo
2005-11-27  7:59         ` Mike Stump

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).