public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Link-time optimzation
@ 2005-11-16 22:26 Mark Mitchell
  2005-11-16 22:41 ` Andrew Pinski
                   ` (9 more replies)
  0 siblings, 10 replies; 46+ messages in thread
From: Mark Mitchell @ 2005-11-16 22:26 UTC (permalink / raw)
  To: gcc mailing list

The GCC community has talked about link-time optimization for some time.
In addition to results with other compilers, Geoff Keating's work on
inter-module optimization has demonstrated the potential for improved
code-generation from applying optimizations across translation units.

Some of us (Dan Berlin, David Edelsohn, Steve Ellcey, Shin-Ming Liu,
Tony Linthicum, Mike Meissner, Kenny Zadeck, and myself) have developed
a high-level proposal for doing link-time optimization in GCC.  At this
point, this is just a design sketch.  We look forward to jointly
developing this with the GCC community when the design stabilizes.

Our goal has been to develop a proposal that was sufficiently mature
that it would serve as a plausible approach for consideration -- but we
fully expect comments from the community to shape and change what we've
written, perhaps in quite significant ways.  Certainly, readers will
find many details that are unresolved; we are not claiming that this is
a final, formal specification.

We would prefer not to have this thread devolve into a discussion about
legal and "political" issues relating to reading and writing GCC's
internal representation.  I've said publicly for a couple of years that
GCC would need to have this ability, and, more constructively, David
Edelsohn has talked with the FSF (both RMS and Eben Moglen) about it.
The FSF has indicated that GCC now can explore adding this feature,
although there are still some legal details to resolve.

Therefore, we have taken it as our mission to focus purely on technical
considerations -- and that's what this discussion should be about.  When
we have a technical plan we like, then, before we implement it, we will
get approval from the SC and the FSF -- but, first, lets develop the
technical plan.

The document is on the web here:

  http://gcc.gnu.org/projects/lto/lto.pdf

The LaTeX sources are in htdocs/projects/lto/*.tex.

Thoughts?

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
@ 2005-11-16 22:41 ` Andrew Pinski
  2005-11-16 22:58 ` Andrew Pinski
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Andrew Pinski @ 2005-11-16 22:41 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

> 
> The GCC community has talked about link-time optimization for some time.
> In addition to results with other compilers, Geoff Keating's work on
> inter-module optimization has demonstrated the potential for improved
> code-generation from applying optimizations across translation units.
> 
> Our goal has been to develop a proposal that was sufficiently mature
> that it would serve as a plausible approach for consideration -- but we
> fully expect comments from the community to shape and change what we've
> written, perhaps in quite significant ways.  Certainly, readers will
> find many details that are unresolved; we are not claiming that this is
> a final, formal specification.
> 
> Thoughts?

Yes, we should not have to change the assembler, that is just wrong.

Rationale: people are more willy to use different versions of GCC than
different versions of binutils.  Also that would mean this prevents
targets which don't use GAS or even newer GAS's.


We should not be using debugging sections to emit IR at all, that makes
things worse since you could then have two different versions of the
same global linked function.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
  2005-11-16 22:41 ` Andrew Pinski
@ 2005-11-16 22:58 ` Andrew Pinski
  2005-11-17  0:02 ` Andrew Pinski
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Andrew Pinski @ 2005-11-16 22:58 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

> Some of us (Dan Berlin, David Edelsohn, Steve Ellcey, Shin-Ming Liu,
> Tony Linthicum, Mike Meissner, Kenny Zadeck, and myself) have developed
> a high-level proposal for doing link-time optimization in GCC.  At this
> point, this is just a design sketch.  We look forward to jointly
> developing this with the GCC community when the design stabilizes.


One more thing is that we still have a huge amount of type mismatch bugs
which cause any proposal to go bonkers anyways.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
  2005-11-16 22:41 ` Andrew Pinski
  2005-11-16 22:58 ` Andrew Pinski
@ 2005-11-17  0:02 ` Andrew Pinski
  2005-11-17  0:25 ` Andrew Pinski
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Andrew Pinski @ 2005-11-17  0:02 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

> 
> The GCC community has talked about link-time optimization for some time.
> In addition to results with other compilers, Geoff Keating's work on
> inter-module optimization has demonstrated the potential for improved
> code-generation from applying optimizations across translation units.

I don't understand why all the Linker section cannot just be done from collect2
and let the linker not know anything at all.

That seems like the best way of implementing it.  Unless you want to integrate
binutils inside GCC which might be the best goal anyways if we also want to
include MS style inline asm.


Also I see there is no mention of LLVM at all in the design, I would have thought
that it should be mentioned why you did not choose that bytecode format and why 
make another one.

-- Pinski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (2 preceding siblings ...)
  2005-11-17  0:02 ` Andrew Pinski
@ 2005-11-17  0:25 ` Andrew Pinski
  2005-11-17  0:52   ` Tom Tromey
  2005-11-17  0:26 ` Giovanni Bajo
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 46+ messages in thread
From: Andrew Pinski @ 2005-11-17  0:25 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

> 
> The GCC community has talked about link-time optimization for some time.
> In addition to results with other compilers, Geoff Keating's work on
> inter-module optimization has demonstrated the potential for improved
> code-generation from applying optimizations across translation units.

One thing not mentioned here is how are you going to repesent different
eh personality functions between languages, because currently we cannot 
even do different ones in the same compiling at all.

This is very important when compiling some C++ and Java code together.
Or even Objective-C and C++.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (3 preceding siblings ...)
  2005-11-17  0:25 ` Andrew Pinski
@ 2005-11-17  0:26 ` Giovanni Bajo
  2005-11-17  0:32   ` Daniel Berlin
  2005-11-17  1:20 ` Richard Henderson
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 46+ messages in thread
From: Giovanni Bajo @ 2005-11-17  0:26 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

Mark Mitchell <mark@codesourcery.com> wrote:

> Thoughts?


Thanks for woking on this. Any specific reason why using the LLVM bytecode
wasn't taken into account? It is proven to be stable, high-level enough to
perform any kind of needed optimization, and already features interpreters,
JITters and whatnot.

Giovanni Bajo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  0:26 ` Giovanni Bajo
@ 2005-11-17  0:32   ` Daniel Berlin
  2005-11-17  9:04     ` Giovanni Bajo
  0 siblings, 1 reply; 46+ messages in thread
From: Daniel Berlin @ 2005-11-17  0:32 UTC (permalink / raw)
  To: Giovanni Bajo; +Cc: Mark Mitchell, gcc

On Thu, 2005-11-17 at 01:26 +0100, Giovanni Bajo wrote:
> Mark Mitchell <mark@codesourcery.com> wrote:
> 
> > Thoughts?
> 
> 
> Thanks for woking on this. Any specific reason why using the LLVM bytecode
> wasn't taken into account? 

It was.
A large number of alternatives were explored, including CIL, the JVM,
LLVM, etc.

> It is proven to be stable, high-level enough to
> perform any kind of needed optimization,

This is not true, unfortunately.
That's why it is called "low level virtual machine".
It doesn't have things we'd like to do high level optimizations on, like
dynamic_cast removal, etc.

--Dan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  0:25 ` Andrew Pinski
@ 2005-11-17  0:52   ` Tom Tromey
  0 siblings, 0 replies; 46+ messages in thread
From: Tom Tromey @ 2005-11-17  0:52 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list

>>>>> "Andrew" == Andrew Pinski <pinskia@physics.uc.edu> writes:

Andrew> One thing not mentioned here is how are you going to repesent
Andrew> different eh personality functions between languages, because
Andrew> currently we cannot even do different ones in the same
Andrew> compiling at all.

I think that is covered adequately, if not explicitly, under
Requirement 7.

Tom

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (4 preceding siblings ...)
  2005-11-17  0:26 ` Giovanni Bajo
@ 2005-11-17  1:20 ` Richard Henderson
  2005-11-17  1:28   ` Mark Mitchell
  2005-11-17 15:54   ` Kenneth Zadeck
  2005-11-17  1:43 ` Gabriel Dos Reis
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 46+ messages in thread
From: Richard Henderson @ 2005-11-17  1:20 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

On Wed, Nov 16, 2005 at 02:26:28PM -0800, Mark Mitchell wrote:
>   http://gcc.gnu.org/projects/lto/lto.pdf

In Requirement 4, you say that the function F from input files a.o and
b.o should still be named F in the output file.  Why is this requirement
more than simply having the debug information reflect that both names
were originally F?  I see you go to some length in section 3 to ensure
actual symbol table duplicates, and I don't know why.

The rest of the requirements look good.  I cannot immediately think of
anything you've forgotten.

Section 2.2.1

  "invalid if and of"
  s/and/any/

  By focusing on c99 inline functions, you've forgotten the facts
  of gcc inline functions.  I'll give you a guess as to which set
  of semantics are actually used in practice.  Note that gcc's
  inline semantics are a superset of both C++ and c99, and that
  gcc doesn't actually implement c99 semantics at all.

Section 2.2.2

  You don't specifically mention arrays with differing ranges, but
  identical numbers of elements.  E.g. 0 to 5 and 1 to 6.

Section 3.4

  I don't think I understand how you plan to not duplicate code from
  ld to locate the set of object files.  The only way I can think of
  is for ld to actually extract the objects into new temporary files.
  Perhaps I've forgotten how much effort is involved here, but I can
  see it being easier to just duplicate this code and forget it.

Section 4.1

  Some more detail on what you have in mind wrt mapping scopes would
  be nice.

  My initial thought is that you wouldn't need anything special at
  all.  Just refer to the scopes from within the IL by offset within
  the CIE.  Perhaps I'm missing something that makes this hard.

Section 4.2

  What is the rationale for using a stack-based representation rather
  than a register-based representation?  A infinite register based
  solution would seem to map better onto gimple, making it easier to
  understand the conversion into and out of.  The required stacking
  and unstacking operations would seem to get in the way.

  C.f. plan9's inferno, rather than jvm or cil.

I'll say that I do like the notion of emitting the type and symbol
tables as normal dwarf3 output.  Especially since your proposal seems
to call for regular code to be emitted simultaneously, so that ld -r
can work.


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  1:20 ` Richard Henderson
@ 2005-11-17  1:28   ` Mark Mitchell
  2005-11-17  1:31     ` Daniel Jacobowitz
                       ` (2 more replies)
  2005-11-17 15:54   ` Kenneth Zadeck
  1 sibling, 3 replies; 46+ messages in thread
From: Mark Mitchell @ 2005-11-17  1:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc mailing list

Richard Henderson wrote:

In general, I'm going to just collect comments in a folder for a while,
and then try to reply once the dust has settled a bit.  I'm interested
in seeing where things go, and my primary interest is in getting *some*
consensus, independent of a particular one.

But, I'll try to answer this:

> In Requirement 4, you say that the function F from input files a.o and
> b.o should still be named F in the output file.  Why is this requirement
> more than simply having the debug information reflect that both names
> were originally F?  I see you go to some length in section 3 to ensure
> actual symbol table duplicates, and I don't know why.

Our understanding was that the debugger actually uses the symbol table,
in addition to the debugging information, in some cases.  (This must be
true when not running with -g, but I thought it was true in other cases
as well.)  It might be true for other tools, too.

It's true that, from a correctness or code-generation point of view, it
shouldn't matter, so, for non-GNU assemblers, we could fall back to
F.0/F.1, etc.

> The rest of the requirements look good.  I cannot immediately think of
> anything you've forgotten.

Thanks!

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  1:28   ` Mark Mitchell
@ 2005-11-17  1:31     ` Daniel Jacobowitz
  2005-11-17  3:35     ` Jeffrey A Law
  2005-11-17 11:41     ` Richard Earnshaw
  2 siblings, 0 replies; 46+ messages in thread
From: Daniel Jacobowitz @ 2005-11-17  1:31 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Richard Henderson, gcc mailing list

On Wed, Nov 16, 2005 at 05:27:58PM -0800, Mark Mitchell wrote:
> > In Requirement 4, you say that the function F from input files a.o and
> > b.o should still be named F in the output file.  Why is this requirement
> > more than simply having the debug information reflect that both names
> > were originally F?  I see you go to some length in section 3 to ensure
> > actual symbol table duplicates, and I don't know why.
> 
> Our understanding was that the debugger actually uses the symbol table,
> in addition to the debugging information, in some cases.  (This must be
> true when not running with -g, but I thought it was true in other cases
> as well.)  It might be true for other tools, too.

It does now, but given the level of complexity associated with
preserving that in your current scheme, it would probably be easier to
fix all the other tools.


-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (5 preceding siblings ...)
  2005-11-17  1:20 ` Richard Henderson
@ 2005-11-17  1:43 ` Gabriel Dos Reis
  2005-11-17  1:53 ` Andrew Pinski
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Gabriel Dos Reis @ 2005-11-17  1:43 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

Mark Mitchell <mark@codesourcery.com> writes:

| The GCC community has talked about link-time optimization for some time.
| In addition to results with other compilers, Geoff Keating's work on
| inter-module optimization has demonstrated the potential for improved
| code-generation from applying optimizations across translation units.
| 
| Some of us (Dan Berlin, David Edelsohn, Steve Ellcey, Shin-Ming Liu,
| Tony Linthicum, Mike Meissner, Kenny Zadeck, and myself) have developed
| a high-level proposal for doing link-time optimization in GCC.  At this
| point, this is just a design sketch.  We look forward to jointly
| developing this with the GCC community when the design stabilizes.

I wholeheartly applaude this effort.  At TAMU, we have been developing
similar high level compiler-neutral data structures for program
representations as part of a larger framework.  An overview was
presented recently at LCPC05

       http://www.research.att.com/~bs/SELL-HPC.pdf

I deeply regret we did not have time to finish and make a first
release in time for consideration -- the truth is, it really is a full
time job working on C++, let alone developing supporting tool and other
academic considerations.

I'm delighted to see GCC moving forward to more advanced internal
infrastructure.  We would most certainly give feedbacks based on our
own experience so far.


-- Gaby

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (6 preceding siblings ...)
  2005-11-17  1:43 ` Gabriel Dos Reis
@ 2005-11-17  1:53 ` Andrew Pinski
  2005-11-17  2:39 ` Kean Johnston
  2005-11-17  5:53 ` Ian Lance Taylor
  9 siblings, 0 replies; 46+ messages in thread
From: Andrew Pinski @ 2005-11-17  1:53 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

Some more comments (this time section by section and a little more thought out):

2.1: 
Requirement 1: a good question is how does ICC or even XLC
do this without doing anything special? Or do they keep around
an "on-the-side" database.

(Requirements 2-4 assume Requirement 1)

Requirement 5: is perfectly fine as far as I can tell.

(Requirement 6 assume Requirement 1)

Requirement 7: seems prefectly fine, maybe going more
into which are incompatible options, -ftrapv is a good
example, -fwrapv is another very good example.

2.2:
Requirement 8-10: prefect.

2.2.1:

Maybe I am missing something but I don't see where common
symbols are mentioned.  Are they really just weak?

2.2.2:

So are pointer pointing to type1 the same type as reference
pointing to type1?  The way I read the document, pointers and 
references are interchangable.  Maybe an explited statement
saying they are the same type or not.

(3: all assume requirement 1.)

3.2: does that mean GCC would begin using BFD also or I am
reading that wrongly? Or we are could read in all file formats
using a new interface which mean we duplicated all the BFD reading
code.

A good question about this whole 3 section is how does ICC or XLC
handle this.  I understand we don't want to repeat their mistakes
but not mentioning what they do seems like it was not as thought
through as it really was.

4:

Requirement 11: Very good.

Requirement 12: I still am secticial of all these source-anaylsis
programs.  Now export is a different reason, but the problem
with using the rest of the requirements for this is wrong as
first there is no way to repesent overloaded sets, all the C++
specific code (though we could extend the bytecode but is harder
if we don't design it in the first place).


Requirement 13 and 14: are no brainers.

Just one note about 14, stack based bytecode does not correspond
to GIMPLE TREE representive that well.  An infinite register
one with no stack except for ADDRESSABLE variables.

Also the Rationale of 14 says we go in and out of SSA, is that expensive?
for an example PHI insertion is O(N log(N)), see PR 18594 for an example
of where this can take a long time.


4.1: assumes requirement 1.
Yes dwarf3 is well defined but has it really ever been used for this
purpose before, that seems like a question which should be answered here.

4.2: semi does not statifies requirement 14, see above.

There should be mention of other bytecodes so it does not look like only
those two were looked at since there are some major ones like LLVM which
is gaining a huge movement inside Apple for an example.  WIRL is another
one which comes to my mind.

the registers: 
  The registers of the GVM correspond one for one with the local 
  variables at the GIMPLE level.

From reading that, does that mean we actually have all variables in
registers and that the stack is not that useful?

An example of what the bytecode would look like with that saying is:
push a
push b
add
pop c

which is the same as c = a + b.  Are in a non stack based bytecode:
add c, a, b
Just like a three operand ISA.

But doesn't that mean we really have a non stack based bytecode with
just instructions which act on the stack and that stack will be
empty after one GIMPLE instruction is executed?  Then why have a 
stack at all as no part of the compiler as far as I can see will
ever generate code which does stuff like:
push a
push b
add
push c
add
pop d
Which is equivalent to d = (a+b)+c.

Now a pre-link time optimizer could do it if it can prove that the (a+b) is only used
that once.


What we would produce right now with gimple is:
e = a+b;
d = e+c;

Or to the stack bytecode:

push a;push b; add; pop e; push e; add; pop d;

But we need an extra pass to optimize away the pop/push if we are already out of SSA.


Now if the registers correspond to GENERIC variables, we would be
produce d = (a+b)+c;


-- Pinski

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (7 preceding siblings ...)
  2005-11-17  1:53 ` Andrew Pinski
@ 2005-11-17  2:39 ` Kean Johnston
  2005-11-17  5:53 ` Ian Lance Taylor
  9 siblings, 0 replies; 46+ messages in thread
From: Kean Johnston @ 2005-11-17  2:39 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc mailing list

> The document is on the web here:
> 
>   http://gcc.gnu.org/projects/lto/lto.pdf
> 
> The LaTeX sources are in htdocs/projects/lto/*.tex.
> 
> Thoughts?

It may be worth mentioning that this type of optimization
applies mainly to one given type of output: a non-symbolic
a.out. When the output it a shared library or an a.out with
exposed symbols (-Bexport, -E etc), then the semantics will
need to be different.

In the simplest case where you are producing a non-symbolic
a.out, you can detect that global functions or variables
are completely unused, and therefore discard them. However,
if the a.out has some of its symbols exposed by linker options,
then those symbols, and any other symbols or functions which
they reference, must be left intact. Similarly for shared
libraries, although that case is slightly worse as *any* global
symbol must be left intact, unless a specific export list
is being used, in which case only those symbols, and symbols
which they in turn reference, need survive.

At least thats my understanding.

Also, this whole issue of visible symbols and shared libraries
may well be covered by requirement 3, and if so, it may be
worth mentioning since there was no accompanying rationale
for requirement 3.

Kean

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  1:28   ` Mark Mitchell
  2005-11-17  1:31     ` Daniel Jacobowitz
@ 2005-11-17  3:35     ` Jeffrey A Law
  2005-11-17 14:09       ` Daniel Berlin
  2005-11-17 11:41     ` Richard Earnshaw
  2 siblings, 1 reply; 46+ messages in thread
From: Jeffrey A Law @ 2005-11-17  3:35 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Richard Henderson, gcc mailing list


> Our understanding was that the debugger actually uses the symbol table,
> in addition to the debugging information, in some cases.  (This must be
> true when not running with -g, but I thought it was true in other cases
> as well.)  It might be true for other tools, too.
I can't offhand recall if GDB actually uses the minimal symbols (the 
symbol table) for anything if there are debug symbols available.  But
even if we prove GDB doesn't use those symbols, we should still
keep them -- other tools, or even other debuggers (Etnus?) might still
use the symbols from the symbol table.

Jeff

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-16 22:26 Link-time optimzation Mark Mitchell
                   ` (8 preceding siblings ...)
  2005-11-17  2:39 ` Kean Johnston
@ 2005-11-17  5:53 ` Ian Lance Taylor
  2005-11-17 13:08   ` Ulrich Weigand
  2005-11-17 16:17   ` Kenneth Zadeck
  9 siblings, 2 replies; 46+ messages in thread
From: Ian Lance Taylor @ 2005-11-17  5:53 UTC (permalink / raw)
  To: gcc mailing list

Mark Mitchell <mark@codesourcery.com> writes:

>   http://gcc.gnu.org/projects/lto/lto.pdf

Section 2.2.1 (Variables and Functions) mentions C++ inline functions.
It should also mention gcc's C language "extern inline" functions.

The same section should consider common symbols.  These appear as
uninitialized definitions.  Common symbols should normally be merged.

Obviously the text referring to GNU attributes will need to be
expanded.  Some cases are not obvious; e.g., longcall.

In section 3.3 (Assembler) I'll note that the PowerPC XCOFF assembler
supports a .rename directive, which could be easily be made available
for all targets.

I'll also note that for non-GNU targets, we must be able to use the
native assembler.  Therefore, it may not always be possible to keep
symbol names the same as source code names.  That will have to be an
option, fortunately one that only affects debugging information.

In section 3.4 (Linker) I have the same comment: for non-GNU targets,
the native linker is sometimes required, so modifying the linker
should not be a requirement.  And the exact handling of .a files is
surprisingly target dependent, so while it would be easy to code an
archive searcher in gcc, it would be tedious, though doable, to get it
right for all platforms.

Conversely, I don't know much we are going to care about speed here,
but I assume that we are going to care a bit.  For the linker to
determine which files to pull in from an archive, it is going to have
to read the symbol tables of all the input objects, and it is going to
have to read the archive symbol table, and it is going to have to read
the symbols table of each object included from an archive.  It will
have to build a symbol hash table as it goes along.  This isn't fast;
it's a significant component of link time.  Since the compiler is also
going to have to build a symbol hash table, it is going to be faster
to have the compiler search the archive symbol table and decide which
objects to pull in.  Searching an archive symbol table isn't hard; the
target dependencies come in when deciding which objects to include.

In section 3.5 (Driver), although you don't discuss it, we are going
to want the driver to know whether you are generating a relocateable
object, a shared library, or an executable.  When generating a shared
library, we are going to want the driver to know the set of exported
symbols.  When generating an executable, we are going to want to know
which symbols are referenced by shared libraries included in the link.
We are going to want to know these things so that we can do things
like global dead code elimination and global parameter simplification
(i.e., function only called with certain parameters) even when linking
against shared libraries for which we do not have the source.

Link time optimization will still be useful if we don't know any of
this stuff, but it is clear that people are going to want the
optimizations which require it.  So we should plan for it from the
start.

Section 4.2 (Executable Representation) describes the GVM as a stack
machine, and mentions load, store, duplicate, and swap operations.
But it also discusses having registers which correspond to GIMPLE
local variables.  The way I put this together is that a MODIFY_EXPR
which sets a local variable, say, will be converted to something that
computes the expression using a stack and then assigns the value to a
register.  It's easy to see how to convert such a MODIFY_EXPR into a
stack machine and back.  But it's not easy to see why that stack
machine is going to need, e.g, a swap operation.  There is no GIMPLE
swap operator.  So I may still be confused about something.

Ian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  0:32   ` Daniel Berlin
@ 2005-11-17  9:04     ` Giovanni Bajo
  2005-11-17 16:25       ` Kenneth Zadeck
  0 siblings, 1 reply; 46+ messages in thread
From: Giovanni Bajo @ 2005-11-17  9:04 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Mark Mitchell, gcc

Daniel Berlin <dberlin@dberlin.org> wrote:

>> Thanks for woking on this. Any specific reason why using the LLVM
>> bytecode wasn't taken into account?
>
> It was.
> A large number of alternatives were explored, including CIL, the JVM,
> LLVM, etc.
>
>> It is proven to be stable, high-level enough to
>> perform any kind of needed optimization,
>
> This is not true, unfortunately.
> That's why it is called "low level virtual machine".
> It doesn't have things we'd like to do high level optimizations on,
> like dynamic_cast removal, etc.


Anyway, *slightly* extending an existing VM which already exists, is
production-ready, is GPL compatible, is supported by a full toolchain
(including interpreters, disassemblers, jitters, loaders, optimizers...) looks
like a much better deal. Also, I'm sure Chris would be willing to provide us
with all the needed help.

I also think CIL would have egregiously worked. I'm sure the reasons to refuse
it are more political than tecnical, so it's useless to go into further details
I presume.

Giovanni Bajo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  1:28   ` Mark Mitchell
  2005-11-17  1:31     ` Daniel Jacobowitz
  2005-11-17  3:35     ` Jeffrey A Law
@ 2005-11-17 11:41     ` Richard Earnshaw
  2005-11-17 21:40       ` Ian Lance Taylor
  2 siblings, 1 reply; 46+ messages in thread
From: Richard Earnshaw @ 2005-11-17 11:41 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Richard Henderson, gcc mailing list

On Thu, 2005-11-17 at 01:27, Mark Mitchell wrote:
> Richard Henderson wrote:
> > In Requirement 4, you say that the function F from input files a.o and
> > b.o should still be named F in the output file.  Why is this requirement
> > more than simply having the debug information reflect that both names
> > were originally F?  I see you go to some length in section 3 to ensure
> > actual symbol table duplicates, and I don't know why.
> 
> Our understanding was that the debugger actually uses the symbol table,
> in addition to the debugging information, in some cases.  (This must be
> true when not running with -g, but I thought it was true in other cases
> as well.)  It might be true for other tools, too.
> 
> It's true that, from a correctness or code-generation point of view, it
> shouldn't matter, so, for non-GNU assemblers, we could fall back to
> F.0/F.1, etc.

We spend a lot of time printing out the results of compilation as
assembly language, only to have to parse it all again in the assembler. 
Given some of the problems this proposal throws up I think we should
seriously look at bypassing as much of this step as possible, and of
generating object files from directly in the compiler.  Ultimately we'd
only need to parse assembly statements for inline asm constructs.

R.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  5:53 ` Ian Lance Taylor
@ 2005-11-17 13:08   ` Ulrich Weigand
  2005-11-17 21:42     ` Ian Lance Taylor
  2005-11-17 16:17   ` Kenneth Zadeck
  1 sibling, 1 reply; 46+ messages in thread
From: Ulrich Weigand @ 2005-11-17 13:08 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc mailing list

Ian Lance Taylor wrote:

> In section 3.4 (Linker) I have the same comment: for non-GNU targets,
> the native linker is sometimes required, so modifying the linker
> should not be a requirement.  And the exact handling of .a files is
> surprisingly target dependent, so while it would be easy to code an
> archive searcher in gcc, it would be tedious, though doable, to get it
> right for all platforms.
> 
> Conversely, I don't know much we are going to care about speed here,
> but I assume that we are going to care a bit.  For the linker to
> determine which files to pull in from an archive, it is going to have
> to read the symbol tables of all the input objects, and it is going to
> have to read the archive symbol table, and it is going to have to read
> the symbols table of each object included from an archive.  It will
> have to build a symbol hash table as it goes along.  This isn't fast;
> it's a significant component of link time.  Since the compiler is also
> going to have to build a symbol hash table, it is going to be faster
> to have the compiler search the archive symbol table and decide which
> objects to pull in.  Searching an archive symbol table isn't hard; the
> target dependencies come in when deciding which objects to include.

I'm wondering whether we can't simply employ the linker to handle
all those issues:  Have the driver always (not just as fall-back)
call "ld -r" and the linker will pull together all input files,
including those from archives, and combine them into one single
object file.  Then invoke the new "link-optimizer" on that single
object file, resulting in an optimized object file.

Any reasons why this cannot work?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  Linux on zSeries Development
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  3:35     ` Jeffrey A Law
@ 2005-11-17 14:09       ` Daniel Berlin
  2005-11-17 14:48         ` mathieu lacage
  0 siblings, 1 reply; 46+ messages in thread
From: Daniel Berlin @ 2005-11-17 14:09 UTC (permalink / raw)
  To: law; +Cc: Mark Mitchell, Richard Henderson, gcc mailing list

On Wed, 2005-11-16 at 20:33 -0700, Jeffrey A Law wrote:
> > Our understanding was that the debugger actually uses the symbol table,
> > in addition to the debugging information, in some cases.  (This must be
> > true when not running with -g, but I thought it was true in other cases
> > as well.)  It might be true for other tools, too.
> I can't offhand recall if GDB actually uses the minimal symbols (the 
> symbol table) for anything if there are debug symbols available.

It does, but only after trying the debug symbols first.

I discovered this when deep hacking into the symbol code of GDB a while
ago.  Apparently, some people enjoy breakpointing symbols by using the
fully mangled name, which appears (nowadays) mainly in the minsym table.

>   But
> even if we prove GDB doesn't use those symbols, we should still
> keep them -- other tools, or even other debuggers (Etnus?) might still
> use the symbols from the symbol table.
> 
> Jeff
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 14:09       ` Daniel Berlin
@ 2005-11-17 14:48         ` mathieu lacage
  0 siblings, 0 replies; 46+ messages in thread
From: mathieu lacage @ 2005-11-17 14:48 UTC (permalink / raw)
  To: gcc

hi,

Daniel Berlin wrote:

>I discovered this when deep hacking into the symbol code of GDB a while
>ago.  Apparently, some people enjoy breakpointing symbols by using the
>fully mangled name, which appears (nowadays) mainly in the minsym table.
>  
>
This sort of hack is often used to work around what appears to be the 
inability of gdb to put breakpoints in c++ constructors (or maybe it is 
bad dwarf2 debugging output by gcc, I don't know).

regards,
Mathieu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  1:20 ` Richard Henderson
  2005-11-17  1:28   ` Mark Mitchell
@ 2005-11-17 15:54   ` Kenneth Zadeck
  2005-11-17 16:41     ` Jan Hubicka
                       ` (2 more replies)
  1 sibling, 3 replies; 46+ messages in thread
From: Kenneth Zadeck @ 2005-11-17 15:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Mark Mitchell, gcc mailing list

On Wed, Nov 16, 2005 at 02:26:28PM -0800, Mark Mitchell wrote:
> >   http://gcc.gnu.org/projects/lto/lto.pdf
> 
> Section 4.2
> 
>   What is the rationale for using a stack-based representation rather
>   than a register-based representation?  A infinite register based
>   solution would seem to map better onto gimple, making it easier to
>   understand the conversion into and out of.  The required stacking
>   and unstacking operations would seem to get in the way.
> 
>   C.f. plan9's inferno, rather than jvm or cil.

What we wanted was something that was portable and easy to specify but
also easily exported from and imported into GCC.  With respect to
those constraints, JVM, CIL and LLVM do not make the cut because they
are different enough from the tree-gimple code that we have to make
the translation into and/or out of gcc difficult.  This is not NIH,
just being pragmatic about not wanting to have to commit resources into
things that are not important.

A stack machine representation was chosen for the same reason.  Tree
gimple is a series of statements each statement being a tree.
Smashing the trees and introducing temps is easy on the output side
but requires a lot of work on the input side.  I am not a fan of our
tree representations, but I did not believe that changing them should
be a necessary to get to link time optimization.  If we decide we want
to get rid if trees as an intermediate form, this decision should
change.

A well designed stack machine also provides for a very tight encoding.
It is very desirable from a size of portable code point of view to
minimize the number of temps that you create.  You can do this in
several ways:

1) Do some register allocation of the temps so that they are reused.
   This is non trivial to undo (but truely doable), especially where
   you wish to not adversely impact debugging.

2) Just generate a lot of temps and hope that some other form of
   compression will save the day.

3) Use a stack to represent the intermediate nodes of the tree.  This
   is what we chose to do.

It is trivial to generate the stack code from a single walk of the
tree.  It is trivial to regenerate the tree from a single pass over
the stack code.

The stack machine that we have in mind will be as stripped down as
possible.  The idea is just to get the trees in and get them back out.


Kenny

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  5:53 ` Ian Lance Taylor
  2005-11-17 13:08   ` Ulrich Weigand
@ 2005-11-17 16:17   ` Kenneth Zadeck
  1 sibling, 0 replies; 46+ messages in thread
From: Kenneth Zadeck @ 2005-11-17 16:17 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc mailing list

Mark Mitchell <mark@codesourcery.com> writes:

>   http://gcc.gnu.org/projects/lto/lto.pdf

> 
> Section 4.2 (Executable Representation) describes the GVM as a stack
> machine, and mentions load, store, duplicate, and swap operations.
> But it also discusses having registers which correspond to GIMPLE
> local variables.  The way I put this together is that a MODIFY_EXPR
> which sets a local variable, say, will be converted to something that
> computes the expression using a stack and then assigns the value to a
> register.  It's easy to see how to convert such a MODIFY_EXPR into a
> stack machine and back.  But it's not easy to see why that stack
> machine is going to need, e.g, a swap operation.  There is no GIMPLE
> swap operator.  So I may still be confused about something.

You are most likely correct.  We either do not need a swap or do not
want to encourage people to try to optimize the trees so much that
they need a swap.
> 
> Ian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17  9:04     ` Giovanni Bajo
@ 2005-11-17 16:25       ` Kenneth Zadeck
  0 siblings, 0 replies; 46+ messages in thread
From: Kenneth Zadeck @ 2005-11-17 16:25 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc mailing list

> >> Thanks for woking on this. Any specific reason why using the LLVM
> >> bytecode wasn't taken into account?
> >
> > It was.
> > A large number of alternatives were explored, including CIL, the JVM,
> > LLVM, etc.
> >
> >> It is proven to be stable, high-level enough to
> >> perform any kind of needed optimization,
> >
> > This is not true, unfortunately.
> > That's why it is called "low level virtual machine".
> > It doesn't have things we'd like to do high level optimizations on,
> > like dynamic_cast removal, etc.
> 
> 
> Anyway, *slightly* extending an existing VM which already exists, is
> production-ready, is GPL compatible, is supported by a full toolchain
> (including interpreters, disassemblers, jitters, loaders, optimizers...) looks
> like a much better deal. Also, I'm sure Chris would be willing to provide us
> with all the needed help.
> 
> I also think CIL would have egregiously worked. I'm sure the reasons to refuse
> it are more political than tecnical, so it's useless to go into further details
> I presume.

I do not think that CIL really would have worked.  CIL has been
carefully crafted to support what Microsoft wants it to support and
unrestricted C, C++, and Fortran are not in that mix.  

Remember that Microsoft wants to be able to take a CIL program
produced on one (?Microsoft?) platform and run it on another
(?Microsoft?) platform.  By the time we get to where we want to
produce the iterchangable code, the il has a lot of platform knowledge
(for instance generated by the C preprocessor) that cannot be easily
accounted for.


> 
> Giovanni Bajo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 15:54   ` Kenneth Zadeck
@ 2005-11-17 16:41     ` Jan Hubicka
  2005-11-18 16:31     ` Michael Matz
  2005-11-18 17:24     ` Nathan Sidwell
  2 siblings, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2005-11-17 16:41 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Richard Henderson, Mark Mitchell, gcc mailing list

> On Wed, Nov 16, 2005 at 02:26:28PM -0800, Mark Mitchell wrote:
> > >   http://gcc.gnu.org/projects/lto/lto.pdf
> > 
> > Section 4.2
> > 
> >   What is the rationale for using a stack-based representation rather
> >   than a register-based representation?  A infinite register based
> >   solution would seem to map better onto gimple, making it easier to
> >   understand the conversion into and out of.  The required stacking
> >   and unstacking operations would seem to get in the way.
> > 
> >   C.f. plan9's inferno, rather than jvm or cil.
> 
> A stack machine representation was chosen for the same reason.  Tree
> gimple is a series of statements each statement being a tree.
> Smashing the trees and introducing temps is easy on the output side
> but requires a lot of work on the input side.  I am not a fan of our
> tree representations, but I did not believe that changing them should
> be a necessary to get to link time optimization.  If we decide we want
> to get rid if trees as an intermediate form, this decision should
> change.

Actually I also tend to think that using stack based IL would be mistake
here.  Gimple is represented as tree in low level, but basically it is
flat IL with registers, so having the representation close to such
language would actually make translation more direct, I would expect.
I think we can gradually move away from the nested tree representation
of gimple and thus it would be better to not push it deeper into our
design decisions.
> 
> A well designed stack machine also provides for a very tight encoding.
> It is very desirable from a size of portable code point of view to
> minimize the number of temps that you create.  You can do this in
> several ways:
> 
> 1) Do some register allocation of the temps so that they are reused.
>    This is non trivial to undo (but truely doable), especially where
>    you wish to not adversely impact debugging.
> 
> 2) Just generate a lot of temps and hope that some other form of
>    compression will save the day.
> 
> 3) Use a stack to represent the intermediate nodes of the tree.  This
>    is what we chose to do.
> 
> It is trivial to generate the stack code from a single walk of the
> tree.  It is trivial to regenerate the tree from a single pass over
> the stack code.
> 
> The stack machine that we have in mind will be as stripped down as
> possible.  The idea is just to get the trees in and get them back out.

I have a feeling that going in/out of stack representation will cause
some extra garbage to be produced (pretty much like we do with reg-stack
that solve sort of similar problem), making it bit more challenging to
optimize code at compilation time and save work by not re-doing it all
at link time.  It seems important to me that saving/restoring pair don't
actually introduce any non-trivial suboptimalities.  I would probably
preffer if it was possible to save IL in SSA form unless this is really
seen as overkill file size wise.

Compactness is important here, but since we are going to expand
everything into temporary form in memory anyway we are not going to win
that much I would say.  My preferences here (at this very moment) would
be to hope for 2) to save day ;))

I am on the vacation now, so I won't comment much before I have chance
to look closer into the proposal at monday, so these are just my 2
cents...
But thanks for all the work,
Honza

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 11:41     ` Richard Earnshaw
@ 2005-11-17 21:40       ` Ian Lance Taylor
  2005-11-17 23:10         ` Robert Dewar
  0 siblings, 1 reply; 46+ messages in thread
From: Ian Lance Taylor @ 2005-11-17 21:40 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc mailing list

Richard Earnshaw <rearnsha@gcc.gnu.org> writes:

> We spend a lot of time printing out the results of compilation as
> assembly language, only to have to parse it all again in the assembler. 
> Given some of the problems this proposal throws up I think we should
> seriously look at bypassing as much of this step as possible, and of
> generating object files from directly in the compiler.  Ultimately we'd
> only need to parse assembly statements for inline asm constructs.

I certainly think that is a good idea, and it is one which has been
discussed before.  But I think this is really a separate issue which
should not be confused with the link time optimization proposal.

I think the symbol table issues are a red herring.  The only
troublesome case is when using both a non-GNU assembler and a non-GNU
debugger, and even then the worst case is some difficulty with naming
static functions and variables (we can rename them using a suitable
mangled source file name as a suffix so that it is still possible to
find them, albeit awkward).  I think that if we lay down an
appropriate trail, other tool developers will follow soon enough.

Ian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 13:08   ` Ulrich Weigand
@ 2005-11-17 21:42     ` Ian Lance Taylor
  0 siblings, 0 replies; 46+ messages in thread
From: Ian Lance Taylor @ 2005-11-17 21:42 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc mailing list

Ulrich Weigand <uweigand@de.ibm.com> writes:

> > Conversely, I don't know much we are going to care about speed here,
> > but I assume that we are going to care a bit.  For the linker to
> > determine which files to pull in from an archive, it is going to have
> > to read the symbol tables of all the input objects, and it is going to
> > have to read the archive symbol table, and it is going to have to read
> > the symbols table of each object included from an archive.  It will
> > have to build a symbol hash table as it goes along.  This isn't fast;
> > it's a significant component of link time.  Since the compiler is also
> > going to have to build a symbol hash table, it is going to be faster
> > to have the compiler search the archive symbol table and decide which
> > objects to pull in.  Searching an archive symbol table isn't hard; the
> > target dependencies come in when deciding which objects to include.
> 
> I'm wondering whether we can't simply employ the linker to handle
> all those issues:  Have the driver always (not just as fall-back)
> call "ld -r" and the linker will pull together all input files,
> including those from archives, and combine them into one single
> object file.  Then invoke the new "link-optimizer" on that single
> object file, resulting in an optimized object file.
> 
> Any reasons why this cannot work?

Well, it means having the linker do a fair amount of work and then
throwing it all away.  And there might be some effort involved in
teasing out the debug information again, although I'm sure that could
be handled.  I tend to think the extra disk I/O and computation would
make this idea a non-starter, but I could certainly be wrong.  It does
seem like a viable fallback position.

Ian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 21:40       ` Ian Lance Taylor
@ 2005-11-17 23:10         ` Robert Dewar
  2005-11-17 23:42           ` Ian Lance Taylor
                             ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Robert Dewar @ 2005-11-17 23:10 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Richard Earnshaw, gcc mailing list

Ian Lance Taylor wrote:

>>We spend a lot of time printing out the results of compilation as
>>assembly language, only to have to parse it all again in the assembler. 

I never like arguments which have loaded words like "lot" without
quantification. Just how long *is* spent in this step, is it really
significant?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 23:10         ` Robert Dewar
@ 2005-11-17 23:42           ` Ian Lance Taylor
  2005-11-18  2:13             ` Daniel Jacobowitz
  2005-11-18  2:33           ` Dale Johannesen
  2005-11-18 18:30           ` Mike Stump
  2 siblings, 1 reply; 46+ messages in thread
From: Ian Lance Taylor @ 2005-11-17 23:42 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Richard Earnshaw, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

> Ian Lance Taylor wrote:
> 
> >>We spend a lot of time printing out the results of compilation as
> >> assembly language, only to have to parse it all again in the
> >> assembler.
> 
> I never like arguments which have loaded words like "lot" without
> quantification. Just how long *is* spent in this step, is it really
> significant?

[ It wasn't me that said the above, it was Richard Earnshaw. ]

I haven't measured it for a long time, and I never measured it
properly.  And I don't know how significant it has to be to care
about.  I would expect that it is on the order of 1% of the time of a
typical unoptimized compile+assembly, but I can't prove it today.

A proper measurement is the amount of time spent formatting the output
in the compiler, the amount of time spent making system calls to write
out the output, the amount of time the assembler spends making system
calls reading the input, and the amount of time the assembler spends
preprocessing the assembler file, interpreting the strings, looking up
instructions in hash tables, and parsing the operands.  Profiling will
show that the assembler stuff is a significant chunk of time taken by
the assembler, so the questions are: how much time does gcc spend
formatting and how much time do the system calls take?

I just tried a simple unoptimized compile.  -ftime-report said that
final took 5% of the time (obviously final does more than formatting),
and the assembler took 4% of the total user time, and system time took
16% of wall clock time.  Cutting those numbers in half makes 1% seem
not implausible to me, maybe even low.

I'm considering an unoptimized compile because that is where the
assembler makes the most difference--the compiler is faster and the
assembler output probably tends to be longer, and also an unoptimized
compile is when people care most about speed.  For an optimizing
compile, the assembler is obviously going to be less of a factor.

Ian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 23:42           ` Ian Lance Taylor
@ 2005-11-18  2:13             ` Daniel Jacobowitz
  2005-11-18  9:29               ` Bernd Schmidt
  2005-11-18 18:35               ` Link-time optimzation Mike Stump
  0 siblings, 2 replies; 46+ messages in thread
From: Daniel Jacobowitz @ 2005-11-18  2:13 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Robert Dewar, Richard Earnshaw, gcc mailing list

On Thu, Nov 17, 2005 at 03:42:29PM -0800, Ian Lance Taylor wrote:
> I just tried a simple unoptimized compile.  -ftime-report said that
> final took 5% of the time (obviously final does more than formatting),
> and the assembler took 4% of the total user time, and system time took
> 16% of wall clock time.  Cutting those numbers in half makes 1% seem
> not implausible to me, maybe even low.
> 
> I'm considering an unoptimized compile because that is where the
> assembler makes the most difference--the compiler is faster and the
> assembler output probably tends to be longer, and also an unoptimized
> compile is when people care most about speed.  For an optimizing
> compile, the assembler is obviously going to be less of a factor.

Also, please keep in mind that generating and then assembling debug
info takes a huge amount of I/O relative to code size.  I'd expect much
more than 1% saving the write-out and write-in on -g.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 23:10         ` Robert Dewar
  2005-11-17 23:42           ` Ian Lance Taylor
@ 2005-11-18  2:33           ` Dale Johannesen
  2005-11-18  3:11             ` Geert Bosch
  2005-11-18 18:43             ` Mike Stump
  2005-11-18 18:30           ` Mike Stump
  2 siblings, 2 replies; 46+ messages in thread
From: Dale Johannesen @ 2005-11-18  2:33 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Ian Lance Taylor, Dale Johannesen, Richard Earnshaw, gcc mailing list

On Nov 17, 2005, at 3:09 PM, Robert Dewar wrote:
> Richard Earnshaw wrote:
>
>>> We spend a lot of time printing out the results of compilation as
>>> assembly language, only to have to parse it all again in the 
>>> assembler.
>>>
> I never like arguments which have loaded words like "lot" without
> quantification. Just how long *is* spent in this step, is it really
> significant?

When I arrived at Apple around 5 years ago, I was told of some recent
measurements that showed the assembler took around 5% of the time.
Don't know if that's still accurate.  Of course the speed of the 
assembler
is also relevant, and our stubs and lazy pointers probably mean Apple's
.s files are bigger than other people's.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18  2:33           ` Dale Johannesen
@ 2005-11-18  3:11             ` Geert Bosch
  2005-11-18 18:43             ` Mike Stump
  1 sibling, 0 replies; 46+ messages in thread
From: Geert Bosch @ 2005-11-18  3:11 UTC (permalink / raw)
  To: Dale Johannesen
  Cc: Robert Dewar, Ian Lance Taylor, Richard Earnshaw, gcc mailing list

On Nov 17, 2005, at 21:33, Dale Johannesen wrote:
> When I arrived at Apple around 5 years ago, I was told of some recent
> measurements that showed the assembler took around 5% of the time.
> Don't know if that's still accurate.  Of course the speed of the  
> assembler
> is also relevant, and our stubs and lazy pointers probably mean  
> Apple's
> .s files are bigger than other people's.

Of course, there is a reason why almost any commercial compiler writes
object files directly. If you start feeding serious GCC output through
IAS (the Intel assembler) on a platform like IA64, you'll find that this
really doesn't work. A file that takes seconds to compile can take over
an hour to assemble.

GCC tries to write out assembly in a way that is unambiguous, so
the exact instructions being used are known. Any platform with a
"smart optimizing", assembler will run into all kinds of issues.
(Think MIPS.) Many assembler features, such as decimal floating-point
number conversion, are so poorly implemented that they should be avoided
at all cost. Some assemblers like to do their own instruction splitting,
NOP insertion and dependency detection, completely throwing off choices
made by the compiler's scheduler. Then there is alignment of code  
labels.
If there even is the slightest doubt about what exact instruction
encoding the assembler will use, all bets are off here too.

If you'd start from scratch and want to get everything exactly right,
it seems clear that the assembly output path is far harder to implement
than writing object code directly. When you know exactly what bits you
want, just go and write them. However, given that all ports are
implemented based on assembly output, and that many users depend on
assembly output being available, changing GCC's ways will be
very labor intensive unfortunately.

   -Geert

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18  2:13             ` Daniel Jacobowitz
@ 2005-11-18  9:29               ` Bernd Schmidt
  2005-11-18 11:19                 ` Robert Dewar
  2005-11-18 11:29                 ` Richard Earnshaw
  2005-11-18 18:35               ` Link-time optimzation Mike Stump
  1 sibling, 2 replies; 46+ messages in thread
From: Bernd Schmidt @ 2005-11-18  9:29 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Ian Lance Taylor, Robert Dewar, Richard Earnshaw, gcc mailing list

Daniel Jacobowitz wrote:
> On Thu, Nov 17, 2005 at 03:42:29PM -0800, Ian Lance Taylor wrote:
> 
>>I just tried a simple unoptimized compile.  -ftime-report said that
>>final took 5% of the time (obviously final does more than formatting),
>>and the assembler took 4% of the total user time, and system time took
>>16% of wall clock time.  Cutting those numbers in half makes 1% seem
>>not implausible to me, maybe even low.
>>
>>I'm considering an unoptimized compile because that is where the
>>assembler makes the most difference--the compiler is faster and the
>>assembler output probably tends to be longer, and also an unoptimized
>>compile is when people care most about speed.  For an optimizing
>>compile, the assembler is obviously going to be less of a factor.
> 
> 
> Also, please keep in mind that generating and then assembling debug
> info takes a huge amount of I/O relative to code size.  I'd expect much
> more than 1% saving the write-out and write-in on -g.

So, maybe a simpler strategy could be to make minor modifications to gas 
and gcc so that the former is linked in and the latter can pass strings 
to it?  Maybe that could get us a performance improvement without the 
need for a massive overhaul of all backends, and the need to duplicate 
object code generation.


Bernd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18  9:29               ` Bernd Schmidt
@ 2005-11-18 11:19                 ` Robert Dewar
  2005-11-18 11:29                 ` Richard Earnshaw
  1 sibling, 0 replies; 46+ messages in thread
From: Robert Dewar @ 2005-11-18 11:19 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Daniel Jacobowitz, Ian Lance Taylor, Richard Earnshaw, gcc mailing list

Bernd Schmidt wrote:

> So, maybe a simpler strategy could be to make minor modifications to gas 
> and gcc so that the former is linked in and the latter can pass strings 
> to it?  Maybe that could get us a performance improvement without the 
> need for a massive overhaul of all backends, and the need to duplicate 
> object code generation.

I don't see that passing strings "internally" would be significantly
faster than the current setup, and depending on how it was done, might
even end up being slower.

And it is well to remember that assembler technology is not trivial.
There are environments in which the object formats are quite complex.

I think it may well be possible to achieve an improvement of a few
per cent in overall performance, but I would guess that the ratio
of effort to improvement could be very high.

Furthermore, I think you would certainly want to retain the possibility
of generating assembler files for two reasons at least:

1) it is very helpful to be able to see the exact asm that is generated,
knowing that it really IS the exact asm, and not some possibly incorrect
attempt to reconstruct it.

2) when doing a new port for a new target with a new peculiar object
format, you really don't want to have to tackle the details of the
object format as part of the port.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18  9:29               ` Bernd Schmidt
  2005-11-18 11:19                 ` Robert Dewar
@ 2005-11-18 11:29                 ` Richard Earnshaw
  2005-11-18 11:40                   ` Directly generating binary code [Was Re: Link-time optimzation] Andrew Haley
  1 sibling, 1 reply; 46+ messages in thread
From: Richard Earnshaw @ 2005-11-18 11:29 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Daniel Jacobowitz, Ian Lance Taylor, Robert Dewar, gcc mailing list

On Fri, 2005-11-18 at 09:29, Bernd Schmidt wrote:

> > Also, please keep in mind that generating and then assembling debug
> > info takes a huge amount of I/O relative to code size.  I'd expect much
> > more than 1% saving the write-out and write-in on -g.
> 
> So, maybe a simpler strategy could be to make minor modifications to gas 
> and gcc so that the former is linked in and the latter can pass strings 
> to it?  Maybe that could get us a performance improvement without the 
> need for a massive overhaul of all backends, and the need to duplicate 
> object code generation.

That's surprisingly close to how I'd see a migration strategy working
anyway.

Firstly we have to retain permanent access to the assembler's parser to
handle inline asm statements.  However, we don't have to use the parsing
stage within the guts of the compiler.  So I see a migration strategy
something as follows:

- Create a more complete set of output routines for printing assembly
(ultimately a backend should never write to asm_out_file but to the new
routines -- we're probably 90%+ towards that goal already).  A back-end
can't convert to embedded assembly until that process is complete.

- once that's done we can start integrating the assembler code, the
first step would be to feed the standard parser directly from the new
assembly output routines (probably a line, or a statement at a time).

- Then, incrementally, we can bypass the parse layer to call routines
directly in the assembler.  This can be done both for directives and for
assembly of instructions.  *but it can all be done incrementally*.

The main difficulty is preserving -S output in a converted target if
that is requested.  It ought to be possible but it will need some care.

R.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Directly generating binary code [Was Re: Link-time optimzation]
  2005-11-18 11:29                 ` Richard Earnshaw
@ 2005-11-18 11:40                   ` Andrew Haley
  2005-11-18 12:04                     ` Laurent GUERBY
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Haley @ 2005-11-18 11:40 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Bernd Schmidt, Daniel Jacobowitz, Ian Lance Taylor, Robert Dewar,
	gcc mailing list

Richard Earnshaw writes:

 > - Then, incrementally, we can bypass the parse layer to call routines
 > directly in the assembler.  This can be done both for directives and for
 > assembly of instructions.  *but it can all be done incrementally*.
 > 
 > The main difficulty is preserving -S output in a converted target if
 > that is requested.  It ought to be possible but it will need some care.

A nightmare scenario is debugging the compiler when its behaviour
changes due to using "-S".  Assembly source is something that we
maintainers use more than anyone else.

I expect that if we go down this road the gcc back end routines that
generate assembly source will rot.  I've seen compilers generate
"assembly" code that fails to assemble even though the binaries it
generated were correct.  Needless to say, this can make life very
difficult.

Andrew.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Directly generating binary code [Was Re: Link-time optimzation]
  2005-11-18 11:40                   ` Directly generating binary code [Was Re: Link-time optimzation] Andrew Haley
@ 2005-11-18 12:04                     ` Laurent GUERBY
  2005-11-18 17:41                       ` Jim Blandy
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent GUERBY @ 2005-11-18 12:04 UTC (permalink / raw)
  To: Andrew Haley
  Cc: Richard Earnshaw, Bernd Schmidt, Daniel Jacobowitz,
	Ian Lance Taylor, Robert Dewar, gcc mailing list

On Fri, 2005-11-18 at 11:40 +0000, Andrew Haley wrote:
> A nightmare scenario is debugging the compiler when its behaviour
> changes due to using "-S".  Assembly source is something that we
> maintainers use more than anyone else.

If we go the direct generation route, I think it would be more
efficient (if possible) to add whatever extra information is needed in
the object file (like the asm template string, compiler comments, ...)
so that object code dumper will get it back to you on request. So skip
the -S altogether, always use the object dumper to "look at assembly".

Of course you then depend on the full target object toolchain, if it
is a proprietary one and we can't feed extra information through its
object dumper, my scheme won't work.

Laurent

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 15:54   ` Kenneth Zadeck
  2005-11-17 16:41     ` Jan Hubicka
@ 2005-11-18 16:31     ` Michael Matz
  2005-11-18 17:04       ` Steven Bosscher
  2005-11-18 17:24     ` Nathan Sidwell
  2 siblings, 1 reply; 46+ messages in thread
From: Michael Matz @ 2005-11-18 16:31 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Richard Henderson, Mark Mitchell, gcc mailing list

Hi,

On Thu, 17 Nov 2005, Kenneth Zadeck wrote:

> A stack machine representation was chosen for the same reason.  Tree
> gimple is a series of statements each statement being a tree.

IMHO we should follow that path of thinking.  The representation of GIMPLE 
where we do most optimizations on (i.e. tree-ssa) is implemented as 
GCC trees, thats true.  But this is just an implementation detail, and one 
which somewhen in the future hopefully will be changed.  Because in 
essence GIMPLE is a rather flat intermediate form, most of the time just 
three address form.  I think it would be a mistake in the long run if we 
would now use a stack based external representation just because right now 
gimple is implemeted via trees.  For instance the gimple statement

  a = b + c

would need to be implemented ala
  push id_b
  push id_c
  add
  pop id_a

The expansion of the trivial operation into four stackops is horrible to 
read (think reading debug dumps).  Additionally the change of 
representation form might introduce hard to overcome issues due to 
mismatches in the expressiveness.  We would possibly need a mini stack 
optimizer for just reading back this form into gimple.

I think writing out gimple directly, i.e. using a register machine and 
three address code, is the better way.  I could even imagine some custom 
extensions to the three address form to easily represent nested constructs 
which still happen in gimple (e.g. type conversions, address taking etc).

> 1) Do some register allocation of the temps so that they are reused.
>    This is non trivial to undo (but truely doable), especially where
>    you wish to not adversely impact debugging.
> 
> 2) Just generate a lot of temps and hope that some other form of
>    compression will save the day.

In the above light I would go for 2) together with perhaps relatively 
trivial form of 1)  (e.g. reusing temps per gimple statements, which 
reduces the overall need for temps to the max Sethi-Ullman number for the 
statements to be converted, most of the time lower than lets say 20).

OTOH it might be a good idea to persue both strategies at first (i.e. a 
gimple writer/reader based on stack machine and one based on register 
machine), and then see which feels better.  Perhaps even a merger of both 
approaches is sensible, three address form for most simple gimple 
statements with falling back to stack encoding for deeply nested operands.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18 16:31     ` Michael Matz
@ 2005-11-18 17:04       ` Steven Bosscher
  2005-11-18 17:29         ` Michael Matz
  0 siblings, 1 reply; 46+ messages in thread
From: Steven Bosscher @ 2005-11-18 17:04 UTC (permalink / raw)
  To: gcc; +Cc: Michael Matz, Kenneth Zadeck, Richard Henderson, Mark Mitchell

On Friday 18 November 2005 17:31, Michael Matz wrote:
> Perhaps even a merger of both
> approaches is sensible, three address form for most simple gimple
> statements with falling back to stack encoding for deeply nested operands.

That would be a bad violation of the KISS principle.

Gr.
Steven

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 15:54   ` Kenneth Zadeck
  2005-11-17 16:41     ` Jan Hubicka
  2005-11-18 16:31     ` Michael Matz
@ 2005-11-18 17:24     ` Nathan Sidwell
  2 siblings, 0 replies; 46+ messages in thread
From: Nathan Sidwell @ 2005-11-18 17:24 UTC (permalink / raw)
  To: Kenneth.Zadeck; +Cc: Richard Henderson, Mark Mitchell, gcc mailing list

Kenneth Zadeck wrote:

> The stack machine that we have in mind will be as stripped down as
> possible.  The idea is just to get the trees in and get them back out.

When I first read the proposal, I too wondered if a register machine would be 
better here.  I've come to the conclusion that it wouldn't be, and that a stack 
machine is a fine choice.

*) Unlike JVM, we're not producing something that is supposed to be immediately 
executable.  Making hardware stack machines go fast is very hard -- TOS acts as 
a huge atomic operator.

*) I can well imagine more complicated gimple nodes than simple 3 address forms. 
   A stack machine makes this kind of thing easy to extend.  As Kenny says, a 
stack machine is an ideal way to serialize a tree.

*) The stack machine decouples the getting and putting of operands from the 
actual operations.  Although this could lead to excessive size, that does depend 
on the actual encoding chosen -- something that affects both stack and register 
machines.

nathan
-- 
Nathan Sidwell    ::   http://www.codesourcery.com   ::     CodeSourcery LLC
nathan@codesourcery.com    ::     http://www.planetfall.pwp.blueyonder.co.uk

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18 17:04       ` Steven Bosscher
@ 2005-11-18 17:29         ` Michael Matz
  0 siblings, 0 replies; 46+ messages in thread
From: Michael Matz @ 2005-11-18 17:29 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Kenneth Zadeck, Richard Henderson, Mark Mitchell

Hi,

On Fri, 18 Nov 2005, Steven Bosscher wrote:

> On Friday 18 November 2005 17:31, Michael Matz wrote:
> > Perhaps even a merger of both
> > approaches is sensible, three address form for most simple gimple
> > statements with falling back to stack encoding for deeply nested operands.
> 
> That would be a bad violation of the KISS principle.

Of course.  It was just an idea coming to my mind, you don't have to start 
with that.  And sometimes one shouldn't avoid complexity at all cost, if 
the gain is high enough ;)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Directly generating binary code [Was Re: Link-time optimzation]
  2005-11-18 12:04                     ` Laurent GUERBY
@ 2005-11-18 17:41                       ` Jim Blandy
  0 siblings, 0 replies; 46+ messages in thread
From: Jim Blandy @ 2005-11-18 17:41 UTC (permalink / raw)
  To: Laurent GUERBY
  Cc: Andrew Haley, Richard Earnshaw, Bernd Schmidt, Daniel Jacobowitz,
	Ian Lance Taylor, Robert Dewar, gcc mailing list

On 11/18/05, Laurent GUERBY <laurent@guerby.net> wrote:
> On Fri, 2005-11-18 at 11:40 +0000, Andrew Haley wrote:
> > A nightmare scenario is debugging the compiler when its behaviour
> > changes due to using "-S".  Assembly source is something that we
> > maintainers use more than anyone else.
>
> If we go the direct generation route, I think it would be more
> efficient (if possible) to add whatever extra information is needed in
> the object file (like the asm template string, compiler comments, ...)
> so that object code dumper will get it back to you on request. So skip
> the -S altogether, always use the object dumper to "look at assembly".

The point Andrew's raising here is that you are replacing an
intermediate form that is very useful for isolating problems --- the
assembly language file --- with an in-core form that is much more
difficult to inspect.

And there are various side consequences: for example, you know the
compiler isn't going to go and tweak the assembly-language file you're
looking at, because it's exited.  But with an in-core representation
(and a non-type-safe language like C), compiler bugs can still be
mangling the assembly code "after" it's been generated.

For a 2% speedup?  That hasn't been carefully measured?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-17 23:10         ` Robert Dewar
  2005-11-17 23:42           ` Ian Lance Taylor
  2005-11-18  2:33           ` Dale Johannesen
@ 2005-11-18 18:30           ` Mike Stump
  2 siblings, 0 replies; 46+ messages in thread
From: Mike Stump @ 2005-11-18 18:30 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Ian Lance Taylor, Richard Earnshaw, gcc mailing list

On Nov 17, 2005, at 3:09 PM, Robert Dewar wrote:
> I never like arguments which have loaded words like "lot" without
> quantification. Just how long *is* spent in this step, is it really
> significant?

as is 2-3% as I recall (Finder_FE C++) of total build time.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18  2:13             ` Daniel Jacobowitz
  2005-11-18  9:29               ` Bernd Schmidt
@ 2005-11-18 18:35               ` Mike Stump
  1 sibling, 0 replies; 46+ messages in thread
From: Mike Stump @ 2005-11-18 18:35 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Ian Lance Taylor, Robert Dewar, Richard Earnshaw, gcc mailing list

On Nov 17, 2005, at 6:13 PM, Daniel Jacobowitz wrote:
> Also, please keep in mind that generating and then assembling debug
> info takes a huge amount of I/O relative to code size.  I'd expect  
> much
> more than 1% saving the write-out and write-in on -g.

I'd hope that we can contribute code to eliminate this, so, it might  
be possible to not consider any benefit this would have in the  
current decision making.

compiler -> debug info repository -> gdb (side stepping the linker,  
the assembler)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
  2005-11-18  2:33           ` Dale Johannesen
  2005-11-18  3:11             ` Geert Bosch
@ 2005-11-18 18:43             ` Mike Stump
  1 sibling, 0 replies; 46+ messages in thread
From: Mike Stump @ 2005-11-18 18:43 UTC (permalink / raw)
  To: Dale Johannesen
  Cc: Robert Dewar, Ian Lance Taylor, Richard Earnshaw, gcc mailing list

On Nov 17, 2005, at 6:33 PM, Dale Johannesen wrote:
> When I arrived at Apple around 5 years ago, I was told of some recent
> measurements that showed the assembler took around 5% of the time.

Yeah, it's been sped up actually.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Link-time optimzation
@ 2005-11-17  0:52 Chris Lattner
  0 siblings, 0 replies; 46+ messages in thread
From: Chris Lattner @ 2005-11-17  0:52 UTC (permalink / raw)
  To: dberlin; +Cc: giovannibajo, mark, GCC Development

Daniel Berlin Wrote:

 > > It [LLVM] is proven to be stable, high-level enough to
 > > perform any kind of needed optimization,

 > This is not true, unfortunately. That's why it is called "low  
level virtual machine".
 > It doesn't have things we'd like to do high level optimizations  
on, like
 > dynamic_cast removal, etc.

For the record, this isn't really true at all.  LLVM does already  
capture some high-level program properties, and is constantly being  
extended over time.  I will note that it would be far easier to  
extend LLVM with the functionality you desire than to reinvent a  
whole new way of doing things.

That said, wanting to stay as close as possible to gimple is a  
reasonable design point, and LLVM certainly isn't that.

-Chris

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2005-11-18 18:43 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-16 22:26 Link-time optimzation Mark Mitchell
2005-11-16 22:41 ` Andrew Pinski
2005-11-16 22:58 ` Andrew Pinski
2005-11-17  0:02 ` Andrew Pinski
2005-11-17  0:25 ` Andrew Pinski
2005-11-17  0:52   ` Tom Tromey
2005-11-17  0:26 ` Giovanni Bajo
2005-11-17  0:32   ` Daniel Berlin
2005-11-17  9:04     ` Giovanni Bajo
2005-11-17 16:25       ` Kenneth Zadeck
2005-11-17  1:20 ` Richard Henderson
2005-11-17  1:28   ` Mark Mitchell
2005-11-17  1:31     ` Daniel Jacobowitz
2005-11-17  3:35     ` Jeffrey A Law
2005-11-17 14:09       ` Daniel Berlin
2005-11-17 14:48         ` mathieu lacage
2005-11-17 11:41     ` Richard Earnshaw
2005-11-17 21:40       ` Ian Lance Taylor
2005-11-17 23:10         ` Robert Dewar
2005-11-17 23:42           ` Ian Lance Taylor
2005-11-18  2:13             ` Daniel Jacobowitz
2005-11-18  9:29               ` Bernd Schmidt
2005-11-18 11:19                 ` Robert Dewar
2005-11-18 11:29                 ` Richard Earnshaw
2005-11-18 11:40                   ` Directly generating binary code [Was Re: Link-time optimzation] Andrew Haley
2005-11-18 12:04                     ` Laurent GUERBY
2005-11-18 17:41                       ` Jim Blandy
2005-11-18 18:35               ` Link-time optimzation Mike Stump
2005-11-18  2:33           ` Dale Johannesen
2005-11-18  3:11             ` Geert Bosch
2005-11-18 18:43             ` Mike Stump
2005-11-18 18:30           ` Mike Stump
2005-11-17 15:54   ` Kenneth Zadeck
2005-11-17 16:41     ` Jan Hubicka
2005-11-18 16:31     ` Michael Matz
2005-11-18 17:04       ` Steven Bosscher
2005-11-18 17:29         ` Michael Matz
2005-11-18 17:24     ` Nathan Sidwell
2005-11-17  1:43 ` Gabriel Dos Reis
2005-11-17  1:53 ` Andrew Pinski
2005-11-17  2:39 ` Kean Johnston
2005-11-17  5:53 ` Ian Lance Taylor
2005-11-17 13:08   ` Ulrich Weigand
2005-11-17 21:42     ` Ian Lance Taylor
2005-11-17 16:17   ` Kenneth Zadeck
2005-11-17  0:52 Chris Lattner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).