public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: A FrontEnd in C++?
@ 2002-08-22 13:50 Chris Lattner
  2002-08-23  0:51 ` AST optimizer " Pop Sébastian
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-22 13:50 UTC (permalink / raw)
  To: gcc

> Go right ahead.  All you need is a new branch and some free time.  I
> would personally suggest following some of the design and implementation
> ideas from SUIF and/or Sage++.  They are kind of neat.

As long as we are plugging compiler infrastructures, I thought I might
mention my own project LLVM: http://llvm.cs.uiuc.edu/

It's also completely written in C++ (G++ 3.2), uses a GCC->LLVM frontend,
is SSA based, open-source, and has some interesting technology available
in it.  The LLVM instruction set is roughly equivalent to SIMPLE
( http://llvm.cs.uiuc.edu/docs/LangRef.html ).

A few of the nify things LLVM provides are the modular pass system:
http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html
strong infrastructure for interprocedural optimizations, linktime, and
runtime optimizations, etc.  The infrastructure is _very_ fast.

Additionally, because it uses C++ and the STL throughout, writing
transformations is quite simple.  For example, LICM and GCSE are each
just a couple hundred lines of code (which are mostly comments):

http://llvm/doxygen/LICM_8cpp-source.html
http://llvm/doxygen/GCSE_8cpp-source.html

If you have any questions about LLVM or would like to discuss compiler
architecture, I'd be more than happy to participate.

-Chris

^ permalink raw reply	[flat|nested] 39+ messages in thread

* AST optimizer in C++?
  2002-08-22 13:50 A FrontEnd in C++? Chris Lattner
@ 2002-08-23  0:51 ` Pop Sébastian
  2002-08-23 10:25   ` Chris Lattner
  0 siblings, 1 reply; 39+ messages in thread
From: Pop Sébastian @ 2002-08-23  0:51 UTC (permalink / raw)
  To: Chris Lattner; +Cc: gcc

Hi Chris,

On Thu, Aug 22, 2002 at 03:53:54PM -0500, Chris Lattner wrote:
> 
> > Go right ahead.  All you need is a new branch and some free time.  I
> > would personally suggest following some of the design and implementation
> > ideas from SUIF and/or Sage++.  They are kind of neat.
> 
> As long as we are plugging compiler infrastructures, I thought I might
> mention my own project LLVM: http://llvm.cs.uiuc.edu/
> 
I had a look at the project, and I'm interested to have a look at the sources
(if it's possible).

> It's also completely written in C++ (G++ 3.2), uses a GCC->LLVM frontend,
> is SSA based, open-source, and has some interesting technology available
> in it.  The LLVM instruction set is roughly equivalent to SIMPLE
> ( http://llvm.cs.uiuc.edu/docs/LangRef.html ).
> 
> A few of the nify things LLVM provides are the modular pass system:
> http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html
> strong infrastructure for interprocedural optimizations, linktime, and
> runtime optimizations, etc.  The infrastructure is _very_ fast.
> 
> Additionally, because it uses C++ and the STL throughout, writing
> transformations is quite simple.  For example, LICM and GCSE are each
> just a couple hundred lines of code (which are mostly comments):
> 
> http://llvm/doxygen/LICM_8cpp-source.html
> http://llvm/doxygen/GCSE_8cpp-source.html
> 
> If you have any questions about LLVM or would like to discuss compiler
> architecture, I'd be more than happy to participate.
> 
After what I (quickly) read from the web page you use gcc's front-ends.
My questions are:
- Do you have a class hierarchy for representing statements and expressions?
- Is it possible to generate GCC trees after your optimizations?  (and then
generate RTL by passing these trees to the RTL-translator...)


In fact what I want for gcc is an AST optimizer that could be built independently
of gcc's front-ends.  This optimizer takes as input SIMPLE trees and its output 
is again gcc trees (a subset of SIMPLE, or even a superset of SIMPLE if we decide 
to promote some stmts/exprs to a higher level during optimization).  

The overall schema could be:

(GCC front-ends) -> (GENERIC trees) -> [(SIMPLE trees) -> (AST Optimizer)] -> 
-> (RTL conversion) -> (Code generation)

with optional components in "-> [...] ->".  These components are not mandatory 
for building the GCC compiler.  They could be built apart once the g++ compiler 
is available.  The AST optimizer could be loaded from a .so file (as in open64 
compiler) and thus avoiding to grow too much the size of the executable when 
AST optimization is not needed.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: AST optimizer in C++?
  2002-08-23  0:51 ` AST optimizer " Pop Sébastian
@ 2002-08-23 10:25   ` Chris Lattner
  2002-08-25  9:48     ` [tree-ssa] " Pop Sébastian
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-23 10:25 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: gcc

> > As long as we are plugging compiler infrastructures, I thought I might
> > mention my own project LLVM: http://llvm.cs.uiuc.edu/

> I had a look at the project, and I'm interested to have a look at the sources
> (if it's possible).

As I says on the web page, we aren't really ready for a public "release"
yet.   That said, the doxygen documentation is all online, so you are free
to browse the source:

http://llvm.cs.uiuc.edu/doxygen/

I'll get back to the list this afternoon with something more decisive
about making tarballs public.

> > If you have any questions about LLVM or would like to discuss compiler
> > architecture, I'd be more than happy to participate.

> After what I (quickly) read from the web page you use gcc's front-ends.
> My questions are:
> - Do you have a class hierarchy for representing statements and expressions?

Sorta.  If you take a look at this page:
http://llvm/doxygen/hierarchy.html or http://llvm/doxygen/inherits.html
you can see the (textual or graphical) class heirarchy for entire compiler
suite.  Search for "Value" and you can see the root of the tree you're
interested in.

Now this is a "sorta" because we don't have statements and expressions.
:)  LLVM is a low level three address code type representation, in SSA
form, with _type information_ for all operands.  One of our research
results has been to show that most of the optimizations traditionally done
on AST's can be done just as well on low level representations, *as long
as you have rich type information*.  In the context of LLVM then, this
means that statements and expressions are both instructions, it's just
that "statements" generally return void, ie they don't produce a value.

If you'd like me to turn some C code into LLVM code to be more concrete,
I'd be more than happy to.  It really is a nice representation for
implementating optimizations (see the subclasses of the "Pass" class in
the inheritance diagram for some of the transformations we have
implemented).  There is quite a bit of documentation on the web page (and
it's growing) talking about the LLVM representation and compiler process.
We are just starting to work on the "intro to coding LLVM stuff"
documents unfortunately.

> - Is it possible to generate GCC trees after your optimizations?  (and then
> generate RTL by passing these trees to the RTL-translator...)

The basic architecture looks like this (this is covered in detail in this
document: http://llvm.cs.uiuc.edu/pubs/LLVMCompilationStrategy.pdf ):

The cc1 process:
GCC Parser -> Tree -> RTL -> RTLSSA -> RTLSSA2LLVM, output x.s (LLVM assembly)

The 'as' process:
parse x.s -> cleanup -> level raise -> optimize, output x.o (which is in LLVM bytecode)

the 'ld' process:
link *.o bytecode files, generate native code for sparc (postlink), output
x.s (sparc assembly)

Run native sparc assembler/linker.   We also have a simple C backend as
well.

There are a couple of interesting things to note about this process:

1. We completely discard all of the type information provided to use by
   the source language, and reconstruct what we need from the RTL.

   This is a side-effect of us starting with a GCC snapshot from about 2
   months before GCC 3.0 became official.  With no tree-ssa branch to
   start from, we decided to base our work around the RTL representation
   (which was also important, because not every frontend used the Tree
   representation).  Although we currently only support the C frontend, we
   plan to turn on the C++ frontend soon.

2. We have disabled almost all of the optimizations in GCC (basically all
   the RTL optimizations, keeping the language specific optimizations,
   like builtin optimizations).  This is because the optimizations in the
   GCC frontend were taking a lot more time to execute than they did when
   reimplemented making use of the SSA and type information.
   Additionally, some optimizations were only applied by the CSE pass,
   which is limited to extended basic blocks, whereas all of our
   optimizations are at least global.  To be specific, compiling vecffe.c
   from the SpecINT2000 GAP benchmark (a big file) gives the following
   -ftime-report output when running with -O2 (showing that most
   everything is disabled):

Execution times (seconds)
 preprocessing         :   0.31 ( 7%) usr   0.05 (25%) sys   0.53 ( 9%) wall
 lexical analysis      :   0.21 ( 4%) usr   0.03 (15%) sys   0.27 ( 5%) wall
 parser                :   1.05 (22%) usr   0.06 (30%) sys   1.27 (22%) wall
 varconst              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 jump                  :   0.27 ( 6%) usr   0.00 ( 0%) sys   0.30 ( 5%) wall
 if-conversion         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 convert to SSA        :   0.65 (14%) usr   0.03 (15%) sys   0.70 (12%) wall
 convert from SSA      :   0.82 (17%) usr   0.01 ( 5%) sys   1.13 (19%) wall
 final                 :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 symout                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 rest of compilation   :   0.32 ( 7%) usr   0.01 ( 5%) sys   0.36 ( 6%) wall
 RTL->LLVM code gen    :   0.44 ( 9%) usr   0.00 ( 0%) sys   0.47 ( 8%) wall
 LLVM ResolveOperands  :   0.32 ( 7%) usr   0.00 ( 0%) sys   0.37 ( 6%) wall
 LLVM Code Emission    :   0.25 ( 5%) usr   0.01 ( 5%) sys   0.36 ( 6%) wall
 TOTAL                 :   4.70             0.20             5.90

  Note that the 3 LLVM passes happen between the convert to/from SSA
  passes.  toplev.c is pretty nastily hacked to disable this stuff (which
  could certainly be fixed, of course).  This is a debug build running on
  slow machine, so the actual timing results shouldn't be taken to mean
  anything.  In particular, the 3 LLVM passes are not really written to be
  efficient, and should be rewritten at some point.

3. We do in fact need a .md file for our target to specify what sort of
   RTL to generate.  Basically we just have completely generic operations,
   with as few constraints as possible on the instructions, trying to get
   unobfuscated code from the frontend.  For example, our andhi3 pattern
   looks like:

(define_insn "andhi3"
  [(set (match_operand:HI 0 "register_operand" "=r")
	(and:HI (match_operand:HI 1 "nonmemory_operand" "")
		(match_operand:HI 2 "nonmemory_operand" "")))]
  ""
  "andhi3 %1, %2, %0")

   if anyone cares, I can make our GCC changes available, although it's
   kindof a hack (let's just say I didn't know GCC very well when I
   started :), and needs to be largely rewritten.

4. Keeping LLVM code until the link step lets us perform important
   interprocedural optimizations at link time, for example alias analysis,
   that have a big effect on machine code generation (for example,
   scheduling).  Also inlining, and many other optimizations are much more
   powerful at link-time, as I'm sure everyone is keenly aware.  To keep
   the actual code generation step fast, we do as much optimization as
   possible in the 'gccas' process as possible, including machine specific
   optimization.

5. LLVM allows a full suite of optimizations on global objects as well as
   the bodies of functions.  This means it can completely eliminate unused
   global variables, functions, etc, without problem.  I believe that GCC
   cannot currently do this in the backend (as part of generic
   optimization) because it treats global variables, for example, as
   immutable places in memory accessed by symbol_refs.

6. The LLVM optimizations are very fast (I can post numbers if you'd
   like).  Typically it takes as much time to run through cc0 as it does
   to execute gccas.  Although we do not have ALL of the optimizations
   we'd like implemented for GCCAS, we do have a fair collection, executed
   in this order:

 * FunctionResolving - Resolves a call to 'int foo(...)' to 'int foo(int)'
                       if there wasn't a prototype for foo specified.
 * globaldce           - Remove unused static global variables &
                         functions, also removes unused 'externs'.
 * deadtypeelimination - Cleanup the symbol table by removing unused
                         typedefs.
 * constantmerge       - Merge global variable constants (basically
                         .rodata) with identical contents together.
 * deadinst elimination - Trivial dead code elimination
 * raise allocations    - Convert malloc/free calls to malloc/free instructions
 * indvar simplify     - Convert indvars to cannonical form
 * Level raise         - Recover type information
 * promote mem2reg     - Promote stack values to SSA registers.
 * instcombining       - Generalized peephole, val# pass, very fast.
 * licm                - Loop invariant code motion
 * gcse                - Global Common Subexpression Elimination
 * sccp                - Sparse conditional constant prop
 * instcombine         - Rerun value simplification
 * adce                - Aggressive dead code elimination
 * simplifycfg         - Merge and delete blocks.

  These use various analyses to do their job.  If you're interested, I can
  also provide timing information for stuff if you'd like.

  There are a couple of additional transformations that I would like added
  to this collection: PRE, strength reduction, etc.  They are being added
  when I have time.

> In fact what I want for gcc is an AST optimizer that could be built independently
> of gcc's front-ends.  This optimizer takes as input SIMPLE trees and its output
> is again gcc trees (a subset of SIMPLE, or even a superset of SIMPLE if we decide
> to promote some stmts/exprs to a higher level during optimization).

LLVM could certainly be used for something like this, but I think it would
fit better as an intermediate representation between SIMPLE (or generic
trees) and RTL.  The current expansion infrastructure in GCC that expands
from tree to RTL could easily be adapted to convert from LLVM, and RTL
certainly doesn't need the type information available in LLVM.  *shrug*
I'm not sure what converting back to trees would give us, although it's
probably possible.

> The overall schema could be:
>
> (GCC front-ends) -> (GENERIC trees) -> [(SIMPLE trees) -> (AST Optimizer)] ->
> -> (RTL conversion) -> (Code generation)

There are many ways to do it, I think it would work best as:

 (GCC front-ends) -> (GENERIC trees) -> LLVM -> [optimiations] -> (RTL conversion)
     -> (Code generation)

but a mix of approaches would certainly have advantages and disadvantages.
Even if GCC was not interested in using LLVM itself as a component (which
I really don't expect), some of the ideas could be incorporated into the
SIMPLE representation to get some of the advantages.

What exactly is gained by converting back to a tree representation
afterwards?  Just simplified integration with the compiler as it stands
now?

> with optional components in "-> [...] ->".  These components are not mandatory
> for building the GCC compiler.  They could be built apart once the g++ compiler
> is available.  The AST optimizer could be loaded from a .so file (as in open64
> compiler) and thus avoiding to grow too much the size of the executable when
> AST optimization is not needed.

Sure, makes sense.  There are two ways I see to completely get LLVM out of
the pictures when building from an untrusted compiler:

1. Have the simplest LLVM components be written in C, including the
   LLVM->RTL conversion.
2. Convert directly from generic trees to RTL when bootstrapping.

I would advocate #2 personally, simply because it has already been done
before in the past and could easily be resurrected.  The only problem
would be that two code paths would have to be maintained: Tree->RTL and
Tree->LLVM.  Although not a huge deal (a lot of stuff could be factored
out as LLVM and RTL are both "low level" representations), it's still
suboptimal.

Anyway, I'd love to discuss various IR design issues.  If anyone would
like to talk about the Pros/cons of optimizing on a tree vs a low-level
IR with type information, I can certainly talk about the experience I have
with it.  Hopefully I can convince you that tree's are not really
neccesary at all.  :) :)

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [tree-ssa] AST optimizer in C++?
  2002-08-23 10:25   ` Chris Lattner
@ 2002-08-25  9:48     ` Pop Sébastian
  2002-08-25 14:33       ` Chris Lattner
  2002-08-25 15:35       ` Diego Novillo
  0 siblings, 2 replies; 39+ messages in thread
From: Pop Sébastian @ 2002-08-25  9:48 UTC (permalink / raw)
  To: Chris Lattner, dnovillo; +Cc: gcc

On Fri, Aug 23, 2002 at 12:28:53PM -0500, Chris Lattner wrote:
> 
> Anyway, I'd love to discuss various IR design issues.  If anyone would
> like to talk about the Pros/cons of optimizing on a tree vs a low-level
> IR with type information, I can certainly talk about the experience I have
> with it.  Hopefully I can convince you that tree's are not really
> neccesary at all.  :) :)
> 

Agree, since we deal with (more or less) the same amount of information 
structured differently: trees vs. list of instructions.

AST representation as it exists now has the following information transformed 
directly from the input source code:
- control flow
- def-use 
- caller-callee
- types

Intermediate representations are created for better handling all this information.
IR can be viewed as sub-languages of the original language:
- CFG is a the sub-language that contains information about the control flow 
- SSA and dependence graphs for def-use
- Call-graph for caller-callee
- symbol tables for types
(see also 
http://www.stratego-language.org/twiki/bin/view/Stratego/StrategoPublications
http://www.stratego-language.org/twiki/bin/view/Transform/ProgramTransformation
)

These IR expose the same information as the original source code but in a more
compact form to optimizers.  

OTOH this information is not ready to be used dirrectly by optimizers since IRs
can produce different representations for the same semantic-equivalent program.
This increase the complexity of optimizers.  The idea is to reduce a class of
semantic-equivalent programs to a single representative program.  This is the 
normalization process:
- SIMPLE normalizes def-use IRs
- goto/break/switch elimination and loop canonicalization aim to normalize 
  control flow.
- ??? normalization for caller-callee
- ??? normalization for types
(I have no idea yet about question marks... work in progress...)

  Low Level Virtual Machine (LLVM) defines a virtual instruction set that 
contains exactly the same amount of information as the AST representation and 
it has all the benefits of the three-address-code normalization.
  However LLVM does not provide support for high level loop constructs 
(nested loops for example).  It defines instead branch "br" instructions.  
You then need extra informations for loop nest optimizers.
  Thus, I agree that writing optimizers on LLVM is as difficult :-) as writing 
them on trees under SIMPLE form.  However I think that introducing yet another 
translation from trees to LLVM could be expensive in compiling time (yet it is 
a quick transformation from SIMPLE).  

  A good idea I saw in the LLVM is that the SSA infromation is stored directly 
in the operations stream: ie. (from LLVMCompilationStrategy.pdf)
<result> = phi <type> [<val0>, <label0>], ..., [<valN>, <labelN>]

  Diego, maybe it is worth to include PHI nodes directly as a SIMPLE extension?
This could solve the problem with the tree_common.aux field...  The only difficult
bit I see is to remove them before we generate RTL (or otherwise propagate 
information about SSA into RTL through the translation of these nodes).

What do others think?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25  9:48     ` [tree-ssa] " Pop Sébastian
@ 2002-08-25 14:33       ` Chris Lattner
  2002-08-25 16:03         ` Diego Novillo
  2002-08-26  8:52         ` Pop Sébastian
  2002-08-25 15:35       ` Diego Novillo
  1 sibling, 2 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-25 14:33 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: dnovillo, gcc

> > with it.  Hopefully I can convince you that tree's are not really
> > neccesary at all.  :) :)

> Agree, since we deal with (more or less) the same amount of information
> structured differently: trees vs. list of instructions.

Very true.  Although probably well known, lists of instructions have lots
of advantages as well (over trees):

* Better cache behavior: Tree representations tend to have lots of
  dynamically allocated nodes to represent the equivalent amount of
  information held by a single "instruction".  For example, a single "add"
  instruction can be represented with a single dynamically allocated node
  in list-of-instructions form, but requires at least 3 (and possibly
  more) nodes in a tree representation: One of the instruction, and two
  for the operands.
* Smaller representation: Side effect of the above... with less
  allocations/fewer nodes, the total of the per-node overheads is much
  less.
* Simpler to deal with: There are _no_ sharing/unsharing or "amount of
  lowering" issues to deal with in a simple list-of-instructions
  representation.

These are some of the reasons that I think that the SIMPLE representation
could be made better.  That said, SIMPLE _is_ 90% of the way there, and a
HUGE step up from working on raw language dependent trees or RTL.

> AST representation as it exists now has the following information transformed
> directly from the input source code:
> - control flow
> - def-use
> - caller-callee
> - types

Yes, all of which are very important for one part of the compiler or
another.  See below for how LLVM addresses these.

> Intermediate representations are created for better handling all this information.
> IR can be viewed as sub-languages of the original language:
> - CFG is a the sub-language that contains information about the control flow
> - SSA and dependence graphs for def-use
> - Call-graph for caller-callee
> - symbol tables for types

Very true.

> These IR expose the same information as the original source code but in a more
> compact form to optimizers.

And, more importantly, it should be an easier to manipulate form.  The AST
is an important representation that has to deal with partial information
(for example before type checking/propogation has finished), and
potentially erroneous input.  A well designed mid-level IR doesn't have to
deal with these issues, so the representation used can be simplified a
lot, making it much easier to transform.

> OTOH this information is not ready to be used dirrectly by optimizers since IRs
> can produce different representations for the same semantic-equivalent program.

I'm not really sure what you mean by this.  In my view, it is always best
to eliminate the number of redundancies and number of cases that must be
handled in the representation (ie, by lowering ++X to use primitives, you
don't have to handle the ++ case anymore).  I'm not sure if this is what
you're getting at, but if so, I certainly agree.

> This increase the complexity of optimizers.  The idea is to reduce a class of
> semantic-equivalent programs to a single representative program.  This is the
> normalization process:
> - SIMPLE normalizes def-use IRs
> - goto/break/switch elimination and loop canonicalization aim to normalize
>   control flow.
> - ??? normalization for caller-callee
> - ??? normalization for types
> (I have no idea yet about question marks... work in progress...)

Yes, that is how it currently stands.

>   Low Level Virtual Machine (LLVM) defines a virtual instruction set that
> contains exactly the same amount of information as the AST representation and
> it has all the benefits of the three-address-code normalization.

It actually covers all of the bases you list above pretty well, more
detail below.

>   Thus, I agree that writing optimizers on LLVM is as difficult :-) as writing
> them on trees under SIMPLE form.
...
>   A good idea I saw in the LLVM is that the SSA infromation is stored directly
> in the operations stream: ie. (from LLVMCompilationStrategy.pdf)
> <result> = phi <type> [<val0>, <label0>], ..., [<valN>, <labelN>]

This example that you picked out is just one of a variety of different
ways that LLVM is designed to simplify optimizer design.  Let me explain
how LLVM represents the information that you listed above:

> - SSA and dependence graphs for def-use

This is the certainly the easiest one to explain.  LLVM does use SSA, and
as the example you showed above illustrates, the SSA information is
reflected __directly__ in the instructions themselves.  This occurs in
even more subtle ways than the PHI node instruction.  Some background is
neccesary.

In our current LLVM implementation, we have the following C++ classes
[you can see the full class heirarchy at the bottom of the
http://llvm.cs.uiuc.edu/doxygen/inherits.html page, if you are so
inclined]:

Value  [ http://llvm.cs.uiuc.edu/doxygen/classValue.html ]
  This class is the base class of all things that can be used as an
  operand to an instruction.  For example, there are subclasses
  "Constant" and "Argument" to represent constants and formal parameters
  to functions.  The Value class keeps track of list of "User"'s [below],
  which are all of the objects that use the LLVM value.  This provides the
  def-use chain information.

User  [ http://llvm.cs.uiuc.edu/doxygen/classUser.html ]
  User is a subclass of Value which represents a Value that can refer to
  other LLVM values [for example Instruction, below].  The main feature of
  User is the operands list, which keeps pointers to all of the Value's
  that are used by this User.  This information represents the SSA use-def
  information.  Of course, the C++ implementation provides a nice
  interface that makes it impossible for the use-def & def-use information
  to get out of date...

Instruction [ http://llvm.cs.uiuc.edu/doxygen/classInstruction.html ]
  Instruction is a subclass of User.  It provides information about the
  opcode of the instruction, and some other helpful instruction type
  stuff, in addition to inheriting functionality from it's base classes.

So so far, you see how the def-use & use-def SSA information is tracked.
The crucial thing to note here though is that _Instruction is a subclass
of Value_.  This means that when you have the following LLVM code:

  %A = add int %X, %Y
  %B = sub int %A, 0

... that there is a pointer in the operands list for the %B instruction
that points directly to the Add instruction itself.  There are no
intermediates or other references.  There are no tables on the side that
are neccesary to update this information.

This makes the representation dramatically simpler to update and
transform.  For example, to optimize away the second instruction there, we
do something similar to the following:

if (B->getOpcode() == Instruction::Sub)
  if (B->getOperand(1) == ConstantZero)
    B->replaceAllUsesWith(B->getOperand(0));

The replaceAllUsesWith method is a member of the Value class
[ http://llvm.cs.uiuc.edu/doxygen/classValue.html#a9 ] that loops over all
of the Users of the value, changing all instances of "this" in their
operand list to instead refer to the argument of replaceAllUsesWith.

Note that this automatically keeps the SSA information up-to-date.  There
is no external tables to reference, reconstruct or update.  This makes is
incredibly easy to write transformations.


> - symbol tables for types

LLVM also directly reflects this _in_ the representation just like it does
the SSA representation.  LLVM is a typed
[ http://llvm.cs.uiuc.edu/docs/LangRef.html#typesystem ] representation,
which requires types exist on all values and requires that certain type
constraints are always true (for example, the two operands to an 'add' or
'sub' instruction must be the same).  The type representation is truly
language independant, by mapping high-level concepts like classes down
into low-level concepts like structures.

Because every LLVM value must have a type, the Value class contains a
Type* field, recording the type of every value.  Because of this, types
are easily and quickly available for use.  This makes "easy things easy",
and also quite efficient:

if (B->getOpcode() == Instruction::Sub)
  if (!B->getType()->isFloatingPoint())   // Be careful with NaN's?
    if (B->getOperand(1) == ConstantZero)
      B->replaceAllUsesWith(B->getOperand(0));

Note that this representation is equivalent to how RTL stores the Mode
information in all of the "values" it uses.  Here, however, the type
system is much more powerful (and useful).  My key point is that this
information is kept IN the representation itself, not along-side in a
table that is:

 A. Slow - in the best case, accessing it is a hash table lookup vs a
    simple pointer indirection
 B. Large - again, hash tables have overhead
 C. Painful - The symbol table must always be kept up-to-date.


> - CFG is a the sub-language that contains information about the control flow

Keeping with my story, LLVM also represents the CFG directly in the
representation, just like everything else we have discussed.  In
particular, there is a subclass of Value named BasicBlock
[ http://llvm.cs.uiuc.edu/doxygen/classBasicBlock.html ].  BasicBlock's
obviously contain a list of instuctions in the block, but they themselves
are also Value's (of Type 'label').  Because of this, 'BasicBlock*'s may
exist right in the Operand list for the instructions themselves, although
there are only a few instructions that can use operands of type 'label'.
An example of some LLVM code:

   br label %bb2
bb2:
   %reg110 = phi uint [ %reg112, %bb2 ], [ 1, %bb1 ]
   %reg112 = add uint %reg110, 1
   %cond218 = setne uint %reg112, 10
   br bool %cond218, label %bb2, label %bb3
bb3:

In this short example, basic block "BB2" is an operand for the PHI
instruction and for the two branch instructions.  A pointer _to the basic
block_ itself exists in all of these locations.

This is great, but it doesn't really answer how we get a CFG out of this.
The simplest CFG interface can tell you all of the incoming and outgoing
edges of a block.  To do this, we have iterators that wrap the following
[trivial] analyses:

To get the successors of a block:
  One invariant of LLVM is that every basic block must end with a
  "Terminator" instruction
  [ http://llvm.cs.uiuc.edu/doxygen/classTerminatorInst.html ].  Terminator
  instructions are basically control flow instructions
  (conditional/unconditional branch, return, plus a few others to be fully
  general).  These instructions know all possible destinations that they
  may branch to, and therefore expose the "successor" interface used by
  the succ_iterator.

To get predecessors of a block:
  Because BasicBlock is a subclass of Value, it automatically keeps track
  of all of the User's of the block.  There are users of a basic block
  only in the following cases:
    1. A phi instruction, these can be ignored.
    2. Terminator instructions.  In this case, the basic block the
       terminator lives in is a predecessor.

  Thus, the pred_iterator basically loops over the users of the basic
  block, looking for terminators instructions.

This just leaves one feature lacking from the CFG support:

>   However LLVM does not provide support for high level loop constructs
> (nested loops for example).  It defines instead branch "br" instructions.
> You then need extra informations for loop nest optimizers.

That is very true.  In fact we intentionally do not provide these
facilities as part of LLVM itself, for several reasons (below).  To
provide LoopInfo [ http://llvm.cs.uiuc.edu/doxygen/classLoopInfo.html ], we
use the LLVM Pass infrastructure (described in detail in:
http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html ) to automate the
construction and deconstruction of loop information as neccesary.  Our
"loop information" simply identifies Natural loops and their nesting
properties.

There are several important aspects of loop information that make it
different from other parts of the representation, warranting different
treatment [instead of being part of the representation itself, it gets
treated like alias analysis information or dominator information]:

1. Updating loop information is not trivial.  We want to make it easy to
   write transformations and analyses, to allow explorative programming.
   After a transformation is mature and finished, "optimizations" like
   incrementally updating loop information can be implemented if desired.
   If not implemented, however, the loop information is simply invalidated
   and recalculated as neccesary.  Details are in the "how to write a
   pass" document.

2. Most transformations don't need it.  While it is certainly useful for a
   class of transformations (loop transformations for example, LICM for
   another), there are many that don't need it.  Making all of the
   transformations aware of all of the possible analyses is not a good
   idea.

The other option is to add a "Loop" construct to the intermediate
representation itself.  I'll try to explain why I think this is a bad
idea:

1. A low-level representation should be low-level.  This breaks the whole
   notion of a flat representation (in LLVMs case), and adds a special
   case that must be handled by anything that changes or analyzes the CFG.

2. A Loop construct is inherently limited by one of two things:
    A. It is not general enough to represent all of the possible loops
       that are able to be handled by a transformation.  For example, if
       you only support fortran style "DO" loops, then a recusive list
       prefetching transformation will have absolutely no use for it.
    B. It is TOO general.  Upon realizing that "A" has set in, you might
       make the Loop construct a lot more general.  For example, first you
       add the capability to increment the induction variable by a loop
       invariant amount, then you add the ability to not _have_ an
       induction variable at all.  Perhaps you recognize and represent
       irreducible loops as well or something.

       If you go down this path, then the Loop construct becomes
       meaningless.  Transformations become as complex as if they detected
       the loops themselves.  Representing general natural loops (which
       can have multiple back edges, for example) becomes a really nasty
       thing to represent as a "Loop" node, so the real generality desired
       is never implemented.

3. Cluttering up an intermediate representation with something that MAY be
   useful to some transformations is always a bad idea. It's better to
   focus on specific things to represent that are useful to nearly all
   transformations & code generation.  Representing loop information in
   the IR doesn't give you any advantage to representing it on the side,
   and it forces all clients of the IR to be aware of it.


> - Call-graph for caller-callee

This email is already WAY too long, so I'll cut this short.  The basic
jist of how LLVM represents this is pretty simple.  Function's
[ http://llvm.cs.uiuc.edu/doxygen/classFunction.html ] contain a list of
BasicBlock's, and themselves derive from Value.  Because of this, they
have def-use information just like everything else in LLVM.  The LLVM
'Call' instruction always takes a pointer to a function as an operand.
In the case of a direct call, then, this operand is a direct pointer to
the Function object itself.

For almost all purposes, this is a sufficient representation for
transformations that need "call graph" information [for example, inlining
does not need anything else].  No symbol-table lookup or other side
information is neccesary to represent the caller-callee information.  In
more complex cases (for example, a transformation that does stuff with
indirect calls), the transformation can request
[ http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html#getAnalysisUsage ] a
call graph object [ http://llvm.cs.uiuc.edu/doxygen/classCallGraph.html ]
that contains this information.

This covers the most common case quite nicely, and has other advantages as
well [it's trivial to find the calls sites for a particular function,
etc].

> However I think that introducing yet another
> translation from trees to LLVM could be expensive in compiling time (yet it is
> a quick transformation from SIMPLE).

In my mind, one of two things would make sense:

1. Change SIMPLE to incorporate some of the better ideas from LLVM, while
   leaving stuff people don't like alone.
2. Replace SIMPLE with LLVM.  If LLVM looks nice, and having a mature C++
   implementation would also help, I certainly would not be opposed to
   this.  I'm sure many people would though, and I'm explicitly not
   pushing for this.

I do not, however, think that it makes sense to have both a "SIMPLE" and
an "LLVM" phase to the compiler, there is almost complete overlap between
the functionality provided, so there should be only one.

Do you have any idea what kinds of things are possible in simple that you
don't think would be possible (or pleasant) in LLVM?  I'd like to get an
idea if there is something that would be hard to do in LLVM, but is easy
in SIMPLE (to figure out how much overlap there is).

I'm really sorry for the length of this mail, but I think it helps explain
a bit of the philosophy behind LLVM...

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25  9:48     ` [tree-ssa] " Pop Sébastian
  2002-08-25 14:33       ` Chris Lattner
@ 2002-08-25 15:35       ` Diego Novillo
  2002-08-25 15:45         ` Chris Lattner
  1 sibling, 1 reply; 39+ messages in thread
From: Diego Novillo @ 2002-08-25 15:35 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: Chris Lattner, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1540 bytes --]

On Sun, 25 Aug 2002, Pop Sébastian wrote:

> A good idea I saw in the LLVM is that the SSA infromation is
> stored directly in the operations stream: ie. (from
> LLVMCompilationStrategy.pdf) <result> = phi <type> [<val0>,
> <label0>], ..., [<valN>, <labelN>]
> 
> Diego, maybe it is worth to include PHI nodes directly as a
> SIMPLE extension?
>
I am not too keen on SSA re-writing schemes.  When you encode SSA
into your IR, you have the added aggravation of having to
re-write the code twice.  Once to convert it into SSA form and
the second time to convert it out of SSA form.

The approach we use now does not interfere with the underlying
IR.  What it does is overlay a web of def-use/use-def pointers on
top of the CFG.  Each of these def/use/phi references are
associated with a specific operand, expression, statement and
basic block.  This means that building the SSA form is not too
expensive (no IR re-writing) and once you're done, you just toss
the data away.

In any case, I'm always ready to be convinced about the benefits
of other approaches.  We're in the early stages, and are just now
starting to use the infrastructure to write optimizers.  As we
gain experience with it we'll be able to compare different
approaches.


> This could solve the problem with the tree_common.aux field...
>
Hmm, what problem?  The SSA form does not use that field
(directly).  The field is used to annotate trees with various
tidbits of information like what block they belong to, what
variable references are made by it, etc.


Diego.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 15:35       ` Diego Novillo
@ 2002-08-25 15:45         ` Chris Lattner
  2002-08-25 16:08           ` Diego Novillo
  2002-08-25 16:12           ` Daniel Berlin
  0 siblings, 2 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-25 15:45 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Pop Sébastian, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1000 bytes --]

On Sun, 25 Aug 2002, Diego Novillo wrote:
> On Sun, 25 Aug 2002, Pop Sébastian wrote:

> > Diego, maybe it is worth to include PHI nodes directly as a
> > SIMPLE extension?
> >
> I am not too keen on SSA re-writing schemes.  When you encode SSA
> into your IR, you have the added aggravation of having to
> re-write the code twice.  Once to convert it into SSA form and
> the second time to convert it out of SSA form.

Here's one possible way to look at this issue:  SIMPLE should always be in
SSA form.  When it is SIMPLE is initially constructed, SSA form is built.
When SIMPLE is lowered to RTL, UnSSA is run, or RTL in SSA form could even
be created and the existing RTL UnSSA pass could be used eventually.

> The approach we use now does not interfere with the underlying
> IR.  What it does is overlay a web of def-use/use-def pointers on

Is there a reason that you'd like to have a SIMPLE form that is not in
SSA?

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 14:33       ` Chris Lattner
@ 2002-08-25 16:03         ` Diego Novillo
  2002-08-27  9:32           ` Chris Lattner
  2002-08-26  8:52         ` Pop Sébastian
  1 sibling, 1 reply; 39+ messages in thread
From: Diego Novillo @ 2002-08-25 16:03 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Pop Sébastian, gcc

On Sun, 25 Aug 2002, Chris Lattner wrote:

> In my mind, one of two things would make sense:
> 
> 1. Change SIMPLE to incorporate some of the better ideas from LLVM, while
>    leaving stuff people don't like alone.
> 2. Replace SIMPLE with LLVM.  If LLVM looks nice, and having a mature C++
>    implementation would also help, I certainly would not be opposed to
>    this.  I'm sure many people would though, and I'm explicitly not
>    pushing for this.
> 
I agree with #1.  We chose to go with SIMPLE because it
represented an evolutionary approach wrt the existing
implementation.  Evolution is much more civilized than
revolution.

I've seen several neat ideas in LLVM.  Incorporating some of them
in GCC would be great.  And the tree-ssa branch is perfect for
that kind of experimentation.

> Do you have any idea what kinds of things are possible in simple that you
> don't think would be possible (or pleasant) in LLVM?  I'd like to get an
> idea if there is something that would be hard to do in LLVM, but is easy
> in SIMPLE (to figure out how much overlap there is).
> 
We have just started doing things in SIMPLE.  It is still
incomplete and we haven't found the really hard problems.  All
the work that we are doing is based on published algorithms and
designs, so it's relatively straightforward for people not
familiar with GCC to jump in and see how things are developing.


Diego.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 15:45         ` Chris Lattner
@ 2002-08-25 16:08           ` Diego Novillo
  2002-08-25 16:19             ` Diego Novillo
  2002-08-26  8:35             ` Chris Lattner
  2002-08-25 16:12           ` Daniel Berlin
  1 sibling, 2 replies; 39+ messages in thread
From: Diego Novillo @ 2002-08-25 16:08 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Pop Sébastian, gcc

On Sun, 25 Aug 2002, Chris Lattner wrote:

> > The approach we use now does not interfere with the underlying
> > IR.  What it does is overlay a web of def-use/use-def pointers on
> 
> Is there a reason that you'd like to have a SIMPLE form that is not in
> SSA?
> 
Only that nobody has bothered implementing it.  We'd need to
represent PHI nodes in the IR and teach the rest of the compiler
how to deal with them (RTL expanders, tree manipulation, etc).
In principle it would be a rather mechanic, albeit large, patch.


Diego.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 15:45         ` Chris Lattner
  2002-08-25 16:08           ` Diego Novillo
@ 2002-08-25 16:12           ` Daniel Berlin
  2002-08-26  7:02             ` Pop Sébastian
  2002-08-26  8:52             ` Chris Lattner
  1 sibling, 2 replies; 39+ messages in thread
From: Daniel Berlin @ 2002-08-25 16:12 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc

> > The approach we use now does not interfere with the underlying
> > IR.  What it does is overlay a web of def-use/use-def pointers on
> 
> Is there a reason that you'd like to have a SIMPLE form that is not in
> SSA?

Um, yes.
It artificially limits us to that representation.
Say later, someone wants to use some other form of SSA, or non-SSA 
(dependence flow graph, etc).
If you make it actually part of the trees, they now have to care about 
it.
If it's just annotations, only what wants to look at those annotations 
look at it.

Plus, there are optimization passes that might not be run on SSA (some 
loop optimizations).

Once again, forcing SSA on the representation itself means they have to 
care about it (or you've limited when they can be run), while annotations 
have no such limitations.

There is no good reason to actually make it part of the form, other than 
having to insert copies yourself.  But all algorithms i've seen that are 
written to keep SSA form up to date do this anyway, or make it easy to do.
--Dan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 16:08           ` Diego Novillo
@ 2002-08-25 16:19             ` Diego Novillo
  2002-08-26  8:39               ` Chris Lattner
  2002-08-26  8:35             ` Chris Lattner
  1 sibling, 1 reply; 39+ messages in thread
From: Diego Novillo @ 2002-08-25 16:19 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Pop Sébastian, gcc

On Sun, 25 Aug 2002, Diego Novillo wrote:

> On Sun, 25 Aug 2002, Chris Lattner wrote:
> 
> > Is there a reason that you'd like to have a SIMPLE form that is not in
> > SSA?
> > 
> Only that nobody has bothered implementing it.  We'd need to
>
Well, that and the fact that by encoding SSA into the IR you're
now forcing every transformation to know about SSA.  Not all of
them do.  In principle I don't see why dataflow and IR should be
tied to each other.


Diego.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 16:12           ` Daniel Berlin
@ 2002-08-26  7:02             ` Pop Sébastian
  2002-08-26  8:28               ` Steven Bosscher
  2002-08-26  8:52             ` Chris Lattner
  1 sibling, 1 reply; 39+ messages in thread
From: Pop Sébastian @ 2002-08-26  7:02 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Chris Lattner, Diego Novillo, gcc

On Sun, Aug 25, 2002 at 07:12:03PM -0400, Daniel Berlin wrote:
> Plus, there are optimization passes that might not be run on SSA (some 
> loop optimizations).
> 
> Once again, forcing SSA on the representation itself means they have to 
> care about it (or you've limited when they can be run), while annotations 
> have no such limitations.
> 
okay.  I get the idea.

> There is no good reason to actually make it part of the form, 

  I'd like we examine also the case of DECL_STMTs that seems to fall under the 
same pattern.  
  For now we have a symbol table that spreads all over the tree.  DECL_ nodes
define entries in this symbol table.  Isn't a "stand alone" symbol table a better 
representation for storing this information?  It's probably a good way to store 
variable declarations in C-like languages, but what about fortran and other 
languages?  


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26  7:02             ` Pop Sébastian
@ 2002-08-26  8:28               ` Steven Bosscher
  0 siblings, 0 replies; 39+ messages in thread
From: Steven Bosscher @ 2002-08-26  8:28 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]

Op ma 26-08-2002, om 16:02 schreef Pop Sébastian:
> On Sun, Aug 25, 2002 at 07:12:03PM -0400, Daniel Berlin wrote:
> > Plus, there are optimization passes that might not be run on SSA (some 
> > loop optimizations).
> > 
> > Once again, forcing SSA on the representation itself means they have to 
> > care about it (or you've limited when they can be run), while annotations 
> > have no such limitations.
> > 
> okay.  I get the idea.
> 
> > There is no good reason to actually make it part of the form, 
> 
>   I'd like we examine also the case of DECL_STMTs that seems to fall under the 
> same pattern.  
>   For now we have a symbol table that spreads all over the tree.  DECL_ nodes
> define entries in this symbol table.  Isn't a "stand alone" symbol table a better 
> representation for storing this information?  It's probably a good way to store 
> variable declarations in C-like languages, but what about fortran and other 
> languages?  

What do you mean by "stand alone"? You mean, one big separate table for,
say, each function? I don't see how you would declare a local entity
with such a symbol table.

Why would there be any difference between C and other languages?

Greetz
Steven


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 16:08           ` Diego Novillo
  2002-08-25 16:19             ` Diego Novillo
@ 2002-08-26  8:35             ` Chris Lattner
  1 sibling, 0 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-26  8:35 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Pop Sébastian, gcc

On Sun, 25 Aug 2002, Diego Novillo wrote:
> On Sun, 25 Aug 2002, Chris Lattner wrote:
> > > The approach we use now does not interfere with the underlying
> > > IR.  What it does is overlay a web of def-use/use-def pointers on
> >
> > Is there a reason that you'd like to have a SIMPLE form that is not in
> > SSA?

> Only that nobody has bothered implementing it.  We'd need to
> represent PHI nodes in the IR and teach the rest of the compiler
> how to deal with them (RTL expanders, tree manipulation, etc).
> In principle it would be a rather mechanic, albeit large, patch.

In principle, you could get away from teaching all of the backends and
their RTL expanders about PHI nodes by simply simulating them by putting
an assignment in the appropriate predecessor blocks.  Performing UnSSA on
a list-of-instructions representation is really quite simple and could be
performed at the same time as RTL expansion.

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 16:19             ` Diego Novillo
@ 2002-08-26  8:39               ` Chris Lattner
  2002-08-26 17:43                 ` Daniel Berlin
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-26  8:39 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Pop Sébastian, gcc

On Sun, 25 Aug 2002, Diego Novillo wrote:
> On Sun, 25 Aug 2002, Diego Novillo wrote:
> > On Sun, 25 Aug 2002, Chris Lattner wrote:
> > > Is there a reason that you'd like to have a SIMPLE form that is not in
> > > SSA?

> > Only that nobody has bothered implementing it.  We'd need to

> Well, that and the fact that by encoding SSA into the IR you're
> now forcing every transformation to know about SSA.  Not all of
> them do.  In principle I don't see why dataflow and IR should be
> tied to each other.

In practice, I have found that it is _very_ easy to extend traditional
transformations to use SSA.  The one exception I've found is the SSA-PRE
algorithm, which you need to wrap your mind around before it makes
complete sense.  Once done, however, it's not a huge deal to implement it.

IMHO, the advantages of using SSA and having it available (for both
convenience, optimizer efficiency, and more powerful algorithms than data
flow can provide at times) greatly out-weigh the disadvantages of having
to update it.

What kinds of transformations do you have in mind that would be
inconvenient to implement in SSA form?

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 14:33       ` Chris Lattner
  2002-08-25 16:03         ` Diego Novillo
@ 2002-08-26  8:52         ` Pop Sébastian
  2002-08-26  9:17           ` Chris Lattner
  2002-08-26 21:18           ` Daniel Berlin
  1 sibling, 2 replies; 39+ messages in thread
From: Pop Sébastian @ 2002-08-26  8:52 UTC (permalink / raw)
  To: Chris Lattner; +Cc: dnovillo, gcc

On Sun, Aug 25, 2002 at 04:36:59PM -0500, Chris Lattner wrote:
> 
> > OTOH this information is not ready to be used dirrectly by optimizers since IRs
> > can produce different representations for the same semantic-equivalent program.
> 
> I'm not really sure what you mean by this.  In my view, it is always best
> to eliminate the number of redundancies and number of cases that must be
> handled in the representation (ie, by lowering ++X to use primitives, you
> don't have to handle the ++ case anymore).  I'm not sure if this is what
> you're getting at, but if so, I certainly agree.
> 
Exact.  

  In fact by reducing the number of cases to be handled (as in reducing synactic 
sugar, and control sugar) we simplify the optimizer.  Reducing "control sugar" is 
something that is well done in LLVM since all loop constructs are lowered to the 
same common denominator: branch instructions.

  However after the discussion about the SWITCH_STMT elimination I saw things 
differently:  in a first step we reduce control structures to a common 
representative, then it is worth to consider that some architectures support high 
level control constructs and then a promoting pass has to be applied. 
  Thus the underlying question is: is it worth to lower statements for promoting 
them later?  

  My opinion is that we win in having less complex optimizers.

[...]
> 
> Note that this representation is equivalent to how RTL stores the Mode
> information in all of the "values" it uses.  Here, however, the type
> system is much more powerful (and useful).  My key point is that this
> information is kept IN the representation itself, not along-side in a
> table 

same thing happens in tree representation where symbol table is stored in 
DECL_STMTs dirrectly in the tree.

> that is:
> 
>  A. Slow - in the best case, accessing it is a hash table lookup vs a
>     simple pointer indirection
>  B. Large - again, hash tables have overhead
>  C. Painful - The symbol table must always be kept up-to-date.
> 

you're right... sorry for previous mail.

> 
> This is great, but it doesn't really answer how we get a CFG out of this.
> The simplest CFG interface can tell you all of the incoming and outgoing
> edges of a block.  To do this, we have iterators that wrap the following
> [trivial] analyses:
> 
[...]
>   the succ_iterator.
[...]
>   the pred_iterator basically loops over the users of the basic
>   block, looking for terminators instructions.
> 

Yep that's a nice interface for handling CFG nodes.
I saw something like this also in open64.  
It's so simple to just say: iterate over all BB in topological order
and just write "for (topological_CFG::iterator i = ..."  
:-)

> This just leaves one feature lacking from the CFG support:
> 
> >   However LLVM does not provide support for high level loop constructs
> > (nested loops for example).  It defines instead branch "br" instructions.
> > You then need extra informations for loop nest optimizers.
> 
> That is very true.  In fact we intentionally do not provide these
> facilities as part of LLVM itself, for several reasons (below).  To
> provide LoopInfo [ http://llvm.cs.uiuc.edu/doxygen/classLoopInfo.html ], we
> use the LLVM Pass infrastructure (described in detail in:
> http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html ) to automate the
> construction and deconstruction of loop information as neccesary.  Our
> "loop information" simply identifies Natural loops and their nesting
> properties.
> 
> There are several important aspects of loop information that make it
> different from other parts of the representation, warranting different
> treatment [instead of being part of the representation itself, it gets
> treated like alias analysis information or dominator information]:
> 
> 1. Updating loop information is not trivial.  
[...]
> 2. Most transformations don't need it.  
[...]
> 
right, and right
It seems that the same thing happens at this level again. 
We can store information on the side structures and deal with reduced 
sets of loop structures.

> 
> Do you have any idea what kinds of things are possible in simple that you
> don't think would be possible (or pleasant) in LLVM?  I'd like to get an
> idea if there is something that would be hard to do in LLVM, but is easy
> in SIMPLE (to figure out how much overlap there is).
> 
Hmmm, I have to think more about this...

For the moment I see that LLVM can express high level loop structures by 
handling them as side information.  This trick can be used also to store 
information about loop induction variables, thus fortran's DO-loops can 
be effectively handled by optimizers in the same way as on tree structures.
And since you handle only the lowest level loop structures you win in 
having low level complexity optimizers.

[flat trees]
LLVM could be obtained by goto introduction from SIMPLE 
  break, continue -> br
  for, while, do -> br
  switch -> switch ("an extended br")

But then you have to construct loop information before you flatten statements.

> I'm really sorry for the length of this mail, but I think it helps explain
> a bit of the philosophy behind LLVM...
> 
thanks a lot for contributing ideas :-)


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 16:12           ` Daniel Berlin
  2002-08-26  7:02             ` Pop Sébastian
@ 2002-08-26  8:52             ` Chris Lattner
  2002-08-26 17:40               ` Daniel Berlin
  1 sibling, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-26  8:52 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc

On Sun, 25 Aug 2002, Daniel Berlin wrote:

> > Is there a reason that you'd like to have a SIMPLE form that is not in
> > SSA?
>
> Um, yes.
> It artificially limits us to that representation.
> Say later, someone wants to use some other form of SSA, or non-SSA
> (dependence flow graph, etc).
> If you make it actually part of the trees, they now have to care about
> it.
> If it's just annotations, only what wants to look at those annotations
> look at it.

Actually, that's not true at all.  Using SSA as the base representation is
just like using a conventional register based underlying representation:
you can still layer other representations ON TOP of SSA, you just need to
keep the SSA up-to-date as you transform the layered representation (just
as you must do with the register based representation).

In practice, there are several simple extensions to SSA that are
representable as direct extensions to the representation [eta nodes, pi
nodes from the ABCE paper, etc], which don't require a layered
representation.  Other representations are really major extensions of SSA
which are enhanced by having SSA as the base representation.

Of course, I'm sure there are others I'm not familiar with, but worst
case, you just have to keep the base representation up to date with the
layered representation, which is nothing different than the current
situation with SSA.

The key difference with using SSA as the base representation, however, is
that almost ALL transformations benefit from it.  SSA will probably be the
_most_ commonly used representation when the tree-ssa branch is finally
mature (hence the name), beating out the register based representation by
a lot.  I'm not really sure when and if other representations would
actually be useful, and if their usefulness is worth the cost of creating
and maintaining the representation: whether layered on top of registers or
SSA.

> Plus, there are optimization passes that might not be run on SSA (some
> loop optimizations).

Loop optimizations generally work wonderfully on SSA form.  The PHI nodes
encapsulates a lot of the information you need directly in the code, which
is _quite_ convenient.

> Once again, forcing SSA on the representation itself means they have to
> care about it (or you've limited when they can be run), while annotations
> have no such limitations.

Sure, of course.  If, however, they don't understand the annotations, they
will be corrupted by the pass.  This means that you have to destroy and
regenerate the SSA annotations.  This is greatly expensive, which means
that the transformation will be updated to use SSA anyway (often making it
more efficient).  While forcing SSA from the start makes it harder to
transfer legacy code, I think this is actually a good thing for a couple
of reasons:

1. The code should be rewritten anyway, because the IR is different, there
   are different tradeoffs, and different cases to handle.
2. Writing a transformation taking advantage of SSA often _greatly_
   simplifies the code.  Again, I can show many examples of simplified
   transformations.
3. Destroying and regenerating SSA information is quite expensive when
   it's unneccesary.  Keeping track of whether or not SSA is created is
   also painful, hurting modularity.

> There is no good reason to actually make it part of the form, other than
> having to insert copies yourself.  But all algorithms i've seen that are
> written to keep SSA form up to date do this anyway, or make it easy to do.

I'm not sure if I understand what you're saying here... it seems as though
you are contradicting yourself by saying that SSA _is_ easy to keep
up-to-date (which I believe it is, of course :)  What am I
misinterpreting?

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26  8:52         ` Pop Sébastian
@ 2002-08-26  9:17           ` Chris Lattner
  2002-08-26 21:18           ` Daniel Berlin
  1 sibling, 0 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-26  9:17 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: dnovillo, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 5632 bytes --]

On Mon, 26 Aug 2002, [iso-8859-1] Pop Sébastian wrote:
> On Sun, Aug 25, 2002 at 04:36:59PM -0500, Chris Lattner wrote:

>   In my view, it is always best
> > to eliminate the number of redundancies and number of cases that must be
> > handled in the representation (ie, by lowering ++X to use primitives, you

>   In fact by reducing the number of cases to be handled (as in reducing synactic
> sugar, and control sugar) we simplify the optimizer.  Reducing "control sugar" is
> something that is well done in LLVM since all loop constructs are lowered to the
> same common denominator: branch instructions.

Yes, it makes things much simpler to deal with.

>   However after the discussion about the SWITCH_STMT elimination I saw things
> differently:  in a first step we reduce control structures to a common
> representative, then it is worth to consider that some architectures support high
> level control constructs and then a promoting pass has to be applied.
>   Thus the underlying question is: is it worth to lower statements for promoting
> them later?

This comes up a lot in LLVM, because it is decidedly, well, low-level.  :)
In our case, we do actually have a switch statement as part of the
representation ( http://llvm.cs.uiuc.edu/docs/LangRef.html#i_switch ), but I
am strongly considering removing it, and using a general indirect branch
instead.  The idea is that by having it be an indirect branch, you have to
following features:

1. Optimizations don't have to worry about the special case anymore.
2. Anything that uses the fact that it was a "switch" statement (such as
   subsequent lowering, or other optimizations), can simply be implemented
   to see if the pointer argument is derived from an array of labels.
   Since LLVM provides strong support for representing global values, and
   it would be a constant array, traditional optimizations would not
   really be more complex (especially with a couple of wrapper functions).

Because of this, I'm strongly leaning towards eliminating the instruction,
but haven't yet.

>   My opinion is that we win in having less complex optimizers.

Definately.  Less complex optimizers are more likely to be correct also.

> > Note that this representation is equivalent to how RTL stores the Mode
> > information in all of the "values" it uses.  Here, however, the type
> > system is much more powerful (and useful).  My key point is that this
> > information is kept IN the representation itself, not along-side in a
> > table
>
> same thing happens in tree representation where symbol table is stored in
> DECL_STMTs dirrectly in the tree.

Ah yes, you're right.  I'm glad.  :)  The next thing that I need to
convince you of is the need for a language independant type system.  :)

> > This is great, but it doesn't really answer how we get a CFG out of this.
> > The simplest CFG interface can tell you all of the incoming and outgoing
> > edges of a block.  To do this, we have iterators that wrap the following
> > [trivial] analyses:
> >   the succ_iterator & pred_iterator

> Yep that's a nice interface for handling CFG nodes.
> I saw something like this also in open64.
> It's so simple to just say: iterate over all BB in topological order
> and just write "for (topological_CFG::iterator i = ..."
> :-)

Definately.  We have all kinds of nice things like:

// traverse flow graph in DFO
for (df_iterator<BasicBlock> I = df_begin(BB)...

// traverse flow graph in depth first order along the predecessors,
// instead of successor links.
for (idf_iterator<BasicBlock> I = idf_begin(BB)...

> > Do you have any idea what kinds of things are possible in simple that you
> > don't think would be possible (or pleasant) in LLVM?  I'd like to get an
> > idea if there is something that would be hard to do in LLVM, but is easy
> > in SIMPLE (to figure out how much overlap there is).

> For the moment I see that LLVM can express high level loop structures by
> handling them as side information.  This trick can be used also to store
> information about loop induction variables, thus fortran's DO-loops can
> be effectively handled by optimizers in the same way as on tree structures.
> And since you handle only the lowest level loop structures you win in
> having low level complexity optimizers.

Yup, a trick that we use is to have the cannonical induction variable for
a loop moved to be the first PHI node in a loop header.  That way you
don't even have to store information on the side for it.  Loop
transformations just look like this:

for each loop
  if (isACannonicalIndVar(EntryBB->front()))
    do stuff

Where isACannonicalIndVar just checks to see if the PHI node is formed
with an incoming constant and an add instruction (which is very fast).

> [flat trees]
> LLVM could be obtained by goto introduction from SIMPLE
>   break, continue -> br
>   for, while, do -> br
>   switch -> switch ("an extended br")
>
> But then you have to construct loop information before you flatten statements.

LLVM computes loop information from dominator information, which it
computes directly from the LLVM itself.  Loop information is not required
to read and write LLVM, it's one of several analyses that can be computed
on the side.

> > I'm really sorry for the length of this mail, but I think it helps explain
> > a bit of the philosophy behind LLVM...
> >
> thanks a lot for contributing ideas :-)

No problem.  :)  Now I just have to convince everyone of the utility and
advantage of making the IR a full fledged language in its own right, which
is self contained.  :)

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26  8:52             ` Chris Lattner
@ 2002-08-26 17:40               ` Daniel Berlin
  2002-08-26 17:44                 ` Daniel Berlin
  2002-08-26 18:11                 ` Chris Lattner
  0 siblings, 2 replies; 39+ messages in thread
From: Daniel Berlin @ 2002-08-26 17:40 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc

On Mon, 26 Aug 2002, Chris Lattner wrote:

> On Sun, 25 Aug 2002, Daniel Berlin wrote:
> 
> > > Is there a reason that you'd like to have a SIMPLE form that is not in
> > > SSA?
> >
> > Um, yes.
> > It artificially limits us to that representation.
> > Say later, someone wants to use some other form of SSA, or non-SSA
> > (dependence flow graph, etc).
> > If you make it actually part of the trees, they now have to care about
> > it.
> > If it's just annotations, only what wants to look at those annotations
> > look at it.
> 
> Actually, that's not true at all.  Using SSA as the base representation is
> just like using a conventional register based underlying representation:
> you can still layer other representations ON TOP of SSA, you just need to
> keep the SSA up-to-date as you transform the layered representation (just
> as you must do with the register based representation).

No kidding, which is the problem.

> 
> In practice, there are several simple extensions to SSA that are
> representable as direct extensions to the representation [eta nodes, pi
> nodes from the ABCE paper, etc], which don't require a layered
> representation.  Other representations are really major extensions of SSA
> which are enhanced by having SSA as the base representation.
Not necessarily true at all.
But i'm also not saying SSA is a bad base, i just don't think we need to 
make it the *actual* IR.

> 
> Of course, I'm sure there are others I'm not familiar with, but worst
> case, you just have to keep the base representation up to date with the
> layered representation, which is nothing different than the current
> situation with SSA.
This has nothing to do with keeping it up to date. It has to do with the 
fact that other things would need to know about PHI nodes, etc.
We can easily just rebuild the SSA form if we want to run a non-SSA pass 
in between two SSA ones (or whatever). But it's quicker to be able to just 
*RUN* that pass without having to drop PHI nodes, insert copies, etc, just 
so that non-SSA pass doesn't get confused.

 > 
> The key difference with using SSA as the base representation, however, is
> that almost ALL transformations benefit from it.  SSA will probably be the
> _most_ commonly used representation when the tree-ssa branch is finally
> mature (hence the name), beating out the register based representation by
> a lot.  I'm not really sure when and if other representations would
> actually be useful, and if their usefulness is worth the cost of creating
> and maintaining the representation: whether layered on top of registers or
> SSA.

DFG, PDG, etc are generally worth the cost of creating.


> 
> > Plus, there are optimization passes that might not be run on SSA (some
> > loop optimizations).
> 
> Loop optimizations generally work wonderfully on SSA form.  The PHI nodes
> encapsulates a lot of the information you need directly in the code, which
> is _quite_ convenient.

> > Once again, forcing SSA on the representation itself means they have to
> > care about it (or you've limited when they can be run), while annotations
> > have no such limitations.
> 
> Sure, of course.  If, however, they don't understand the annotations, they
> will be corrupted by the pass.  This means that you have to destroy and
> regenerate the SSA annotations. 
So?
The other option is to not be able to run the pass at all.
Why limit ourselves by making sure we can't run that other pass, when we 
get no actual *benefit* from doing so?

>  This is greatly expensive, which means
> that the transformation will be updated to use SSA anyway (often making it
> more efficient).  While forcing SSA from the start makes it harder to
> transfer legacy code, 

It's not necessarily legacy code.
All the world is not SSA.
Even if it was, it may not stay that way forever.

>    simplifies the code.  Again, I can show many examples of simplified
>    transformations.
> 3. Destroying and regenerating SSA information is quite expensive when
>    it's unneccesary.  Keeping track of whether or not SSA is created is
>    also painful, hurting modularity.
> 
> > There is no good reason to actually make it part of the form, other than
> > having to insert copies yourself.  But all algorithms i've seen that are
> > written to keep SSA form up to date do this anyway, or make it easy to do.
> 
> I'm not sure if I understand what you're saying here... it seems as though
> you are contradicting yourself by saying that SSA _is_ easy to keep
> up-to-date (which I believe it is, of course :)  What am I
> misinterpreting?

You keep missing the actual point.

1. There is no benefit to putting SSA in the trees, as opposed to 
annotations.
2. There is a cost to putting SSA in the trees, regardless of whether 
you feel the things that aren't SSA, are worth squat.

Thus, given this, *why* do it?

What does it buy us?

That's what i want to know.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26  8:39               ` Chris Lattner
@ 2002-08-26 17:43                 ` Daniel Berlin
  2002-08-26 18:12                   ` Chris Lattner
  0 siblings, 1 reply; 39+ messages in thread
From: Daniel Berlin @ 2002-08-26 17:43 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc

On Mon, 26 Aug 2002, Chris Lattner wrote:

> On Sun, 25 Aug 2002, Diego Novillo wrote:
> > On Sun, 25 Aug 2002, Diego Novillo wrote:
> > > On Sun, 25 Aug 2002, Chris Lattner wrote:
> > > > Is there a reason that you'd like to have a SIMPLE form that is not in
> > > > SSA?
> 
> > > Only that nobody has bothered implementing it.  We'd need to
> 
> > Well, that and the fact that by encoding SSA into the IR you're
> > now forcing every transformation to know about SSA.  Not all of
> > them do.  In principle I don't see why dataflow and IR should be
> > tied to each other.
> 
> In practice, I have found that it is _very_ easy to extend traditional
> transformations to use SSA.  The one exception I've found is the SSA-PRE
> algorithm, which you need to wrap your mind around before it makes
> complete sense.  Once done, however, it's not a huge deal to implement it.
> 
Which is why I did so already.

> IMHO, the advantages of using SSA and having it available (for both
> convenience, optimizer efficiency, and more powerful algorithms than data
> flow can provide at times) greatly out-weigh the disadvantages of having
> to update it.

We already have it available, and used.
Nobody is saying we shouldn't use SSA algorithms when possible, or convert 
algorithms to SSA.
We aren't talking about whether to use SSA or not, we are talking about 
whether to keep it as annotations, or encode it into the IR.
Please address the benefits of *that*, not of SSA in general.
> 
> What kinds of transformations do you have in mind that would be
> inconvenient to implement in SSA form?

That's not quite relevant.
The real question is:
What does encoding it directly into the IR, rather than doing is as we do 
now, buy us, that makes it worth it, given the disadvantages?



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 17:40               ` Daniel Berlin
@ 2002-08-26 17:44                 ` Daniel Berlin
  2002-08-26 18:11                 ` Chris Lattner
  1 sibling, 0 replies; 39+ messages in thread
From: Daniel Berlin @ 2002-08-26 17:44 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc

On Mon, 26 Aug 2002, Daniel Berlin wrote:

> On Mon, 26 Aug 2002, Chris Lattner wrote:
> 
> > On Sun, 25 Aug 2002, Daniel Berlin wrote:
> > 
> > > > Is there a reason that you'd like to have a SIMPLE form that is not in
> > > > SSA?
> > >
> > > Um, yes.
> > > It artificially limits us to that representation.
> > > Say later, someone wants to use some other form of SSA, or non-SSA
> > > (dependence flow graph, etc).
> > > If you make it actually part of the trees, they now have to care about
> > > it.
> > > If it's just annotations, only what wants to look at those annotations
> > > look at it.
> > 
> > Actually, that's not true at all.  Using SSA as the base representation is
> > just like using a conventional register based underlying representation:
> > you can still layer other representations ON TOP of SSA, you just need to
> > keep the SSA up-to-date as you transform the layered representation (just
> > as you must do with the register based representation).
> 
> No kidding, which is the problem.

Just to be clear, i mean the fact that we have to layer things on top of 
it is the problem. not the updating.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 17:40               ` Daniel Berlin
  2002-08-26 17:44                 ` Daniel Berlin
@ 2002-08-26 18:11                 ` Chris Lattner
  2002-08-26 18:46                   ` Daniel Berlin
  1 sibling, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-26 18:11 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc

On Mon, 26 Aug 2002, Daniel Berlin wrote:
> On Mon, 26 Aug 2002, Chris Lattner wrote:
> > On Sun, 25 Aug 2002, Daniel Berlin wrote:
> > > > Is there a reason that you'd like to have a SIMPLE form that is not in
> > > > SSA?
> > >
> > > Um, yes.
> > > It artificially limits us to that representation.
> > > Say later, someone wants to use some other form of SSA, or non-SSA
> > > (dependence flow graph, etc).
> > > If you make it actually part of the trees, they now have to care about
> > > it.
> > > If it's just annotations, only what wants to look at those annotations
> > > look at it.
> >
> > Actually, that's not true at all.  Using SSA as the base representation is
> > just like using a conventional register based underlying representation:
> > you can still layer other representations ON TOP of SSA, you just need to
> > keep the SSA up-to-date as you transform the layered representation (just
> > as you must do with the register based representation).
>
> No kidding, which is the problem.

> Just to be clear, i mean the fact that we have to layer things on top of
> it is the problem. not the updating.

I understand exactly what you're saying.  My point is that if you want to
do a non-SSA representation TODAY, that you have to deal with layering.
Changing the base representation to SSA form doesn't change this fact.

> > In practice, there are several simple extensions to SSA that are
> > representable as direct extensions to the representation [eta nodes, pi
> > nodes from the ABCE paper, etc], which don't require a layered
> > representation.  Other representations are really major extensions of SSA
> > which are enhanced by having SSA as the base representation.

> Not necessarily true at all.
> But i'm also not saying SSA is a bad base, i just don't think we need to
> make it the *actual* IR.

I understand what you're saying.  :)

> > Of course, I'm sure there are others I'm not familiar with, but worst
> > case, you just have to keep the base representation up to date with the
> > layered representation, which is nothing different than the current
> > situation with SSA.

> This has nothing to do with keeping it up to date. It has to do with the
> fact that other things would need to know about PHI nodes, etc.
> We can easily just rebuild the SSA form if we want to run a non-SSA pass
> in between two SSA ones (or whatever). But it's quicker to be able to just
> *RUN* that pass without having to drop PHI nodes, insert copies, etc, just
> so that non-SSA pass doesn't get confused.

This has everything to do with keeping it up to date.  This is the
situation as it exists today with a layered SSA reprsentation:

1. All SSA transformations are penalized by having to update both SSA form
   and standard register form.  This makes SSA transformations harder and
   less likely to be written.
2. Alternative representations can be represented just like SSA is now,
   layered.
3. Non-SSA transformations don't have to deal with SSA at all, they can
   just ignore it and have it garbage collected away.
4. Non-SSA transformations cause a huge compile-time performance hit,
   because they invalidate the SSA information, requiring it to be
   rebuilt.  SSA generation isn't SLOW, but regenerating it frequently can
   certainly add noticably to compile time.  If dominator information is
   also out of date then things get really slow for big codes.
5. Transformations on alternative representations can either be aware of
   SSA or not.  If they aren't, then they suffer the same problems as #4.
   If they are, then they have to maintain and manipulate _3_ different
   representations, keeping them up to date.

... and this is the situation with SSA built into the representation:

1. SSA transformations just update the one true representation.  There is
   nothing to get out of date, and they are simple to write.
2. Transformation in alternative representations are implemented as layers
   on top of the SSA representation.  There isn't a huge difference
   between this case and a register based representation.
3. You don't have to allocate as much to maintain two reprentations.  You
   don't have to track whether or not things are up to date, because SSA
   is always available.  Thus, the optimizer runs faster.
4. Non-SSA transformations are harder to write.  See below.
5. Transformations on alternative representations have to maintain _at
   most_ two representations.  The register based representation doesn't
   factor into the picture at all.

About Non-SSA transformations:

1. I have yet to be convinced that there is a need for such a thing at
   all.  Entire optimizers (Pro64, LLVM) have been written in SSA, and I
   haven't come across an example of a transformation that is awkward to
   implement in SSA.  If you have a good example, please let me know.
2. Even if you do come up with a transformation, you can use a (generally
   useful) pass like the LLVM mem2reg pass.  This transformation basically
   promotes stack allocated values accessed with load & store
   instructions into SSA registers (memory is not in SSA form).

   If you have a transformation that really doesn't want to maintain and
   update SSA form, you simply have it so that it converts the stuff it's
   working on into loads and stores of a stack allocated value.  After the
   pass you run the mem2reg pass, which cleans everything up for you
   (basically constructs SSA just for the values you're interested in)

Either way, non-ssa transformations are not a problem, and again, I'm not
aware of anything that doesn't like to be in SSA.

> > The key difference with using SSA as the base representation, however, is
> > that almost ALL transformations benefit from it.  SSA will probably be the
> > _most_ commonly used representation when the tree-ssa branch is finally
> > mature (hence the name), beating out the register based representation by
> > a lot.  I'm not really sure when and if other representations would
> > actually be useful, and if their usefulness is worth the cost of creating
> > and maintaining the representation: whether layered on top of registers or
> > SSA.
>
> DFG, PDG, etc are generally worth the cost of creating.

These representations are also easily built "on the side" on top of SSA,
so they aren't very good for showing how SSA gets in the way...

> > > Once again, forcing SSA on the representation itself means they have to
> > > care about it (or you've limited when they can be run), while annotations
> > > have no such limitations.
> >
> > Sure, of course.  If, however, they don't understand the annotations, they
> > will be corrupted by the pass.  This means that you have to destroy and
> > regenerate the SSA annotations.
> So?
> The other option is to not be able to run the pass at all.
> Why limit ourselves by making sure we can't run that other pass, when we
> get no actual *benefit* from doing so?

That's not true at all, please see the remarks above.  You seem very
concerned about these transformations... can you give me some examples?

> >  This is greatly expensive, which means
> > that the transformation will be updated to use SSA anyway (often making it
> > more efficient).  While forcing SSA from the start makes it harder to
> > transfer legacy code,
>
> It's not necessarily legacy code.
> All the world is not SSA.
> Even if it was, it may not stay that way forever.

That's true, but even though SSA is not perfect, it's a HUGE step up over
just using a register based representation.  Why limit ourselves to the
lowest common denominator?

> > > There is no good reason to actually make it part of the form, other than
> > > having to insert copies yourself.  But all algorithms i've seen that are
> > > written to keep SSA form up to date do this anyway, or make it easy to do.
> >
> > I'm not sure if I understand what you're saying here... it seems as though
> > you are contradicting yourself by saying that SSA _is_ easy to keep
> > up-to-date (which I believe it is, of course :)  What am I
> > misinterpreting?
>
> You keep missing the actual point.
>
> 1. There is no benefit to putting SSA in the trees, as opposed to
> annotations.
> 2. There is a cost to putting SSA in the trees, regardless of whether
> you feel the things that aren't SSA, are worth squat.
>
> Thus, given this, *why* do it?
> What does it buy us?
> That's what i want to know.

Obviously because maintaining the representation on the side has costs,
costs that you seem to be ignoring.  If you want me to enumerate some of
them, I can.

The whole point is that SSA is a very nice representation for performing
optimizations on, and therefore you want to make writing these
optimizations easier.  If it makes the less common case a bit harder, then
that's ok, because it's not common.  Sorta like profile guided
optimization I suppose, were we are optimizing initial development and
ongoing maintenance instead of execution time.

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 17:43                 ` Daniel Berlin
@ 2002-08-26 18:12                   ` Chris Lattner
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-26 18:12 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc

> > IMHO, the advantages of using SSA and having it available (for both
> > convenience, optimizer efficiency, and more powerful algorithms than data
> > flow can provide at times) greatly out-weigh the disadvantages of having
> > to update it.
>
> We already have it available, and used.
> Nobody is saying we shouldn't use SSA algorithms when possible, or convert
> algorithms to SSA.
> We aren't talking about whether to use SSA or not, we are talking about
> whether to keep it as annotations, or encode it into the IR.
> Please address the benefits of *that*, not of SSA in general.

See: http://gcc.gnu.org/ml/gcc/2002-08/msg01658.html

> > What kinds of transformations do you have in mind that would be
> > inconvenient to implement in SSA form?
>
> That's not quite relevant.
> The real question is:
> What does encoding it directly into the IR, rather than doing is as we do
> now, buy us, that makes it worth it, given the disadvantages?

Ditto.

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 18:11                 ` Chris Lattner
@ 2002-08-26 18:46                   ` Daniel Berlin
  2002-08-27  9:26                     ` Chris Lattner
  0 siblings, 1 reply; 39+ messages in thread
From: Daniel Berlin @ 2002-08-26 18:46 UTC (permalink / raw)
  To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc

No kidding, which is the problem.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26  8:52         ` Pop Sébastian
  2002-08-26  9:17           ` Chris Lattner
@ 2002-08-26 21:18           ` Daniel Berlin
  2002-08-27  1:17             ` Pop Sébastian
                               ` (2 more replies)
  1 sibling, 3 replies; 39+ messages in thread
From: Daniel Berlin @ 2002-08-26 21:18 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: Chris Lattner, dnovillo, gcc



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 21:18           ` Daniel Berlin
@ 2002-08-27  1:17             ` Pop Sébastian
  2002-08-27  6:28               ` Diego Novillo
                                 ` (2 more replies)
  2002-08-27  6:44             ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher
  2002-08-27  8:29             ` [tree-ssa] AST optimizer in C++? Chris Lattner
  2 siblings, 3 replies; 39+ messages in thread
From: Pop Sébastian @ 2002-08-27  1:17 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Chris Lattner, dnovillo, gcc

On Tue, Aug 27, 2002 at 12:18:31AM -0400, Daniel Berlin wrote:
> 
> I've yet to deal with the issue of a Tree class (either by conversion, 
> or abstraction).

Keeping the original tree structure as it is avoids to retranslate the 
representation back to trees once we're finished.  So I'm for abstraction.
I think that we should avoid is to deal with tree structure directly, and 
build an interface that will abstract existent structures (maybe tree 
structures will change).

> Though if you guys want to play with a C++ based AST optimizer, i'll 
> throw it on a branch or something 

That's an excellent idea.  What about including it in the tree-ssa branch?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27  1:17             ` Pop Sébastian
@ 2002-08-27  6:28               ` Diego Novillo
  2002-08-27  7:08                 ` Steven Bosscher
  2002-08-27  8:37               ` Daniel Berlin
  2002-08-27 12:21               ` Per Bothner
  2 siblings, 1 reply; 39+ messages in thread
From: Diego Novillo @ 2002-08-27  6:28 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: Daniel Berlin, Chris Lattner, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 392 bytes --]

On Tue, 27 Aug 2002, Pop Sébastian wrote:

> > Though if you guys want to play with a C++ based AST optimizer, i'll 
> > throw it on a branch or something 
> 
> That's an excellent idea.  What about including it in the tree-ssa branch?
>
I'd rather do it separately.  This is is not directly related to
the goals we have on this branch and might create quite a bit of
disruption.


Diego.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?)
  2002-08-26 21:18           ` Daniel Berlin
  2002-08-27  1:17             ` Pop Sébastian
@ 2002-08-27  6:44             ` Steven Bosscher
  2002-08-27  7:09               ` Discussion about a new development plan? (was: Re: [tree-ssa] AST optimizer " Jan Hubicka
  2002-08-27  8:29             ` [tree-ssa] AST optimizer in C++? Chris Lattner
  2 siblings, 1 reply; 39+ messages in thread
From: Steven Bosscher @ 2002-08-27  6:44 UTC (permalink / raw)
  To: gcc

Hello,

Many Great New Improvements for GCC have been discussed lately on this
mailing list. Maybe it's time to devise a sane development plan, to see
what is feasable for what release, and in what time frame.

I know this isn't quite how Free Software is usually developed. But
since GCC is an essential tool for almost everybody in the Open Source
world, I think people will appreciate if they know what to expect
(without making any promises, of course ;-).



ONGOING WORK
(most of you already know this, so skip it or correct me)

Right now there are at least four projects ongoing that involve lots of
infrastructure changes or major rewrites:
1) The new (tree-based?) optimizer framework that people are working
   on in the tree-ssa branch.
2) The other "major rework" project on the cfg-branch, where Jan Hubicka
   and others were working on loop optimizers and midRTL.
3) The new, hand-crafted C++ parser in the cp-parser branch.
4) PCH work in the pch-branch. One guy, many improvements :-)

(BTW It would be nice to see more regular status updates for each
branch, hint hint...)

One would guess this would require all resources available for GCC
development (think `people' and `time' here, not money). But this week,
two new branches were announced. That means that GCC hackers now have 6
branches plus the mainline to deal with. Isn't that going to be a
potential branch maintainer nightmare?

To make things worse, many invasive changes have been discussed in this
mailing list lately, most notably the idea of doing at least some of the
optimizers in C++, which will probably require new bootstrap
infrastructure. That's the kind of thing for a new major release, I
would think...
Add to that the C++ ABI problems and hopefully a test suite to solve
those, the plan to move the C front end to its own directory, the
conversion from K&R to ISO/ANSI C, etc...

In short: GCC is now more than ever a moving target :-(

That's bad for disto builders, it's bad for the gcc users community, and
it's bad for GCC developers as well! For example, the GNU Pascal
compiler still would not even compile as part of GCC 3.0 last time I
looked (GC work isn't done yet).

I suppose most people would agree that it'd be nice to have a stable GCC
release branch for a while, maybe with some bug fixes and compilation
time improvements, but not too many changes.



NEW DEVELOPMENT PLAN (crude)

So maybe we could discuss a new development plan. The old one doesn't
say anything about what's beyond GCC 3.3, so this seems like the right
moment.
Here are my ideas:

I tried to devide the ongoing projects in three categories:
a) suitable for relatively quick merge:
	- PCH work. Most of the infrastructure is already there,
	  and (correct me if I'm wrong) the remaining work should
	  not affect anything in the middle-end or back-end.
	- Most of what's on the cfg-branch now.
	- Anything gcc-3_4-basic-improvements-branch, by definition.

b) long-term, invasive reworks and massive infrastructure changes
	- faster-compiler stuff. It would seem this could easily
	  cause conflicts with the tree-ssa stuff, so maybe it's better
	  to put them together on a single branch.
	- tree-ssa-branch. This is starting to look more and more
	  like a complete new middle-end.
	- More builtin folding (and/or expand BUILTIN_FRONTEND to
	  trees instead of RTL?)
	- All ISO C for source code not required for bootstraping.
	- New C++ parser, and converting other front ends to
	  function-at-a-time mode
	- Bug-free (*cough*) C++ ABI, with testing framework

c) very long term ideas
	- Cross-file interprocedural optimizations and alike. Most
	of this obviously is related to items under (b), but it seems
	far more ambitious.
	- auto-vectorizing pass (as part of middle-end optimizers)

With these categories in mind, you could say, OK, stuff under (a) is
what we want in the 3.x release series. Stuff under (b) is for GCC 4.0,
and (c) is for 4.x (x>0) and beyond.

This means that disto builders/users can rely on GCC producing binary
compatible code for C++ for all GCC 3.x compilers. It also means that no
GCC 3.x will be much faster than what we have now, but I'm not sure how
big an issue this really is, especially once PCH is in.

Once it has been decided how the new optimizers should be implemented
(language, IR, etc), and after 3.3 has been branched, we could branch a
"middle-end-branch".
The mainline would be the base code for 3.4, 3.5, 3.6, and maybe 3.7,
depending on how well things would work out on the middle-end-branch.
Assuming one minor release every six months, GCC 3.6 would be released
in April 2004.

Anything that's on the tree-ssa and faster-compiler branches by then
would be merged onto this new middle-end-branch. Work on this branch
would include:
1) Move the C front end to its own directory and move stuff from
c-common.* to, say, tree-ir.*. (Maybe this should be done on the
"stable" branch as well to avoid huge merge conflicts???)
2) Figure out how to bootstrap with part of the sources written in C++
3a) Implement the new middle-end, using C++ and ISO C (with backports to
the mainline and release branch to avoid merge conflicts), and
3b) Convert the existing front ends to function-at-a-time mode to allow
them to actually use this infrastructure.
4) Include new front ends such as g95 and GNU Pascal??

Actually (3b) may be relevant only to GNU Pascal and GNAT, unless
somebody with serious brain damage stands up and volunteers to rewrite
g77. It may be better to concentrate in g95 instead, it already depends
on much of what is in the tree-ssa branch today, and it actually
produces code (well, for some programs at least).

As is being actively discussed right now, some optimizers will probably
be written in C++, so a stable g++ would be crucial for the work on the
middle-end-branch. Therefore, any major C++ work should be done on a
separate branch. The new C++ parser and all ABI fixes would go on the
cp-parser branch (renamed future-cxx branch by then?), and fixes would
be merged to the middle-end branch only after thourough testing.

Assuming the C++ parser can be finished within one and a half year
(CodeSourcery is working on it again, right?) and the work on the
middle-end will progress at the same speed it has been on in the last
half year, a GCC 4.0 alpha could be ready at about the same time you'd
release 3.6.

Unfortunately nobody can tell how realistic this timeframe is. I suppose
I'm being a bit optimistic... I would certainly like to hear your
opinions ;-)

Sorry for the long mail. I just hope this gets a discussion going...

Greetz
Steven


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27  6:28               ` Diego Novillo
@ 2002-08-27  7:08                 ` Steven Bosscher
  0 siblings, 0 replies; 39+ messages in thread
From: Steven Bosscher @ 2002-08-27  7:08 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Pop Sébastian, Daniel Berlin, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 748 bytes --]

Op di 27-08-2002, om 15:25 schreef Diego Novillo:
> On Tue, 27 Aug 2002, Pop Sébastian wrote:
> 
> > > Though if you guys want to play with a C++ based AST optimizer, i'll 
> > > throw it on a branch or something 
> > 
> > That's an excellent idea.  What about including it in the tree-ssa branch?
> >
> I'd rather do it separately.  This is is not directly related to
> the goals we have on this branch and might create quite a bit of
> disruption.

There used to be a time when the tree-ssa branch was the ast-optimizer
branch ;-)

Anyway, I, too, would like to see the code and maybe play with it a bit.
Daniel, can't you post it as a big patch? It may be a little early to
create a whole new branch for something like this...

Greetz
Steven

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Discussion about a new development plan? (was: Re: [tree-ssa] AST optimizer in C++?)
  2002-08-27  6:44             ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher
@ 2002-08-27  7:09               ` Jan Hubicka
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Hubicka @ 2002-08-27  7:09 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc

> ONGOING WORK
> (most of you already know this, so skip it or correct me)
> 
> Right now there are at least four projects ongoing that involve lots of
> infrastructure changes or major rewrites:
> 1) The new (tree-based?) optimizer framework that people are working
>    on in the tree-ssa branch.
> 2) The other "major rework" project on the cfg-branch, where Jan Hubicka
>    and others were working on loop optimizers and midRTL.

Concerning cfg-branch, I plan to create (in fact I've already did) a new branch
called rtlopt-branch, where RTL based optimizes in the shape ready for GCC
merging (ie not breaking many architectures and provably improving code
quality) should go.  I plan to move there the optimizes from cfg-branch so we
won't slip release schedule again as happent previously and plan to move
forward in the direction of removing loop and cse optimizers and replacing them
with something sane.

I didn't included any code to rtlopt branch nor announced it yet, as I would
like to get daily benchmarking of it working first and the machine doing so
have problems, Andreas is working on it.

I also plan to create midrtl-branch in near future and place there the work for
midrtl I done so far.  In the first stage I would target for code mergeable
into mainline in near future that just cleans up the insn-* interface and
virtualises it, so multiple machine descriptions can be compiled into compiler
(it works for me already to some extend - like that insn-attrtab and firends
are not virtualised as it is not needed for the midrtl project)

Then i would like to move in the direction to midrtl, once the ast matures
enought so we can cleanup the RTL generation enought to make this possible.

Honza

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 21:18           ` Daniel Berlin
  2002-08-27  1:17             ` Pop Sébastian
  2002-08-27  6:44             ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher
@ 2002-08-27  8:29             ` Chris Lattner
  2 siblings, 0 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-27  8:29 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Pop Sébastian, dnovillo, gcc

On Tue, 27 Aug 2002, Daniel Berlin wrote:

> > Yep that's a nice interface for handling CFG nodes.
> > I saw something like this also in open64.
> > It's so simple to just say: iterate over all BB in topological order
> > and just write "for (topological_CFG::iterator i = ..."
> > :-)

> Just an FYI, I reimplemented from scratch (so i wouldn't have to deal
> with any revealing of code i've been asked not to reveal issues) the
> graph_traits stuff, and their callgraph stuff, in a C++ based AST
> optimizer.

Nifty. :)

> I've yet to deal with the issue of a Tree class (either by conversion,
> or abstraction).
> Too lazy.
> It's just something i play with from time to time.
> Though if you guys want to play with a C++ based AST optimizer, i'll
> throw it on a branch or something (It can construct callgraphs
> incrementally, and a few other things).

That's very interesting!  What do you think the likelyhood of GCC using a
C++ based optimizer is?  I think everyone will acknowledge that writing it
in C++ would make it much nicer to deal with, but do you the the extra
bootstrap stage (if no native C++ compiler is available) would be
tolerated?

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27  1:17             ` Pop Sébastian
  2002-08-27  6:28               ` Diego Novillo
@ 2002-08-27  8:37               ` Daniel Berlin
  2002-08-27 12:21               ` Per Bothner
  2 siblings, 0 replies; 39+ messages in thread
From: Daniel Berlin @ 2002-08-27  8:37 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: Chris Lattner, dnovillo, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 67 bytes --]

On Tuesday, August 27, 2002, at 04:17  AM, Pop Sébastian wrote:

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-26 18:46                   ` Daniel Berlin
@ 2002-08-27  9:26                     ` Chris Lattner
  2002-08-27  9:57                       ` Chris Lattner
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-27  9:26 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc

On Mon, 26 Aug 2002, Daniel Berlin wrote:

> >> No kidding, which is the problem.
> >
> >> Just to be clear, i mean the fact that we have to layer things on top
> >> of it is the problem. not the updating.
> >
> > I understand exactly what you're saying.  My point is that if you want
> > to do a non-SSA representation TODAY, that you have to deal with
> > layering.

> No you don't, you can just add your own annotations.
> No layering needed.
> Where did you get the idea you need to touch SSA at all to write a
> seperate representation?

I'm sorry, I mis-typed.  I meant that if you want to do a non-register
based representation today, you have to deal with layering.  When I say
layering, I mean a layer of annotations on the side that basically exist
over the top of a base representation.

If SSA was your base representation and you wanted a PDG, f.e., you could
have the annotations on the side of SSA just like you would a register
based representation.  Sorry for the confusion!  :)

> > This has everything to do with keeping it up to date.  This is the
> > situation as it exists today with a layered SSA reprsentation:
> >
> > 1. All SSA transformations are penalized by having to update both SSA
> > form
> >    and standard register form.  This makes SSA transformations harder
> > and
> >    less likely to be written

> How does it make it harder, given that  any transformation that keeps
> the SSA form up to date must necessarily be inserting the proper copies
> into the "standard" form and whatnot anyway?

I'm not sure what you mean by "inserting copies".  If SSA is the base
representation there is absolutely no need for a copy instruction (LLVM
doesn't have one, for example).  If SSA is layered on top of a register
based representation (with annotations), of course then you have to insert
copies into the register based representation, but that's only one of the
ways you have to modify it to keep it up to date...

Could you please explain a bit what you mean by this?  Sorry if I'm
missing something obvious.  :)


> > 2. Alternative representations can be represented just like SSA is now,
> >    layered.
> > 3. Non-SSA transformations don't have to deal with SSA at all, they can
> >    just ignore it and have it garbage collected away.
> > 4. Non-SSA transformations cause a huge compile-time performance hit,
> >    because they invalidate the SSA information, requiring it to be
> >    rebuilt.

> Well, not quite.
> You can incrementally update it without the non-SSA pass having to
> worry about it if you really think it's that big a hit.

I'm not sure what you mean by this.  You still have to regenerate the SSA
annotations after transformations on the alternative representation if you
don't update the SSA.  If you do update the SSA, that's case #5 below.

> Or, we could run the optimizers outside the SSA ones, but with your
> scheme, this adds an unssa pass. Not that this is slow, but just
> "another" thing to worry about.

Or, in your scheme, the SSA representation would effectively get garbage
collected away, and regenerated later, which is basically as expensive as
running unssa anyway.  Unssa is a quite fast and simple transformation
when you don't have to worry about RTL level issues.

> > 5. Transformations on alternative representations can either be aware
> > of
> >    SSA or not.  If they aren't, then they suffer the same problems as
> > #4.
> >    If they are, then they have to maintain and manipulate _3_ different
> >    representations, keeping them up to date.

> Not they directly necessarily, but yes, SSA needs to be kept up to date
> or rebuilt.

So my point was that _if_ SSA is the primary representation [ie, used by
most transformations], then it make sense to make it the base
representation as well.

> > ... and this is the situation with SSA built into the representation:
> >
> > 1. SSA transformations just update the one true representation.  There
> > is
> >    nothing to get out of date, and they are simple to write.

> I still don't see how it was hard in the first place.

It is not _hard_ persay, see the end of the email.


> > 2. Transformation in alternative representations are implemented as
> > layers
> >    on top of the SSA representation.  There isn't a huge difference
> >    between this case and a register based representation.

> The hardness of this step depends on how close to SSA your
> representation is.

Certainly, just as the hardness of layering on top of a register based
representation depends on how close you are to a register based rep.  I
think that you would agree that "new" representations tend to be more
about encoding flow and dependance information into the representation
than they do about modeling explicit registers, along with all four
possible dependences...

> > Obviously because maintaining the representation on the side has costs,
> > costs that you seem to be ignoring.
>
> >  If you want me to enumerate some of
> > them, I can.

> Please do.
> The location of SSA information, be it in annotations, in the IR, or on
> some remote computer in budapest, should not affect the cost of
> implementation (IE the "hardness" that will prevent SSA opts from being
> written) as long as the abstraction is good.
> Maybe the cost of runtime, but even then, consider this:

Here are some of the costs that I can think of:

1. Actively working on (updating, traversing) two seperate representations
   at the same time implies that you have less effectively cache
   capacity available to you.  This includes L1, L2, TLB, and all of the
   other problems people are tackling.
2. Because you _must_ add extra layers of abstraction on top of things to
   make it easy to update the SSA information, your icache is less
   effective.  In addition, the code is harder to understand because there
   is more abstraction involved.
3. Some things you cannot abstract out, for example, generalized updates
   the the representation.
4. Having two representations active at the same time means that potential
   contributors have to be aware and understand the rules and semantics of
   BOTH of them.  In this case, it means that they have to know how and
   when to update the base representation and what exactly the abstraction
   functions are doing.
5. Speaking from experience, having the representation in SSA form only
   makes it much easier to manipulate instruction nodes, understand the
   lifetime of the underlying representation, and therefore have simple
   rules for when and how memory can be leaked.  This is something that
   GCC has lacked for a long time (not just in the optimizer), which is
   what makes the garbage collector effectively neccesary.
6. Having a layered representation makes it much less convenient to define
   you intermediate representation as a _language_ in it's own right which
   gives you all kinds of nice advantages.

Note that this all depends on having a top-notch abstraction interface.
If you don't have a good abstraction interface, then there really is no
comparison.

> Even if we made SSA the IR, we still need the annotations to generate
> the factored edges, unless you are suggesting we get rid of these too
> (which, I should point out, Pro64 and friends have and use).
> If we get rid of those, we'd have to have nice large arrays that
> multimap defs to uses, map defs to uses, etc.
> Unless you want to throw those annotations *directly* in the tree
> structure, which would be another fight with other people.

I'm sorry, I'm not familiar with the term "factored edges".  What are they
used for?


> > The whole point is that SSA is a very nice representation for
> > performing
> > optimizations on, and therefore you want to make writing these
> > optimizations easier.  If it makes the less common case a bit harder,
> > then
> > that's ok, because it's not common.

> Not common *right now*.
> Look how quickly SSA took over.

Sure, because SSA makes a lot of sense for a lot of optimizations.  You
still haven't given me any examples about optimizations that are difficult
to write in SSA, f.e.  Most of the other representations are tuned to a
particular class of problem, making them less generally useful.  I would
argue that _this_ is why SSA caught on so fast.


> > Sorta like profile guided
> > optimization I suppose, were we are optimizing initial development and
> > ongoing maintenance instead of execution time.

> In terms of initial development time, i think i can offer some actual
> data:
> I've written 1.5 of the 2 SSA optimizers on the tree-ssa branch.
> You have written 0.

True, but I have written ~30 SSA optimizers in LLVM, plus many "analyses"
as well (dominators, natural loops, intervals, alias analysis, data
structure analysis, etc).  I do have some experience, and I do have some
idea what I'm talking about.  The fact that LLVM didn't exist at all two
years ago is a testament to how fast it has been progressing.

> Is this because SSA is not the IR?

This is for several reasons:

1. There is a huge learning curve to working on GCC
2. There is [basically] no documentation on the tree-ssa branch
3. I'm not paid to work on GCC, and all my free time is taken working on
   LLVM.

That said, I _would_ like to be able to work on GCC, contributing
enhancements here and there.  I _am_ somewhat familiar with other parts of
GCC, just because of the backend work I have done to get LLVM coming out
of GCC.  If the infrastructure was more mature and inviting, I would be
contributing.  A good start would be porting over some of the
transformations from LLVM.

> Honestly, I don't see how making SSA the IR would make me want to write
> 3 instead of 5, or would make it any easier.

I understand your viewpoint.  The fact is, however, that you are an expert
in GCC and an expert in the SSA representation used.  For people who
aren't, having two representations to deal with is painful and hard to
deal with.  This makes it less likely that you will get _external_
contributions, which I'd argue is what free software is all about.

> Maybe you could work up a patch to tree-ssa-ccp or tree-ssa-pre.c that
> shows how much easier it would be if we had it in the IR?
> No need to actually make it *work*, i just want to see an actual
> benefit in terms of what the code looks like, or how short it is, etc,
> rather than all theoretical junk.

I'm having a hard time getting access to those files.  subversion.gcc.org
doesn't seem to like displaying branches in viewcvs, and those files don't
show up here:

http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/?only_with_tag=tree-ssa-20020619-branch

any hints?

> It seems to me that all it would buy us in either is that the macros
> might have different names, or access different structure fields.
> For all intents and purposes, except for the fact that we don't
> directly represent the versions in the names (nor does Pro64), and phi
> nodes don't exist in trees, SSA *is* the IR.

I'm not arguing that SSA isn't the IR.  :)  I'm arguing that because it's
not the BASE IR that there are some disadvantages, that's all...

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-25 16:03         ` Diego Novillo
@ 2002-08-27  9:32           ` Chris Lattner
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-27  9:32 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Pop Sébastian, gcc

On Sun, 25 Aug 2002, Diego Novillo wrote:

> I've seen several neat ideas in LLVM.  Incorporating some of them
> in GCC would be great.  And the tree-ssa branch is perfect for
> that kind of experimentation.

Sounds good to me.  I just hope that some of the experience our group has
gained can be useful to the greater good.

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27  9:26                     ` Chris Lattner
@ 2002-08-27  9:57                       ` Chris Lattner
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Lattner @ 2002-08-27  9:57 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc

> > Maybe you could work up a patch to tree-ssa-ccp or tree-ssa-pre.c that
> > shows how much easier it would be if we had it in the IR?
> > No need to actually make it *work*, i just want to see an actual
> > benefit in terms of what the code looks like, or how short it is, etc,
> > rather than all theoretical junk.
>
> I'm having a hard time getting access to those files.  subversion.gcc.org
> doesn't seem to like displaying branches in viewcvs, and those files don't
> show up here:

Okay I found the code in the Attic directory (is this normal?).
Basically, I think the benefits would be:

* Simpler code.  I'm not really sure why you have two loops in
  ssa_ccp_substitute_constants, but SCCP should only require one loop of
  transformation over the representation to finalize it.  In LLVM this
  looks like this:

  for (Function::iterator BB = F.begin(), BBE = F.end(); BB != BBE; ++BB)
    for (BasicBlock::iterator BI = BB->begin(); BI != BB->end();) {
      Instruction &Inst = *BI;
      if (ValueState[&Inst].isConstant()) {
        Constant *Const = ValueState[&Inst].getConstant();
        // Replaces all of the uses of a variable with uses of the constant.
        Inst.replaceAllUsesWith(Const);
        // Remove the operator from the list of definitions... and delete it.
        BI = BB->getInstList().erase(BI);
      } else {
        ++BI;
      }
     }

  [slightly simplified from
   http://llvm.cs.uiuc.edu/doxygen/SCCP_8cpp-source.html#l00239 ]

  why does your SCCP have the extra loop?

* Macros should disappear.  Given a structured representation, many of the
  macros should go away...

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27  1:17             ` Pop Sébastian
  2002-08-27  6:28               ` Diego Novillo
  2002-08-27  8:37               ` Daniel Berlin
@ 2002-08-27 12:21               ` Per Bothner
  2002-08-27 12:32                 ` Gabriel Dos Reis
  2 siblings, 1 reply; 39+ messages in thread
From: Per Bothner @ 2002-08-27 12:21 UTC (permalink / raw)
  To: Pop Sébastian; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 24 bytes --]

Pop Sébastian wrote:

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27 12:21               ` Per Bothner
@ 2002-08-27 12:32                 ` Gabriel Dos Reis
  2002-08-27 12:51                   ` Per Bothner
  0 siblings, 1 reply; 39+ messages in thread
From: Gabriel Dos Reis @ 2002-08-27 12:32 UTC (permalink / raw)
  To: Per Bothner; +Cc: Pop Sébastian, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1445 bytes --]

Per Bothner <per@bothner.com> writes:

| Pop Sébastian wrote:
| > Keeping the original tree structure as it is avoids to retranslate the 
| > representation back to trees once we're finished.  So I'm for abstraction.
| > I think that we should avoid is to deal with tree structure directly, and 
| > build an interface that will abstract existent structures (maybe tree 
| > structures will change).
| 
| Perhaps something like:
| 
| class Tree {
|     union tree_node *node;
| };

| Or:
| 

| class Tree {
|    union tree_common common;
| }

Since the class Tree will base class, it is likely that we will
exclusively using pointers to Tree, then the second declaration saves
an unnecessary indirection.

And, if we were to be using C++, I'm not convinced that we would like
to use "union tree_code" whose purpose is solely to provide
polymorphic container type.  But, yes, I understand your point:

| Long term, I'd be in favor of re-writing the entire ocmpiler in C++,
| but as a migration strategy it has to be a *wrapper* around the
| existing tree types, and at least for now must allow boostrapping
| without an existing C++ compiler.  If the experiment looks promising,
| I'd be open to switching the entire compiler to C++, again assuming
| it can be done incrmentally, without a total re-write!

Well, I have some impression that there are some sentiment against C++
somewhere... Well, I wish that impression were wrong...

-- Gaby

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27 12:32                 ` Gabriel Dos Reis
@ 2002-08-27 12:51                   ` Per Bothner
  2002-08-27 13:02                     ` Gabriel Dos Reis
  0 siblings, 1 reply; 39+ messages in thread
From: Per Bothner @ 2002-08-27 12:51 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc

Gabriel Dos Reis wrote:

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [tree-ssa] AST optimizer in C++?
  2002-08-27 12:51                   ` Per Bothner
@ 2002-08-27 13:02                     ` Gabriel Dos Reis
  0 siblings, 0 replies; 39+ messages in thread
From: Gabriel Dos Reis @ 2002-08-27 13:02 UTC (permalink / raw)
  To: Per Bothner; +Cc: gcc

Per Bothner <per@bothner.com> writes:

[...]

| In any case, it's not going to happen unless someone comes up with a
| good design, and illustrates its use, perhaps with a new (and optional)
| optimization.

I absolutely agree with you.

Thanks,

-- Gaby

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2002-08-27 13:02 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-08-22 13:50 A FrontEnd in C++? Chris Lattner
2002-08-23  0:51 ` AST optimizer " Pop Sébastian
2002-08-23 10:25   ` Chris Lattner
2002-08-25  9:48     ` [tree-ssa] " Pop Sébastian
2002-08-25 14:33       ` Chris Lattner
2002-08-25 16:03         ` Diego Novillo
2002-08-27  9:32           ` Chris Lattner
2002-08-26  8:52         ` Pop Sébastian
2002-08-26  9:17           ` Chris Lattner
2002-08-26 21:18           ` Daniel Berlin
2002-08-27  1:17             ` Pop Sébastian
2002-08-27  6:28               ` Diego Novillo
2002-08-27  7:08                 ` Steven Bosscher
2002-08-27  8:37               ` Daniel Berlin
2002-08-27 12:21               ` Per Bothner
2002-08-27 12:32                 ` Gabriel Dos Reis
2002-08-27 12:51                   ` Per Bothner
2002-08-27 13:02                     ` Gabriel Dos Reis
2002-08-27  6:44             ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher
2002-08-27  7:09               ` Discussion about a new development plan? (was: Re: [tree-ssa] AST optimizer " Jan Hubicka
2002-08-27  8:29             ` [tree-ssa] AST optimizer in C++? Chris Lattner
2002-08-25 15:35       ` Diego Novillo
2002-08-25 15:45         ` Chris Lattner
2002-08-25 16:08           ` Diego Novillo
2002-08-25 16:19             ` Diego Novillo
2002-08-26  8:39               ` Chris Lattner
2002-08-26 17:43                 ` Daniel Berlin
2002-08-26 18:12                   ` Chris Lattner
2002-08-26  8:35             ` Chris Lattner
2002-08-25 16:12           ` Daniel Berlin
2002-08-26  7:02             ` Pop Sébastian
2002-08-26  8:28               ` Steven Bosscher
2002-08-26  8:52             ` Chris Lattner
2002-08-26 17:40               ` Daniel Berlin
2002-08-26 17:44                 ` Daniel Berlin
2002-08-26 18:11                 ` Chris Lattner
2002-08-26 18:46                   ` Daniel Berlin
2002-08-27  9:26                     ` Chris Lattner
2002-08-27  9:57                       ` Chris Lattner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).