* Re: A FrontEnd in C++?
@ 2002-08-22 13:50 Chris Lattner
2002-08-23 0:51 ` AST optimizer " Pop Sébastian
0 siblings, 1 reply; 39+ messages in thread
From: Chris Lattner @ 2002-08-22 13:50 UTC (permalink / raw)
To: gcc
> Go right ahead. All you need is a new branch and some free time. I
> would personally suggest following some of the design and implementation
> ideas from SUIF and/or Sage++. They are kind of neat.
As long as we are plugging compiler infrastructures, I thought I might
mention my own project LLVM: http://llvm.cs.uiuc.edu/
It's also completely written in C++ (G++ 3.2), uses a GCC->LLVM frontend,
is SSA based, open-source, and has some interesting technology available
in it. The LLVM instruction set is roughly equivalent to SIMPLE
( http://llvm.cs.uiuc.edu/docs/LangRef.html ).
A few of the nify things LLVM provides are the modular pass system:
http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html
strong infrastructure for interprocedural optimizations, linktime, and
runtime optimizations, etc. The infrastructure is _very_ fast.
Additionally, because it uses C++ and the STL throughout, writing
transformations is quite simple. For example, LICM and GCSE are each
just a couple hundred lines of code (which are mostly comments):
http://llvm/doxygen/LICM_8cpp-source.html
http://llvm/doxygen/GCSE_8cpp-source.html
If you have any questions about LLVM or would like to discuss compiler
architecture, I'd be more than happy to participate.
-Chris
^ permalink raw reply [flat|nested] 39+ messages in thread
* AST optimizer in C++? 2002-08-22 13:50 A FrontEnd in C++? Chris Lattner @ 2002-08-23 0:51 ` Pop Sébastian 2002-08-23 10:25 ` Chris Lattner 0 siblings, 1 reply; 39+ messages in thread From: Pop Sébastian @ 2002-08-23 0:51 UTC (permalink / raw) To: Chris Lattner; +Cc: gcc Hi Chris, On Thu, Aug 22, 2002 at 03:53:54PM -0500, Chris Lattner wrote: > > > Go right ahead. All you need is a new branch and some free time. I > > would personally suggest following some of the design and implementation > > ideas from SUIF and/or Sage++. They are kind of neat. > > As long as we are plugging compiler infrastructures, I thought I might > mention my own project LLVM: http://llvm.cs.uiuc.edu/ > I had a look at the project, and I'm interested to have a look at the sources (if it's possible). > It's also completely written in C++ (G++ 3.2), uses a GCC->LLVM frontend, > is SSA based, open-source, and has some interesting technology available > in it. The LLVM instruction set is roughly equivalent to SIMPLE > ( http://llvm.cs.uiuc.edu/docs/LangRef.html ). > > A few of the nify things LLVM provides are the modular pass system: > http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html > strong infrastructure for interprocedural optimizations, linktime, and > runtime optimizations, etc. The infrastructure is _very_ fast. > > Additionally, because it uses C++ and the STL throughout, writing > transformations is quite simple. For example, LICM and GCSE are each > just a couple hundred lines of code (which are mostly comments): > > http://llvm/doxygen/LICM_8cpp-source.html > http://llvm/doxygen/GCSE_8cpp-source.html > > If you have any questions about LLVM or would like to discuss compiler > architecture, I'd be more than happy to participate. > After what I (quickly) read from the web page you use gcc's front-ends. My questions are: - Do you have a class hierarchy for representing statements and expressions? - Is it possible to generate GCC trees after your optimizations? (and then generate RTL by passing these trees to the RTL-translator...) In fact what I want for gcc is an AST optimizer that could be built independently of gcc's front-ends. This optimizer takes as input SIMPLE trees and its output is again gcc trees (a subset of SIMPLE, or even a superset of SIMPLE if we decide to promote some stmts/exprs to a higher level during optimization). The overall schema could be: (GCC front-ends) -> (GENERIC trees) -> [(SIMPLE trees) -> (AST Optimizer)] -> -> (RTL conversion) -> (Code generation) with optional components in "-> [...] ->". These components are not mandatory for building the GCC compiler. They could be built apart once the g++ compiler is available. The AST optimizer could be loaded from a .so file (as in open64 compiler) and thus avoiding to grow too much the size of the executable when AST optimization is not needed. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: AST optimizer in C++? 2002-08-23 0:51 ` AST optimizer " Pop Sébastian @ 2002-08-23 10:25 ` Chris Lattner 2002-08-25 9:48 ` [tree-ssa] " Pop Sébastian 0 siblings, 1 reply; 39+ messages in thread From: Chris Lattner @ 2002-08-23 10:25 UTC (permalink / raw) To: Pop Sébastian; +Cc: gcc > > As long as we are plugging compiler infrastructures, I thought I might > > mention my own project LLVM: http://llvm.cs.uiuc.edu/ > I had a look at the project, and I'm interested to have a look at the sources > (if it's possible). As I says on the web page, we aren't really ready for a public "release" yet. That said, the doxygen documentation is all online, so you are free to browse the source: http://llvm.cs.uiuc.edu/doxygen/ I'll get back to the list this afternoon with something more decisive about making tarballs public. > > If you have any questions about LLVM or would like to discuss compiler > > architecture, I'd be more than happy to participate. > After what I (quickly) read from the web page you use gcc's front-ends. > My questions are: > - Do you have a class hierarchy for representing statements and expressions? Sorta. If you take a look at this page: http://llvm/doxygen/hierarchy.html or http://llvm/doxygen/inherits.html you can see the (textual or graphical) class heirarchy for entire compiler suite. Search for "Value" and you can see the root of the tree you're interested in. Now this is a "sorta" because we don't have statements and expressions. :) LLVM is a low level three address code type representation, in SSA form, with _type information_ for all operands. One of our research results has been to show that most of the optimizations traditionally done on AST's can be done just as well on low level representations, *as long as you have rich type information*. In the context of LLVM then, this means that statements and expressions are both instructions, it's just that "statements" generally return void, ie they don't produce a value. If you'd like me to turn some C code into LLVM code to be more concrete, I'd be more than happy to. It really is a nice representation for implementating optimizations (see the subclasses of the "Pass" class in the inheritance diagram for some of the transformations we have implemented). There is quite a bit of documentation on the web page (and it's growing) talking about the LLVM representation and compiler process. We are just starting to work on the "intro to coding LLVM stuff" documents unfortunately. > - Is it possible to generate GCC trees after your optimizations? (and then > generate RTL by passing these trees to the RTL-translator...) The basic architecture looks like this (this is covered in detail in this document: http://llvm.cs.uiuc.edu/pubs/LLVMCompilationStrategy.pdf ): The cc1 process: GCC Parser -> Tree -> RTL -> RTLSSA -> RTLSSA2LLVM, output x.s (LLVM assembly) The 'as' process: parse x.s -> cleanup -> level raise -> optimize, output x.o (which is in LLVM bytecode) the 'ld' process: link *.o bytecode files, generate native code for sparc (postlink), output x.s (sparc assembly) Run native sparc assembler/linker. We also have a simple C backend as well. There are a couple of interesting things to note about this process: 1. We completely discard all of the type information provided to use by the source language, and reconstruct what we need from the RTL. This is a side-effect of us starting with a GCC snapshot from about 2 months before GCC 3.0 became official. With no tree-ssa branch to start from, we decided to base our work around the RTL representation (which was also important, because not every frontend used the Tree representation). Although we currently only support the C frontend, we plan to turn on the C++ frontend soon. 2. We have disabled almost all of the optimizations in GCC (basically all the RTL optimizations, keeping the language specific optimizations, like builtin optimizations). This is because the optimizations in the GCC frontend were taking a lot more time to execute than they did when reimplemented making use of the SSA and type information. Additionally, some optimizations were only applied by the CSE pass, which is limited to extended basic blocks, whereas all of our optimizations are at least global. To be specific, compiling vecffe.c from the SpecINT2000 GAP benchmark (a big file) gives the following -ftime-report output when running with -O2 (showing that most everything is disabled): Execution times (seconds) preprocessing : 0.31 ( 7%) usr 0.05 (25%) sys 0.53 ( 9%) wall lexical analysis : 0.21 ( 4%) usr 0.03 (15%) sys 0.27 ( 5%) wall parser : 1.05 (22%) usr 0.06 (30%) sys 1.27 (22%) wall varconst : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall jump : 0.27 ( 6%) usr 0.00 ( 0%) sys 0.30 ( 5%) wall if-conversion : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall convert to SSA : 0.65 (14%) usr 0.03 (15%) sys 0.70 (12%) wall convert from SSA : 0.82 (17%) usr 0.01 ( 5%) sys 1.13 (19%) wall final : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall rest of compilation : 0.32 ( 7%) usr 0.01 ( 5%) sys 0.36 ( 6%) wall RTL->LLVM code gen : 0.44 ( 9%) usr 0.00 ( 0%) sys 0.47 ( 8%) wall LLVM ResolveOperands : 0.32 ( 7%) usr 0.00 ( 0%) sys 0.37 ( 6%) wall LLVM Code Emission : 0.25 ( 5%) usr 0.01 ( 5%) sys 0.36 ( 6%) wall TOTAL : 4.70 0.20 5.90 Note that the 3 LLVM passes happen between the convert to/from SSA passes. toplev.c is pretty nastily hacked to disable this stuff (which could certainly be fixed, of course). This is a debug build running on slow machine, so the actual timing results shouldn't be taken to mean anything. In particular, the 3 LLVM passes are not really written to be efficient, and should be rewritten at some point. 3. We do in fact need a .md file for our target to specify what sort of RTL to generate. Basically we just have completely generic operations, with as few constraints as possible on the instructions, trying to get unobfuscated code from the frontend. For example, our andhi3 pattern looks like: (define_insn "andhi3" [(set (match_operand:HI 0 "register_operand" "=r") (and:HI (match_operand:HI 1 "nonmemory_operand" "") (match_operand:HI 2 "nonmemory_operand" "")))] "" "andhi3 %1, %2, %0") if anyone cares, I can make our GCC changes available, although it's kindof a hack (let's just say I didn't know GCC very well when I started :), and needs to be largely rewritten. 4. Keeping LLVM code until the link step lets us perform important interprocedural optimizations at link time, for example alias analysis, that have a big effect on machine code generation (for example, scheduling). Also inlining, and many other optimizations are much more powerful at link-time, as I'm sure everyone is keenly aware. To keep the actual code generation step fast, we do as much optimization as possible in the 'gccas' process as possible, including machine specific optimization. 5. LLVM allows a full suite of optimizations on global objects as well as the bodies of functions. This means it can completely eliminate unused global variables, functions, etc, without problem. I believe that GCC cannot currently do this in the backend (as part of generic optimization) because it treats global variables, for example, as immutable places in memory accessed by symbol_refs. 6. The LLVM optimizations are very fast (I can post numbers if you'd like). Typically it takes as much time to run through cc0 as it does to execute gccas. Although we do not have ALL of the optimizations we'd like implemented for GCCAS, we do have a fair collection, executed in this order: * FunctionResolving - Resolves a call to 'int foo(...)' to 'int foo(int)' if there wasn't a prototype for foo specified. * globaldce - Remove unused static global variables & functions, also removes unused 'externs'. * deadtypeelimination - Cleanup the symbol table by removing unused typedefs. * constantmerge - Merge global variable constants (basically .rodata) with identical contents together. * deadinst elimination - Trivial dead code elimination * raise allocations - Convert malloc/free calls to malloc/free instructions * indvar simplify - Convert indvars to cannonical form * Level raise - Recover type information * promote mem2reg - Promote stack values to SSA registers. * instcombining - Generalized peephole, val# pass, very fast. * licm - Loop invariant code motion * gcse - Global Common Subexpression Elimination * sccp - Sparse conditional constant prop * instcombine - Rerun value simplification * adce - Aggressive dead code elimination * simplifycfg - Merge and delete blocks. These use various analyses to do their job. If you're interested, I can also provide timing information for stuff if you'd like. There are a couple of additional transformations that I would like added to this collection: PRE, strength reduction, etc. They are being added when I have time. > In fact what I want for gcc is an AST optimizer that could be built independently > of gcc's front-ends. This optimizer takes as input SIMPLE trees and its output > is again gcc trees (a subset of SIMPLE, or even a superset of SIMPLE if we decide > to promote some stmts/exprs to a higher level during optimization). LLVM could certainly be used for something like this, but I think it would fit better as an intermediate representation between SIMPLE (or generic trees) and RTL. The current expansion infrastructure in GCC that expands from tree to RTL could easily be adapted to convert from LLVM, and RTL certainly doesn't need the type information available in LLVM. *shrug* I'm not sure what converting back to trees would give us, although it's probably possible. > The overall schema could be: > > (GCC front-ends) -> (GENERIC trees) -> [(SIMPLE trees) -> (AST Optimizer)] -> > -> (RTL conversion) -> (Code generation) There are many ways to do it, I think it would work best as: (GCC front-ends) -> (GENERIC trees) -> LLVM -> [optimiations] -> (RTL conversion) -> (Code generation) but a mix of approaches would certainly have advantages and disadvantages. Even if GCC was not interested in using LLVM itself as a component (which I really don't expect), some of the ideas could be incorporated into the SIMPLE representation to get some of the advantages. What exactly is gained by converting back to a tree representation afterwards? Just simplified integration with the compiler as it stands now? > with optional components in "-> [...] ->". These components are not mandatory > for building the GCC compiler. They could be built apart once the g++ compiler > is available. The AST optimizer could be loaded from a .so file (as in open64 > compiler) and thus avoiding to grow too much the size of the executable when > AST optimization is not needed. Sure, makes sense. There are two ways I see to completely get LLVM out of the pictures when building from an untrusted compiler: 1. Have the simplest LLVM components be written in C, including the LLVM->RTL conversion. 2. Convert directly from generic trees to RTL when bootstrapping. I would advocate #2 personally, simply because it has already been done before in the past and could easily be resurrected. The only problem would be that two code paths would have to be maintained: Tree->RTL and Tree->LLVM. Although not a huge deal (a lot of stuff could be factored out as LLVM and RTL are both "low level" representations), it's still suboptimal. Anyway, I'd love to discuss various IR design issues. If anyone would like to talk about the Pros/cons of optimizing on a tree vs a low-level IR with type information, I can certainly talk about the experience I have with it. Hopefully I can convince you that tree's are not really neccesary at all. :) :) -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* [tree-ssa] AST optimizer in C++? 2002-08-23 10:25 ` Chris Lattner @ 2002-08-25 9:48 ` Pop Sébastian 2002-08-25 14:33 ` Chris Lattner 2002-08-25 15:35 ` Diego Novillo 0 siblings, 2 replies; 39+ messages in thread From: Pop Sébastian @ 2002-08-25 9:48 UTC (permalink / raw) To: Chris Lattner, dnovillo; +Cc: gcc On Fri, Aug 23, 2002 at 12:28:53PM -0500, Chris Lattner wrote: > > Anyway, I'd love to discuss various IR design issues. If anyone would > like to talk about the Pros/cons of optimizing on a tree vs a low-level > IR with type information, I can certainly talk about the experience I have > with it. Hopefully I can convince you that tree's are not really > neccesary at all. :) :) > Agree, since we deal with (more or less) the same amount of information structured differently: trees vs. list of instructions. AST representation as it exists now has the following information transformed directly from the input source code: - control flow - def-use - caller-callee - types Intermediate representations are created for better handling all this information. IR can be viewed as sub-languages of the original language: - CFG is a the sub-language that contains information about the control flow - SSA and dependence graphs for def-use - Call-graph for caller-callee - symbol tables for types (see also http://www.stratego-language.org/twiki/bin/view/Stratego/StrategoPublications http://www.stratego-language.org/twiki/bin/view/Transform/ProgramTransformation ) These IR expose the same information as the original source code but in a more compact form to optimizers. OTOH this information is not ready to be used dirrectly by optimizers since IRs can produce different representations for the same semantic-equivalent program. This increase the complexity of optimizers. The idea is to reduce a class of semantic-equivalent programs to a single representative program. This is the normalization process: - SIMPLE normalizes def-use IRs - goto/break/switch elimination and loop canonicalization aim to normalize control flow. - ??? normalization for caller-callee - ??? normalization for types (I have no idea yet about question marks... work in progress...) Low Level Virtual Machine (LLVM) defines a virtual instruction set that contains exactly the same amount of information as the AST representation and it has all the benefits of the three-address-code normalization. However LLVM does not provide support for high level loop constructs (nested loops for example). It defines instead branch "br" instructions. You then need extra informations for loop nest optimizers. Thus, I agree that writing optimizers on LLVM is as difficult :-) as writing them on trees under SIMPLE form. However I think that introducing yet another translation from trees to LLVM could be expensive in compiling time (yet it is a quick transformation from SIMPLE). A good idea I saw in the LLVM is that the SSA infromation is stored directly in the operations stream: ie. (from LLVMCompilationStrategy.pdf) <result> = phi <type> [<val0>, <label0>], ..., [<valN>, <labelN>] Diego, maybe it is worth to include PHI nodes directly as a SIMPLE extension? This could solve the problem with the tree_common.aux field... The only difficult bit I see is to remove them before we generate RTL (or otherwise propagate information about SSA into RTL through the translation of these nodes). What do others think? ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 9:48 ` [tree-ssa] " Pop Sébastian @ 2002-08-25 14:33 ` Chris Lattner 2002-08-25 16:03 ` Diego Novillo 2002-08-26 8:52 ` Pop Sébastian 2002-08-25 15:35 ` Diego Novillo 1 sibling, 2 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-25 14:33 UTC (permalink / raw) To: Pop Sébastian; +Cc: dnovillo, gcc > > with it. Hopefully I can convince you that tree's are not really > > neccesary at all. :) :) > Agree, since we deal with (more or less) the same amount of information > structured differently: trees vs. list of instructions. Very true. Although probably well known, lists of instructions have lots of advantages as well (over trees): * Better cache behavior: Tree representations tend to have lots of dynamically allocated nodes to represent the equivalent amount of information held by a single "instruction". For example, a single "add" instruction can be represented with a single dynamically allocated node in list-of-instructions form, but requires at least 3 (and possibly more) nodes in a tree representation: One of the instruction, and two for the operands. * Smaller representation: Side effect of the above... with less allocations/fewer nodes, the total of the per-node overheads is much less. * Simpler to deal with: There are _no_ sharing/unsharing or "amount of lowering" issues to deal with in a simple list-of-instructions representation. These are some of the reasons that I think that the SIMPLE representation could be made better. That said, SIMPLE _is_ 90% of the way there, and a HUGE step up from working on raw language dependent trees or RTL. > AST representation as it exists now has the following information transformed > directly from the input source code: > - control flow > - def-use > - caller-callee > - types Yes, all of which are very important for one part of the compiler or another. See below for how LLVM addresses these. > Intermediate representations are created for better handling all this information. > IR can be viewed as sub-languages of the original language: > - CFG is a the sub-language that contains information about the control flow > - SSA and dependence graphs for def-use > - Call-graph for caller-callee > - symbol tables for types Very true. > These IR expose the same information as the original source code but in a more > compact form to optimizers. And, more importantly, it should be an easier to manipulate form. The AST is an important representation that has to deal with partial information (for example before type checking/propogation has finished), and potentially erroneous input. A well designed mid-level IR doesn't have to deal with these issues, so the representation used can be simplified a lot, making it much easier to transform. > OTOH this information is not ready to be used dirrectly by optimizers since IRs > can produce different representations for the same semantic-equivalent program. I'm not really sure what you mean by this. In my view, it is always best to eliminate the number of redundancies and number of cases that must be handled in the representation (ie, by lowering ++X to use primitives, you don't have to handle the ++ case anymore). I'm not sure if this is what you're getting at, but if so, I certainly agree. > This increase the complexity of optimizers. The idea is to reduce a class of > semantic-equivalent programs to a single representative program. This is the > normalization process: > - SIMPLE normalizes def-use IRs > - goto/break/switch elimination and loop canonicalization aim to normalize > control flow. > - ??? normalization for caller-callee > - ??? normalization for types > (I have no idea yet about question marks... work in progress...) Yes, that is how it currently stands. > Low Level Virtual Machine (LLVM) defines a virtual instruction set that > contains exactly the same amount of information as the AST representation and > it has all the benefits of the three-address-code normalization. It actually covers all of the bases you list above pretty well, more detail below. > Thus, I agree that writing optimizers on LLVM is as difficult :-) as writing > them on trees under SIMPLE form. ... > A good idea I saw in the LLVM is that the SSA infromation is stored directly > in the operations stream: ie. (from LLVMCompilationStrategy.pdf) > <result> = phi <type> [<val0>, <label0>], ..., [<valN>, <labelN>] This example that you picked out is just one of a variety of different ways that LLVM is designed to simplify optimizer design. Let me explain how LLVM represents the information that you listed above: > - SSA and dependence graphs for def-use This is the certainly the easiest one to explain. LLVM does use SSA, and as the example you showed above illustrates, the SSA information is reflected __directly__ in the instructions themselves. This occurs in even more subtle ways than the PHI node instruction. Some background is neccesary. In our current LLVM implementation, we have the following C++ classes [you can see the full class heirarchy at the bottom of the http://llvm.cs.uiuc.edu/doxygen/inherits.html page, if you are so inclined]: Value [ http://llvm.cs.uiuc.edu/doxygen/classValue.html ] This class is the base class of all things that can be used as an operand to an instruction. For example, there are subclasses "Constant" and "Argument" to represent constants and formal parameters to functions. The Value class keeps track of list of "User"'s [below], which are all of the objects that use the LLVM value. This provides the def-use chain information. User [ http://llvm.cs.uiuc.edu/doxygen/classUser.html ] User is a subclass of Value which represents a Value that can refer to other LLVM values [for example Instruction, below]. The main feature of User is the operands list, which keeps pointers to all of the Value's that are used by this User. This information represents the SSA use-def information. Of course, the C++ implementation provides a nice interface that makes it impossible for the use-def & def-use information to get out of date... Instruction [ http://llvm.cs.uiuc.edu/doxygen/classInstruction.html ] Instruction is a subclass of User. It provides information about the opcode of the instruction, and some other helpful instruction type stuff, in addition to inheriting functionality from it's base classes. So so far, you see how the def-use & use-def SSA information is tracked. The crucial thing to note here though is that _Instruction is a subclass of Value_. This means that when you have the following LLVM code: %A = add int %X, %Y %B = sub int %A, 0 ... that there is a pointer in the operands list for the %B instruction that points directly to the Add instruction itself. There are no intermediates or other references. There are no tables on the side that are neccesary to update this information. This makes the representation dramatically simpler to update and transform. For example, to optimize away the second instruction there, we do something similar to the following: if (B->getOpcode() == Instruction::Sub) if (B->getOperand(1) == ConstantZero) B->replaceAllUsesWith(B->getOperand(0)); The replaceAllUsesWith method is a member of the Value class [ http://llvm.cs.uiuc.edu/doxygen/classValue.html#a9 ] that loops over all of the Users of the value, changing all instances of "this" in their operand list to instead refer to the argument of replaceAllUsesWith. Note that this automatically keeps the SSA information up-to-date. There is no external tables to reference, reconstruct or update. This makes is incredibly easy to write transformations. > - symbol tables for types LLVM also directly reflects this _in_ the representation just like it does the SSA representation. LLVM is a typed [ http://llvm.cs.uiuc.edu/docs/LangRef.html#typesystem ] representation, which requires types exist on all values and requires that certain type constraints are always true (for example, the two operands to an 'add' or 'sub' instruction must be the same). The type representation is truly language independant, by mapping high-level concepts like classes down into low-level concepts like structures. Because every LLVM value must have a type, the Value class contains a Type* field, recording the type of every value. Because of this, types are easily and quickly available for use. This makes "easy things easy", and also quite efficient: if (B->getOpcode() == Instruction::Sub) if (!B->getType()->isFloatingPoint()) // Be careful with NaN's? if (B->getOperand(1) == ConstantZero) B->replaceAllUsesWith(B->getOperand(0)); Note that this representation is equivalent to how RTL stores the Mode information in all of the "values" it uses. Here, however, the type system is much more powerful (and useful). My key point is that this information is kept IN the representation itself, not along-side in a table that is: A. Slow - in the best case, accessing it is a hash table lookup vs a simple pointer indirection B. Large - again, hash tables have overhead C. Painful - The symbol table must always be kept up-to-date. > - CFG is a the sub-language that contains information about the control flow Keeping with my story, LLVM also represents the CFG directly in the representation, just like everything else we have discussed. In particular, there is a subclass of Value named BasicBlock [ http://llvm.cs.uiuc.edu/doxygen/classBasicBlock.html ]. BasicBlock's obviously contain a list of instuctions in the block, but they themselves are also Value's (of Type 'label'). Because of this, 'BasicBlock*'s may exist right in the Operand list for the instructions themselves, although there are only a few instructions that can use operands of type 'label'. An example of some LLVM code: br label %bb2 bb2: %reg110 = phi uint [ %reg112, %bb2 ], [ 1, %bb1 ] %reg112 = add uint %reg110, 1 %cond218 = setne uint %reg112, 10 br bool %cond218, label %bb2, label %bb3 bb3: In this short example, basic block "BB2" is an operand for the PHI instruction and for the two branch instructions. A pointer _to the basic block_ itself exists in all of these locations. This is great, but it doesn't really answer how we get a CFG out of this. The simplest CFG interface can tell you all of the incoming and outgoing edges of a block. To do this, we have iterators that wrap the following [trivial] analyses: To get the successors of a block: One invariant of LLVM is that every basic block must end with a "Terminator" instruction [ http://llvm.cs.uiuc.edu/doxygen/classTerminatorInst.html ]. Terminator instructions are basically control flow instructions (conditional/unconditional branch, return, plus a few others to be fully general). These instructions know all possible destinations that they may branch to, and therefore expose the "successor" interface used by the succ_iterator. To get predecessors of a block: Because BasicBlock is a subclass of Value, it automatically keeps track of all of the User's of the block. There are users of a basic block only in the following cases: 1. A phi instruction, these can be ignored. 2. Terminator instructions. In this case, the basic block the terminator lives in is a predecessor. Thus, the pred_iterator basically loops over the users of the basic block, looking for terminators instructions. This just leaves one feature lacking from the CFG support: > However LLVM does not provide support for high level loop constructs > (nested loops for example). It defines instead branch "br" instructions. > You then need extra informations for loop nest optimizers. That is very true. In fact we intentionally do not provide these facilities as part of LLVM itself, for several reasons (below). To provide LoopInfo [ http://llvm.cs.uiuc.edu/doxygen/classLoopInfo.html ], we use the LLVM Pass infrastructure (described in detail in: http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html ) to automate the construction and deconstruction of loop information as neccesary. Our "loop information" simply identifies Natural loops and their nesting properties. There are several important aspects of loop information that make it different from other parts of the representation, warranting different treatment [instead of being part of the representation itself, it gets treated like alias analysis information or dominator information]: 1. Updating loop information is not trivial. We want to make it easy to write transformations and analyses, to allow explorative programming. After a transformation is mature and finished, "optimizations" like incrementally updating loop information can be implemented if desired. If not implemented, however, the loop information is simply invalidated and recalculated as neccesary. Details are in the "how to write a pass" document. 2. Most transformations don't need it. While it is certainly useful for a class of transformations (loop transformations for example, LICM for another), there are many that don't need it. Making all of the transformations aware of all of the possible analyses is not a good idea. The other option is to add a "Loop" construct to the intermediate representation itself. I'll try to explain why I think this is a bad idea: 1. A low-level representation should be low-level. This breaks the whole notion of a flat representation (in LLVMs case), and adds a special case that must be handled by anything that changes or analyzes the CFG. 2. A Loop construct is inherently limited by one of two things: A. It is not general enough to represent all of the possible loops that are able to be handled by a transformation. For example, if you only support fortran style "DO" loops, then a recusive list prefetching transformation will have absolutely no use for it. B. It is TOO general. Upon realizing that "A" has set in, you might make the Loop construct a lot more general. For example, first you add the capability to increment the induction variable by a loop invariant amount, then you add the ability to not _have_ an induction variable at all. Perhaps you recognize and represent irreducible loops as well or something. If you go down this path, then the Loop construct becomes meaningless. Transformations become as complex as if they detected the loops themselves. Representing general natural loops (which can have multiple back edges, for example) becomes a really nasty thing to represent as a "Loop" node, so the real generality desired is never implemented. 3. Cluttering up an intermediate representation with something that MAY be useful to some transformations is always a bad idea. It's better to focus on specific things to represent that are useful to nearly all transformations & code generation. Representing loop information in the IR doesn't give you any advantage to representing it on the side, and it forces all clients of the IR to be aware of it. > - Call-graph for caller-callee This email is already WAY too long, so I'll cut this short. The basic jist of how LLVM represents this is pretty simple. Function's [ http://llvm.cs.uiuc.edu/doxygen/classFunction.html ] contain a list of BasicBlock's, and themselves derive from Value. Because of this, they have def-use information just like everything else in LLVM. The LLVM 'Call' instruction always takes a pointer to a function as an operand. In the case of a direct call, then, this operand is a direct pointer to the Function object itself. For almost all purposes, this is a sufficient representation for transformations that need "call graph" information [for example, inlining does not need anything else]. No symbol-table lookup or other side information is neccesary to represent the caller-callee information. In more complex cases (for example, a transformation that does stuff with indirect calls), the transformation can request [ http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html#getAnalysisUsage ] a call graph object [ http://llvm.cs.uiuc.edu/doxygen/classCallGraph.html ] that contains this information. This covers the most common case quite nicely, and has other advantages as well [it's trivial to find the calls sites for a particular function, etc]. > However I think that introducing yet another > translation from trees to LLVM could be expensive in compiling time (yet it is > a quick transformation from SIMPLE). In my mind, one of two things would make sense: 1. Change SIMPLE to incorporate some of the better ideas from LLVM, while leaving stuff people don't like alone. 2. Replace SIMPLE with LLVM. If LLVM looks nice, and having a mature C++ implementation would also help, I certainly would not be opposed to this. I'm sure many people would though, and I'm explicitly not pushing for this. I do not, however, think that it makes sense to have both a "SIMPLE" and an "LLVM" phase to the compiler, there is almost complete overlap between the functionality provided, so there should be only one. Do you have any idea what kinds of things are possible in simple that you don't think would be possible (or pleasant) in LLVM? I'd like to get an idea if there is something that would be hard to do in LLVM, but is easy in SIMPLE (to figure out how much overlap there is). I'm really sorry for the length of this mail, but I think it helps explain a bit of the philosophy behind LLVM... -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 14:33 ` Chris Lattner @ 2002-08-25 16:03 ` Diego Novillo 2002-08-27 9:32 ` Chris Lattner 2002-08-26 8:52 ` Pop Sébastian 1 sibling, 1 reply; 39+ messages in thread From: Diego Novillo @ 2002-08-25 16:03 UTC (permalink / raw) To: Chris Lattner; +Cc: Pop Sébastian, gcc On Sun, 25 Aug 2002, Chris Lattner wrote: > In my mind, one of two things would make sense: > > 1. Change SIMPLE to incorporate some of the better ideas from LLVM, while > leaving stuff people don't like alone. > 2. Replace SIMPLE with LLVM. If LLVM looks nice, and having a mature C++ > implementation would also help, I certainly would not be opposed to > this. I'm sure many people would though, and I'm explicitly not > pushing for this. > I agree with #1. We chose to go with SIMPLE because it represented an evolutionary approach wrt the existing implementation. Evolution is much more civilized than revolution. I've seen several neat ideas in LLVM. Incorporating some of them in GCC would be great. And the tree-ssa branch is perfect for that kind of experimentation. > Do you have any idea what kinds of things are possible in simple that you > don't think would be possible (or pleasant) in LLVM? I'd like to get an > idea if there is something that would be hard to do in LLVM, but is easy > in SIMPLE (to figure out how much overlap there is). > We have just started doing things in SIMPLE. It is still incomplete and we haven't found the really hard problems. All the work that we are doing is based on published algorithms and designs, so it's relatively straightforward for people not familiar with GCC to jump in and see how things are developing. Diego. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 16:03 ` Diego Novillo @ 2002-08-27 9:32 ` Chris Lattner 0 siblings, 0 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-27 9:32 UTC (permalink / raw) To: Diego Novillo; +Cc: Pop Sébastian, gcc On Sun, 25 Aug 2002, Diego Novillo wrote: > I've seen several neat ideas in LLVM. Incorporating some of them > in GCC would be great. And the tree-ssa branch is perfect for > that kind of experimentation. Sounds good to me. I just hope that some of the experience our group has gained can be useful to the greater good. -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 14:33 ` Chris Lattner 2002-08-25 16:03 ` Diego Novillo @ 2002-08-26 8:52 ` Pop Sébastian 2002-08-26 9:17 ` Chris Lattner 2002-08-26 21:18 ` Daniel Berlin 1 sibling, 2 replies; 39+ messages in thread From: Pop Sébastian @ 2002-08-26 8:52 UTC (permalink / raw) To: Chris Lattner; +Cc: dnovillo, gcc On Sun, Aug 25, 2002 at 04:36:59PM -0500, Chris Lattner wrote: > > > OTOH this information is not ready to be used dirrectly by optimizers since IRs > > can produce different representations for the same semantic-equivalent program. > > I'm not really sure what you mean by this. In my view, it is always best > to eliminate the number of redundancies and number of cases that must be > handled in the representation (ie, by lowering ++X to use primitives, you > don't have to handle the ++ case anymore). I'm not sure if this is what > you're getting at, but if so, I certainly agree. > Exact. In fact by reducing the number of cases to be handled (as in reducing synactic sugar, and control sugar) we simplify the optimizer. Reducing "control sugar" is something that is well done in LLVM since all loop constructs are lowered to the same common denominator: branch instructions. However after the discussion about the SWITCH_STMT elimination I saw things differently: in a first step we reduce control structures to a common representative, then it is worth to consider that some architectures support high level control constructs and then a promoting pass has to be applied. Thus the underlying question is: is it worth to lower statements for promoting them later? My opinion is that we win in having less complex optimizers. [...] > > Note that this representation is equivalent to how RTL stores the Mode > information in all of the "values" it uses. Here, however, the type > system is much more powerful (and useful). My key point is that this > information is kept IN the representation itself, not along-side in a > table same thing happens in tree representation where symbol table is stored in DECL_STMTs dirrectly in the tree. > that is: > > A. Slow - in the best case, accessing it is a hash table lookup vs a > simple pointer indirection > B. Large - again, hash tables have overhead > C. Painful - The symbol table must always be kept up-to-date. > you're right... sorry for previous mail. > > This is great, but it doesn't really answer how we get a CFG out of this. > The simplest CFG interface can tell you all of the incoming and outgoing > edges of a block. To do this, we have iterators that wrap the following > [trivial] analyses: > [...] > the succ_iterator. [...] > the pred_iterator basically loops over the users of the basic > block, looking for terminators instructions. > Yep that's a nice interface for handling CFG nodes. I saw something like this also in open64. It's so simple to just say: iterate over all BB in topological order and just write "for (topological_CFG::iterator i = ..." :-) > This just leaves one feature lacking from the CFG support: > > > However LLVM does not provide support for high level loop constructs > > (nested loops for example). It defines instead branch "br" instructions. > > You then need extra informations for loop nest optimizers. > > That is very true. In fact we intentionally do not provide these > facilities as part of LLVM itself, for several reasons (below). To > provide LoopInfo [ http://llvm.cs.uiuc.edu/doxygen/classLoopInfo.html ], we > use the LLVM Pass infrastructure (described in detail in: > http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html ) to automate the > construction and deconstruction of loop information as neccesary. Our > "loop information" simply identifies Natural loops and their nesting > properties. > > There are several important aspects of loop information that make it > different from other parts of the representation, warranting different > treatment [instead of being part of the representation itself, it gets > treated like alias analysis information or dominator information]: > > 1. Updating loop information is not trivial. [...] > 2. Most transformations don't need it. [...] > right, and right It seems that the same thing happens at this level again. We can store information on the side structures and deal with reduced sets of loop structures. > > Do you have any idea what kinds of things are possible in simple that you > don't think would be possible (or pleasant) in LLVM? I'd like to get an > idea if there is something that would be hard to do in LLVM, but is easy > in SIMPLE (to figure out how much overlap there is). > Hmmm, I have to think more about this... For the moment I see that LLVM can express high level loop structures by handling them as side information. This trick can be used also to store information about loop induction variables, thus fortran's DO-loops can be effectively handled by optimizers in the same way as on tree structures. And since you handle only the lowest level loop structures you win in having low level complexity optimizers. [flat trees] LLVM could be obtained by goto introduction from SIMPLE break, continue -> br for, while, do -> br switch -> switch ("an extended br") But then you have to construct loop information before you flatten statements. > I'm really sorry for the length of this mail, but I think it helps explain > a bit of the philosophy behind LLVM... > thanks a lot for contributing ideas :-) ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 8:52 ` Pop Sébastian @ 2002-08-26 9:17 ` Chris Lattner 2002-08-26 21:18 ` Daniel Berlin 1 sibling, 0 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-26 9:17 UTC (permalink / raw) To: Pop Sébastian; +Cc: dnovillo, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 5632 bytes --] On Mon, 26 Aug 2002, [iso-8859-1] Pop Sébastian wrote: > On Sun, Aug 25, 2002 at 04:36:59PM -0500, Chris Lattner wrote: > In my view, it is always best > > to eliminate the number of redundancies and number of cases that must be > > handled in the representation (ie, by lowering ++X to use primitives, you > In fact by reducing the number of cases to be handled (as in reducing synactic > sugar, and control sugar) we simplify the optimizer. Reducing "control sugar" is > something that is well done in LLVM since all loop constructs are lowered to the > same common denominator: branch instructions. Yes, it makes things much simpler to deal with. > However after the discussion about the SWITCH_STMT elimination I saw things > differently: in a first step we reduce control structures to a common > representative, then it is worth to consider that some architectures support high > level control constructs and then a promoting pass has to be applied. > Thus the underlying question is: is it worth to lower statements for promoting > them later? This comes up a lot in LLVM, because it is decidedly, well, low-level. :) In our case, we do actually have a switch statement as part of the representation ( http://llvm.cs.uiuc.edu/docs/LangRef.html#i_switch ), but I am strongly considering removing it, and using a general indirect branch instead. The idea is that by having it be an indirect branch, you have to following features: 1. Optimizations don't have to worry about the special case anymore. 2. Anything that uses the fact that it was a "switch" statement (such as subsequent lowering, or other optimizations), can simply be implemented to see if the pointer argument is derived from an array of labels. Since LLVM provides strong support for representing global values, and it would be a constant array, traditional optimizations would not really be more complex (especially with a couple of wrapper functions). Because of this, I'm strongly leaning towards eliminating the instruction, but haven't yet. > My opinion is that we win in having less complex optimizers. Definately. Less complex optimizers are more likely to be correct also. > > Note that this representation is equivalent to how RTL stores the Mode > > information in all of the "values" it uses. Here, however, the type > > system is much more powerful (and useful). My key point is that this > > information is kept IN the representation itself, not along-side in a > > table > > same thing happens in tree representation where symbol table is stored in > DECL_STMTs dirrectly in the tree. Ah yes, you're right. I'm glad. :) The next thing that I need to convince you of is the need for a language independant type system. :) > > This is great, but it doesn't really answer how we get a CFG out of this. > > The simplest CFG interface can tell you all of the incoming and outgoing > > edges of a block. To do this, we have iterators that wrap the following > > [trivial] analyses: > > the succ_iterator & pred_iterator > Yep that's a nice interface for handling CFG nodes. > I saw something like this also in open64. > It's so simple to just say: iterate over all BB in topological order > and just write "for (topological_CFG::iterator i = ..." > :-) Definately. We have all kinds of nice things like: // traverse flow graph in DFO for (df_iterator<BasicBlock> I = df_begin(BB)... // traverse flow graph in depth first order along the predecessors, // instead of successor links. for (idf_iterator<BasicBlock> I = idf_begin(BB)... > > Do you have any idea what kinds of things are possible in simple that you > > don't think would be possible (or pleasant) in LLVM? I'd like to get an > > idea if there is something that would be hard to do in LLVM, but is easy > > in SIMPLE (to figure out how much overlap there is). > For the moment I see that LLVM can express high level loop structures by > handling them as side information. This trick can be used also to store > information about loop induction variables, thus fortran's DO-loops can > be effectively handled by optimizers in the same way as on tree structures. > And since you handle only the lowest level loop structures you win in > having low level complexity optimizers. Yup, a trick that we use is to have the cannonical induction variable for a loop moved to be the first PHI node in a loop header. That way you don't even have to store information on the side for it. Loop transformations just look like this: for each loop if (isACannonicalIndVar(EntryBB->front())) do stuff Where isACannonicalIndVar just checks to see if the PHI node is formed with an incoming constant and an add instruction (which is very fast). > [flat trees] > LLVM could be obtained by goto introduction from SIMPLE > break, continue -> br > for, while, do -> br > switch -> switch ("an extended br") > > But then you have to construct loop information before you flatten statements. LLVM computes loop information from dominator information, which it computes directly from the LLVM itself. Loop information is not required to read and write LLVM, it's one of several analyses that can be computed on the side. > > I'm really sorry for the length of this mail, but I think it helps explain > > a bit of the philosophy behind LLVM... > > > thanks a lot for contributing ideas :-) No problem. :) Now I just have to convince everyone of the utility and advantage of making the IR a full fledged language in its own right, which is self contained. :) -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 8:52 ` Pop Sébastian 2002-08-26 9:17 ` Chris Lattner @ 2002-08-26 21:18 ` Daniel Berlin 2002-08-27 1:17 ` Pop Sébastian ` (2 more replies) 1 sibling, 3 replies; 39+ messages in thread From: Daniel Berlin @ 2002-08-26 21:18 UTC (permalink / raw) To: Pop Sébastian; +Cc: Chris Lattner, dnovillo, gcc ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 21:18 ` Daniel Berlin @ 2002-08-27 1:17 ` Pop Sébastian 2002-08-27 6:28 ` Diego Novillo ` (2 more replies) 2002-08-27 6:44 ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher 2002-08-27 8:29 ` [tree-ssa] AST optimizer in C++? Chris Lattner 2 siblings, 3 replies; 39+ messages in thread From: Pop Sébastian @ 2002-08-27 1:17 UTC (permalink / raw) To: Daniel Berlin; +Cc: Chris Lattner, dnovillo, gcc On Tue, Aug 27, 2002 at 12:18:31AM -0400, Daniel Berlin wrote: > > I've yet to deal with the issue of a Tree class (either by conversion, > or abstraction). Keeping the original tree structure as it is avoids to retranslate the representation back to trees once we're finished. So I'm for abstraction. I think that we should avoid is to deal with tree structure directly, and build an interface that will abstract existent structures (maybe tree structures will change). > Though if you guys want to play with a C++ based AST optimizer, i'll > throw it on a branch or something That's an excellent idea. What about including it in the tree-ssa branch? ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 1:17 ` Pop Sébastian @ 2002-08-27 6:28 ` Diego Novillo 2002-08-27 7:08 ` Steven Bosscher 2002-08-27 8:37 ` Daniel Berlin 2002-08-27 12:21 ` Per Bothner 2 siblings, 1 reply; 39+ messages in thread From: Diego Novillo @ 2002-08-27 6:28 UTC (permalink / raw) To: Pop Sébastian; +Cc: Daniel Berlin, Chris Lattner, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 392 bytes --] On Tue, 27 Aug 2002, Pop Sébastian wrote: > > Though if you guys want to play with a C++ based AST optimizer, i'll > > throw it on a branch or something > > That's an excellent idea. What about including it in the tree-ssa branch? > I'd rather do it separately. This is is not directly related to the goals we have on this branch and might create quite a bit of disruption. Diego. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 6:28 ` Diego Novillo @ 2002-08-27 7:08 ` Steven Bosscher 0 siblings, 0 replies; 39+ messages in thread From: Steven Bosscher @ 2002-08-27 7:08 UTC (permalink / raw) To: Diego Novillo; +Cc: Pop Sébastian, Daniel Berlin, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 748 bytes --] Op di 27-08-2002, om 15:25 schreef Diego Novillo: > On Tue, 27 Aug 2002, Pop Sébastian wrote: > > > > Though if you guys want to play with a C++ based AST optimizer, i'll > > > throw it on a branch or something > > > > That's an excellent idea. What about including it in the tree-ssa branch? > > > I'd rather do it separately. This is is not directly related to > the goals we have on this branch and might create quite a bit of > disruption. There used to be a time when the tree-ssa branch was the ast-optimizer branch ;-) Anyway, I, too, would like to see the code and maybe play with it a bit. Daniel, can't you post it as a big patch? It may be a little early to create a whole new branch for something like this... Greetz Steven ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 1:17 ` Pop Sébastian 2002-08-27 6:28 ` Diego Novillo @ 2002-08-27 8:37 ` Daniel Berlin 2002-08-27 12:21 ` Per Bothner 2 siblings, 0 replies; 39+ messages in thread From: Daniel Berlin @ 2002-08-27 8:37 UTC (permalink / raw) To: Pop Sébastian; +Cc: Chris Lattner, dnovillo, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 67 bytes --] On Tuesday, August 27, 2002, at 04:17 AM, Pop Sébastian wrote: ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 1:17 ` Pop Sébastian 2002-08-27 6:28 ` Diego Novillo 2002-08-27 8:37 ` Daniel Berlin @ 2002-08-27 12:21 ` Per Bothner 2002-08-27 12:32 ` Gabriel Dos Reis 2 siblings, 1 reply; 39+ messages in thread From: Per Bothner @ 2002-08-27 12:21 UTC (permalink / raw) To: Pop Sébastian; +Cc: gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 24 bytes --] Pop Sébastian wrote: ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 12:21 ` Per Bothner @ 2002-08-27 12:32 ` Gabriel Dos Reis 2002-08-27 12:51 ` Per Bothner 0 siblings, 1 reply; 39+ messages in thread From: Gabriel Dos Reis @ 2002-08-27 12:32 UTC (permalink / raw) To: Per Bothner; +Cc: Pop Sébastian, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1445 bytes --] Per Bothner <per@bothner.com> writes: | Pop Sébastian wrote: | > Keeping the original tree structure as it is avoids to retranslate the | > representation back to trees once we're finished. So I'm for abstraction. | > I think that we should avoid is to deal with tree structure directly, and | > build an interface that will abstract existent structures (maybe tree | > structures will change). | | Perhaps something like: | | class Tree { | union tree_node *node; | }; | Or: | | class Tree { | union tree_common common; | } Since the class Tree will base class, it is likely that we will exclusively using pointers to Tree, then the second declaration saves an unnecessary indirection. And, if we were to be using C++, I'm not convinced that we would like to use "union tree_code" whose purpose is solely to provide polymorphic container type. But, yes, I understand your point: | Long term, I'd be in favor of re-writing the entire ocmpiler in C++, | but as a migration strategy it has to be a *wrapper* around the | existing tree types, and at least for now must allow boostrapping | without an existing C++ compiler. If the experiment looks promising, | I'd be open to switching the entire compiler to C++, again assuming | it can be done incrmentally, without a total re-write! Well, I have some impression that there are some sentiment against C++ somewhere... Well, I wish that impression were wrong... -- Gaby ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 12:32 ` Gabriel Dos Reis @ 2002-08-27 12:51 ` Per Bothner 2002-08-27 13:02 ` Gabriel Dos Reis 0 siblings, 1 reply; 39+ messages in thread From: Per Bothner @ 2002-08-27 12:51 UTC (permalink / raw) To: Gabriel Dos Reis; +Cc: gcc Gabriel Dos Reis wrote: ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 12:51 ` Per Bothner @ 2002-08-27 13:02 ` Gabriel Dos Reis 0 siblings, 0 replies; 39+ messages in thread From: Gabriel Dos Reis @ 2002-08-27 13:02 UTC (permalink / raw) To: Per Bothner; +Cc: gcc Per Bothner <per@bothner.com> writes: [...] | In any case, it's not going to happen unless someone comes up with a | good design, and illustrates its use, perhaps with a new (and optional) | optimization. I absolutely agree with you. Thanks, -- Gaby ^ permalink raw reply [flat|nested] 39+ messages in thread
* Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) 2002-08-26 21:18 ` Daniel Berlin 2002-08-27 1:17 ` Pop Sébastian @ 2002-08-27 6:44 ` Steven Bosscher 2002-08-27 7:09 ` Discussion about a new development plan? (was: Re: [tree-ssa] AST optimizer " Jan Hubicka 2002-08-27 8:29 ` [tree-ssa] AST optimizer in C++? Chris Lattner 2 siblings, 1 reply; 39+ messages in thread From: Steven Bosscher @ 2002-08-27 6:44 UTC (permalink / raw) To: gcc Hello, Many Great New Improvements for GCC have been discussed lately on this mailing list. Maybe it's time to devise a sane development plan, to see what is feasable for what release, and in what time frame. I know this isn't quite how Free Software is usually developed. But since GCC is an essential tool for almost everybody in the Open Source world, I think people will appreciate if they know what to expect (without making any promises, of course ;-). ONGOING WORK (most of you already know this, so skip it or correct me) Right now there are at least four projects ongoing that involve lots of infrastructure changes or major rewrites: 1) The new (tree-based?) optimizer framework that people are working on in the tree-ssa branch. 2) The other "major rework" project on the cfg-branch, where Jan Hubicka and others were working on loop optimizers and midRTL. 3) The new, hand-crafted C++ parser in the cp-parser branch. 4) PCH work in the pch-branch. One guy, many improvements :-) (BTW It would be nice to see more regular status updates for each branch, hint hint...) One would guess this would require all resources available for GCC development (think `people' and `time' here, not money). But this week, two new branches were announced. That means that GCC hackers now have 6 branches plus the mainline to deal with. Isn't that going to be a potential branch maintainer nightmare? To make things worse, many invasive changes have been discussed in this mailing list lately, most notably the idea of doing at least some of the optimizers in C++, which will probably require new bootstrap infrastructure. That's the kind of thing for a new major release, I would think... Add to that the C++ ABI problems and hopefully a test suite to solve those, the plan to move the C front end to its own directory, the conversion from K&R to ISO/ANSI C, etc... In short: GCC is now more than ever a moving target :-( That's bad for disto builders, it's bad for the gcc users community, and it's bad for GCC developers as well! For example, the GNU Pascal compiler still would not even compile as part of GCC 3.0 last time I looked (GC work isn't done yet). I suppose most people would agree that it'd be nice to have a stable GCC release branch for a while, maybe with some bug fixes and compilation time improvements, but not too many changes. NEW DEVELOPMENT PLAN (crude) So maybe we could discuss a new development plan. The old one doesn't say anything about what's beyond GCC 3.3, so this seems like the right moment. Here are my ideas: I tried to devide the ongoing projects in three categories: a) suitable for relatively quick merge: - PCH work. Most of the infrastructure is already there, and (correct me if I'm wrong) the remaining work should not affect anything in the middle-end or back-end. - Most of what's on the cfg-branch now. - Anything gcc-3_4-basic-improvements-branch, by definition. b) long-term, invasive reworks and massive infrastructure changes - faster-compiler stuff. It would seem this could easily cause conflicts with the tree-ssa stuff, so maybe it's better to put them together on a single branch. - tree-ssa-branch. This is starting to look more and more like a complete new middle-end. - More builtin folding (and/or expand BUILTIN_FRONTEND to trees instead of RTL?) - All ISO C for source code not required for bootstraping. - New C++ parser, and converting other front ends to function-at-a-time mode - Bug-free (*cough*) C++ ABI, with testing framework c) very long term ideas - Cross-file interprocedural optimizations and alike. Most of this obviously is related to items under (b), but it seems far more ambitious. - auto-vectorizing pass (as part of middle-end optimizers) With these categories in mind, you could say, OK, stuff under (a) is what we want in the 3.x release series. Stuff under (b) is for GCC 4.0, and (c) is for 4.x (x>0) and beyond. This means that disto builders/users can rely on GCC producing binary compatible code for C++ for all GCC 3.x compilers. It also means that no GCC 3.x will be much faster than what we have now, but I'm not sure how big an issue this really is, especially once PCH is in. Once it has been decided how the new optimizers should be implemented (language, IR, etc), and after 3.3 has been branched, we could branch a "middle-end-branch". The mainline would be the base code for 3.4, 3.5, 3.6, and maybe 3.7, depending on how well things would work out on the middle-end-branch. Assuming one minor release every six months, GCC 3.6 would be released in April 2004. Anything that's on the tree-ssa and faster-compiler branches by then would be merged onto this new middle-end-branch. Work on this branch would include: 1) Move the C front end to its own directory and move stuff from c-common.* to, say, tree-ir.*. (Maybe this should be done on the "stable" branch as well to avoid huge merge conflicts???) 2) Figure out how to bootstrap with part of the sources written in C++ 3a) Implement the new middle-end, using C++ and ISO C (with backports to the mainline and release branch to avoid merge conflicts), and 3b) Convert the existing front ends to function-at-a-time mode to allow them to actually use this infrastructure. 4) Include new front ends such as g95 and GNU Pascal?? Actually (3b) may be relevant only to GNU Pascal and GNAT, unless somebody with serious brain damage stands up and volunteers to rewrite g77. It may be better to concentrate in g95 instead, it already depends on much of what is in the tree-ssa branch today, and it actually produces code (well, for some programs at least). As is being actively discussed right now, some optimizers will probably be written in C++, so a stable g++ would be crucial for the work on the middle-end-branch. Therefore, any major C++ work should be done on a separate branch. The new C++ parser and all ABI fixes would go on the cp-parser branch (renamed future-cxx branch by then?), and fixes would be merged to the middle-end branch only after thourough testing. Assuming the C++ parser can be finished within one and a half year (CodeSourcery is working on it again, right?) and the work on the middle-end will progress at the same speed it has been on in the last half year, a GCC 4.0 alpha could be ready at about the same time you'd release 3.6. Unfortunately nobody can tell how realistic this timeframe is. I suppose I'm being a bit optimistic... I would certainly like to hear your opinions ;-) Sorry for the long mail. I just hope this gets a discussion going... Greetz Steven ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Discussion about a new development plan? (was: Re: [tree-ssa] AST optimizer in C++?) 2002-08-27 6:44 ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher @ 2002-08-27 7:09 ` Jan Hubicka 0 siblings, 0 replies; 39+ messages in thread From: Jan Hubicka @ 2002-08-27 7:09 UTC (permalink / raw) To: Steven Bosscher; +Cc: gcc > ONGOING WORK > (most of you already know this, so skip it or correct me) > > Right now there are at least four projects ongoing that involve lots of > infrastructure changes or major rewrites: > 1) The new (tree-based?) optimizer framework that people are working > on in the tree-ssa branch. > 2) The other "major rework" project on the cfg-branch, where Jan Hubicka > and others were working on loop optimizers and midRTL. Concerning cfg-branch, I plan to create (in fact I've already did) a new branch called rtlopt-branch, where RTL based optimizes in the shape ready for GCC merging (ie not breaking many architectures and provably improving code quality) should go. I plan to move there the optimizes from cfg-branch so we won't slip release schedule again as happent previously and plan to move forward in the direction of removing loop and cse optimizers and replacing them with something sane. I didn't included any code to rtlopt branch nor announced it yet, as I would like to get daily benchmarking of it working first and the machine doing so have problems, Andreas is working on it. I also plan to create midrtl-branch in near future and place there the work for midrtl I done so far. In the first stage I would target for code mergeable into mainline in near future that just cleans up the insn-* interface and virtualises it, so multiple machine descriptions can be compiled into compiler (it works for me already to some extend - like that insn-attrtab and firends are not virtualised as it is not needed for the midrtl project) Then i would like to move in the direction to midrtl, once the ast matures enought so we can cleanup the RTL generation enought to make this possible. Honza ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 21:18 ` Daniel Berlin 2002-08-27 1:17 ` Pop Sébastian 2002-08-27 6:44 ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher @ 2002-08-27 8:29 ` Chris Lattner 2 siblings, 0 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-27 8:29 UTC (permalink / raw) To: Daniel Berlin; +Cc: Pop Sébastian, dnovillo, gcc On Tue, 27 Aug 2002, Daniel Berlin wrote: > > Yep that's a nice interface for handling CFG nodes. > > I saw something like this also in open64. > > It's so simple to just say: iterate over all BB in topological order > > and just write "for (topological_CFG::iterator i = ..." > > :-) > Just an FYI, I reimplemented from scratch (so i wouldn't have to deal > with any revealing of code i've been asked not to reveal issues) the > graph_traits stuff, and their callgraph stuff, in a C++ based AST > optimizer. Nifty. :) > I've yet to deal with the issue of a Tree class (either by conversion, > or abstraction). > Too lazy. > It's just something i play with from time to time. > Though if you guys want to play with a C++ based AST optimizer, i'll > throw it on a branch or something (It can construct callgraphs > incrementally, and a few other things). That's very interesting! What do you think the likelyhood of GCC using a C++ based optimizer is? I think everyone will acknowledge that writing it in C++ would make it much nicer to deal with, but do you the the extra bootstrap stage (if no native C++ compiler is available) would be tolerated? -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 9:48 ` [tree-ssa] " Pop Sébastian 2002-08-25 14:33 ` Chris Lattner @ 2002-08-25 15:35 ` Diego Novillo 2002-08-25 15:45 ` Chris Lattner 1 sibling, 1 reply; 39+ messages in thread From: Diego Novillo @ 2002-08-25 15:35 UTC (permalink / raw) To: Pop Sébastian; +Cc: Chris Lattner, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1540 bytes --] On Sun, 25 Aug 2002, Pop Sébastian wrote: > A good idea I saw in the LLVM is that the SSA infromation is > stored directly in the operations stream: ie. (from > LLVMCompilationStrategy.pdf) <result> = phi <type> [<val0>, > <label0>], ..., [<valN>, <labelN>] > > Diego, maybe it is worth to include PHI nodes directly as a > SIMPLE extension? > I am not too keen on SSA re-writing schemes. When you encode SSA into your IR, you have the added aggravation of having to re-write the code twice. Once to convert it into SSA form and the second time to convert it out of SSA form. The approach we use now does not interfere with the underlying IR. What it does is overlay a web of def-use/use-def pointers on top of the CFG. Each of these def/use/phi references are associated with a specific operand, expression, statement and basic block. This means that building the SSA form is not too expensive (no IR re-writing) and once you're done, you just toss the data away. In any case, I'm always ready to be convinced about the benefits of other approaches. We're in the early stages, and are just now starting to use the infrastructure to write optimizers. As we gain experience with it we'll be able to compare different approaches. > This could solve the problem with the tree_common.aux field... > Hmm, what problem? The SSA form does not use that field (directly). The field is used to annotate trees with various tidbits of information like what block they belong to, what variable references are made by it, etc. Diego. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 15:35 ` Diego Novillo @ 2002-08-25 15:45 ` Chris Lattner 2002-08-25 16:08 ` Diego Novillo 2002-08-25 16:12 ` Daniel Berlin 0 siblings, 2 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-25 15:45 UTC (permalink / raw) To: Diego Novillo; +Cc: Pop Sébastian, gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1000 bytes --] On Sun, 25 Aug 2002, Diego Novillo wrote: > On Sun, 25 Aug 2002, Pop Sébastian wrote: > > Diego, maybe it is worth to include PHI nodes directly as a > > SIMPLE extension? > > > I am not too keen on SSA re-writing schemes. When you encode SSA > into your IR, you have the added aggravation of having to > re-write the code twice. Once to convert it into SSA form and > the second time to convert it out of SSA form. Here's one possible way to look at this issue: SIMPLE should always be in SSA form. When it is SIMPLE is initially constructed, SSA form is built. When SIMPLE is lowered to RTL, UnSSA is run, or RTL in SSA form could even be created and the existing RTL UnSSA pass could be used eventually. > The approach we use now does not interfere with the underlying > IR. What it does is overlay a web of def-use/use-def pointers on Is there a reason that you'd like to have a SIMPLE form that is not in SSA? -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 15:45 ` Chris Lattner @ 2002-08-25 16:08 ` Diego Novillo 2002-08-25 16:19 ` Diego Novillo 2002-08-26 8:35 ` Chris Lattner 2002-08-25 16:12 ` Daniel Berlin 1 sibling, 2 replies; 39+ messages in thread From: Diego Novillo @ 2002-08-25 16:08 UTC (permalink / raw) To: Chris Lattner; +Cc: Pop Sébastian, gcc On Sun, 25 Aug 2002, Chris Lattner wrote: > > The approach we use now does not interfere with the underlying > > IR. What it does is overlay a web of def-use/use-def pointers on > > Is there a reason that you'd like to have a SIMPLE form that is not in > SSA? > Only that nobody has bothered implementing it. We'd need to represent PHI nodes in the IR and teach the rest of the compiler how to deal with them (RTL expanders, tree manipulation, etc). In principle it would be a rather mechanic, albeit large, patch. Diego. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 16:08 ` Diego Novillo @ 2002-08-25 16:19 ` Diego Novillo 2002-08-26 8:39 ` Chris Lattner 2002-08-26 8:35 ` Chris Lattner 1 sibling, 1 reply; 39+ messages in thread From: Diego Novillo @ 2002-08-25 16:19 UTC (permalink / raw) To: Chris Lattner; +Cc: Pop Sébastian, gcc On Sun, 25 Aug 2002, Diego Novillo wrote: > On Sun, 25 Aug 2002, Chris Lattner wrote: > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > SSA? > > > Only that nobody has bothered implementing it. We'd need to > Well, that and the fact that by encoding SSA into the IR you're now forcing every transformation to know about SSA. Not all of them do. In principle I don't see why dataflow and IR should be tied to each other. Diego. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 16:19 ` Diego Novillo @ 2002-08-26 8:39 ` Chris Lattner 2002-08-26 17:43 ` Daniel Berlin 0 siblings, 1 reply; 39+ messages in thread From: Chris Lattner @ 2002-08-26 8:39 UTC (permalink / raw) To: Diego Novillo; +Cc: Pop Sébastian, gcc On Sun, 25 Aug 2002, Diego Novillo wrote: > On Sun, 25 Aug 2002, Diego Novillo wrote: > > On Sun, 25 Aug 2002, Chris Lattner wrote: > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > > SSA? > > Only that nobody has bothered implementing it. We'd need to > Well, that and the fact that by encoding SSA into the IR you're > now forcing every transformation to know about SSA. Not all of > them do. In principle I don't see why dataflow and IR should be > tied to each other. In practice, I have found that it is _very_ easy to extend traditional transformations to use SSA. The one exception I've found is the SSA-PRE algorithm, which you need to wrap your mind around before it makes complete sense. Once done, however, it's not a huge deal to implement it. IMHO, the advantages of using SSA and having it available (for both convenience, optimizer efficiency, and more powerful algorithms than data flow can provide at times) greatly out-weigh the disadvantages of having to update it. What kinds of transformations do you have in mind that would be inconvenient to implement in SSA form? -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 8:39 ` Chris Lattner @ 2002-08-26 17:43 ` Daniel Berlin 2002-08-26 18:12 ` Chris Lattner 0 siblings, 1 reply; 39+ messages in thread From: Daniel Berlin @ 2002-08-26 17:43 UTC (permalink / raw) To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc On Mon, 26 Aug 2002, Chris Lattner wrote: > On Sun, 25 Aug 2002, Diego Novillo wrote: > > On Sun, 25 Aug 2002, Diego Novillo wrote: > > > On Sun, 25 Aug 2002, Chris Lattner wrote: > > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > > > SSA? > > > > Only that nobody has bothered implementing it. We'd need to > > > Well, that and the fact that by encoding SSA into the IR you're > > now forcing every transformation to know about SSA. Not all of > > them do. In principle I don't see why dataflow and IR should be > > tied to each other. > > In practice, I have found that it is _very_ easy to extend traditional > transformations to use SSA. The one exception I've found is the SSA-PRE > algorithm, which you need to wrap your mind around before it makes > complete sense. Once done, however, it's not a huge deal to implement it. > Which is why I did so already. > IMHO, the advantages of using SSA and having it available (for both > convenience, optimizer efficiency, and more powerful algorithms than data > flow can provide at times) greatly out-weigh the disadvantages of having > to update it. We already have it available, and used. Nobody is saying we shouldn't use SSA algorithms when possible, or convert algorithms to SSA. We aren't talking about whether to use SSA or not, we are talking about whether to keep it as annotations, or encode it into the IR. Please address the benefits of *that*, not of SSA in general. > > What kinds of transformations do you have in mind that would be > inconvenient to implement in SSA form? That's not quite relevant. The real question is: What does encoding it directly into the IR, rather than doing is as we do now, buy us, that makes it worth it, given the disadvantages? ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 17:43 ` Daniel Berlin @ 2002-08-26 18:12 ` Chris Lattner 0 siblings, 0 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-26 18:12 UTC (permalink / raw) To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc > > IMHO, the advantages of using SSA and having it available (for both > > convenience, optimizer efficiency, and more powerful algorithms than data > > flow can provide at times) greatly out-weigh the disadvantages of having > > to update it. > > We already have it available, and used. > Nobody is saying we shouldn't use SSA algorithms when possible, or convert > algorithms to SSA. > We aren't talking about whether to use SSA or not, we are talking about > whether to keep it as annotations, or encode it into the IR. > Please address the benefits of *that*, not of SSA in general. See: http://gcc.gnu.org/ml/gcc/2002-08/msg01658.html > > What kinds of transformations do you have in mind that would be > > inconvenient to implement in SSA form? > > That's not quite relevant. > The real question is: > What does encoding it directly into the IR, rather than doing is as we do > now, buy us, that makes it worth it, given the disadvantages? Ditto. -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 16:08 ` Diego Novillo 2002-08-25 16:19 ` Diego Novillo @ 2002-08-26 8:35 ` Chris Lattner 1 sibling, 0 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-26 8:35 UTC (permalink / raw) To: Diego Novillo; +Cc: Pop Sébastian, gcc On Sun, 25 Aug 2002, Diego Novillo wrote: > On Sun, 25 Aug 2002, Chris Lattner wrote: > > > The approach we use now does not interfere with the underlying > > > IR. What it does is overlay a web of def-use/use-def pointers on > > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > SSA? > Only that nobody has bothered implementing it. We'd need to > represent PHI nodes in the IR and teach the rest of the compiler > how to deal with them (RTL expanders, tree manipulation, etc). > In principle it would be a rather mechanic, albeit large, patch. In principle, you could get away from teaching all of the backends and their RTL expanders about PHI nodes by simply simulating them by putting an assignment in the appropriate predecessor blocks. Performing UnSSA on a list-of-instructions representation is really quite simple and could be performed at the same time as RTL expansion. -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 15:45 ` Chris Lattner 2002-08-25 16:08 ` Diego Novillo @ 2002-08-25 16:12 ` Daniel Berlin 2002-08-26 7:02 ` Pop Sébastian 2002-08-26 8:52 ` Chris Lattner 1 sibling, 2 replies; 39+ messages in thread From: Daniel Berlin @ 2002-08-25 16:12 UTC (permalink / raw) To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc > > The approach we use now does not interfere with the underlying > > IR. What it does is overlay a web of def-use/use-def pointers on > > Is there a reason that you'd like to have a SIMPLE form that is not in > SSA? Um, yes. It artificially limits us to that representation. Say later, someone wants to use some other form of SSA, or non-SSA (dependence flow graph, etc). If you make it actually part of the trees, they now have to care about it. If it's just annotations, only what wants to look at those annotations look at it. Plus, there are optimization passes that might not be run on SSA (some loop optimizations). Once again, forcing SSA on the representation itself means they have to care about it (or you've limited when they can be run), while annotations have no such limitations. There is no good reason to actually make it part of the form, other than having to insert copies yourself. But all algorithms i've seen that are written to keep SSA form up to date do this anyway, or make it easy to do. --Dan ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 16:12 ` Daniel Berlin @ 2002-08-26 7:02 ` Pop Sébastian 2002-08-26 8:28 ` Steven Bosscher 2002-08-26 8:52 ` Chris Lattner 1 sibling, 1 reply; 39+ messages in thread From: Pop Sébastian @ 2002-08-26 7:02 UTC (permalink / raw) To: Daniel Berlin; +Cc: Chris Lattner, Diego Novillo, gcc On Sun, Aug 25, 2002 at 07:12:03PM -0400, Daniel Berlin wrote: > Plus, there are optimization passes that might not be run on SSA (some > loop optimizations). > > Once again, forcing SSA on the representation itself means they have to > care about it (or you've limited when they can be run), while annotations > have no such limitations. > okay. I get the idea. > There is no good reason to actually make it part of the form, I'd like we examine also the case of DECL_STMTs that seems to fall under the same pattern. For now we have a symbol table that spreads all over the tree. DECL_ nodes define entries in this symbol table. Isn't a "stand alone" symbol table a better representation for storing this information? It's probably a good way to store variable declarations in C-like languages, but what about fortran and other languages? ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 7:02 ` Pop Sébastian @ 2002-08-26 8:28 ` Steven Bosscher 0 siblings, 0 replies; 39+ messages in thread From: Steven Bosscher @ 2002-08-26 8:28 UTC (permalink / raw) To: Pop Sébastian; +Cc: gcc [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1206 bytes --] Op ma 26-08-2002, om 16:02 schreef Pop Sébastian: > On Sun, Aug 25, 2002 at 07:12:03PM -0400, Daniel Berlin wrote: > > Plus, there are optimization passes that might not be run on SSA (some > > loop optimizations). > > > > Once again, forcing SSA on the representation itself means they have to > > care about it (or you've limited when they can be run), while annotations > > have no such limitations. > > > okay. I get the idea. > > > There is no good reason to actually make it part of the form, > > I'd like we examine also the case of DECL_STMTs that seems to fall under the > same pattern. > For now we have a symbol table that spreads all over the tree. DECL_ nodes > define entries in this symbol table. Isn't a "stand alone" symbol table a better > representation for storing this information? It's probably a good way to store > variable declarations in C-like languages, but what about fortran and other > languages? What do you mean by "stand alone"? You mean, one big separate table for, say, each function? I don't see how you would declare a local entity with such a symbol table. Why would there be any difference between C and other languages? Greetz Steven ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-25 16:12 ` Daniel Berlin 2002-08-26 7:02 ` Pop Sébastian @ 2002-08-26 8:52 ` Chris Lattner 2002-08-26 17:40 ` Daniel Berlin 1 sibling, 1 reply; 39+ messages in thread From: Chris Lattner @ 2002-08-26 8:52 UTC (permalink / raw) To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc On Sun, 25 Aug 2002, Daniel Berlin wrote: > > Is there a reason that you'd like to have a SIMPLE form that is not in > > SSA? > > Um, yes. > It artificially limits us to that representation. > Say later, someone wants to use some other form of SSA, or non-SSA > (dependence flow graph, etc). > If you make it actually part of the trees, they now have to care about > it. > If it's just annotations, only what wants to look at those annotations > look at it. Actually, that's not true at all. Using SSA as the base representation is just like using a conventional register based underlying representation: you can still layer other representations ON TOP of SSA, you just need to keep the SSA up-to-date as you transform the layered representation (just as you must do with the register based representation). In practice, there are several simple extensions to SSA that are representable as direct extensions to the representation [eta nodes, pi nodes from the ABCE paper, etc], which don't require a layered representation. Other representations are really major extensions of SSA which are enhanced by having SSA as the base representation. Of course, I'm sure there are others I'm not familiar with, but worst case, you just have to keep the base representation up to date with the layered representation, which is nothing different than the current situation with SSA. The key difference with using SSA as the base representation, however, is that almost ALL transformations benefit from it. SSA will probably be the _most_ commonly used representation when the tree-ssa branch is finally mature (hence the name), beating out the register based representation by a lot. I'm not really sure when and if other representations would actually be useful, and if their usefulness is worth the cost of creating and maintaining the representation: whether layered on top of registers or SSA. > Plus, there are optimization passes that might not be run on SSA (some > loop optimizations). Loop optimizations generally work wonderfully on SSA form. The PHI nodes encapsulates a lot of the information you need directly in the code, which is _quite_ convenient. > Once again, forcing SSA on the representation itself means they have to > care about it (or you've limited when they can be run), while annotations > have no such limitations. Sure, of course. If, however, they don't understand the annotations, they will be corrupted by the pass. This means that you have to destroy and regenerate the SSA annotations. This is greatly expensive, which means that the transformation will be updated to use SSA anyway (often making it more efficient). While forcing SSA from the start makes it harder to transfer legacy code, I think this is actually a good thing for a couple of reasons: 1. The code should be rewritten anyway, because the IR is different, there are different tradeoffs, and different cases to handle. 2. Writing a transformation taking advantage of SSA often _greatly_ simplifies the code. Again, I can show many examples of simplified transformations. 3. Destroying and regenerating SSA information is quite expensive when it's unneccesary. Keeping track of whether or not SSA is created is also painful, hurting modularity. > There is no good reason to actually make it part of the form, other than > having to insert copies yourself. But all algorithms i've seen that are > written to keep SSA form up to date do this anyway, or make it easy to do. I'm not sure if I understand what you're saying here... it seems as though you are contradicting yourself by saying that SSA _is_ easy to keep up-to-date (which I believe it is, of course :) What am I misinterpreting? -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 8:52 ` Chris Lattner @ 2002-08-26 17:40 ` Daniel Berlin 2002-08-26 17:44 ` Daniel Berlin 2002-08-26 18:11 ` Chris Lattner 0 siblings, 2 replies; 39+ messages in thread From: Daniel Berlin @ 2002-08-26 17:40 UTC (permalink / raw) To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc On Mon, 26 Aug 2002, Chris Lattner wrote: > On Sun, 25 Aug 2002, Daniel Berlin wrote: > > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > > SSA? > > > > Um, yes. > > It artificially limits us to that representation. > > Say later, someone wants to use some other form of SSA, or non-SSA > > (dependence flow graph, etc). > > If you make it actually part of the trees, they now have to care about > > it. > > If it's just annotations, only what wants to look at those annotations > > look at it. > > Actually, that's not true at all. Using SSA as the base representation is > just like using a conventional register based underlying representation: > you can still layer other representations ON TOP of SSA, you just need to > keep the SSA up-to-date as you transform the layered representation (just > as you must do with the register based representation). No kidding, which is the problem. > > In practice, there are several simple extensions to SSA that are > representable as direct extensions to the representation [eta nodes, pi > nodes from the ABCE paper, etc], which don't require a layered > representation. Other representations are really major extensions of SSA > which are enhanced by having SSA as the base representation. Not necessarily true at all. But i'm also not saying SSA is a bad base, i just don't think we need to make it the *actual* IR. > > Of course, I'm sure there are others I'm not familiar with, but worst > case, you just have to keep the base representation up to date with the > layered representation, which is nothing different than the current > situation with SSA. This has nothing to do with keeping it up to date. It has to do with the fact that other things would need to know about PHI nodes, etc. We can easily just rebuild the SSA form if we want to run a non-SSA pass in between two SSA ones (or whatever). But it's quicker to be able to just *RUN* that pass without having to drop PHI nodes, insert copies, etc, just so that non-SSA pass doesn't get confused. > > The key difference with using SSA as the base representation, however, is > that almost ALL transformations benefit from it. SSA will probably be the > _most_ commonly used representation when the tree-ssa branch is finally > mature (hence the name), beating out the register based representation by > a lot. I'm not really sure when and if other representations would > actually be useful, and if their usefulness is worth the cost of creating > and maintaining the representation: whether layered on top of registers or > SSA. DFG, PDG, etc are generally worth the cost of creating. > > > Plus, there are optimization passes that might not be run on SSA (some > > loop optimizations). > > Loop optimizations generally work wonderfully on SSA form. The PHI nodes > encapsulates a lot of the information you need directly in the code, which > is _quite_ convenient. > > Once again, forcing SSA on the representation itself means they have to > > care about it (or you've limited when they can be run), while annotations > > have no such limitations. > > Sure, of course. If, however, they don't understand the annotations, they > will be corrupted by the pass. This means that you have to destroy and > regenerate the SSA annotations. So? The other option is to not be able to run the pass at all. Why limit ourselves by making sure we can't run that other pass, when we get no actual *benefit* from doing so? > This is greatly expensive, which means > that the transformation will be updated to use SSA anyway (often making it > more efficient). While forcing SSA from the start makes it harder to > transfer legacy code, It's not necessarily legacy code. All the world is not SSA. Even if it was, it may not stay that way forever. > simplifies the code. Again, I can show many examples of simplified > transformations. > 3. Destroying and regenerating SSA information is quite expensive when > it's unneccesary. Keeping track of whether or not SSA is created is > also painful, hurting modularity. > > > There is no good reason to actually make it part of the form, other than > > having to insert copies yourself. But all algorithms i've seen that are > > written to keep SSA form up to date do this anyway, or make it easy to do. > > I'm not sure if I understand what you're saying here... it seems as though > you are contradicting yourself by saying that SSA _is_ easy to keep > up-to-date (which I believe it is, of course :) What am I > misinterpreting? You keep missing the actual point. 1. There is no benefit to putting SSA in the trees, as opposed to annotations. 2. There is a cost to putting SSA in the trees, regardless of whether you feel the things that aren't SSA, are worth squat. Thus, given this, *why* do it? What does it buy us? That's what i want to know. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 17:40 ` Daniel Berlin @ 2002-08-26 17:44 ` Daniel Berlin 2002-08-26 18:11 ` Chris Lattner 1 sibling, 0 replies; 39+ messages in thread From: Daniel Berlin @ 2002-08-26 17:44 UTC (permalink / raw) To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc On Mon, 26 Aug 2002, Daniel Berlin wrote: > On Mon, 26 Aug 2002, Chris Lattner wrote: > > > On Sun, 25 Aug 2002, Daniel Berlin wrote: > > > > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > > > SSA? > > > > > > Um, yes. > > > It artificially limits us to that representation. > > > Say later, someone wants to use some other form of SSA, or non-SSA > > > (dependence flow graph, etc). > > > If you make it actually part of the trees, they now have to care about > > > it. > > > If it's just annotations, only what wants to look at those annotations > > > look at it. > > > > Actually, that's not true at all. Using SSA as the base representation is > > just like using a conventional register based underlying representation: > > you can still layer other representations ON TOP of SSA, you just need to > > keep the SSA up-to-date as you transform the layered representation (just > > as you must do with the register based representation). > > No kidding, which is the problem. Just to be clear, i mean the fact that we have to layer things on top of it is the problem. not the updating. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 17:40 ` Daniel Berlin 2002-08-26 17:44 ` Daniel Berlin @ 2002-08-26 18:11 ` Chris Lattner 2002-08-26 18:46 ` Daniel Berlin 1 sibling, 1 reply; 39+ messages in thread From: Chris Lattner @ 2002-08-26 18:11 UTC (permalink / raw) To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc On Mon, 26 Aug 2002, Daniel Berlin wrote: > On Mon, 26 Aug 2002, Chris Lattner wrote: > > On Sun, 25 Aug 2002, Daniel Berlin wrote: > > > > Is there a reason that you'd like to have a SIMPLE form that is not in > > > > SSA? > > > > > > Um, yes. > > > It artificially limits us to that representation. > > > Say later, someone wants to use some other form of SSA, or non-SSA > > > (dependence flow graph, etc). > > > If you make it actually part of the trees, they now have to care about > > > it. > > > If it's just annotations, only what wants to look at those annotations > > > look at it. > > > > Actually, that's not true at all. Using SSA as the base representation is > > just like using a conventional register based underlying representation: > > you can still layer other representations ON TOP of SSA, you just need to > > keep the SSA up-to-date as you transform the layered representation (just > > as you must do with the register based representation). > > No kidding, which is the problem. > Just to be clear, i mean the fact that we have to layer things on top of > it is the problem. not the updating. I understand exactly what you're saying. My point is that if you want to do a non-SSA representation TODAY, that you have to deal with layering. Changing the base representation to SSA form doesn't change this fact. > > In practice, there are several simple extensions to SSA that are > > representable as direct extensions to the representation [eta nodes, pi > > nodes from the ABCE paper, etc], which don't require a layered > > representation. Other representations are really major extensions of SSA > > which are enhanced by having SSA as the base representation. > Not necessarily true at all. > But i'm also not saying SSA is a bad base, i just don't think we need to > make it the *actual* IR. I understand what you're saying. :) > > Of course, I'm sure there are others I'm not familiar with, but worst > > case, you just have to keep the base representation up to date with the > > layered representation, which is nothing different than the current > > situation with SSA. > This has nothing to do with keeping it up to date. It has to do with the > fact that other things would need to know about PHI nodes, etc. > We can easily just rebuild the SSA form if we want to run a non-SSA pass > in between two SSA ones (or whatever). But it's quicker to be able to just > *RUN* that pass without having to drop PHI nodes, insert copies, etc, just > so that non-SSA pass doesn't get confused. This has everything to do with keeping it up to date. This is the situation as it exists today with a layered SSA reprsentation: 1. All SSA transformations are penalized by having to update both SSA form and standard register form. This makes SSA transformations harder and less likely to be written. 2. Alternative representations can be represented just like SSA is now, layered. 3. Non-SSA transformations don't have to deal with SSA at all, they can just ignore it and have it garbage collected away. 4. Non-SSA transformations cause a huge compile-time performance hit, because they invalidate the SSA information, requiring it to be rebuilt. SSA generation isn't SLOW, but regenerating it frequently can certainly add noticably to compile time. If dominator information is also out of date then things get really slow for big codes. 5. Transformations on alternative representations can either be aware of SSA or not. If they aren't, then they suffer the same problems as #4. If they are, then they have to maintain and manipulate _3_ different representations, keeping them up to date. ... and this is the situation with SSA built into the representation: 1. SSA transformations just update the one true representation. There is nothing to get out of date, and they are simple to write. 2. Transformation in alternative representations are implemented as layers on top of the SSA representation. There isn't a huge difference between this case and a register based representation. 3. You don't have to allocate as much to maintain two reprentations. You don't have to track whether or not things are up to date, because SSA is always available. Thus, the optimizer runs faster. 4. Non-SSA transformations are harder to write. See below. 5. Transformations on alternative representations have to maintain _at most_ two representations. The register based representation doesn't factor into the picture at all. About Non-SSA transformations: 1. I have yet to be convinced that there is a need for such a thing at all. Entire optimizers (Pro64, LLVM) have been written in SSA, and I haven't come across an example of a transformation that is awkward to implement in SSA. If you have a good example, please let me know. 2. Even if you do come up with a transformation, you can use a (generally useful) pass like the LLVM mem2reg pass. This transformation basically promotes stack allocated values accessed with load & store instructions into SSA registers (memory is not in SSA form). If you have a transformation that really doesn't want to maintain and update SSA form, you simply have it so that it converts the stuff it's working on into loads and stores of a stack allocated value. After the pass you run the mem2reg pass, which cleans everything up for you (basically constructs SSA just for the values you're interested in) Either way, non-ssa transformations are not a problem, and again, I'm not aware of anything that doesn't like to be in SSA. > > The key difference with using SSA as the base representation, however, is > > that almost ALL transformations benefit from it. SSA will probably be the > > _most_ commonly used representation when the tree-ssa branch is finally > > mature (hence the name), beating out the register based representation by > > a lot. I'm not really sure when and if other representations would > > actually be useful, and if their usefulness is worth the cost of creating > > and maintaining the representation: whether layered on top of registers or > > SSA. > > DFG, PDG, etc are generally worth the cost of creating. These representations are also easily built "on the side" on top of SSA, so they aren't very good for showing how SSA gets in the way... > > > Once again, forcing SSA on the representation itself means they have to > > > care about it (or you've limited when they can be run), while annotations > > > have no such limitations. > > > > Sure, of course. If, however, they don't understand the annotations, they > > will be corrupted by the pass. This means that you have to destroy and > > regenerate the SSA annotations. > So? > The other option is to not be able to run the pass at all. > Why limit ourselves by making sure we can't run that other pass, when we > get no actual *benefit* from doing so? That's not true at all, please see the remarks above. You seem very concerned about these transformations... can you give me some examples? > > This is greatly expensive, which means > > that the transformation will be updated to use SSA anyway (often making it > > more efficient). While forcing SSA from the start makes it harder to > > transfer legacy code, > > It's not necessarily legacy code. > All the world is not SSA. > Even if it was, it may not stay that way forever. That's true, but even though SSA is not perfect, it's a HUGE step up over just using a register based representation. Why limit ourselves to the lowest common denominator? > > > There is no good reason to actually make it part of the form, other than > > > having to insert copies yourself. But all algorithms i've seen that are > > > written to keep SSA form up to date do this anyway, or make it easy to do. > > > > I'm not sure if I understand what you're saying here... it seems as though > > you are contradicting yourself by saying that SSA _is_ easy to keep > > up-to-date (which I believe it is, of course :) What am I > > misinterpreting? > > You keep missing the actual point. > > 1. There is no benefit to putting SSA in the trees, as opposed to > annotations. > 2. There is a cost to putting SSA in the trees, regardless of whether > you feel the things that aren't SSA, are worth squat. > > Thus, given this, *why* do it? > What does it buy us? > That's what i want to know. Obviously because maintaining the representation on the side has costs, costs that you seem to be ignoring. If you want me to enumerate some of them, I can. The whole point is that SSA is a very nice representation for performing optimizations on, and therefore you want to make writing these optimizations easier. If it makes the less common case a bit harder, then that's ok, because it's not common. Sorta like profile guided optimization I suppose, were we are optimizing initial development and ongoing maintenance instead of execution time. -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 18:11 ` Chris Lattner @ 2002-08-26 18:46 ` Daniel Berlin 2002-08-27 9:26 ` Chris Lattner 0 siblings, 1 reply; 39+ messages in thread From: Daniel Berlin @ 2002-08-26 18:46 UTC (permalink / raw) To: Chris Lattner; +Cc: Diego Novillo, Pop Sébastian, gcc No kidding, which is the problem. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-26 18:46 ` Daniel Berlin @ 2002-08-27 9:26 ` Chris Lattner 2002-08-27 9:57 ` Chris Lattner 0 siblings, 1 reply; 39+ messages in thread From: Chris Lattner @ 2002-08-27 9:26 UTC (permalink / raw) To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc On Mon, 26 Aug 2002, Daniel Berlin wrote: > >> No kidding, which is the problem. > > > >> Just to be clear, i mean the fact that we have to layer things on top > >> of it is the problem. not the updating. > > > > I understand exactly what you're saying. My point is that if you want > > to do a non-SSA representation TODAY, that you have to deal with > > layering. > No you don't, you can just add your own annotations. > No layering needed. > Where did you get the idea you need to touch SSA at all to write a > seperate representation? I'm sorry, I mis-typed. I meant that if you want to do a non-register based representation today, you have to deal with layering. When I say layering, I mean a layer of annotations on the side that basically exist over the top of a base representation. If SSA was your base representation and you wanted a PDG, f.e., you could have the annotations on the side of SSA just like you would a register based representation. Sorry for the confusion! :) > > This has everything to do with keeping it up to date. This is the > > situation as it exists today with a layered SSA reprsentation: > > > > 1. All SSA transformations are penalized by having to update both SSA > > form > > and standard register form. This makes SSA transformations harder > > and > > less likely to be written > How does it make it harder, given that any transformation that keeps > the SSA form up to date must necessarily be inserting the proper copies > into the "standard" form and whatnot anyway? I'm not sure what you mean by "inserting copies". If SSA is the base representation there is absolutely no need for a copy instruction (LLVM doesn't have one, for example). If SSA is layered on top of a register based representation (with annotations), of course then you have to insert copies into the register based representation, but that's only one of the ways you have to modify it to keep it up to date... Could you please explain a bit what you mean by this? Sorry if I'm missing something obvious. :) > > 2. Alternative representations can be represented just like SSA is now, > > layered. > > 3. Non-SSA transformations don't have to deal with SSA at all, they can > > just ignore it and have it garbage collected away. > > 4. Non-SSA transformations cause a huge compile-time performance hit, > > because they invalidate the SSA information, requiring it to be > > rebuilt. > Well, not quite. > You can incrementally update it without the non-SSA pass having to > worry about it if you really think it's that big a hit. I'm not sure what you mean by this. You still have to regenerate the SSA annotations after transformations on the alternative representation if you don't update the SSA. If you do update the SSA, that's case #5 below. > Or, we could run the optimizers outside the SSA ones, but with your > scheme, this adds an unssa pass. Not that this is slow, but just > "another" thing to worry about. Or, in your scheme, the SSA representation would effectively get garbage collected away, and regenerated later, which is basically as expensive as running unssa anyway. Unssa is a quite fast and simple transformation when you don't have to worry about RTL level issues. > > 5. Transformations on alternative representations can either be aware > > of > > SSA or not. If they aren't, then they suffer the same problems as > > #4. > > If they are, then they have to maintain and manipulate _3_ different > > representations, keeping them up to date. > Not they directly necessarily, but yes, SSA needs to be kept up to date > or rebuilt. So my point was that _if_ SSA is the primary representation [ie, used by most transformations], then it make sense to make it the base representation as well. > > ... and this is the situation with SSA built into the representation: > > > > 1. SSA transformations just update the one true representation. There > > is > > nothing to get out of date, and they are simple to write. > I still don't see how it was hard in the first place. It is not _hard_ persay, see the end of the email. > > 2. Transformation in alternative representations are implemented as > > layers > > on top of the SSA representation. There isn't a huge difference > > between this case and a register based representation. > The hardness of this step depends on how close to SSA your > representation is. Certainly, just as the hardness of layering on top of a register based representation depends on how close you are to a register based rep. I think that you would agree that "new" representations tend to be more about encoding flow and dependance information into the representation than they do about modeling explicit registers, along with all four possible dependences... > > Obviously because maintaining the representation on the side has costs, > > costs that you seem to be ignoring. > > > If you want me to enumerate some of > > them, I can. > Please do. > The location of SSA information, be it in annotations, in the IR, or on > some remote computer in budapest, should not affect the cost of > implementation (IE the "hardness" that will prevent SSA opts from being > written) as long as the abstraction is good. > Maybe the cost of runtime, but even then, consider this: Here are some of the costs that I can think of: 1. Actively working on (updating, traversing) two seperate representations at the same time implies that you have less effectively cache capacity available to you. This includes L1, L2, TLB, and all of the other problems people are tackling. 2. Because you _must_ add extra layers of abstraction on top of things to make it easy to update the SSA information, your icache is less effective. In addition, the code is harder to understand because there is more abstraction involved. 3. Some things you cannot abstract out, for example, generalized updates the the representation. 4. Having two representations active at the same time means that potential contributors have to be aware and understand the rules and semantics of BOTH of them. In this case, it means that they have to know how and when to update the base representation and what exactly the abstraction functions are doing. 5. Speaking from experience, having the representation in SSA form only makes it much easier to manipulate instruction nodes, understand the lifetime of the underlying representation, and therefore have simple rules for when and how memory can be leaked. This is something that GCC has lacked for a long time (not just in the optimizer), which is what makes the garbage collector effectively neccesary. 6. Having a layered representation makes it much less convenient to define you intermediate representation as a _language_ in it's own right which gives you all kinds of nice advantages. Note that this all depends on having a top-notch abstraction interface. If you don't have a good abstraction interface, then there really is no comparison. > Even if we made SSA the IR, we still need the annotations to generate > the factored edges, unless you are suggesting we get rid of these too > (which, I should point out, Pro64 and friends have and use). > If we get rid of those, we'd have to have nice large arrays that > multimap defs to uses, map defs to uses, etc. > Unless you want to throw those annotations *directly* in the tree > structure, which would be another fight with other people. I'm sorry, I'm not familiar with the term "factored edges". What are they used for? > > The whole point is that SSA is a very nice representation for > > performing > > optimizations on, and therefore you want to make writing these > > optimizations easier. If it makes the less common case a bit harder, > > then > > that's ok, because it's not common. > Not common *right now*. > Look how quickly SSA took over. Sure, because SSA makes a lot of sense for a lot of optimizations. You still haven't given me any examples about optimizations that are difficult to write in SSA, f.e. Most of the other representations are tuned to a particular class of problem, making them less generally useful. I would argue that _this_ is why SSA caught on so fast. > > Sorta like profile guided > > optimization I suppose, were we are optimizing initial development and > > ongoing maintenance instead of execution time. > In terms of initial development time, i think i can offer some actual > data: > I've written 1.5 of the 2 SSA optimizers on the tree-ssa branch. > You have written 0. True, but I have written ~30 SSA optimizers in LLVM, plus many "analyses" as well (dominators, natural loops, intervals, alias analysis, data structure analysis, etc). I do have some experience, and I do have some idea what I'm talking about. The fact that LLVM didn't exist at all two years ago is a testament to how fast it has been progressing. > Is this because SSA is not the IR? This is for several reasons: 1. There is a huge learning curve to working on GCC 2. There is [basically] no documentation on the tree-ssa branch 3. I'm not paid to work on GCC, and all my free time is taken working on LLVM. That said, I _would_ like to be able to work on GCC, contributing enhancements here and there. I _am_ somewhat familiar with other parts of GCC, just because of the backend work I have done to get LLVM coming out of GCC. If the infrastructure was more mature and inviting, I would be contributing. A good start would be porting over some of the transformations from LLVM. > Honestly, I don't see how making SSA the IR would make me want to write > 3 instead of 5, or would make it any easier. I understand your viewpoint. The fact is, however, that you are an expert in GCC and an expert in the SSA representation used. For people who aren't, having two representations to deal with is painful and hard to deal with. This makes it less likely that you will get _external_ contributions, which I'd argue is what free software is all about. > Maybe you could work up a patch to tree-ssa-ccp or tree-ssa-pre.c that > shows how much easier it would be if we had it in the IR? > No need to actually make it *work*, i just want to see an actual > benefit in terms of what the code looks like, or how short it is, etc, > rather than all theoretical junk. I'm having a hard time getting access to those files. subversion.gcc.org doesn't seem to like displaying branches in viewcvs, and those files don't show up here: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/?only_with_tag=tree-ssa-20020619-branch any hints? > It seems to me that all it would buy us in either is that the macros > might have different names, or access different structure fields. > For all intents and purposes, except for the fact that we don't > directly represent the versions in the names (nor does Pro64), and phi > nodes don't exist in trees, SSA *is* the IR. I'm not arguing that SSA isn't the IR. :) I'm arguing that because it's not the BASE IR that there are some disadvantages, that's all... -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [tree-ssa] AST optimizer in C++? 2002-08-27 9:26 ` Chris Lattner @ 2002-08-27 9:57 ` Chris Lattner 0 siblings, 0 replies; 39+ messages in thread From: Chris Lattner @ 2002-08-27 9:57 UTC (permalink / raw) To: Daniel Berlin; +Cc: Diego Novillo, Pop Sébastian, gcc > > Maybe you could work up a patch to tree-ssa-ccp or tree-ssa-pre.c that > > shows how much easier it would be if we had it in the IR? > > No need to actually make it *work*, i just want to see an actual > > benefit in terms of what the code looks like, or how short it is, etc, > > rather than all theoretical junk. > > I'm having a hard time getting access to those files. subversion.gcc.org > doesn't seem to like displaying branches in viewcvs, and those files don't > show up here: Okay I found the code in the Attic directory (is this normal?). Basically, I think the benefits would be: * Simpler code. I'm not really sure why you have two loops in ssa_ccp_substitute_constants, but SCCP should only require one loop of transformation over the representation to finalize it. In LLVM this looks like this: for (Function::iterator BB = F.begin(), BBE = F.end(); BB != BBE; ++BB) for (BasicBlock::iterator BI = BB->begin(); BI != BB->end();) { Instruction &Inst = *BI; if (ValueState[&Inst].isConstant()) { Constant *Const = ValueState[&Inst].getConstant(); // Replaces all of the uses of a variable with uses of the constant. Inst.replaceAllUsesWith(Const); // Remove the operator from the list of definitions... and delete it. BI = BB->getInstList().erase(BI); } else { ++BI; } } [slightly simplified from http://llvm.cs.uiuc.edu/doxygen/SCCP_8cpp-source.html#l00239 ] why does your SCCP have the extra loop? * Macros should disappear. Given a structured representation, many of the macros should go away... -Chris http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/ ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2002-08-27 13:02 UTC | newest] Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-08-22 13:50 A FrontEnd in C++? Chris Lattner 2002-08-23 0:51 ` AST optimizer " Pop Sébastian 2002-08-23 10:25 ` Chris Lattner 2002-08-25 9:48 ` [tree-ssa] " Pop Sébastian 2002-08-25 14:33 ` Chris Lattner 2002-08-25 16:03 ` Diego Novillo 2002-08-27 9:32 ` Chris Lattner 2002-08-26 8:52 ` Pop Sébastian 2002-08-26 9:17 ` Chris Lattner 2002-08-26 21:18 ` Daniel Berlin 2002-08-27 1:17 ` Pop Sébastian 2002-08-27 6:28 ` Diego Novillo 2002-08-27 7:08 ` Steven Bosscher 2002-08-27 8:37 ` Daniel Berlin 2002-08-27 12:21 ` Per Bothner 2002-08-27 12:32 ` Gabriel Dos Reis 2002-08-27 12:51 ` Per Bothner 2002-08-27 13:02 ` Gabriel Dos Reis 2002-08-27 6:44 ` Discussion about a new development plan? (was: Re: [tree-ssa] ASToptimizer in C++?) Steven Bosscher 2002-08-27 7:09 ` Discussion about a new development plan? (was: Re: [tree-ssa] AST optimizer " Jan Hubicka 2002-08-27 8:29 ` [tree-ssa] AST optimizer in C++? Chris Lattner 2002-08-25 15:35 ` Diego Novillo 2002-08-25 15:45 ` Chris Lattner 2002-08-25 16:08 ` Diego Novillo 2002-08-25 16:19 ` Diego Novillo 2002-08-26 8:39 ` Chris Lattner 2002-08-26 17:43 ` Daniel Berlin 2002-08-26 18:12 ` Chris Lattner 2002-08-26 8:35 ` Chris Lattner 2002-08-25 16:12 ` Daniel Berlin 2002-08-26 7:02 ` Pop Sébastian 2002-08-26 8:28 ` Steven Bosscher 2002-08-26 8:52 ` Chris Lattner 2002-08-26 17:40 ` Daniel Berlin 2002-08-26 17:44 ` Daniel Berlin 2002-08-26 18:11 ` Chris Lattner 2002-08-26 18:46 ` Daniel Berlin 2002-08-27 9:26 ` Chris Lattner 2002-08-27 9:57 ` Chris Lattner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).