Annotalysis is a flow and path sensitive analysis, in which we track the set of locks that are known to be held at each program point, and verify that any access to guarded data, or call to a guarded method, is made while holding the appropriate lock. Annotations are used to mark data members and methods as "guarded". In the ideal world of our imagination, we would like an intermediate representation that supplies the following information, in order of priority: (1) It must provide a CFG that accurately reflects the source code. This is critical. (2) It must provide accurate source-language type information for each expression, so that we can find annotations on virtual methods. (3) We need to look at the IR before any optimizations occur. (4) It should be stable from release to release of the compiler. (5) All local variables should be in SSA form; SSA makes life much easier. (6) It should identify loads and stores. (7) It should be simple and easy to traverse. Gimple satisfies 5-7 almost perfectly, does 1-3 pretty well, and fails on 4. Unfortunately, "pretty well" on 1-3 was not sufficient for our needs, and 4 was getting to be a problem. I will provide a few examples of things that have caused major headaches. These examples are enclosed in gcc-problem-examples.cpp, which you can compile with -fdump-tree-ssa to follow along. Descriptions relate to gcc 4.6; I have not tested with gcc 4.7. ======== (1) Inaccurate CFG void foo1(Mutex* mu, bool c) { mu->Lock(); if (c) { MyClass mc; if (c) { mu->Unlock(); return; } // extra join point here to call ~MyClass() } mu->Unlock(); return; }; The lowering to GIMPLE produces an extra join point in the CFG that is not present in the source code, in order to call the destructor of MyClass. The benefit from a codegen standpoint is that different paths in the CFG can share the same destructor code, so this is ordinarily a good thing. However, our analysis algorithm merges locksets whenever it hits a join point, so we get false positives in this case. Fixing this problem would require a substantial rewrite of the algorithm to handle conditionally held locks. ======= (2) Inaccurate type information. void foo2(void* p) { ((MyClass*) p)->myMethod(); } This is a minor nit. The virtual function call uses a type cast to grab the vtable, but then passes p (with type void*) as 'this'. So when I go back and try to locate the method later, I get a method lookup failure. To be fair, this sort of thing does not happen very often. ======== (3) "local" optimizations with non-local effects. Annotalysis is scheduled as a local optimization pass that runs before all other optimizations. Unfortunately, some "local" optimizations have non-local effects that interfere with the analysis. IPA-SRA has been the main source of headaches: class DummyMutex { public: void Lock() EXCLUSIVE_LOCK_FUNCTION(this) { } void Unlock() UNLOCK_FUNCTION(this) { } }; void foo3(DummyMutex* dmu1, DummyMutex* dmu2) { dmu1->Lock(); dmu2->Lock(); dmu2->Unlock(); dmu1->Unlock(); } DummyMutex is a class that implements the interface of a mutex without doing anything. The LOCK_FUNCTION and UNLOCK_FUNCTION macros are annotations that the analysis uses to build locksets. IPA-SRA will helpfully notice that the body of these functions do not refer to 'this', and thus they can be lifted to static methods; a good optimization. It fails to notice that the *annotation* refers to 'this'. The lifting happens "locally", before the compilation of foo3(), so when checking foo3, the analysis can no longer determine which object is being locked or unlocked. The net result of this change is that we get false positives when compiling with optimizations turned on, even though the analysis is supposed to run before any optimizations. (Compare the output of -fdump-tree-ssa when compiling normally, versus compiling with -O3.) ======== (4) Unstable platform. Problems (1) and (3) were both introduced in gcc 4.6; they did not exist in earlier versions of the compiler. gcc 4.7 introduces new breakages that I have not investigated. The lowering algorithm from C++ to gimple, the gimple intermediate language, and the optimization pass framework, are all very fluid, so there's a lot of maintenance work required to keep annotalysis up-to-date. ======== Comparison with Clang. The clang AST satisfies 1-4 perfectly. The AST records all information, and follows the C++ spec as closely as possible. Clang does not peform any transformations or optimizations in the front end. It builds a CFG on top of the original AST, with pointers to the original expressions. The AST is then lowered to LLVM IR in a separate pass, which happens after the AST has been fully constructed. Because the AST is keyed to the C++ spec, it rarely changes. The clang AST fails utterly on 5-7. The AST is not simple, because C++ is not simple. For example, whereas gimple has function calls, clang has constructor, destructor, new, delete, implicit destructor, function, and method calls. So that involves more work. We also had to implement our own use-def chains that mimic SSA, and identify expressions that would cause a load or store. (Note that the clang static analyzer framework does provide some of this functionality, but our current implementation does not make use of it.) As you can see, neither platform perfectly meets our needs. However, issues 5-7 were lower priority, and we had a straightforward way to work around them that did not require major changes to the algorithm, or alterations to the architecture of the compiler. ======== Conclusion From our perspective, the ideal solution would be to have a strongly typed high-level intermediate language, somewhat like the JVM or .NET IRs, and initially lower from C++ source code to the high-level IR. Static analysis passes and high-level optimizations could be run on this IR. The high-level IR would then be lowered to a low-level IR like Gimple or LLVM, where low-level optimization and codegen passes would occur. Java uses an architecture like this, which is one reason why academic research has overwhelmingly focused on Java; Java bytecode is vastly easier to work with than C++. Neither clang nor gcc currently use such an architecture.