It's been my plan since finally wrapping my head around Bodik's thesis to revamp how we handle jump threading to use some of the principles from his thesis. In particular, the back substitution and simplification model feels like the right long term direction. Sebastian's FSM threader was the first step on that path (gcc-5). Exploiting that threader for more than just FSM loops was the next big step (gcc-6). This patch takes the next step -- dis-entangling that new jump threading code from the old threading code and VRP/DOM. The key thing to realize here is that the backwards (FSM) jump threader does not inherently need the DOM tables nor the ASSERT_EXPRs from VRP to do its job. ie, it can and should run completely independently of DOM/VRP (though one day it may exploit range information that a prior VRP pass has computed). By moving the backwards threader into its own pass, we can run it prior to DOM/VRP, which allow DOM/VRP to work on a simpler CFG with larger extended basic blocks. The removal of unexecutable paths before VRP also has the nice effect of also eliminating false positive warnings for some work Aldy is doing around out-of-bound array index warnings. We can remove all the calls to the backwards threader from the old style threader. The way the FSM bits wired into the old threader caused redundant path evaluations. That can be easily fixed with the FSM bits in their own pass. The net is a 25% reduction in paths examined by the FSM threader. Finally, we ultimately end up threading more jumps. I don't have the #s handy anymore, but when I ran this through my tester there was a clear decrease in the number of runtime jumps. So what are the downsides. With the threader in its own pass, we end up getting more calls into the CFG & SSA verification routines in a checking-enabled compiler. So the compile-time improvement is lost for a checking-enabled compiler. The backward threader does not combine identical jump threading paths with different starting edges into a single jump threading path with multiple entry points. This is primarily a codesize issue, but can have a secondary effect on performance. I know how to fix this and it's on the list for gcc-7 along with further cleanups. Bootstrapped and regression tested on x86_64 linux. Installing on the trunk momentarily. Jeff