public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* CFG and trees
@ 2003-11-17 18:44 Andrew MacLeod
  0 siblings, 0 replies; only message in thread
From: Andrew MacLeod @ 2003-11-17 18:44 UTC (permalink / raw)
  To: gcc mailing list


There is a bit of length here, so please bear with me.

First, let me clearly state that I am not opposed to integrating the CFG
into the IL, I just want everyone to be clear about what that means.

Next, lets look at the ideal world.

I picture all optimizations as black boxes. We get a correct
representation of a program in the form of an IL in one end, it gets
massaged around in some presumably beneficial way, and then we get
correct IL out the other end. 

We also have translation black boxes which take one form in, and spit a
different form out. The Parser translates source code to trees,
tree->rtl is another translation box, and rtl->.s is another
translation.

Translation boxes are required for compilation, optimization boxes are
optional. Since they are optional, we ought to be able to mix and match
then in any order and combination, and the program will still be
correct. We try to find the optimal order to get the best code.

So before deciding that a data structure of any sort should exist
through a translation box, we ought to make sure that it doesn't change
anything which might come between/after them.

In our case, we have 2 more translation units. tree->SSA and SSA->tree.
SSA is a form of a tree, but is clearly distinguishable from trees since
any optimization which operates on general trees is not goin to produce
a correct SSA tree.

If we want to make the CFG an integral part of the IL from SSA forward
and keep it through the translation to RTL, we have to examine what
might come in that period between SSA->normal and tree->rtl. Anything
which might will have to be a CFG manipulator as well.

We have optimized trees at that point. If we want to do inlining on
optimized trees, then the inliner will have also to understand the CFG.
Jan mentioned he was thinking about this for profiling information.  In
my mind the inliner doesn't care about the CFG.  It cares profiling in
that how many times a function is called is critical to making good
decisions. Keeping the CFG will make that information available, but
maintaining a CFG while performing inlining seems like extra work that
the inliner normally wouldnt be doing. I would think that annotating
call locations with an execution count would accomplish the same thing
without having to deal with a CFG.

Now, that just one example, and perhaps keeping the CFG up to date when
inlining isn't too big a deal when its integrated tightly with the IL.
It does mean we dont have the option of inlining *before* SSA on 
non-optimized trees without a different version of the inliner.

What I'd like to know is what other things people are envisioning doing
with the compiler on trees that this will have an impact on.  If having
a CFG around is going to be a generally beneficial thing, then we ought
to make sure we architect it that way now. 

Im under the impression that although tree->rtl isn't going to be forced
to be GIMPLE, it is going to be a lowered GENERIC in that some of the
more difficult container constructs will need to be replaced before
expansion. In which case that would seem to be the right place to build
a CFG if we want CFG based trees. Then everything which deals with
lowered GENERIC trees always has the CFG, so we would have a translation
box with goes from GENERIC trees -> lower generic + CFG, and that is
what is used striaght throught to rtl expansion. (with an optional
expansion into SSA and back).  I'll just call it lowered tree's for
now.  

So I think we are mostly deciding whether the CFG ought to be an
integral part of lowered trees or not. 

So what other types of thing might we do, and is a CFG required or would
it be in the way.

There is the inliner. It ought to operate on lowered trees I guess? It
would help it see a consistance view across all F/E's 

the SSA optimizer operates on lowered trees as well. This means the CFG
would exist before we come into the SSA translator. I presume this would
not be an issue, but we clearly need one here.

Presumably we want to do some form of whole program analysis at some
point, which would presumably operate on lowered trees as well.

And... ???


So, I think we currently lower trees and then build the cfg, we would
just have to ensure that they always happen together because thats what
forms our "lowered trees" IL. 

Now, my final summation :-) 

Reading this would seem to indicate that we might have a clear place
where we can combine the CFG and the IL which results in a consistant
view of the IL and a clean abtraction of the compiler as a whole.

I would like to see what others think about this view, and what mistakes
or modifications are in my formulations. I care that we keep a clean
interface and clear distinction between the different "black box"
sections. Ie, if we go this route, we aren't going to want to reuse
something from lowered-trees on regular trees or vice versa, because
then we have unneccesary duplication of effort. The only real impact
from this view is on the phase between SSA->lowered trees and
lowered_trees->rtl. lowered_trees and regular trees currently are
compatible with each other, and they will not be when the CFG is
integrated.


So the translation units we have are:

	FE -> any_trees (GENERIC/GIMPLE, whatever, its a tree)
	any_trees -> lowered trees
[opt]	lowered_tree->SSA,   SSA->lowered_tree
	lowered_tree->RTL

and then when into the rtl optimizer, so I'll just leave that as a black
box since we're not disucussing it. So for "logical" IL's that might be
optimized upon, we have 
  - any_trees
  - lowered_tree
  - SSA trees
  - rtl
and translators between them.

Adding the CFG to the definition of lowered_tree adds a simple
restriction that we can't have a single entity that operates on
any_trees and lowered_trees because they are now incompatible. I doubt
this is much of an issue because I would think the first thing we do
coming out of a F/E would be to translate to lowered_trees, but perhaps
someone has an argument for it.  

Note that this pretty much undoes my previous argument about using the
inliner before and after SSA, because we would now have lowered_trees
for it to operate on, and SSA again becomes a purely optional pass. the
inliner can happen before or after with freedom.

I have a question, how does gimplification fit into this. We need GIMPLE
in the SSA pass, but do we *need* it other places. Ie, would
gimplification also be considered an integral part of the
any_tree->lowered_tree translator?

Are there language specific things that interfere with the cleanliness
of doing everything on lowered_trees? Ie are there some thing that must
be done on trees as the come out of the F/E? I would hope not, or if so,
it out to be done before any_trees->lowered_trees happens.

As you can see, Im not arguing against making the CFG part of
lowered_trees, I do want to be sure no one else has any reasons why its
not a good idea, and I want to see if we can all agree on what the
picture looks like before moving forward. Then we're all talking the
same language.  The picture I see here implies it would be OK. As long
as the picture is correct :-)

Andrew

PS I forgot about mudflap. It would also have to be modified to work on
lowered trees, and it would then presumably benefit from having a CFG,
and it would also have to keep the CFG up to date. It would clearly not
work on lowered-trees if we make this change until it was updated.  I
would think it would benefit from having the CFG, and possibly from
splitting into 2 parts, one which issues some form of instrumentation
which can be optimized before SSA, and then expansion of that
insrumentation into real code after SSA. Then if SSA is run, it would
benefit from all the optimizations performed on SSA form.

Anything else Ive forgotten?


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2003-11-17 16:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-17 18:44 CFG and trees Andrew MacLeod

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).